Image-Based Textile Decoding: Explaining How AI Can Recover Weaving Patterns from Fabric Images

Textiles are not only visual objects; they are also structured materials created through a precise arrangement of yarns. In woven fabrics, vertical yarns called warp and horizontal yarns called weft cross each other repeatedly. At every crossing point, either the warp yarn appears on top or the weft yarn appears on top.

The paper Image-based Textile Decoding studies an interesting reverse-engineering problem: can we take a photograph of a woven fabric and automatically recover the hidden binary pattern that defines how the fabric was woven? This is especially important for Jacquard fabrics, where complex patterns can be created without simple repetition.

Table of Contents

1. What Problem Does the Paper Solve?
2. Why Jacquard Textile Decoding Is Difficult
3. Main Method Proposed in the Paper
4. Intermediate Representation
5. Neural Network Architecture
6. Post-Processing into a Binary Pattern
7. Experimental Results
8. Why This Paper Is Important for Textile AI
9. Limitations and Future Scope
10. Relevance to Saree Provenance Research

1. What Problem Does the Paper Solve?

The central problem of the paper is textile decoding. In textile production, a binary weaving pattern is first given to a loom. The loom then creates a physical woven fabric. This is the forward process: from digital pattern to fabric.

The paper tries to solve the reverse problem: starting from an observed fabric image, recover the original binary pattern that describes the warp-weft crossing structure.

At every crossing point, the fabric can be described using a binary value:

\[ P(i,j)= \begin{cases} 0, & \text{if warp is over weft at crossing point }(i,j) \\ 1, & \text{if weft is over warp at crossing point }(i,j) \end{cases} \]

Here, \(P(i,j)\) is the binary weaving pattern at the crossing between the \(i\)-th warp yarn and the \(j\)-th weft yarn. This binary matrix can be understood as the hidden code behind the woven textile.

2. Why Jacquard Textile Decoding Is Difficult

In ordinary woven fabrics, the weave structure may repeat periodically. Such repeated structures are easier to analyze because once a small pattern unit is identified, it can often explain the larger fabric.

Jacquard fabrics are more complex. A Jacquard loom can control individual warp-weft crossing points, allowing large and non-repetitive patterns. This means the entire fabric may need to be analyzed rather than only a small repeating unit.

The difficulty becomes greater when the fabric is photographed. The crossing points in the observed image may not lie neatly on a perfect grid. Yarns may bend, shift, twist, or appear differently due to lighting, texture, and physical deformation. Therefore, simple template matching is not reliable.

Key idea: The challenge is not only to classify an image. The challenge is to recover a structured grid-like binary pattern from an imperfect photograph of a physical woven object.

3. Main Method Proposed in the Paper

The authors propose a method that combines image processing, manual labeling, deep learning, and post-processing. Instead of directly converting a fabric image into a binary matrix, they introduce an intermediate representation.

The overall pipeline can be described as:

\[ \text{Fabric Image} \rightarrow \text{Pre-processing} \rightarrow \text{Intermediate Representation} \rightarrow \text{Deep Neural Network} \rightarrow \text{Post-processing} \rightarrow \text{Binary Weaving Pattern} \]

Stage	Purpose
Pre-processing	Clean the fabric image and reduce fine fiber noise.
Manual labeling	Create training examples by marking crossing points.
Intermediate representation	Represent crossing-point likelihoods in an image-like form.
Deep neural network	Learn to predict the intermediate representation from fabric images.
Post-processing	Convert the predicted intermediate image into a clean binary matrix.

4. Intermediate Representation

A major contribution of the paper is the use of an intermediate representation. The authors found that asking a deep neural network to directly output the final binary matrix is too difficult. The image and the final matrix are structurally different: the image is pixel-based, while the weaving pattern is grid-based.

To bridge this gap, they convert the crossing-point information into an image-like representation. In this representation, each pixel may take one of three values:

Pixel Value	Meaning
\(0\)	Warp is on top of weft.
\(1\)	Weft is on top of warp.
\(0.5\)	The pixel is not a crossing point.

The basic impulse representation can be written as:

\[ I_0(x,y)= \begin{cases} 1, & \text{if weft is on warp at }(x,y) \\ 0, & \text{if warp is on weft at }(x,y) \\ 0.5, & \text{otherwise} \end{cases} \]

However, the authors found that an impulse representation is too sharp and difficult for the network to learn. They therefore tested filtered versions of this representation. The best performance came from the box-filtered peak representation, where each crossing point is represented as a small region rather than a single sharp pixel.

A simplified form of the box-filtered representation is:

\[ I_B(x,y)= \begin{cases} 1, & \max_{(s,t)\in N(x,y)} I_0(s,t)=1 \\ 0, & \max_{(s,t)\in N(x,y)} I_0(s,t)=0 \\ 0.5, & \text{otherwise} \end{cases} \]

Here, \(N(x,y)\) represents the neighborhood around pixel \((x,y)\). In the paper, a \(9 \times 9\) window gave strong results.

5. Neural Network Architecture

The authors use a deep neural network with a U-Net-like structure. U-Net is suitable for image-to-image tasks because it can preserve spatial details while also learning contextual information from surrounding regions.

This is important because textile decoding requires both local and global understanding. The network must inspect local yarn crossings, but it must also preserve the overall spatial arrangement of the grid.

The input to the network is a pre-processed fabric image, and the output is the intermediate representation image.

In simplified form:

\[ f_\theta(X) \approx I_B \]

where \(X\) is the pre-processed fabric image, \(I_B\) is the target intermediate representation, and \(f_\theta\) is the neural network with learnable parameters \(\theta\).

The authors use an \(L_1\) loss between the predicted image and the target label image:

\[ \mathcal{L} = \sum_{x,y} \left| \hat{I}(x,y)-I(x,y) \right| \]

Here, \(\hat{I}(x,y)\) is the predicted value at pixel \((x,y)\), and \(I(x,y)\) is the target intermediate representation value.

6. Post-Processing into a Binary Pattern

The neural network does not directly output the final weaving pattern. It outputs an intermediate image. Therefore, post-processing is required to convert this image into a binary matrix.

The post-processing has four major steps:

Step	Description
1. Tri-valued conversion	The continuous output is converted into \(0\), \(0.5\), and \(1\).
2. Region integration	Connected regions are merged so that each crossing point has one consistent value.
3. Yarn position estimation	Approximate warp and weft positions are estimated.
4. Binary assignment	Each grid point is assigned either \(0\) or \(1\).

The result is a binary matrix that can be interpreted as the recovered Jacquard weaving pattern.

\[ \hat{P}(i,j) \in \{0,1\} \]

where \(\hat{P}(i,j)\) is the decoded binary value at the crossing of warp \(i\) and weft \(j\).

7. Experimental Results

The authors tested the method using black-and-white Jacquard fabric images. They captured textile samples using a camera with a macro lens and then divided high-resolution images into smaller image patches.

Experimental Detail	Value
Original images	176
Image size	\(512 \times 320\) pixels
Data augmentation	Horizontal flip, vertical flip, and \(180^\circ\) rotation
Total augmented samples	704
Deep learning framework	PyTorch
Validation method	11-fold cross-validation

The most important result is that the proposed method achieved about:

\[ \text{Accuracy} = 0.930 \]

and:

\[ F\text{-measure} = 0.929 \]

This means the system was able to recover around 93% of the crossing-point structure correctly. The authors also showed that the decoded binary patterns could be woven again to produce fabrics visually close to the original samples.

8. Why This Paper Is Important for Textile AI

This paper is important because it treats textile images as more than ordinary pictures. A woven fabric has an underlying physical and structural logic. The appearance of the fabric is created by the repeated interaction of warp and weft yarns.

Many textile image analysis studies focus on classification, defect detection, or visual similarity. This paper goes deeper by attempting to recover the actual weave structure from the observed image.

The paper also shows that direct deep learning may not always be enough. The authors had to design a carefully structured pipeline with intermediate representation and post-processing. This is a useful lesson for textile AI research: domain knowledge about yarns, grids, crossings, and weaving structure can improve machine learning methods.

9. Limitations and Future Scope

The paper has some limitations. First, the method was mainly tested on black-and-white yarn images. Real textiles often contain many colors, complex textures, metallic yarns, uneven lighting, and decorative effects.

Second, the dataset was relatively small. Although data augmentation helped, larger datasets would likely improve deep learning performance.

Third, manual labeling was required to prepare training data. This makes the approach semi-automatic during the dataset preparation stage.

Fourth, the method works on image patches. For very large textiles, the decoded patches would need to be stitched together to reconstruct the complete fabric pattern.

Limitation	Possible Future Direction
Only black-and-white yarns	Extend the method to multi-color yarns and real-world textile images.
Small dataset	Build larger annotated textile decoding datasets.
Manual labeling required	Develop weakly supervised or self-supervised labeling methods.
Patch-level decoding	Use image stitching or global textile reconstruction methods.
Partial decoding errors	Add structural constraints based on weaving rules.

10. Relevance to Saree Provenance Research

This paper is highly relevant to textile AI, but its objective is different from saree provenance classification.

The paper focuses on decoding the binary warp-weft structure of Jacquard fabrics. Saree provenance classification, on the other hand, tries to identify the regional or cultural origin of a saree using visual and structural cues such as motifs, borders, pallu design, weaving style, material, color layout, and craft tradition.

Image-Based Textile Decoding	Saree Provenance Classification
Recovers warp-weft binary pattern.	Identifies regional origin or craft tradition.
Works at yarn-crossing level.	Works at motif, border, pallu, texture, and whole-image level.
Uses U-Net and post-processing.	May use CNNs, Vision Transformers, metric learning, and graph neural networks.
Mainly tested on black-and-white Jacquard samples.	Must handle multi-color, multi-pattern, real-world saree images.

For saree provenance research, the key takeaway is that textile images contain recoverable structural information. A saree image is not only a visual pattern; it reflects weaving technique, motif grammar, regional design conventions, and material structure.

Therefore, future AI systems for saree classification may benefit from combining visual models with textile-domain knowledge. Instead of relying only on surface-level image classification, such systems can incorporate structured cues related to weave, motif, border, pallu, and region.

Conclusion

The paper Image-based Textile Decoding presents an interesting approach to recovering the hidden binary weaving pattern from a fabric image. Its main strength lies in the use of an intermediate representation that connects photographic fabric images with grid-based weaving patterns.

The study shows that deep learning can support textile structure analysis, but it also shows the importance of domain-specific processing. For textiles, the physical structure of yarns and crossings matters. A successful AI system must therefore understand not only pixels, but also the material logic behind the image.

For researchers working on saree provenance, textile classification, handloom recognition, or cultural heritage informatics, this paper is a useful example of how computer vision can move beyond simple image classification and begin to analyze the structural intelligence embedded in woven fabrics.

Disclaimer: This article is an educational explanation of the research paper Image-based Textile Decoding. It is intended for learning and discussion purposes. Readers should consult the original paper for complete technical details, experimental settings, and formal results.

My Research Notes

Saturday, 13 June 2026

Understanding the Paper: Image Based Textile Decoding