Protostar Labs

SHARE

Data augmentation for improved and more robust computer vision models

Recently we developed a data augmentation solution for a client who uses Data Matrix Code (DMC) for product identification. DMC is a 2D code that is similar to ubiquitous Quick Response (QR) code, but has some advantages for product marking because it can encode more characters within the same space and is therefore better suited when the location of the code is limited in size.

For training of machine learning / computer vision models, generally a large annotated training set is needed. However, it can be hard both to gather a large enough dataset and to annotate the images. Training set augmentation is a common and very effective method to address these issues. It consists of creating new images from existing ones by applying some simple transformations. Apart from enlarging the training set, there are additional benefits from data augmentation: resulting model is robust to common image degradations, because it is trained to focus on relevant image features (for example objects in the scene), and not on unimportant features like brightness and contrast changes, noise and other degradations, rotation, perspective changes, horizontal / vertical flipping and so on. Sometimes this aspect is even more important than the training set enlargement.

Recently we developed a data augmentation solution for a client who uses Data Matrix Code (DMC) for product identification. DMC is a 2D code that is similar to ubiquitous Quick Response (QR) code, but has some advantages for product marking because it can encode more characters within the same space and is therefore better suited when the location of the code is limited in size. For reliable detection, a DMC detector should not be sensitive to common image degradations i.e. variabilities. In this use case, common degradations include camera perspective changes, constant or structured brightness changes due to additional light sources, spurious reflections, random dropping of dots in the code, scratches or background changes. Generally, frameworks such as Torchvision or Albumentations can be used for most common augmentations, but usually (as in this case) a customization is needed for specific use cases. For example, here we needed to implement realistic non-constant brightness changes due to additional light sources, realistic reflections that usually appear due to lighting of the scanner device, scratches, changes of background (different texture or completely different background) and a DMC and metal sheet detector. Below we include an example image of DMC on a metal sheet and several examples of augmentations obtained by composing several degradations.

Figure 1. Example of DMC on a metal sheet
Figure 2. Example of augmented image with perspective change, scratches, brightness and lighting change

Figure 3. Example of augmented image with perspective change, scratches and additional lighting
Figure 4. Example of augmented image with background, brightness and lighting changes
Figure 4. Example of augmented image with background, brightness and lighting changes

Data augmentation is a very useful technique, and very efficient recent approaches include learning augmentation policies for specific use cases from data, which includes both training set augmentation and test-time augmentation. Test-time augmentation means averaging the predictions of machine learning models across multiple augmented samples of data, which can be used to improve model performance.

If you are interested in computer vision and machine learning projects like this one (and also much more complex ones) or if you have the need for similar or other specific solutions, feel free to contact us (hello@protostar.ai) because we are always on the lookout for new engineers and projects.

Related Articles