Home » How Do Grayscale Images Affect Visual Anomaly Detection?

How Do Grayscale Images Affect Visual Anomaly Detection?

  1. Introduction: Why grayscale images might affect anomaly detection.
  2. Anomaly detection, grayscale images: Quick recap on the two main subjects discussed in this article.
  3. Experiment setting: What and how we compare.
  4. Performance results: How grayscale images affect model performance.
  5. Speed results: How grayscale images affect inference speed.
  6. Conclusion

1. Introduction

In this article, we’ll explore how grayscale images affect the performance of anomaly detection models and examine how this choice influences inference speed.

In computer vision, it’s well established that fine-tuning pre-trained classification models on grayscale images can lead to degraded performance. But what about anomaly detection models? These models do not require fine-tuning, but they use pre-trained classification models such as WideResNet or EfficientNet as feature extractors. This raises an important question: do these feature extractors produce less relevant features when applied to a grayscale image?

Image taken from the VisA dataset (CC-BY-4.0) and processed using Anomalib library

This question is not just academic, but one with real-world implications for anyone working on automating industrial visual inspection in manufacturing. For example, you might find yourself wondering if a color camera is necessary or if a cheaper grayscale one will be sufficient. Or you could have concerns regarding the inference speed and want to use any opportunity to increase it.

2. Anomaly detection, grayscale images

If you are already familiar with both anomaly detection in computer vision and the basics of digital image representation, feel free to skip this section. Otherwise, it provides a brief overview and links for further exploration.

Anomaly detection

In computer vision, anomaly detection is a fast-evolving field within deep learning that focuses on identifying unusual patterns in images. Typically, these models are trained using only images without defects, allowing the model to learn what “normal” looks like. During inference, the model can detect images that deviate from this learned representation as abnormal. Such anomalies often correspond to various defects that may appear in a production environment but were not seen during training. For a more detailed introduction, see this link.

Grayscale images

For humans, color and grayscale images look quite similar (aside from the lack of color). But for computers, an image is an array of numbers, so it becomes a little bit more complicated. A grayscale image is a two-dimensional array of numbers, typically ranging from 0 to 255, where each value represents the intensity of a pixel, with 0 being black and 255 being white.

In contrast, color images are typically composed of three such separate grayscale images (called channels) stacked together to form a three-dimensional array. Each channel (red, green, and blue) describes the intensity of the respective color, and its combination creates a color image. You can learn more about this here.

3. Experiment setting

Models

We will use four state-of-the-art anomaly detection models: PatchCore, Reverse Distillation, FastFlow, and GLASS. These models represent different types of anomaly detection algorithms and, at the same time, they are widely used in practical applications due to fast training and inference speed. The first three models use the implementation from the Anomalib library, for GLASS we employ the official implementation.

Image by author

Dataset

For our experiments, we use the VisA dataset with 12 categories of objects, which provides a variety of images and has no color-dependent defects.

Image taken from the VisA dataset (CC-BY-4.0)

Metrics

We will use image-level AUROC to see if the whole image was classified correctly without the need to select a particular threshold, and pixel-level AUPRO, which shows how good we are at localizing defective areas in the image. Speed will be evaluated using the frames-per-second (FPS) metric. For all metrics, higher values correspond to better results.

Grayscale conversion

To make an image grayscale, we will use torchvision transforms.

Image by author

For one channel, we also modify feature extractors using the in_chans parameter in the timm library.

Image by author

The code for adapting Anomalib to use one channel is available here.

4. Performance results

RGB

These are regular images with red, blue, and green channels.

Image by author

Grayscale, three channels

Images were converted to grayscale using torchvision transform Grayscale with three channels.

Image by author

Grayscale, one channel

Images were converted to grayscale using the same torchvision transform Grayscale with one channel.

Image by author

Comparison

We can see that PatchCore and Reverse Distillation have close results across all three experiments for both image and pixel-level metrics. FastFlow becomes somewhat worse, and GLASS becomes noticeably worse. Results are averaged across the 12 categories of objects in the VisA dataset.

What about results per category of objects? Maybe some of them perform worse than others, and some better, causing the average results to appear the same? Here is the visualization of results for PatchCore across all three experiments showing that results are quite stable within categories as well.

Image by author

The same visualization for GLASS shows that some categories can be slightly better while some can be strongly worse. However, this is not necessarily caused by grayscale transformation only; some of it can be regular result fluctuations due to how the model is trained. Averaged results show a clear tendency that for this model, RGB images produce the best result, grayscale with three channels somewhat worse, and grayscale with one channel the worst result.

Image by author

Bonus

How do results change per category? It is possible that some categories are simply better suited for RGB or grayscale images, even if there are no color-dependent defects.

Here is the visualization of the difference between RGB and grayscale with one channel for all the models. We can see that only pipe_fryum category becomes slightly (or strongly) worse for every model. The rest of the categories become worse or better, depending on the model.

Image by author

Extra bonus

If you are interested in how this pipe_fryum looks, here are a couple of examples with GLASS model predictions.

Images taken from the VisA dataset (CC-BY-4.0) and processed using GLASS and Anomalib library

5. Speed results

The number of channels affects only the first layer of the model, the rest remains unchanged. The speed improvement seems to be negligible, highlighting how the first layer feature extraction is just a small part of the calculations performed by the models. GLASS shows a somewhat noticeable improvement, but at the same time, it shows the worst metrics decline, so it requires caution if you want to speed it up by switching to one channel.

Image by author

6. Conclusion

So how does using grayscale images affect visual anomaly detection? It depends, but RGB seems to be the safer bet. The impact varies depending on the model and data. PatchCore and Reverse Distillation generally handle grayscale inputs well, but you need to be more careful with FastFlow and especially GLASS, which shows some speed improvement but also the most significant drop in performance metrics. If you want to use grayscale input, you need to test and compare it with RGB on your specific data.

The jupyter notebook with the Anomalib code: link.

Follow author on LinkedIn for more on industrial visual anomaly detection.

References

1. C. Hughes, Transfer Learning on Greyscale Images: How to Fine-Tune Pretrained Models (2022), towardsdatascience.com

2. S. Wehkamp, A practical guide to image-based anomaly detection using Anomalib (2022), blog.ml6.eu

3. A. Baitieva, Y. Bouaouni, A. Briot, D. Ameln, S. Khalfaoui, and S. Akcay. Beyond Academic Benchmarks: Critical Analysis and Best Practices for Visual Industrial Anomaly Detection (2025), CVPR Workshop on Visual Anomaly and Novelty Detection (VAND)

4. Y. Zou, J. Jeong, L. Pemula, D. Zhang, and O. Dabeer, SPot-the-Difference Self-Supervised Pre-training for Anomaly Detection and Segmentation (2022), ECCV

5. S. Akcay, D. Ameln, A. Vaidya, B. Lakshmanan, N. Ahuja, and U. Genc, Anomalib (2022), ICIP

Related Posts

Leave a Reply

Your email address will not be published. Required fields are marked *