When we talk about data most people think of structured data: e.g., relational databases, CSV files, Excel, etc.. But there is a world of unstructured data that comprises between 80% and 90% of all available information (source). This unstructured data has been accumulating great interest over the last 10 years and has been the driving force behind the study and development of new models that are being applied in various industries. For example, autonomous vehicles, audio translation in videos, evaluation of the quality of agricultural plantations, detection of fake news in social networks, and many more.
Today we will focus on the world of images. Images are stored in our computers as a tensor of 3 dimensions that represent the Width, Height, and finally the Number of channels (the most common RGB) that indicate the combination of colors that is in it. For example, a color image has a number of channels equal to 3, while a black and white image, a number of channels equal to 1. When you want to work with them, the first thing to do is to find a vector representation (descriptor or feature) that is not the initial tensor and that captures the characteristics that interest us to subsequently train a model and solve the problem we are attacking, whether classification, segmentation, object detection, action detection, object tracking or others.
Currently there are two ways to transform images to one or more vectors that capture their essence: the first is to use or create an algorithm specifically designed to extract features from an image (handcrafted-features), and the second is to let a machine learning model (generally from the Deep-Learning subarea) learn the best way to represent the image features just by looking at the data while training to perform a secondary task (learned-features).
In this blog we will learn two classical algorithms to extract features from an image. But why revisit these old algorithms when the state of the art uses neural networks to do so? Although neural networks are very powerful, they also require millions of data to be trained and use much more computational resources than the more classical image processing algorithms, so it is never too much to know tools that allow us to attack problems involving images when we are limited either by data or resources.
Gabor filters are filters (or kernels) that when applied to an image are able to detect edges and textures within it. These are combinations of a Gaussian and sinusoidal distribution with parameters that allow the filter to be varied in different ways, so in general several are used together as they do not all capture the same thing. Figure 1 shows the types of variations that can be applied to these filters, and in Figure 2 we can see a representation of these with variations in orientation and size controlled by different parameters that regulate each of the characteristics shown in Figure 1, with the exception of the shape, which is an extension to the original filters.
Figure 1: Gabor filters with variations in distance, scale, orientation, number, location and shape.
Figure 2: filters with variations in size and orientation.
But how can these filters be used to work with images? As seen in Figure 3, the resulting images depend on the orientation and size of the filter applied and will capture different information from it. The examples on the right capture the texture of the sea and rock well, while the sand appears to be unaffected by the filter. The resulting ‘activation’ or image after applying the filter can be used as a feature that could be used later to segment the image, classify it, or simply be combined as a preprocessing before applying any other transformation on the image. In case of using the activations as input features for another image, it is recommended to use the filter on a smaller image. This in order to avoid excessive memory usage and to reduce processing times. Another alternative is to add up all the values of the resulting image, thus obtaining a value for each Gabor filter applied, if it is high, the filter is activated in several parts of the image and gives us valuable information about the content.
Figure 3: Original image together with activation of 4 different Gabor filters.
Another use of these filters is simply to use them as a preprocessing step before applying an edge detection algorithm. As an example, Figure 4 shows how the results change when preprocessing an image by applying Gabor filters before performing edge detection. The results with Gabor filters achieve better image edge detection without introducing background noise.
Figure 4: Comparison of results between image with 16 Gabor filters and without Gabor before applying an edge detection algorithm (Canny filter [1]).
This is a method that allows extracting texture features from an image or within an image, based on the assumption that hue and texture always go hand in hand and on the calculation of 14 metrics on the probability of co-occurrence of adjacent pixel values in a predetermined direction. That is, the calculation of a matrix that tells us the probability of a certain combination of pixels occurring in a direction for all possible combinations of values. For example, the probability that a black pixel is right next to a white pixel (going from 0 to 255) or vice versa in the same direction.
As seen in Figure 5, there are 4 orientations in which the co-occurrence matrices are calculated, and once calculated, it is recommended to average the 14 metrics proposed by Haralick in addition to calculating the range (max – min) of these. The result is a vector with 28 values that in theory capture the essence of textures in an image or part of it.
Figure 5: Directions in which the co-occurrence of pixels within an image is calculated. 1 and 5 is horizontal with 0°, 7 and 3 vertical with 90°, 6 and 2 with 135° and finally 4 and 8 with 45° (Source).
For example, we might be interested in analyzing the texture in different parts of image A in Figure 6.
Figure 6: Image to be analyzed with the Haralick textures
A) Original image
B) Black and white image obtained from the average of all channels (RGB).
C) Quadrants of the image to be analyzed with Haralick textures.
Haralick textures are intended to operate on only one channel of the image, so a black and white version is generated (Figure 6 B), which is then subdivided into 100 quadrants from which Haralick textures are extracted. As can be seen in Figure 7, each quadrant has different values in the average of the 14 metrics that are calculated on the co-occurrence matrix and can be used as features to classify the textures within the image or the image itself.
Figure 7: Image Vs Histogram of Haralick metrics.
A) Original grayscale image subdivided into 100 quadrants.
B) Histogram of the average of the 14 scaled metrics in each quadrant.
Some areas in which Haralick textures have been used are in medical imaging, map segmentation, among others.
As we saw in this post, there are simple algorithms capable of extracting features from images and even serve as preprocessing. So, if for some reason it is not possible to use neural networks, we already know two methods that allow us to extract features from an image.
Each method has its advantages and disadvantages, Haralick basically does not require setting any parameters and is fast to calculate. And secondly Gabor filters require a careful choice of parameters and/or the use of a large filter bank which could make the extraction of features unnecessarily slow.
The methods we saw are widely used in texture detection. Both methods can be used in different ways depending on the objective we have, we can decide to extract features from parts of an image or the whole image, make aggregations and/or transformations of the activations or leave them as they are depending on what we need.
In this blog we gave intuitions of how these algorithms work and very particular use cases, the important thing is to keep in mind that there are these more traditional methods and that we can use them to generate features that can later be used to train machine-learning models and solve image-related problems.
[1] Edge detection algorithm is out of the scope of this post, it is only used to exemplify one of the applications of Gabor filters.