Image Similarity: Understanding and Implementing Methods

In the digital era, where images are widely used across industries like e-commerce, healthcare, and social media, the ability to measure how similar two images are has become an important task. This is known as image similarity. Whether for recommending similar products in an online store or detecting duplicate images in a large dataset, image similarity algorithms help automate processes that involve comparing and categorizing images based on visual content.

The article explores what image similarity is, the most common methods for calculating it, and a practical implementation in Python.

What is Image Similarity?

Image similarity refers to the degree of visual resemblance between two images. It can be based on various features like color, texture, shape, or more advanced representations using machine learning models. The goal is to create a metric that quantifies how "close" two images are in terms of their content.

The applications of image similarity are numerous:

Duplicate detection: Finding identical or near-duplicate images in large datasets.
Recommendation systems: Recommending visually similar products.
Content-based image retrieval (CBIR): Retrieving images from a database based on visual content.

Methods for Calculating Image Similarity

There are various approaches to calculating image similarity, depending on the type of features you want to extract from the images:

1. Pixel-based Similarity

One of the simplest ways to compare two images is by comparing their pixel values. This method works well for images that are very similar or identical but can fail when there are slight differences in lighting, rotation, or scaling.

Mean Squared Error (MSE): Compares the pixel values of two images by calculating the average squared difference between corresponding pixels.

MSE = \frac{1}{mn} \sum_{i=1}^{m} \sum_{j=1}^{n} (I_1(i,j) - I_2(i,j))^2

Structural Similarity Index (SSIM): Measures the structural similarity between two images by focusing on luminance, contrast, and structure. SSIM is more perceptually relevant compared to MSE.

2. Histogram-based Similarity

This method compares the color distribution between two images using their histograms. It is invariant to small changes in image content but fails when images differ significantly in composition.

Histogram Intersection: Compares histograms of two images to calculate the similarity.

\text{Intersection}(H_1, H_2) = \sum_{i=1}^{n} \min(H_1(i), H_2(i))

Chi-square Distance: Measures the difference between histograms by computing the sum of squared differences.

\chi^2 = \sum_{i=1}^{n} \frac{(H_1(i) - H_2(i))^2}{H_1(i) + H_2(i)}

3. Feature-based Similarity

This method extracts higher-level features from the image, such as edges, corners, or shapes, and compares them.

SIFT (Scale-Invariant Feature Transform): Extracts local features from the image that are invariant to scaling, rotation, and illumination changes.
ORB (Oriented FAST and Rotated BRIEF): A faster alternative to SIFT, used for keypoint matching and tracking.

4. Deep Learning-based Similarity

Using pre-trained deep learning models, images can be embedded in a high-dimensional space where their proximity indicates similarity. Popular approaches include:

Convolutional Neural Networks (CNNs): These extract hierarchical features from images that are robust to variations in object appearance and background noise.
Pre-trained models (e.g., VGG, ResNet): These can be used for feature extraction to create embeddings that can then be compared using distance metrics like cosine similarity.

Implementing Image Similarity in Python

Below is an example of calculating image similarity using two methods: SSIM and a pre-trained deep learning model.

You can download the reference images from here:

Python

# Import necessary libraries
import cv2
import numpy as np
from skimage.metrics import structural_similarity as ssim
from tensorflow.keras.applications.vgg16 import VGG16, preprocess_input
from tensorflow.keras.preprocessing import image
from tensorflow.keras.models import Model
from scipy.spatial.distance import cosine

# Function to calculate SSIM
def calculate_ssim(imageA, imageB):
    # Convert images to grayscale
    grayA = cv2.cvtColor(imageA, cv2.COLOR_BGR2GRAY)
    grayB = cv2.cvtColor(imageB, cv2.COLOR_BGR2GRAY)
    
    # Compute SSIM between the two images
    score, _ = ssim(grayA, grayB, full=True)
    return score

# Function to extract features using VGG16 and calculate cosine similarity
def calculate_feature_similarity(image_path1, image_path2):
    # Load pre-trained VGG16 model
    base_model = VGG16(weights='imagenet', include_top=False)
    model = Model(inputs=base_model.input, outputs=base_model.get_layer('block5_pool').output)
    
    # Function to preprocess image for VGG16
    def preprocess_image(image_path):
        img = image.load_img(image_path, target_size=(224, 224))
        img_data = image.img_to_array(img)
        img_data = np.expand_dims(img_data, axis=0)
        img_data = preprocess_input(img_data)
        return img_data
    
    # Preprocess the two images
    img1 = preprocess_image(image_path1)
    img2 = preprocess_image(image_path2)
    
    # Extract features
    features_img1 = model.predict(img1)
    features_img2 = model.predict(img2)
    
    # Flatten features and calculate cosine similarity
    features_img1 = features_img1.flatten()
    features_img2 = features_img2.flatten()
    similarity = 1 - cosine(features_img1, features_img2)
    
    return similarity

# Load sample images
imageA = cv2.imread('/content/ref image 2.webp')
imageB = cv2.imread('/content/ref image1.webp')

# Calculate SSIM
ssim_score = calculate_ssim(imageA, imageB)
print(f"SSIM score: {ssim_score}")

# Calculate feature-based similarity using VGG16
feature_similarity = calculate_feature_similarity('/content/ref image 2.webp', '/content/ref image1.webp')
print(f"Feature-based similarity (Cosine similarity): {feature_similarity}")

Output:

SSIM score: 0.3995825443924526

58889256/58889256 ━━━━━━━━━━━━━━━━━━━━ 0s 0us/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 1s 905ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 1s 543ms/step
Feature-based similarity (Cosine similarity): 0.37475860832700925

Both the SSIM score (0.3996) and the Cosine Similarity score (0.3748) suggest that the two images are moderately dissimilar. SSIM captures this dissimilarity from a structural point of view, while the feature-based approach uses higher-level image characteristics, yielding similar insights about the images.

Conclusion

Image similarity is a critical task in numerous real-world applications, from identifying duplicate images to building recommendation systems. Various methods, such as pixel-based comparisons, histogram analysis, feature extraction, and deep learning-based techniques, offer different advantages depending on the use case. The implementation provided in this article showcases how structural and feature-based methods can be applied to measure image similarity in Python.