OrangingK - Image Difference Finder

Image Difference Finder

https://github.com/jeongsol-kim/difference_finder

Install

pip install difference-finder

🛎️ This is an on-going project. There would be several changes, but I hope to maintain the core idea.

Motivation

With the extensive exploration of machine learning techniques, the quality of reconstructed images has reached a saturation point. Consequently, comparing baseline methods has become challenging. While quantitative evaluations may indicate marginal improvements, demonstrating qualitative enhancements has been difficult (in my experience).

Hence, I decided to make a python library that could spot the different points between two images, namely Difference-Finder.

Quick Start

from pathlib import Path

from difference_finder.finder import Finder

img1 = Path(<img_path_1>)

img2 = Path(<img_path_2>)

worker = Finder(pre_processor='identity',

post_processor='identity',

strategy='gradient',

metric='psnr',

device='cuda')

results = worker.run(img1, img2)

Expected Input/Output

INPUT

Here, img1 and img2 could be image file path, image file directory, or pytorch tensor.

file path: Available image files are .png, .jpeg and .jpg.
directories: The number of image files should be the same.
Pytorch Tensor: Assumed to have the shape (H, W), (C, H, W), (H, W, C), (N, C, H, W), or (N, H, W, C).

OUTPUT

Currently, the output is numpy.ndarray size of (H, W). For the channel dimension, difference map will be averaged. For the batch dimension, only the first sample will be returned. If file directories are given, the output will be the list of numpy.ndarray whose length is the same as the number of files.

🚀 There will be an update for the output shape. One would be allowed to use 'reduce_mean', to get multi-batch result, and to get other data type such as torch.Tensor.

Pipeline

In fact, finding differences between two images could readily be done via simple pixel-wise subtraction.

However, there are many rooms to improve the performance of the difference finder. For example, we can leverage either high-frequency components or low-frequency components solely to enhance the differences. Furthermore, we can visually enhance the result images that indicates the differences by clamping the pixel values. Moreover, one can utilize image quality metrics including Peak Signal-to-Noise Ratio (PSNR) or Structural Simailarity Index Measure (SSIM).

To sum up, I designed the following pipeline.

Basically, Difference-Finder consists of pre- and post- processors, strategy and metric modules. The pre- and post- processors involve to difference map enhancement, while the strategy and metric modules involve to measure the pixel-wise differences between given two images. In the following sections, I will describe each modules briefly.

Pre-processor

Currently, the following modules are implemented. Note that written codes are pseudocodes to show the concept, not the implementations.

Identity (name: identity)

def identity(x: torch.Tensor) -> torch.Tensor:

return x

Do nothing.
Default option

Normalization (name: normalize)

def normalization(x: torch.Tensor) -> torch.Tensor:

return (x-x.min())/(x.max()-x.min())

Pixel value normalization to [0..1].

High-pass filter (name: highpass_filter)

def highpass_filter(x: torch.Tensor, factor: float) -> torch.Tensor:

kspace = fft2d(x)

kspace = get_high_freq(kspace, factor)

return ifft2d(kspace)

Take high-frequency components only: sensitive to edges.
Fill zeros to central points of K-space according to factor.
The default value of factor is 0.1, which means that central 30 pixels are filled with zeros for the image size of 300 pixels. Thi value of factor is fixed for now.

Low-pass filter (name: lowpass_filter)

def lowpass_filter(x: torch.Tensor, factor: float) -> torch.Tensor:

kspace = fft2d(x)

kspace = get_low_freq(kspace, factor)

return ifft2d(kspace)

Take low-frequency components only: sensitive to pixel values.
Fill zeros to boundary points of K-space according to factor.
The default value of factor is 0.9, which means that central 270 pixels are ones while the boundary 30 pixels are zero for the image size of 300 pixels. The value of factor is fixed for now.

Post-processor

Currently, there is no variants for the post processor, but it is planned to implement some processors. Plsease check the project repository.

Idendity (name: identity)

def identity(x: torch.Tensor) -> torch.Tensor:

return x

Do nothing
Default option

Strategy

The Strategy modules is the core part for detecting the difference between two images. Currently, there exist two strategies, one is difference and the other one is gradient.

Difference (name: difference)

The strategy difference is a simple compuation of pixel-wise absolute difference between two images. This strategy works for many cases. When you set the strategy as difference, the finder will ignore given Metric option.

Gradient (name: gradient)

This strategy is somewhat different to simple difference. This will calculate the gradient of one input image with respect to calculated (scalar-valued) metric. One pixel of returend gradient indicates "how much the pixel contributes to the metric". Hence, the metric function must be differentiable by the pytorch auto-grad backend. It is recommanded to construct all metric functions via pytorch operations. Some Metric modules are based on external libararies. This strategy is set as a default value.

Metric

The Metric module is for the strategy gradient. It is possible to implement your own image quality metric as well as to use pre-implemented metric. The following metrics are implemented for now.

[x] MSE ('mse') : Mean Squared Error

[x] PSNR ('psnr') : Peak Signal-to-Noise Ratio

[x] SSIM ('ssim') : Structural Similarity Index Measure

[x] MS-SSIM ('ms-ssim') : Multi-scale SSIM

[x] LPIPS ('lpips') : i.e. perceptual loss

Conclusion

See the example in the top of this page. The Strategy is set to gradient and the Metric is set to psnr. You can readily find seven difference spots from the difference map which is quite hard to identify from the left two images by naked eyes.

However, this project could be improved. There are many lines of TODO. My final goal is to use this project for the qualitative comparions of computer vision papers. Anyway, there will be some announcements for updates, maybe in my next free time.

If you think the project is cool, please remain a star in the repository. If you have any useful idea, please leave it in the issue. All contributes are welcome. Thanks.

Page updated

Google Sites

Report abuse