Anti-Twin ... and the Little Difference

Similar file content
Pixel-based image comparison

Similar File Content (binary)

If you want Anti-Twin not only to search for full duplicates but also to similar files, you can reduce the desired minimum match from the default value of 100% to up to 60%. This function has been particularly designed for the search of almost identical files where only a tiny detail was changed. Anti-Twin uses the similarity search as soon as you enter a value below 100%. The similarity comparison takes much longer than the 100% full duplicate search!

Unfortunately, the similarity search as part of the byte-by-byte comparison only makes sense for a few file types, because a similarity can only be detected if the files are uncompressed and unencrypted. Uncompressed files are e.g. unformatted texts (.TXT) and HTML.
A similarity comparison of MP3 files (compressed!) only makes sense if the MP3-ID tag (internal song title and artist) was changed in an identical file copy. The similarity search will not work for MP3 files containing the same song where e.g. the sampling rate, the sound volume or the length are different.
Office documents as e.g. Word/Writer or Excel/Calc will also immediately change their binary content as soon as only one letter is changed in the document. Here, Anti-Twin also hardly has any chance to detect a similarity.

Warning (particularly for the similarity comparison)
Before deleting similar files, please check them even more thoroughly than you should do anyway. The similarity comparison lists files for deletion that are not completely identical, i.e. no real duplicates/copies.


Pixel-based Image Comparison

   
 
Select the option “Compare images” to search for similar pictures. Anti-Twin will open/decompress every image and compare the contained pixels of all images.

By contrast, the common byte-by-byte comparison does not treat the files as images, but completely without an interpretation as a binary sequence of bytes. When using this comparison method, Anti-Twin is not able to detect a picture as similar if it has been saved e.g. in two files with different compression rates (image quality) or in different file formats (JPG, GIF, BMP, TIFF etc.). This is only possible with the pixel comparison!

System requirements: To read images from different image file formats, Anti-Twin requires the graphics library GDI+, which is a standard component of Windows XP. For older Windows versions, you can download GDI+.

Comparison method: As a technical requirement, the image comparison had to be programmed with a detection of only a fuzzy similarity. Due to different image formats and compressions, pixels always differ slightly, even if this difference is not intended (e.g. by color reduction or compression artifacts). As a consequence, Anti-Twin is deliberately inaccurate and even color-blind when comparing pixels. In addition, Anti-Twin ignores the image files' file sizes as well as height and width in pixels. This unfortunately results in a certain lack of reliability of the results displayed.

Warning (for image search)
Due to the above-mentioned fuzzy comparison and the resulting lack of reliability you should be very careful when deleting similar image files. Please check the displayed results manually before deleting files, i.e. by opening every listed image.




Twins