Duplicate Photo Scanner

Upload multiple images to find duplicates and near-duplicates using perceptual hashing. Detects copies even after resizing, cropping, or minor edits.

600+ duplicate scans run

Drag & drop files here, or

JPG PNG WebP GIF BMP AVIF Up to 50 images • Compared in browser • Nothing uploaded

Perceptual Hashing

Uses dHash algorithm to create visual fingerprints that survive resizing, compression, and minor edits.

Smart Grouping

Automatically clusters similar images into groups using Hamming distance with adjustable sensitivity.

Fast Batch Processing

Compare up to 50 images at once. Hashing and comparison happen in milliseconds per image.

100% Private

All hashing and comparison runs locally in your browser. Your images are never uploaded to any server.

How perceptual hashing finds duplicates

The Duplicate Photo Scanner uses the dHash (difference hash) algorithm to create a compact 64-bit visual fingerprint for each image. First, the image is resized to 9×8 pixels and converted to grayscale, removing color information and size differences. Then the algorithm compares each pixel to its right neighbor — if the left pixel is brighter, it records a 1; otherwise, a 0. This produces a 64-bit binary hash that captures the image's gradient structure. Two images with similar visual content will produce similar hashes regardless of resolution, file format, slight cropping, or JPEG recompression. The Similarity Scanner uses a different approach for detailed side-by-side comparison of two specific images.

When to use this tool

Photo deduplication is essential for managing large image libraries and freeing up storage space. Use this scanner to clean up your camera roll after a shoot, identify reposted or stolen images across collections, or find near-duplicate copies that differ only in resolution or compression. Photographers use it before archiving to remove burst-mode duplicates. Forensic investigators use it to find the same image distributed across multiple folders or devices. Content moderators use perceptual hashing to detect reposts. For verifying whether a specific image has been tampered with rather than copied, try the Authenticity Checker or ELA Scanner.

Understanding the sensitivity slider

The sensitivity slider controls the Hamming distance threshold — the maximum number of differing bits between two 64-bit hashes for them to be considered duplicates. At sensitivity 1–3, only near-exact copies are detected (same photo resaved or slightly recompressed). At 5–10, the scanner catches resized versions, minor crops, and light exposure adjustments. At 15–20, even significantly edited versions of the same scene may be grouped together. Start at the default (10) and adjust based on your needs. For pixel-perfect comparison of two specific images, the Compare Photos tool provides overlay and slider views.

Beyond duplicate detection

Perceptual hashing is just one layer of image analysis. Once you've identified duplicate groups, use the EXIF Checker to see which copy has the richest metadata — often the original. Check Privacy Score to ensure copies don't leak GPS coordinates or camera serial numbers. Run the File Hash Scanner on suspected duplicates — identical cryptographic hashes confirm byte-for-byte copies, while perceptual hashes catch visual copies that differ at the binary level. For batch metadata analysis across all your images, the Batch Scanner processes up to 20 files at once.

The dHash algorithm was designed specifically for fast, robust image fingerprinting at scale. Unlike cryptographic hashes (MD5, SHA-256) that change completely if a single byte differs, perceptual hashes capture visual similarity — a JPEG saved at quality 80 and quality 95 will produce nearly identical dHash values even though their file contents are completely different. This makes perceptual hashing the standard approach for image deduplication in photo management software, reverse image search engines, and content moderation systems. The 64-bit hash space allows comparison of thousands of image pairs in milliseconds. Combined with the Quality Analyzer to identify which duplicate has the highest fidelity and the EXIF Remover to strip metadata before sharing the keeper, the Duplicate Scanner fits naturally into a complete photo management workflow.

Frequently Asked Questions

What is perceptual hashing and how does dHash work?

Perceptual hashing creates a compact fingerprint based on an image's visual content rather than its raw file data. The dHash (difference hash) algorithm resizes each image to 9×8 pixels in grayscale, then compares adjacent horizontal pixels to produce a 64-bit binary code. Two visually similar images produce similar hashes even if they differ in resolution, file format, or compression level. The Hamming distance (number of differing bits) between two hashes indicates how visually similar they are.

Can it detect duplicates after resizing or compression?

Yes. Because dHash works on a tiny 9×8 representation and captures gradient patterns rather than exact pixel values, it reliably detects duplicates that have been resized to different resolutions, saved at different JPEG quality levels, converted between formats (JPG to PNG, for example), or slightly cropped. However, significant edits like heavy cropping, rotation, or color manipulation may produce hashes different enough to fall outside the default threshold.

What does the sensitivity slider control?

The sensitivity slider sets the maximum Hamming distance between two 64-bit hashes for images to be grouped as duplicates. A value of 0 means the hashes must be identical (exact visual match). A value of 10 (the default) allows up to 10 differing bits out of 64, catching most resize and recompression variants. Higher values (15-20) detect more loosely related images, such as similar compositions with different exposure settings. Lower values (1-5) restrict matches to near-exact copies only.

How many images can I compare at once?

You can upload up to 50 images in a single scan. The scanner computes a perceptual hash for each image and then compares all pairs, which means 50 images produce 1,225 pairwise comparisons. Because dHash computation takes only milliseconds per image and Hamming distance comparison is a simple bit operation, the entire process completes in seconds even for the maximum batch size. All processing happens in your browser — no server uploads required.

Is this different from the Similarity Scanner?

Yes, they serve different purposes. The Duplicate Photo Scanner is designed for batch deduplication — upload many images and automatically group the ones that look alike. The Similarity Scanner is designed for detailed two-image comparison — upload two specific images and get pixel-level difference visualization, structural similarity scores, and overlay views. Use the Duplicate Scanner to find which images match, then use the Similarity Scanner to examine specific pairs in detail.

Related tools: Similarity Scanner for detailed two-image comparison • Compare Photos for overlay views • File Hash Scanner for byte-level identity