I built a CLI tool in rust to find and remove duplicate files.
At work I sometimes download the same file multiple times to my machine and have to go through my downloads folder and manually remove the duplicates, so this is something I built to speed up the process of doing this.
https://github.com/jconvery1/hydra
I've found it quite useful.
I built a CLI years ago for the same purpose.
From what I can tell, your program treats files as duplicates if they share the same normalized filename and the exact same size; it doesn’t compare contents or hashes.
Mine samples bytes at specific positions, hashes those samples, and compares the hashes to produce a similarity score rather than a strict match. This works great for photos, two shots taken in the same second can differ slightly in pixels but still depict the same scene, so they’re considered duplicates. It also normalizes image orientation by rotating based on the brightest corner, so photos in different orientations are compared using the same features.