DupeSorter

  • April 11, 2006: Version 0.45: Added content comparison
  • January 20, 2006: Version 0.40: Initial Release

This tool scans for duplicate filenames down the directories you specify by keeping track of filenames as it scans. The scan is broken down in two phases, first keeping tab of occurances of every file's filename, and then that tab is trimmed to those files with duplicates, and then followed by a second phase of searching for the paths and computing the hash of the contents of those files. And finally the list of files is trimmed off of files with unique content. This is done to keep memory usage low.

This tool runs off Java, so it will run on pretty much any recent platform like Windows, OS X, UN*X variants, etc. You'll need the JRE to use this app (you could first try running the dupesorter.jar to see if it works). To get the JRE go to java.com.

Note: This lists duplicate files by comparing with hash codes (that is, the file contents are chopped down to a smaller chunk of numbers), and do not remove files which you don't know what they're used for. This tool is best used when having to reorganize your personal folders after backups, or re-merging work, that happens to be hard to compare by filename alone.

Another Note: this isn't like an ordinary search tool as it must keep track of every previous filename, so searching through more files will take more memory. Note that the memory readings during the search process will tend to be in the 90s in any case, so nothing to worry about that. If you want, you can bound the amount of memory for the Java runtime (look up "-Xmx" when running "java" at the command line).

Depending on the system, to run this program simply double click the JAR file.


Back to home page








Copyright © 2000-2005 Edward L. Blake
Privacy Policy