Duplicate finding & removing

Discussion forum for all Windows batch related topics.

Moderator: DosItHelp

Post Reply
Message
Author
Maulz
Posts: 3
Joined: 26 Mar 2013 13:00

Duplicate finding & removing

#1 Post by Maulz » 01 Apr 2013 06:20

Hi everybody! How can i find file duplicates among more then 7500 *.epubs using bat-file?
There are no absolutely matching entries (such as

Never Go Back - Charles DeVet.epub
Never Go Back - Charles DeVet.epub)

But there are a lot of

1984.epub
J.Orwell - 1984.epub
Jeorge Orwell - 1984.epub

(These three are all the same, of coarse, the date, size of files may differ).
I didn't find something like "Duplicate remover" for this goal, because they find only 100% match. Please help!

foxidrive
Expert
Posts: 6031
Joined: 10 Feb 2012 02:20

Re: Duplicate finding & removing

#2 Post by foxidrive » 01 Apr 2013 08:00

A batch file relies on rules that are derived from the makeup of the source files/text.
Your source files name are quite random in nature so aren't a good candidate for a simple batch file.

The best you could hope for is matching the title inside the epub files and producing a list of those that match - and then you can manually figure out which copy is the one you want to keep from a set of files.

Post Reply