Page 1 of 1

Detecting same file size and deleting

Posted: 10 Jul 2010 10:24
by nerd
I am trying to write a script to detect go through sub-folders in the main folder to detect the same file size using "dir /os" followed by a deletion of the duplicate files in the sub-folder. Just wondering how to I combine this together using the IF command to delete if the file size is same.

Re: Detecting same file size and deleting

Posted: 10 Jul 2010 11:41
by aGerman
Compare and delete files depending on their size is not a good idea. you should use a MD5 tool.

I don't understand what you want to do. Where exactly are the files placed and which of found files should be deleted?

Regards
aGerman

Re: Detecting same file size and deleting

Posted: 11 Jul 2010 00:00
by !k
nerd

Code: Select all

@echo off
echo Used RHash http://rhash.anz.ru/
setlocal enableextensions
set "folder=c:\del dups"

set "hash="
for /f "tokens=1,*" %%a in (
'rhash.exe -H -r "%folder%" ^|sort'
) do call :d "%%a" "%%b"
goto :eof

:d
if "%hash%" == "%~1" del /q %2
set "hash=%~1"
goto :eof

Re: Detecting same file size and deleting

Posted: 11 Jul 2010 20:19
by nerd
Hi,

aGerman: I am grabbing some files online for a project, and somehow, I get lots of duplicated files. This duplicated files have the same file size, so I am trying to delete the duplicated files while maintaining original file. I am creating a folder each time I am trying to grab the files. For eg. I create "ABC" folder, I grab the files and place it in the folder. Next, I created "BCD" folder, I grab the files and place it in that folder.

So "ABC" folder has 2 duplicated files size of 1111kb and "BCA" folder has 1 file size of 1111kb as well. So the batch program will go through "ABC" folder and delete the duplicated copy of 1111kb as it has detected the same file size within the same folder. It will not delete the file with 1111kb in "BCA" folder. I have never tried md5 before, so will take a look at it.

!k: Thanks for the input, I will try that out!

Re: Detecting same file size and deleting

Posted: 12 Jul 2010 06:34
by miskox
I would suggest using

Code: Select all

fc /b file1 file2 >nul 2>nul


for this.

I have a batch file for finding .pdf files from different (sub)folders and I compare them with .pdf files that are located in ONE folder only. But files in (sub)folders have different filenames than the one in ONE folder.
I make a list of files to a file and then make binary compare.

Can post a batch program later/tomorrow - don't have it here.

Saso

Re: Detecting same file size and deleting

Posted: 12 Jul 2010 11:57
by aGerman
If you are sure there are never two files with same size but different contents, you could use this:

Code: Select all

@echo off &setlocal
set rootfolder=c:\your data root
pushd "%rootfolder%" ||goto :eof
for /d %%a in (*) do (
  set "subfolder=%%~fa"
  call :proc
)
popd
pause
goto :eof

:proc
pushd "%subfolder%"
for %%a in (*) do (
  for %%b in (*) do (
    if "%%a" neq "%%b" (
      if "%%~za"=="%%~zb" (
        del "%%b"
      )
    )
  )
)
popd
goto :eof


BTW: !k's tool could also calculate the md5 hash of files (option -M instead of -H).

Regards
aGerman

Re: Detecting same file size and deleting

Posted: 14 Jul 2010 10:26
by nerd
Thanks aGerman :D