I required a batch file to compare duplicate text files. DosTipss had helped me. After that, files have grown to over 800 but Limit of batch file seems up to 713:
https://i.imgur.com/2915yCb.png
There are some files generated of different size which it does not compare.
Kindly help.
-----------------------------------------------------------------------------
@Echo Off
SetLocal
Set Prompt=$g$s
Rem I move Versioned Backups to GOOD and the same to the bucket
Rem here: control on duplicate backup files
:: --------------------------------------------------------------------------
Rem Folder
PushD E:\12
:: --------------------------------------------------------------------------
Set Doubles=TRASH
Set Versions=GOOD
Echo Off
For %%i In (%Doubles% %Versions%) Do If Not Exist "%%i\" ( Md "%%i"
If ErrorLevel 1 Exit /B 1
)
For /F "Delims==" %%i In ('2^>Nul Set _') Do @ Set "%%i="
Set /A CountFiles=CountTrash=0
Rem Only files of the same size are compared
Rem Same files (errorlevel = 0) will be marked / moved / removed
Rem System Applicable to (compare / create) backup
For %%i In (*) Do ( SetLocal EnableDelayedExpansion
For /F "UseBackQTokens=1,2*" %%a In ('!CountFiles! !CountTrash! !_%%~zi!') Do (
EndLocal
Set /a CountFiles+=1
Set "EX="
For %%d In ( %%c ) Do If Not Defined EX (
>nul Fc "%%i" "%%~d" && (
Echo Weg %%i
>Nul Move "%%i" TRASH
Set /a CountTrash+=1, CountFiles-=1
Set /a Ex=1
)
)
If Not Defined Ex Set _%%~zi= %%c "%%i"
Title Files %%a Trash %%b
)
)
Title Files %CountFiles% Trash %CountTrash% Done
For /F "Delims==" %%i In ('2^>Nul Set _') Do @ Set "%%i="
Pause
Exit /B
[Solved] hELP
Moderator: DosItHelp
-
- Posts: 18
- Joined: 23 Apr 2017 22:36
[Solved] hELP
Last edited by sajjansinghania on 03 Nov 2019 18:00, edited 1 time in total.
-
- Expert
- Posts: 1166
- Joined: 06 Sep 2013 21:28
- Location: Virginia, United States
Re: hELP
Get rid of that second setlocal enabledelayedexpansion and following endlocal; you don't need them and that's what's breaking the script
-
- Posts: 240
- Joined: 04 Mar 2014 11:14
- Location: germany
Re: hELP
hallo,
The variable, _filesize is too long all the same size filenames are entered in one for each size. This means there are very many (600) files of the same size that contain different content. The length of the name is about 6 characters. I would design the script to create a list per file size to compare to. However, if so many files are the same size it is no longer useful to go through this list. All files from the list are checked against the new file. Is there another criterion on the name of the file, which already shows the difference of the content from the outset?
The variable, _filesize is too long all the same size filenames are entered in one for each size. This means there are very many (600) files of the same size that contain different content. The length of the name is about 6 characters. I would design the script to create a list per file size to compare to. However, if so many files are the same size it is no longer useful to go through this list. All files from the list are checked against the new file. Is there another criterion on the name of the file, which already shows the difference of the content from the outset?
-
- Posts: 18
- Joined: 23 Apr 2017 22:36
Re: hELP
ShadowThief wrote: ↑22 Oct 2019 05:56Get rid of that second setlocal enabledelayedexpansion and following endlocal; you don't need them and that's what's breaking the script
Sir, I have no knowledge of coding batch files. will be grateful if a code from which i create a batch file be posted here.pieh-ejdsch wrote: ↑22 Oct 2019 20:56hallo,
The variable, _filesize is too long all the same size filenames are entered in one for each size. This means there are very many (600) files of the same size that contain different content. The length of the name is about 6 characters. I would design the script to create a list per file size to compare to. However, if so many files are the same size it is no longer useful to go through this list. All files from the list are checked against the new file. Is there another criterion on the name of the file, which already shows the difference of the content from the outset?
Regards.
-
- Posts: 240
- Joined: 04 Mar 2014 11:14
- Location: germany
Re: hELP
Please can you adjust your problem in the contribution description? "Help" is not helpful ...
I'm not sure if this script meets your requirements.
If you want to compare such a large number of files of the same size, you will make much better progress with the file hash.
According to their thread from january, they wanted to identify newer versions of one / more files.
Therefore, I asked if they can use the file name to read out how to version the file.
In any case, I hunted this script over more than 2.4 million files (all together are 18.5 MB) with about 15 thousand had a different content.
After one day (24h)!!!, 600,000 duplicates were sorted out.
With an optional quick'n'dirty function, the script did a bit faster.
But this does not need to be used.
Originally this script should only sort out a small ( 200..) number of identical files.
But whatever...
Phil
I'm not sure if this script meets your requirements.
If you want to compare such a large number of files of the same size, you will make much better progress with the file hash.
According to their thread from january, they wanted to identify newer versions of one / more files.
Therefore, I asked if they can use the file name to read out how to version the file.
In any case, I hunted this script over more than 2.4 million files (all together are 18.5 MB) with about 15 thousand had a different content.
After one day (24h)!!!, 600,000 duplicates were sorted out.
With an optional quick'n'dirty function, the script did a bit faster.
But this does not need to be used.
Code: Select all
@echo off
setlocal
set prompt=$g$s
rem ich verschiebe Versionierte Backups in GOOD und gleiche in den Eimer
rem hier: Kontrolle auf doppelte Backup Dateien
:: --------------------------------------------------------------------------
rem Folder
pushD "%~1"
:: --------------------------------------------------------------------------
set doubles=TRASH
set versions=GOOD
for %%i in (%doubles% %versions%) do if not exist "%%i\" ( md "%%i"
if errorlevel 1 exit /b 1
)
set /a countFiles=countTrash=vergleiche=0
2>nul (for /f "delims==" %%i in ('set _^& set #') do @ set "%%i=")
rem es werden nur Dateien gleicher Größe miteinander Verglichen
rem gleiche Dateien (errorlevel = 0) werden markiert/verschoben/entfernt
rem System Anwendbar auf (Vergleich/Erstellung) Backup
set "sorted=%temp%\allfiles"
2>nul del "%sorted%" "%sorted%log"
echo please wait ... create filelist ...
rem The files are sorted by size and timestamp.
robocopy /L . ". only test ..\\" /njh /ts /ns /nc /np /ndl /njs /bytes /log:"%sorted%log"
:: NOTICE: /ndL noDirectoryListing still outputs the file path without the /fp switch
echo ... sort filelist ...
rem The /R switch preserves the latest files and removes old duplicates.
rem To keep the oldest files, the /R switch must be removed.
sort /R "%sorted%log" /o "%sorted%"
echo ... filelist sorted.
echo ... compare files ...
rem create a new list/variable starting from this number of different files
rem this is necessary in order not to activate the overflow of the variable
rem for more different files, this value should be reduced -it can speed up the search
set "getinList=200"
:: rem These are just quick and dirty settings to speed up the search but NOT COMPLETE it
rem tryLast == Compare only the most recently found unique files
rem set to 0 to compare all files OR set to 100 to compare only the last 100 files
set /a "tryLast=0"
rem Only the last (now) list/variable is used for comparison.
rem leave blank to use all lists, or set to 1 to skip the old lists
set "splitList="
if %tryLast% equ 0 (set "tryLast=") else set "tryLast=|| ( 2>nul set/ai/=comp%%%tryLast% || set NOTtry=1)"
if defined splitList (set "n in (l) =%%n in (%%l)") else set "n in (l) =/l %%n in (%%l -1 0)"
:: which variable contains what?
rem g == listSorted
rem h == fileSize ( bytes )
rem i == fileTime
rem j == fullpathfileName (a b c d ab ad)
rem #h == countSizeList 1 ...
rem l == #h
rem # == nextSizeList +1
rem m == #
rem n == actualSizeList
rem old == sizeNow
set "old=0"
for %%g in ("%sorted%") do for /f "usebackQ tokens=1,3*" %%h in ("%%~g") do ( set "EX="
set "comp="
set "NOTtry="
setlocal enabledelayedexpansion
if !old! neq %%h call :clean :clean
set /a "#%%h +=0, # =#%%h +1"
for /f "usebackQtokens=1-2" %%l in ('!#%%h! !#!') do (
for %n in (l) %do ( if :!! neq : setlocal enabledelayedexpansion
for /f "usebackQtokens=1-2*" %%a in ('!countFiles! !countTrash! !_%%h#%%n!') do (
endlocal
if NOT defined NOTtry if NOT defined EX for %%d in (%%c) do if NOT defined NOTtry if NOT defined EX (
set /a comp+=1
>nul fc /lb1 "%%~nxj" "%%~d" && (
echo weg %%j
>nul move "%%~nxj" TRASH
set /a countTrash+=1, EX=1
) %tryLast%
)
if %%n == %%l if NOT defined EX ( set /a "countfiles+=1"
2>nul set/ai/=countFiles%%%getinList% || set "i="
if NOT defined i ( set _%%h#%%m="%%~nxj"
set "#%%h=%%m"
)
if defined i ( set _%%h#%%n="%%~nxj" %%c
set "#%%h=%%l"
)
)
title Files %%a Trash %%b
)
)
)
set "old=%%h"
)
title Files %countFiles% trash %countTrash% done
:clean
rem here the lists for the different files are saved
2>nul md "%~dp0VersionLists"
for /f "delims=_#= tokens=1,2*" %%i in ('2^>nul set _') do >"%~dp0VersionLists\fc_%%i#%%j" echo %%k
if "%~1" == ":clean" exit /b
pause
2>nul del "%sorted%" "%sorted%log"
exit /b
But whatever...
Phil
-
- Posts: 18
- Joined: 23 Apr 2017 22:36
Re: hELP
At the very outset I must thank pieh-ejdsch Sir, for his invaluable help to me. You have solved my 2 year old problem. I thank God to have found DosTips.COm & you.
Sir, being a below average user I am unable to understand the suggestions you made.
Thank you Sir, Regards
Sir, being a below average user I am unable to understand the suggestions you made.
Having tested this batch file I am fully happy about its use for me. If possible and not a problem, the first matched file is removed to Trash. I would prefer the subsequent file/files to be removed to trash.Please can you adjust your problem in the contribution description? "Help" is not helpful ...
I'm not sure if this script meets your requirements.
If you want to compare such a large number of files of the same size, you will make much better progress with the file hash.
According to their thread from January, they wanted to identify newer versions of one / more files.
Therefore, I asked if they can use the file name to read out how to version the file.
Thank you Sir, Regards