Discussion forum for all Windows batch related topics.
Moderator: DosItHelp
-
foxidrive
- Expert
- Posts: 6031
- Joined: 10 Feb 2012 02:20
#16
Post
by foxidrive » 19 Jun 2015 04:37
If you have one file called runme.bat that contains
and another file called runme.bat that contains
Then your comparison will consider them both the same, but only because the name and filesize is the same.
What is inside those two files can be totally different.
-
mingolito
- Posts: 28
- Joined: 04 Dec 2014 11:34
#17
Post
by mingolito » 19 Jun 2015 05:37
foxidrive wrote:If you have one file called runme.bat that contains
and another file called runme.bat that contains
Then your comparison will consider them both the same, but only because the name and filesize is the same.
What is inside those two files can be totally different.
OkK...Then you should also compare the file signature "hash" to determine if a file contained to a different folder of the same name or size, is equal to 100% to another file in another folder..
-
foxidrive
- Expert
- Posts: 6031
- Joined: 10 Feb 2012 02:20
#18
Post
by foxidrive » 19 Jun 2015 07:33
mingolito wrote:OkK...Then you should also compare the file signature "hash" to determine if a file contained to a different folder of the same name or size, is equal to 100% to another file in another folder..
Yes. You need a third party tool or look at carlos' post in recent weeks for his native batch script to compare files.
viewtopic.php?f=3&t=6439
-
mingolito
- Posts: 28
- Joined: 04 Dec 2014 11:34
#19
Post
by mingolito » 19 Jun 2015 08:56
foxidrive wrote:mingolito wrote:OkK...Then you should also compare the file signature "hash" to determine if a file contained to a different folder of the same name or size, is equal to 100% to another file in another folder..
Yes. You need a third party tool or look at carlos' post in recent weeks for his native batch script to compare files.
viewtopic.php?f=3&t=6439
Great job to carlos
i did not understand but it only works with files in pdf.
-
dbenham
- Expert
- Posts: 2461
- Joined: 12 Feb 2011 21:02
- Location: United States (east coast)
#20
Post
by dbenham » 19 Jun 2015 08:59
If "hash" or checksum were precomputed and available, then that would be a good test. But Windows does not give you that. So the simplest thing to do is compare the files with FC. But don't use FC unless name and size are identical. I would use something like the following to test if files with same name and size are truely identical:
Code: Select all
fc /t /lb1 file1 file2 2>nul && echo files are identical
But even if the files are identical, there could be a good reason for both to exist.
Sure, there may be situations where you want to delete true duplicates. But I would never recommend that anyone blindly delete duplicates as you propose.
Dave Benham
-
mingolito
- Posts: 28
- Joined: 04 Dec 2014 11:34
#21
Post
by mingolito » 19 Jun 2015 09:16
But I would never recommend that anyone blindly delete duplicates as you propose.
Oh yep, in fact thanks to the work of @foxidrive we were able to create a backup folder, to prevent permanent loss of files.
Prevention is Better than Cure...
-
npocmaka_
- Posts: 514
- Joined: 24 Jun 2013 17:10
- Location: Bulgaria
-
Contact:
#22
Post
by npocmaka_ » 19 Jun 2015 09:39
a simple way to generate check sum/hash code with certutil:
Code: Select all
for /f "skip=1 delims=" %%# in ('CertUtil -hashfile "C:\somefile" MD5^|find /v "CertUtil: -hashfile command completed successfully."') do set "haschcode=%%#"
echo %haschcode: =%
you can use following hash algorithms MD2 MD4 MD5 SHA1 SHA256 SHA384 SHA512 (with the uppercase) .
have on mind that certutil is not installed by default in win2003/xp
MAKECAB also provides its own checksum algorithm -
this script can be used directly.
-
dbenham
- Expert
- Posts: 2461
- Joined: 12 Feb 2011 21:02
- Location: United States (east coast)
#23
Post
by dbenham » 19 Jun 2015 09:48
But is that faster than FC? I suppose it could be given that the CERTUTIL technique need never read a file more than once, whereas FC must always read the root file for each comparison made. But I don't know...
FC with /LB1 can abort the comparison as soon as it finds a difference.
Dave Benham
-
Aacini
- Expert
- Posts: 1910
- Joined: 06 Dec 2011 22:15
- Location: México City, México
-
Contact:
#24
Post
by Aacini » 19 Jun 2015 10:55
dbenham wrote:FC with /LB1 can abort the comparison as soon as it finds a difference.
Dave Benham
Hum, err... I am afraid not...
The FC's /LBn switch is used to group the number of consecutive different lines in just one reported group. As I said when I posted my
FComp.bat program:
Aacini wrote:+ |You may use /1 FC switch for a finer isolating of mismatched sections
+ |(the default is /2). This way, two deleted sections separated by 2
+ |lines (instead of 3 by default) will be reported as two deleted
+ |sections instead of one large updated section, for example. However,
+ |in this case several updated sections separated by just one line will
+ |be reported separately with the same ending-beginning lines instead of
+ |as just one large updated section. You may tune up this parameter to
+ |fit your needs.
Antonio
-
dbenham
- Expert
- Posts: 2461
- Joined: 12 Feb 2011 21:02
- Location: United States (east coast)
#25
Post
by dbenham » 19 Jun 2015 11:57
Aacini wrote:dbenham wrote:FC with /LB1 can abort the comparison as soon as it finds a difference.
Dave Benham
Hum, err... I am afraid not...
The FC's /LBn switch is used to group the number of consecutive different lines in just one reported group. As I said when I posted my
FComp.bat program:
Aacini wrote:+ |You may use /1 FC switch for a finer isolating of mismatched sections
+ |(the default is /2). This way, two deleted sections separated by 2
+ |lines (instead of 3 by default) will be reported as two deleted
+ |sections instead of one large updated section, for example. However,
+ |in this case several updated sections separated by just one line will
+ |be reported separately with the same ending-beginning lines instead of
+ |as just one large updated section. You may tune up this parameter to
+ |fit your needs.
Antonio
It is not as simple as I layed out, but it does make it more likely to abort when a difference is found. I have a 9.9 mb file test1.test and another nearly identical file test2.test except it has one extra line of "a" at the beginning and end. Note the difference in results and timing with/without the /LB1 option:
Code: Select all
D:\test>echo %time%&fc /t test1.test test2.test&call echo %^time%
13:47:35.40
Comparing files test1.test and TEST2.TEST
***** test1.test
ID3♥
***** TEST2.TEST
a
ID3♥
*****
***** test1.test
***** TEST2.TEST
a
*****
13:47:35.57
D:\test>echo %time%&fc /t /lb1 test1.test test2.test&call echo %^time%
13:47:52.33
Comparing files test1.test and TEST2.TEST
Resync Failed. Files are too different.
***** test1.test
ID3♥
***** TEST2.TEST
a
*****
13:47:52.34
Dave Benham
-
foxidrive
- Expert
- Posts: 6031
- Joined: 10 Feb 2012 02:20
#26
Post
by foxidrive » 19 Jun 2015 23:39
That old gettimestamp.bat makes it so much easier to evaluate the elapsed time!
Mind you the time taken in your tests is not very long at all.