Discussion forum for all Windows batch related topics.
Moderator: DosItHelp
-
Aacini
- Expert
- Posts: 1913
- Joined: 06 Dec 2011 22:15
- Location: México City, México
-
Contact:
#16
Post
by Aacini » 14 Jan 2015 17:01
catalinnc wrote:this one is very light on variables...feel free to edit it to make it more readable...tested on win xp sp3 up to date...
I am afraid you misunderstood the core point of this problem... The problem is not that the program may use a large amount of variables. The problem is that if the amount is
very large, the time required to process a large file may be excessive.
Yes, your program use a few variables, but it takes too much time to process a file when compared vs. the other solution. I did a couple timing tests with both programs; the first one with the Sample Text File given in first post of this topic and the second one with a file 10 times larger (70 lines). Here are the results:
Code: Select all
C:\> Aacini.bat
Start time: 16:24:56.90
End time: 16:24:56.91
C:\> catalinnc.bat
time is 16:27:24.20 for starting operation
time is 16:27:24.26 for ending operation
Presione una tecla para continuar . . .
C:\> for /L %i in (1,1,10) do @type "_sample text file.txt" >> "_sample text file2.txt"
C:\> copy "_sample text file2.txt" "_sample text file.txt" /Y
1 archivo(s) copiado(s).
C:\> Aacini.bat
Start time: 16:39:59.56
End time: 16:39:59.59
C:\> catalinnc.bat
time is 16:40:07.37 for starting operation
time is 16:40:10.49 for ending operation
Presione una tecla para continuar . . .
Antonio
-
catalinnc
- Posts: 39
- Joined: 12 Jan 2015 11:56
#17
Post
by catalinnc » 16 Jan 2015 13:56
I am afraid you misunderstood the core point of this problem... The problem is not that the program may use a large amount of variables. The problem is that if the amount is very large, the time required to process a large file may be excessive.
Yes, your program use a few variables, but it takes too much time to process a file when compared vs. the other solution.
ok...here is a solution light on time too...
Code: Select all
@echo off
setlocal enabledelayedexpansion
echo time is %time% for starting operation
sort < "SampleTextFile.txt" > "SampleTextFileSorted.txt"
type nul > "DuplicateFound.txt"
for /f "delims=" %%A in (SampleTextFileSorted.txt) do (
set "_string_full=%%A"
set "_string_short=!_string_full:~0,44!"
if /i [!_string_short!] equ [!_cache_short!] (
if /i [!_first_time!] equ [true] (
(
echo !_cache_full!
echo !_string_full!
)>> "DuplicateFound.txt"
set "_first_time=false"
) else (echo !_string_full!>> "DuplicateFound.txt")
) else (set "_first_time=true")
set "_cache_full=!_string_full!"
set "_cache_short=!_string_short!
)
echo time is %time% for ending operation
endlocal
pause
_
p.s. this solution will shine on "SampleTextFile.txt" with thousands of lines!!!
_
offtopic
here is a solution for filtering out dupe lines from "SampleTextFile.txt"
Code: Select all
@echo off
setlocal enabledelayedexpansion
echo time is %time% for starting operation
sort < "SampleTextFile.txt" > "SampleTextFileSorted.txt"
type nul > "UniqueEntriesFound.txt"
for /f "delims=" %%A in (SampleTextFileSorted.txt) do (
set "_string_full=%%A"
if /i [!_string_full!] neq [!_cache_full!] (echo !_string_full!>> "UniqueEntriesFound.txt")
set "_cache_full=!_string_full!"
)
echo time is %time% for ending operation
endlocal
pause
_
-
Squashman
- Expert
- Posts: 4486
- Joined: 23 Dec 2011 13:59
#18
Post
by Squashman » 16 Jan 2015 14:46
Code: Select all
sort < "SampleTextFile.txt" > "SampleTextFileSorted.txt"
Rookie mistake there. Always use the /O option for SORT output to a file. Much faster on large files.
Here is some dedupe code that I believe Dave has posted in the past. This just dedupes a file so you just have one instance of each line in your output.
Code: Select all
:DEDUPE
:: DEDUPE File
setlocal disableDelayedExpansion
set "file=%~1"
set "sorted=%file%.sorted"
set "deduped=%file%.deduped"
::Define a variable containing a linefeed character
set LF=^
::The 2 blank lines above are critical, do not remove
sort "%file%" /O "%sorted%"
>"%deduped%" (
set "prev="
for /f usebackq^ eol^=^%LF%%LF%^ delims^= %%A in ("%sorted%") do (
set "ln=%%A"
setlocal enableDelayedExpansion
if /i "!ln!" neq "!prev!" (
endlocal
(echo %%A)
set "prev=%%A"
) else endlocal
)
)
>nul move /y "%deduped%" "%file%"
del "%sorted%"
GOTO :EOF
-
catalinnc
- Posts: 39
- Joined: 12 Jan 2015 11:56
#19
Post
by catalinnc » 17 Jan 2015 13:20
Rookie mistake there. Always use the /O option for SORT output to a file. Much faster on large files.
thanks a lot 4 the tip
_