Page 1 of 2

need to optimize my batch file [resolved]

Posted: 20 Jan 2016 11:02
by zimxavier
Hello,

My current batch searches each line from STRINGS.txt (as string) in any .txt files in C:\FOLDER\ (subfolders included). It says to me if each string exists and in which file. The output file is IF_STRING_EXISTS.txt.
It works, but now i need to optimize and customize it, because in some cases there are thousands of txt files in FOLDER\ and i have hundreds of strings.

My wishes :
1) I don't want to know where each string exists, but if they exist or not. So, if it finds a string one time, it can immediately search next string (instead of searching in remaining files).

2) I prefer to search in some subdirectories only, because some of them are useless. Say i want to search in C:\FOLDER\a\ and in C:\FOLDER\b\ instead of C:\FOLDER\. How should I write it? I tried to add a | or a ; as separator, but to no avail.


My current batch :

Code: Select all

@echo off
setlocal enableextensions disabledelayedexpansion

set "manifest_folder=C:\FOLDER\*.txt"
set "file_list=STRINGS.txt"

    (for /f "usebackq delims=" %%a in ("%file_list%") do (
        set "found="
        for /f "delims=" %%b in ('findstr /l /m /s /c:"%%a" "%manifest_folder%"') do (
            echo %%a is found in %%~nxb
            set "found=1"
        )
        if not defined found (
            echo %%a is not found
        )
    )) > "IF_STRING_EXISTS.txt"


Thanks in advance :)

EDIT:

Best answer from foxidrive. New batch :

Code: Select all

@echo off

set manifest_folder="C:\FOLDER\a";"C:\FOLDER\b"
set "file_list=STRINGS.txt"

   (
     for /f "usebackq delims=" %%a in ("%file_list%") do (
     findstr /l /m /s /c:"%%a" /d:%manifest_folder% *.txt >nul && (
              echo %%a is found
        ) || (echo %%a is not found)
   )) > "IF_STRING_EXISTS.txt"

Re: need to optimize my batch file

Posted: 20 Jan 2016 14:47
by Squashman
Why don't you just use the /G and /D options of FINDSTR?

Re: need to optimize my batch file

Posted: 20 Jan 2016 16:16
by zimxavier
You mean without FOR ? I don't know how.

Code: Select all

findstr /l /S /M /G:STRINGS.txt  /D:"C:\FOLDER\a\*.txt;C:\FOLDER\b\*.txt" > a.txt

Nothing happens... a.txt is empty.

Even if it works it searches each lines in each file, no ? My biggest file contains 2800 strings and all the subdirectories contains 2700 txt files. Thus I really need the fastest way. My current batch takes more than ten minutes with a smaller file (and i use a ssd).

Re: need to optimize my batch file

Posted: 22 Jan 2016 11:01
by Aacini

Code: Select all

@echo off
setlocal

set "manifest_folders=C:\FOLDER\a\*.txt C:\FOLDER\b\*.txt"
set "file_list=STRINGS.txt"

call :findStrings < "%file_list%" > "IF_STRING_EXISTS.txt"
goto :EOF


:findStrings

rem Reset errorlevel to 0
ver > NUL

:nextString

   rem Process strings in input file one by one until EOF
   set /P "string="
   if errorlevel 1 goto endStrings

   rem Search this string in all *.txt files in given folders
   set "found="
   for /F %%a in ('findstr /M /L /S /C:"%string%" %manifest_folders%') do (
      rem When the string is found in the first file, show it
      echo %string% is found in %%~NXa
      rem ... and break the loop
      set "found=1"
      goto endFor
   )
   :endfor
   if not defined found (
      echo %string% is not found
   )

rem Go back for next string
goto nextString
:endStrings
exit /B

Note the must not be empty lines in the STRINGS.txt file.

Antonio

Re: need to optimize my batch file

Posted: 22 Jan 2016 12:36
by foxidrive
It would be kind of you to test all the solutions that people offer, and report the time taken.

It would be interesting to see the comparative results.

Code: Select all

@echo off
set "file_list=STRINGS.txt"
set "found="

(for %%a in (
"C:\FOLDER\a"
"C:\FOLDER\b"
) do dir "%%~a\*.txt" /b /s /a-d) >"%temp%\list.tmp"

(for /f "usebackq delims=" %%a in ("%file_list%") do (
   for /f "usebackq delims=" %%b in ("%temp%\list.tmp") do (
     if not defined found findstr /L /M "%%a" "%%b" >nul && set found=1
   )
   if defined found (echo(%%a found) else (echo(not found: %%a)
  set "found="
))>"IF_STRING_EXISTS.txt"
del "%temp%\list.tmp"

Re: need to optimize my batch file

Posted: 22 Jan 2016 14:51
by zimxavier
Thank you for your help :)

@Aacini : your batch doesn't find any strings in C:\FOLDER\b\
Like i said, pipe and semi-colon don't work as separator, and apparently space either.

@foxidrive : your batch doesn't work. It gives me an error message, something like "impossible to open ... .txt" for each file (i translated it from french).

Re: need to optimize my batch file

Posted: 22 Jan 2016 15:26
by zimxavier
I tested the batch from Aacini with one folder only and compare with my batch from internet :

my batch from internet (1200 strings ; 8068 txt files): 18min05s (124 'is not found' ; 5108 'is found')
the batch from Aacini with one folder only (1200 strings ; 8068 txt files): 19min03s (124 'is not found' ; 1076 'is found')

=> With one folder it works, but strangely it is a little longer.

If i could select my 4 sub-folders only, there would be 2659 txt files instead of 8068...

Re: need to optimize my batch file

Posted: 23 Jan 2016 02:34
by foxidrive
zimxavier wrote:Thank you for your help :)
@Aacini : your batch doesn't find any strings in C:\FOLDER\b\

Does your folder path contain spaces?
Your sample folder paths don't have spaces in them.
@foxidrive : your batch doesn't work. It gives me an error message

Try the code I posted now - I added the /s switch to the dir command.

Re: need to optimize my batch file

Posted: 23 Jan 2016 07:06
by zimxavier
@foxidrive
I tested your batchs with my sample paths.
However, my real folder is :
C:\Users\Xavier\Documents\Paradox Interactive\Europa Universalis IV\mod\MEIOUandTaxes\
I wrote it %USERPROFILE%\Docume~1\Parado~1\Europa~1\mod\MEIOUa~1\*.txt
It works and i don't know any other way to avoid spaces (because here we can't add additionnal quotation marks i guess).

Your new batch doesn't work. It shows me another error message. Something like
"set found=" is not recognized as internal command

Re: need to optimize my batch file

Posted: 23 Jan 2016 08:41
by Squashman
Shows us the batch file you are using so we can see exactly how you are using that path. Sometimes people are not sure how to edit an existing batch file and they do it incorrectly.

Re: need to optimize my batch file

Posted: 23 Jan 2016 13:25
by zimxavier
Like i said in my latest post, I tested your batchs a first time with my sample paths, i.e. i didn't change anything...

Re: need to optimize my batch file

Posted: 23 Jan 2016 14:58
by Aacini
Try this new code, it should run much faster:

Code: Select all

@echo off
setlocal

set manifest_folders="C:\FOLDER\a\*.txt" "C:\FOLDER\b\*.txt"
set "file_list=STRINGS.txt"

set "found="
(for /F "usebackq delims=" %%a in ("%file_list%") do (

   echo X > notFound.txt

   findstr /M /L /S /C:"%%a" %manifest_folders% 2> NUL | ( set /P "found=" & if defined found del notFound.txt )

   if exist notFound.txt (
      echo %%a is not found
   ) else (
      echo %%a is found
   )

)) > "IF_STRING_EXISTS.txt"

Antonio

Re: need to optimize my batch file

Posted: 23 Jan 2016 17:09
by zimxavier
@Aacini
Thank you Antonio!
I have in a window : "del was unexpected." (in french)
I have a notFound.txt with a line : X (not deleted at the end)
I have a IF_STRING_EXISTS.txt with 1200 lines: name_of_the_string is not found.
So it didn't work at all.

I didn't change anything.
2 original subdirectories are in C:\FOLDER\a and 2 in C:\FOLDER\b
C:\FOLDER\a\decisions\ txt files
C:\FOLDER\a\missions\ txt files
C:\FOLDER\b\common\...\ txt files
C:\FOLDER\b\events\ txt files

---------------------------
My real folders :

1 folder with 8068 txt files (my original batch uses it)
C:\Users\Xavier\Documents\Paradox Interactive\Europa Universalis IV\mod\MEIOUandTaxes

4 subdirectories with 2659 txt files (i hope my future batch will use it):
C:\Users\Xavier\Documents\Paradox Interactive\Europa Universalis IV\mod\MEIOUandTaxes\common
C:\Users\Xavier\Documents\Paradox Interactive\Europa Universalis IV\mod\MEIOUandTaxes\decisions
C:\Users\Xavier\Documents\Paradox Interactive\Europa Universalis IV\mod\MEIOUandTaxes\events
C:\Users\Xavier\Documents\Paradox Interactive\Europa Universalis IV\mod\MEIOUandTaxes\missions

Re: need to optimize my batch file

Posted: 23 Jan 2016 19:53
by foxidrive
zimxavier wrote:However, my real folder is :
C:\Users\Xavier\Documents\Paradox Interactive\Europa Universalis IV\mod\MEIOUandTaxes\
I wrote it %USERPROFILE%\Docume~1\Parado~1\Europa~1\mod\MEIOUa~1\*.txt
It works and i don't know any other way to avoid spaces (because here we can't add additionnal quotation marks i guess).

Spaces are fine when they are catered for. Your example didn't show any.
Your new batch doesn't work. It shows me another error message. Something like
"set found=" is not recognized as internal command


That was a typo - try it now.
The code I provided should handle spaces.

Re: need to optimize my batch file

Posted: 24 Jan 2016 04:25
by zimxavier
Thank you foxidrive

It's better because there is nomore any error message, but now the problem is that it is horribly long.
It seems work other than that.
Sorry, i stopped it. At the end, it treated 374 strings out of 1200 in... 40min42s :shock: