Move newest XML files by reading first lines

Discussion forum for all Windows batch related topics.

Moderator: DosItHelp

Message
Author
ZayaMia
Posts: 14
Joined: 25 Apr 2017 02:54

Move newest XML files by reading first lines

#1 Post by ZayaMia » 25 Apr 2017 03:11

Hello everyone,

I'am really new to .batch, so please bear with me :oops:

I have thousands of XML files but only certain need to be moved into another folder. To speed up the process it only needs to read and extract the first 10 lines since the information I need is given at the very begining.

My batch file is following:

Code: Select all

    @echo off&pushd \\server5\Datapool
    set "file="
    for /f "usebackq tokens=*" %%a in (`dir /b /o:d *.xml`) do findstr /i "Cale Toyota231" %%a && set "file=%%a"
    if defined file (
      copy "%file%" C:\Users\folder1
    ) else (
      echo Not file match found
    )




This batch file extracts and moves it into another folder if attributes "Cale + Toyota231" matches. These attributes are always in the same line.

Possible this function could help, but didn't succeed so far :?

Code: Select all

set /p texte=< file.txt  
  echo %texte%


How do I speed up the process by adjusting my .batch script to read only the first 10 lines?

Many thanks in advance !

penpen
Expert
Posts: 2009
Joined: 23 Jun 2013 06:15
Location: Germany

Re: Move newest XML files by reading first lines

#2 Post by penpen » 25 Apr 2017 10:13

ZayaMia wrote:This batch file extracts and moves it into another folder if attributes "Cale + Toyota231" matches. These attributes are always in the same line.
I'm unsure how to interpret the "+" character (boolean AND or boolean OR, both are possible).
The findstr command you are using does find all attributes matching "Cale" OR "Toyota231" (ignoring case).

ZayaMia wrote:How do I speed up the process by adjusting my .batch script to read only the first 10 lines?
It should be much faster if you don't call findstr for each file.

Code: Select all

:OR "Cale" or "Toyota231" is contained each file
>"result.txt" findstr /i /m "Cale Toyota231"  "*.xml"
for /f "usebackq tokens=* delims=" %%a in ("result.txt") do copy "%%~a" "C:\Users\folder1"
goto :eof

:AND both "Cale" and "Toyota231" must be contained in each file
>"result1.txt" findstr /i /m "Cale" "*.xml"
>"result2.txt" findstr /i /m /f:"result1.txt" "Toyota231"
for /f "usebackq tokens=* delims=" %%a in ("result2.txt") do copy "%%~a" "C:\Users\folder1"
goto :eof


penpen

Edit: I've appended "*.xml" to the first findstr command.

ZayaMia
Posts: 14
Joined: 25 Apr 2017 02:54

Re: Move newest XML files by reading first lines

#3 Post by ZayaMia » 25 Apr 2017 21:47

Thank you so much for your effort.

Unfortunately I forgot to mention an important detail. My bad sorry :oops:
The .batch should extract only the newest date, (todays date) since thousands of XML files are genereted every day.


How do I modify my.bat with your code (OR & AND) if I don"t use findstr?

Sorry, newbie here :oops:

elzooilogico
Posts: 128
Joined: 23 May 2016 15:39
Location: Spain

Re: Move newest XML files by reading first lines

#4 Post by elzooilogico » 26 Apr 2017 05:22

The extract only newest date already answered http://stackoverflow.com/questions/43509202/extracting-files-with-newest-date/43516472#43516472

penpen wrote:

Code: Select all

:AND both "Cale" and "Toyota231" must be contained in each file
>"result1.txt" findstr /i /m "Cale" "*.xml"
>"result2.txt" findstr /i /m /f:"result1.txt" "Toyota231"
for /f "usebackq tokens=* delims=" %%a in ("result2.txt") do copy "%%~a" "C:\Users\folder1"
goto :eof

from https://stackoverflow.com/questions/8844868/what-are-the-undocumented-features-and-limitations-of-the-windows-findstr-command

Code: Select all

:AND "Cale" or "Toyota231" is contained each file
>"result.txt" findstr /irc:"cale.*toyota231" /c:"toyota231.*cale"
for /f "usebackq tokens=* delims=" %%a in ("result.txt") do copy "%%~a" "C:\Users\folder1"
goto :eof

So only one `findstr` needed.

ZayaMia wrote:How do I modify my.bat with your code (OR & AND) if I don"t use findstr?
what does this really mean?

penpen
Expert
Posts: 2009
Joined: 23 Jun 2013 06:15
Location: Germany

Re: Move newest XML files by reading first lines

#5 Post by penpen » 26 Apr 2017 05:31

I was lazy with the code last time (i only provided the searches itself - and i forgot a "*.xml" in the OR search above - i will correct it).

Your "mybat.bat" modified using the OR search ("mybatOr.bat", untested):

Code: Select all

@echo off
setlocal enableExtensions disableDelayedExpansion
pushd \\server5\Datapool
forfiles /P "." /M "*.xml" /D +%date% /C "cmd /c for %%a in (@file) do @(echo(%%~a") | >"fileList.txt" findstr /V "^$"
>"result.txt" findstr /i /m /f:"fileList.txt" "Cale Toyota231"
for /f "usebackq tokens=* delims=" %%a in ("result.txt") do copy "%%~a" "C:\Users\folder1"
del "fileList.txt"
del "result.txt"
goto :eof


Your "mybat.bat" modified using the AND search ("mybatAnd.bat", untested):

Code: Select all

@echo off
setlocal enableExtensions disableDelayedExpansion
pushd \\server5\Datapool
forfiles /P "." /M "*.xml" /D +%date% /C "cmd /c for %%a in (@file) do @(echo(%%~a") | >"fileList.txt" findstr /V "^$"
>"result1.txt" findstr /i /m /f:"fileList.txt" "Cale"
>"result2.txt" findstr /i /m /f:"result1.txt" "Toyota231"
for /f "usebackq tokens=* delims=" %%a in ("result2.txt") do copy "%%~a" "C:\Users\folder1"
del "fileList.txt"
del "result1.txt"
del "result2.txt"
goto :eof


Sidenote:
You may place the temporary files ("fileList.txt", "result.txt", "result1.txt", and "result2.txt") in your local directory (instead of writing them to the server directory).


penpen

penpen
Expert
Posts: 2009
Joined: 23 Jun 2013 06:15
Location: Germany

Re: Move newest XML files by reading first lines

#6 Post by penpen » 26 Apr 2017 06:08

elzooilogico wrote:
penpen wrote:

Code: Select all

:AND both "Cale" and "Toyota231" must be contained in each file
>"result1.txt" findstr /i /m "Cale" "*.xml"
>"result2.txt" findstr /i /m /f:"result1.txt" "Toyota231"
for /f "usebackq tokens=* delims=" %%a in ("result2.txt") do copy "%%~a" "C:\Users\folder1"
goto :eof

from https://stackoverflow.com/questions/8844868/what-are-the-undocumented-features-and-limitations-of-the-windows-findstr-command

Code: Select all

:AND "Cale" or "Toyota231" is contained each file
>"result.txt" findstr /irc:"cale.*toyota231" /c:"toyota231.*cale"
for /f "usebackq tokens=* delims=" %%a in ("result.txt") do copy "%%~a" "C:\Users\folder1"
goto :eof

So only one `findstr` needed.
The command "findstr /irc:"cale.*toyota231" /c:"toyota231.*cale"" searches for both search strings in one line ("." doesn't match neither "\r" nor "\n").
The information may be located in different lines, therefore you cannot use this single findstr command.

The only multiline search findstr example on the linked page i found (but maybe i've overread something) is this:
Chapter: Searching across line breaks wrote:

Code: Select all

@echo off
setlocal
::Define LF variable containing a linefeed (0x0A)
set LF=^


::Above 2 blank lines are critical - do not remove

::Define CR variable containing a carriage return (0x0D)
for /f %%a in ('copy /Z "%~dpf0" nul') do set "CR=%%a"

setlocal enableDelayedExpansion
::regex "!CR!*!LF!" will match both Unix and Windows style End-Of-Line
findstr /n /r /c:"A!CR!*!LF!A" TEST.TXT
Because the number of line endings is unknown, you cannot use this approach.


penpen

elzooilogico
Posts: 128
Joined: 23 May 2016 15:39
Location: Spain

Re: Move newest XML files by reading first lines

#7 Post by elzooilogico » 26 Apr 2017 06:20

penpen wrote:
elzooilogico wrote:
penpen wrote:

Code: Select all

:AND both "Cale" and "Toyota231" must be contained in each file
>"result1.txt" findstr /i /m "Cale" "*.xml"
>"result2.txt" findstr /i /m /f:"result1.txt" "Toyota231"
for /f "usebackq tokens=* delims=" %%a in ("result2.txt") do copy "%%~a" "C:\Users\folder1"
goto :eof

from https://stackoverflow.com/questions/8844868/what-are-the-undocumented-features-and-limitations-of-the-windows-findstr-command

Code: Select all

:AND "Cale" or "Toyota231" is contained each file
>"result.txt" findstr /irc:"cale.*toyota231" /c:"toyota231.*cale"
for /f "usebackq tokens=* delims=" %%a in ("result.txt") do copy "%%~a" "C:\Users\folder1"
goto :eof

So only one `findstr` needed.
The command "findstr /irc:"cale.*toyota231" /c:"toyota231.*cale"" searches for both search strings in one line ("." doesn't match neither "\r" nor "\n").
The information may be located in different lines, therefore you cannot use this single findstr command.

The only multiline search findstr example on the linked page i found (but maybe i've overread something) is this:
Chapter: Searching across line breaks wrote:

Code: Select all

@echo off
setlocal
::Define LF variable containing a linefeed (0x0A)
set LF=^


::Above 2 blank lines are critical - do not remove

::Define CR variable containing a carriage return (0x0D)
for /f %%a in ('copy /Z "%~dpf0" nul') do set "CR=%%a"

setlocal enableDelayedExpansion
::regex "!CR!*!LF!" will match both Unix and Windows style End-Of-Line
findstr /n /r /c:"A!CR!*!LF!A" TEST.TXT
Because the number of line endings is unknown, you cannot use this approach.


penpen


interesting stuff and thanks for the example!

I've included the solution posted as the OP clearly stated that both strings occur always in the same line.

penpen
Expert
Posts: 2009
Joined: 23 Jun 2013 06:15
Location: Germany

Re: Move newest XML files by reading first lines

#8 Post by penpen » 26 Apr 2017 06:41

elzooilogico wrote:I've included the solution posted as the OP clearly stated that both strings occur always in the same line.
Oh, i haven't seen this! :oops:
Sorry!

Then you are right! The AND version is then "mybatAnd.bat":

Code: Select all

@echo off
setlocal enableExtensions disableDelayedExpansion
pushd \\server5\Datapool
forfiles /P "." /M "*.xml" /D +%date% /C "cmd /c for %%a in (@file) do @(echo(%%~a") | >"fileList.txt" findstr /V "^$"
>"result.txt" findstr /r /i /m /f:"fileList.txt" /C:"Cale.*Toyota231" /C:"Toyota231.*Cale"
for /f "usebackq tokens=* delims=" %%a in ("result.txt") do copy "%%~a" "C:\Users\folder1"
del "fileList.txt"
del "result.txt"
goto :eof


penpen

ZayaMia
Posts: 14
Joined: 25 Apr 2017 02:54

Re: Move newest XML files by reading first lines

#9 Post by ZayaMia » 26 Apr 2017 21:54

thanks for the response !

Yes, both attributes occur always on the same line.

So it must be AND.

It doesn't work for me somehow. May you explain me what exactly I have to modify?

I'am sorry for asking this :oops:

penpen
Expert
Posts: 2009
Joined: 23 Jun 2013 06:15
Location: Germany

Re: Move newest XML files by reading first lines

#10 Post by penpen » 27 Apr 2017 05:45

ZayaMia wrote:It doesn't work for me somehow.
The method is working (prove: "rem out" the pushd to run from local directory, add echo to copy to only display the copy commands, and prepend some new files with todays date) "testAnd.bat":

Code: Select all

@echo off
>"Cale.xml" echo(Cale
>"Toyota231.xml" echo(Toyota231
>"Toyota231Cale.xml" echo(Toyota231Cale
>"CaleToyota231.xml" echo(CaleToyota231

setlocal enableExtensions disableDelayedExpansion
rem pushd \\server5\Datapool
forfiles /P "." /M "*.xml" /D +%date% /C "cmd /c for %%a in (@file) do @(echo(%%~a") | >"fileList.txt" findstr /V "^$"
>"result.txt" findstr /r /i /m /f:"fileList.txt" /C:"Cale.*Toyota231" /C:"Toyota231.*Cale"
for /f "usebackq tokens=* delims=" %%a in ("result.txt") do echo copy "%%~a" "C:\Users\folder1"
rem del "fileList.txt"
rem del "result.txt"
goto :eof

Result:

Code: Select all

Z:\>testAnd.bat
copy "CaleToyota231.xml" "C:\Users\folder1"
copy "Toyota231Cale.xml" "C:\Users\folder1"

Z:\>


So you have to provide more deatails, what is going wrong:
Do you get any error messages?

Try this "myTestAnd.bat", and check if the copy commands displayed match your expected result:

Code: Select all

@echo off
setlocal enableExtensions disableDelayedExpansion
rem pushd \\server5\Datapool
forfiles /P "." /M "*.xml" /D +%date% /C "cmd /c for %%a in (@file) do @(echo(%%~a") | >"fileList.txt" findstr /V "^$"
>"result.txt" findstr /r /i /m /f:"fileList.txt" /C:"Cale.*Toyota231" /C:"Toyota231.*Cale"
for /f "usebackq tokens=* delims=" %%a in ("result.txt") do echo copy "%%~a" "C:\Users\folder1"
rem del "fileList.txt"
rem del "result.txt"
goto :eof


penpen

ZayaMia
Posts: 14
Joined: 25 Apr 2017 02:54

Re: Move newest XML files by reading first lines

#11 Post by ZayaMia » 27 Apr 2017 23:17

ok, I think this is getting out of hand since I didn't made my point clearly I think.

I'am really sorry for this.

Thousands of XML files are genereted each day and I just need specific ones. Moreover I just need those from todays date. XML files from yesterday or so aren't in my interest.

XML files do look like these:

Code: Select all

<testInfo testDuration="57" holidayCount="0" completedtask="12" currentState="Executive13" Name="PHIL" testVersion="13" lockTime="2017-04-11T11:20:05" 
 <result testStepName="locating" sequenceNrResult="1" testStepResult="OK">
 etc.
 </testInfo>
</testresult>


So if the Name "PHIL" and currentState "Executive13" match, which are always both on the same line , they should be moved from \\server5\Datapool to C:\Users\folder1.

My old Batch did its work, but took way too long. XML file names aren't named by Cale or Toyota231.

In meantime I also tried so look for a solution and came up with a faster one, but it is still taking quite some time to move these XML files to folder1.

Code: Select all

rem -----------------test.bat--------------
@echo off & SetLocal EnableDelayedExpansion
set "file="
set "srcFolder=\\server5\Datapool"
set "dstFolder=C:\Users\folder1"

rem language independent time
for /f "tokens=2 delims==" %%a in ('wmic os get localdatetime /value') do set "Tm=%%a"
set "today=%Tm:~0,4%%Tm:~4,2%%Tm:~6,2%"

pushd "%srcFolder%"
for /f "usebackq tokens=*" %%a in (`dir /b /o:d *.xml`) do (
  set "fileTime=%%~ta"
  rem set "fileTime_spanish=!fileTime:~6,4!!fileTime:~3,2!!fileTime:~0,2!"
  set "fileTime_english=!fileTime:~6,4!!fileTime:~0,2!!fileTime:~3,2!"
  if "!fileTime_english!" equ "%today%" (
    findstr /i "Cale Toyota231" %%a && copy "%%a" "%dstFolder%"
  )
)
popd
EndLocal
exit/B
rem -----------------test.bat--------------


So my attempt here was to search for a faster solution to move ALL specified files from todays date into another folder.

I hope its clear now. :oops:

elzooilogico
Posts: 128
Joined: 23 May 2016 15:39
Location: Spain

Re: Move newest XML files by reading first lines

#12 Post by elzooilogico » 28 Apr 2017 03:44

You're fetching files from a network location.

Both forfiles and dir retrieve the desired file list quickly.

But findstr may be a bottleneck on your own machine and SURE IT IS running on a network location.

Consider

Code: Select all

@echo off

rem -----------------test.bat--------------
@echo off & SetLocal EnableDelayedExpansion
set "file="
set "srcFolder=\\server5\Datapool"
set "dstFolder=C:\Users\folder1"

pushd "%srcFolder%"

forfiles /P "." /M "*.xml" /D "+%date%">"fileList.txt"

for /f "usebackq tokens=* delims=" %%a in ("fileList.txt") do (
  findstr /irc:"cale.*toyota231" /c:"toyota231.*cale" %%a && copy "%%a" "%dstFolder%"
)
popd
EndLocal
exit/B
rem -----------------test.bat--------------

Once pushd has run, you are in the destination folder, but commands are run on your computer.

So forfiles may get the list of the desired files fast, but findstr is run in every iteration within the for loop. That is, loading findstr into your memory and then parse the whole file across the network (this is SLOW).

The only workaround I can figure out (but ONLY if you have permissions to access the remote machine and deploy batch files) is to place the batch on it and run againts the desired destination computer.

Also, you may create a scheduled task to run it (any trigger you may think about).

penpen
Expert
Posts: 2009
Joined: 23 Jun 2013 06:15
Location: Germany

Re: Move newest XML files by reading first lines

#13 Post by penpen » 28 Apr 2017 06:27

ZayaMia wrote:ok, I think this is getting out of hand since I didn't made my point clearly I think.
You made it clear enough i think, and the batch "mybatAnd.bat" (above) is exactly doing what you have asked for;
although you haven't given the right search phrases.

I suspect you just don't belive me (whyever)?!
You have made the work to explain in detail, so i will do the same in case something might be unclear.


ZayaMia wrote:Moreover I just need those from todays date. XML files from yesterday or so aren't in my interest.
This is done by the line:

Code: Select all

forfiles /P "." /M "*.xml" /D +%date% /C "cmd /c for %%a in (@file) do @(echo(%%~a") | >"fileList.txt" findstr /V "^$"
The above forfiles command searches all xml files ("*.xml") in the actual directory "." (= "\\server5\Datapool", because you have changed to it using pushd), and if the files have the date of today ("%date%) or later then it prints the filename to stdout, which is redirected to a pipe, to be filtered by findstr to delete the trailing empty line (created by findstr).
The filelist then is redirected to the file "fileList.txt".
(One error reason could be, that you might have insufficient rights to write onto the server as mentioned above together with the solution.)

Files from yesterday shouldn't be listed, except if the date on your pc is just wrong, or if you have replaced the "%date%" part with something else which results in for example yesterdays date.

ZayaMia wrote:So if the Name "PHIL" and currentState "Executive13" match, which are always both on the same line ,
This is done by the line:

Code: Select all

>"result.txt" findstr /r /i /m /f:"fileList.txt" /C:"Cale.*Toyota231" /C:"Toyota231.*Cale"
Note that it searches for "Cale" and "Toyota231" instead of "PHIL" and "Executive13", because you have said so.
Just replace any "Cale" with "PHIL", and "Toyota231" with "Executive13", and it should search what you really are looking for.

The names of all files containing the given search pattern are stored to the file "result.txt".


ZayaMia wrote:they should be moved from \\server5\Datapool to C:\Users\folder1.
This is done by the line:

Code: Select all

for /f "usebackq tokens=* delims=" %%a in ("result.txt") do copy "%%~a" "C:\Users\folder1" 

The "for/f" command loops over each line in the file "result.txt" (= list of matching files) and copy the files to "C:\Users\folder1".

Note that your local directory should be "\\server5\Datapool" (pushd), so you are copying the matching files from "\\server5\Datapool" to "C:\Users\folder1".


ZayaMia wrote:XML file names aren't named by Cale or Toyota231.
These is only contained in a batch file, that proves, that the algorithm works and is just some "test data":
I've done this because i don't know the names of the files on your server, and the algorithm needs some files to test.

But they are not hardcoded and found because of its content (and not its name):
You could arbitrarily rename them as you wish, and the batch will still find them.

Prove for that ("testAnd2.bat", with the real search patterns):

Code: Select all

@echo off
>"Any.xml" echo(PHIL
>"Name.xml" echo(Executive13
>"You.xml" echo(Executive13PHIL
>"wish.xml" echo(PHILExecutive13

setlocal enableExtensions disableDelayedExpansion
rem pushd \\server5\Datapool
forfiles /P "." /M "*.xml" /D +%date% /C "cmd /c for %%a in (@file) do @(echo(%%~a") | >"fileList.txt" findstr /V "^$"
>"result.txt" findstr /r /i /m /f:"fileList.txt" /C:"PHIL.*Executive13" /C:"Executive13.*PHIL"
for /f "usebackq tokens=* delims=" %%a in ("result.txt") do echo copy "%%~a" "C:\Users\folder1"
rem del "fileList.txt"
rem del "result.txt"
goto :eof
Note that the files with "test" in it are in this case just test files.
The file named "myBatAnd.bat" (or similar) contain the real algorithm.

Result:

Code: Select all

Z:\>testAnd2.bat
copy "wish.xml" "C:\Users\folder1"
copy "You.xml" "C:\Users\folder1"

Z:\>


ZayaMia wrote:So my attempt here was to search for a faster solution to move ALL specified files from todays date into another folder.

I hope its clear now. :oops:
Again, you should note that the above "mybatAnd.bat" (above) is doing the exact same thing, which i beside this have proven to you.

If the batch is doing something other than this on your system, you have to provide more information, as i said above:
If you changed something in the code, you should also mention it (for example if you have changed "%date%" with something else).


penpen

ZayaMia
Posts: 14
Joined: 25 Apr 2017 02:54

Re: Move newest XML files by reading first lines

#14 Post by ZayaMia » 01 May 2017 21:20

hey penpen

I suspect you just don't belive me (whyever)?!


not at all ! I'am just really unexperienced in .bat and therefore I missunderstand still a lot.

I honestly appreciate all your help and patience with me.

I tried your "mybatAnd.bat" again.

Gives me:

ERROR: Invalid argument/option - '05/02/2017'.
Type "FORFILES /? for usage.


Something I need to modify? Sorry for late response.

Wasn't home all weekend

ZayaMia
Posts: 14
Joined: 25 Apr 2017 02:54

Re: Move newest XML files by reading first lines

#15 Post by ZayaMia » 01 May 2017 22:22

Hey elzooilogico,

thanks for your reply also.

It gives me;

ERROR: Invalid date specified.

Could it be that mabe my system date differs ?

System Date is :05/02/2017

Best

Post Reply