How to perfectly extract the content of a text file?

Discussion forum for all Windows batch related topics.

Moderator: DosItHelp

Post Reply
Message
Author
PaperTronics
Posts: 118
Joined: 02 Apr 2017 06:11

How to perfectly extract the content of a text file?

#1 Post by PaperTronics » 29 Sep 2017 07:05

Hey guys!

So after my previous for loop problem, I thought everything was solved and good, but little did I know, that fortune had another problem in-store for me!

Trying to extract data from my same text file "E_Summary.txt", I encountered another hair-tearing problem which was:

Content of the E_Summary File:

Code: Select all

IMPORT EXPORT DETAILS 2017

Export Details}

2157a45e9b - 15-4-2017 - To Unknown
541288t918b9s - 25-7-2017 - To KDM
--

Import Details}

124j19248I192L - 14-9-2017 - From Lahore
h7a8a99879a88 - 15-9-2017 - From Karachi
--





I wanted to extract the content from the file, assigning each line to a different variable, and dividing the content into sections. E.g: the content under the Import Details should be in ImportDetail[1], ImportDetail[2]... variables and the content under Export Details should be in ExportDetail[1], ExportDetail[2]... vars.

The only non-working and stupid piece of code that I've come up with till now is this:

Code: Select all

:ImportDetails 
set ImportDetails_Cnt=0
For /f "skip=1 tokens=*" %%A IN (E_Summary.txt) DO (
   if %%A == {*end*} (goto :ExportDetails)
   Set "ImportDetailsTodo[!ImportDetails_Cnt!]=%%A"
   Set /a ImportDetails_Cnt+=1
   )

:ExportDetails
set ExportDetails_Cnt=0
set /a ImportDetails_Cnt+=3
For /f "skip=%ImportDetails_Cnt% tokens=*" %%A IN (E_Summary.txt) DO (
   if %%A == {*end*} (goto :Start)
   Set "ExportDetailsTodo[!ExportDetails_Cnt!]=%%A"
   Set /a ExportDetails_Cnt+=1
   )



I'm sure this code will be a good one to laugh at when I would've mastered the for loop, but till then, I've got to rely on the professional minds of DosTips to help me out.

Any help is greatly appreciated,
PaperTronics

Compo
Posts: 600
Joined: 21 Mar 2014 08:50

Re: How to perfectly extract the content of a text file?

#2 Post by Compo » 29 Sep 2017 09:28

Here's some example code for you:

Code: Select all

@Echo Off
SetLocal EnableDelayedExpansion
For /F "UseBackQ EOL=- Tokens=*" %%A In ("E_Summary.txt"
) Do Set "vl=%%A" & Call :PL !vl!
For /F "Delims=" %%A In ('Set^|Find "["') Do Echo %%A
Pause
Exit/B

:PL
Echo %*|FindStr/LIE "}">Nul&&(Set "vn=%1"&Set "vi=")||(If Defined vn (
      If Not Defined vi (Set "vi=1") Else Set/A "vi+=1"
      Set "%vn%[!vi!]=!vl!"))
GoTo :EOF


[Edit /]
…or without delayed expansion

Code: Select all

@Echo Off
For /F "UseBackQ EOL=- Tokens=*" %%A In ("E_Summary.txt"
) Do Set "vl=%%A" & Call :PL %%vl%%
For /F "Delims=" %%A In ('Set^|Find "["') Do Echo %%A
Pause
Exit/B

:PL
Echo %*|FindStr/LIE "}">Nul&&(Set "vn=%1"&Set "vi=")||(If Defined vn (
      If Not Defined vi (Set "vi=1") Else Set/A "vi+=1"
      Call Set "%vn%[%%vi%%]=%%vl%%"))
GoTo :EOF
Last edited by Compo on 30 Sep 2017 03:13, edited 3 times in total.

pieh-ejdsch
Posts: 240
Joined: 04 Mar 2014 11:14
Location: germany

Re: How to perfectly extract the content of a text file?

#3 Post by pieh-ejdsch » 29 Sep 2017 09:42

never go with a goto: mark out of a loop. This will be further muddled afterwards.
Unless you go with an exit / b from the loop that is inside a called subroutine.

This will work:

Code: Select all

:I_E_Details
@echo off
setlocal enabledelayedexpansion
set "prompt=$g$s"

for /f "delims==" %%i in ('2^>nul set cnt_') do set "%%i="
set "data="
for /f tokens^=1^,2^,4^,5^,7^,8delims^=^=^" %%A in (

 "exportDetails=Export Details}" ^
 "importDetails=Import Details}" ^
 "notThis={*end*}"

) do for /f "tokens=*" %%a in (e_summary.txt) do (
  if /i %%a == %%B set "data=%%A"
  if /i %%a == %%D set "data=%%C"
  if /i %%a == %%F set "data="

  if defined data (
    set /a "cnt_D=(cnt_!data!+=1)"
    set "!data!Todo[!Cnt_D!]=%%a"
  )
)

set exp
set imp
pause
exit /b

Phil

PaperTronics
Posts: 118
Joined: 02 Apr 2017 06:11

Re: How to perfectly extract the content of a text file?

#4 Post by PaperTronics » 29 Sep 2017 22:32

@Phil & @Compo - I'm really grateful to you because the code that you guys provided is working. But I actually have 5 Files that I want to use this code on. 4 of them are similar to each other and the code works on them, but the 5th one is a bit different and the code doesn't work on it. Here is the content of the 5th file:

Code: Select all

Common Details}
Gear
1249nu8h12
{*end*}

Urgent Details}
29589fvzgd
295 cs38 r93112249902u741
{*end*}

Home Details}
71%ahr418901if15
{*end*}

Work Details}
58ggg1vv81 bc9sbu751
838591ffsas
13958933911v
18488
{*end*}


Office Details}
swiyr128rnddd12fcbq
fheuwvfewycey8r13353
385385y2751t3r8
3851375t1375831r1
2145
{*end*}


Also, data is regularly entered into this file by another batch file, so the amount of lines in a section may not be the same every time. I DO NOT want the titles (Office Details, Work Details, etc.) and I also don't want the {*end*} thingys.




Thanks for your help,
PaperTronics

pieh-ejdsch
Posts: 240
Joined: 04 Mar 2014 11:14
Location: germany

Re: How to perfectly extract the content of a text file?

#5 Post by pieh-ejdsch » 30 Sep 2017 02:47

now it was not clear enough how the file looks (even the example was different from your explanations).
As well as which lines are to be extracted.

Well, a revised version:

Code: Select all

@echo off
setlocal disabledelayedexpansion
set "prompt=$g$s"

for /f "delims==" %%i in ('2^>nul set cnt_') do set "%%i="
set "data="


for /f "tokens=*" %%a in (e_summary.txt) do (
  set "Line=%%a"
  setlocal enabledelayedexpansion
   if !line! == {*end*} (
    set "data="
  ) else if .!line:*}^=! == . (
      set "data=!line: =!"
      set "data=!data:}=!"
  ) else if defined data (
    set /a "cnt_D=(cnt_!data!+=1)"
  )
  for /f "tokens=1,2delims=;" %%D in ("!data!;!Cnt_D!") do (
    endlocal
    set "data=%%D"
    set "cnt_%%D=%%E"
    if NOT .%%E == . set "%%DTodo[%%E]=%%a"
  )
  if NOT .!! == . endlocal
)

set |find "["
pause
exit /b

Phil

Compo
Posts: 600
Joined: 21 Mar 2014 08:50

Re: How to perfectly extract the content of a text file?

#6 Post by Compo » 30 Sep 2017 03:01

The problem would have just been the EOL character I was using, however you decided to sneakily include a % character.

Try this slightly modified version, only for files which use the {*end*} type markers, (I have fixed the scripts I posted previously for use with -- markers):

Code: Select all

@Echo Off
SetLocal EnableDelayedExpansion
For /F "UseBackQ EOL={ Tokens=*" %%A In ("E_Summary.txt"
) Do Set "vl=%%A" & Call :PL !vl!
For /F "Delims=" %%A In ('Set^|Find "["') Do Echo %%A
Pause
Exit/B

:PL
Echo %*|FindStr/LIE "}">Nul&&(Set "vn=%1"&Set "vi=")||(If Defined vn (
      If Not Defined vi (Set "vi=1") Else Set/A "vi+=1"
      Set "%vn%[!vi!]=!vl!"))
GoTo :EOF


…and without delayed expansion

Code: Select all

@Echo Off
For /F "UseBackQ EOL={ Tokens=*" %%A In ("E_Summary.txt"
) Do Set "vl=%%A" & Call :PL %%vl%%
For /F "Delims=" %%A In ('Set^|Find "["') Do Echo %%A
Pause
Exit/B

:PL
Echo %*|FindStr/LIE "}">Nul&&(Set "vn=%1"&Set "vi=")||(If Defined vn (
      If Not Defined vi (Set "vi=1") Else Set/A "vi+=1"
      Call Set "%vn%[%%vi%%]=%%vl%%"))
GoTo :EOF

PaperTronics
Posts: 118
Joined: 02 Apr 2017 06:11

Re: How to perfectly extract the content of a text file?

#7 Post by PaperTronics » 30 Sep 2017 08:02

@Phil - Thanks a bunch man! Your code works perfectly!

@Compo - I tried your code but it doesn't seem to support poison characters. I still appreciate the efforts you put into brainstorming the code.



Thanks again,
PaperTronics

dbenham
Expert
Posts: 2461
Joined: 12 Feb 2011 21:02
Location: United States (east coast)

Re: How to perfectly extract the content of a text file?

#8 Post by dbenham » 30 Sep 2017 10:42

pieh-ejdsch wrote:never go with a goto: mark out of a loop. This will be further muddled afterwards.
Unless you go with an exit / b from the loop that is inside a called subroutine.

I don't fully understand what you are saying, nor why you are bringing this up. But my limited understanding of your words appears to contradict techniques that I know work very well with batch.

Using GOTO :labelOutsideLoop is a long standing, tried and true method to exit a FOR loop prematurely. I, and many others, use it frequently, with perfect results.

Trivial example that demonstrates processing only the first line of a file (there are better ways to do this)

Code: Select all

for /f "delims=" %%A in (file.txt) do (
  echo %%A
  goto :endLoop
)
:endLoop


The only situation for which it does not work well is with a FOR /L loop with many iterations, because the loop continues to iterate to completion (taking time), even though the contents of the DO loop are not actually executed after the GOTO has executed.

Trivial FOR /L example that does not work well

Code: Select all

for /l %%N in (1 1 1000000) do (
  echo %%N
  goto :endLoop
)
:endLoop
Only 1 line is printed, but the loop takes significant time because it still iterates 1 million times


Dave Benham

Compo
Posts: 600
Joined: 21 Mar 2014 08:50

Re: How to perfectly extract the content of a text file?

#9 Post by Compo » 30 Sep 2017 12:12

PaperTronics wrote:@Phil - Thanks a bunch man! Your code works perfectly!

@Compo - I tried your code but it doesn't seem to support poison characters.
The only poison character you provided was a percent character.

I'm certainly am not spending my evenings/weekends providing a script which pre-guesses whichever poison characters you decide to throw at it after it's written.

Here's your text file now containing four additional poison characters, | ! ; &

Code: Select all

Common Details}
Gear
1249|nu8h12
{*end*}

Urgent Details}
29589fvzgd
295 cs38 r931122!49902u741
{*end*}

Home Details}
71%ahr418901if15
{*end*}

Work Details}
58ggg;1vv81 bc9sbu751
838591ffsas
13958933911v
18488
{*end*}


Office Details}
swiyr128rnddd12fcbq
fheuwvfe&wycey8r13353
385385y2751t3r8
3851375t1375831r1
2145
{*end*}

And here is the side by side output the last posted scripts by Phil and myself:

Code: Select all

                    Phils                           Compos [no delayed expanson]               Expected Output
                                                Common[1]=Gear                        Common[1]=Gear
                                                                                      Common[2]=1249|nu8h12
HomeDetailsTodo[1]=71%ahr418901if15             Home[1]=71%ahr418901if15              Home[1]=71%ahr418901if15
OfficeDetailsTodo[1]=swiyr128rnddd12fcbq        Office[1]=swiyr128rnddd12fcbq         Office[1]=swiyr128rnddd12fcbq
OfficeDetailsTodo[2]=fheuwvfe&wycey8r13353                                            Office[2]=fheuwvfe&wycey8r13353
OfficeDetailsTodo[3]=385385y2751t3r8            Office[2]=385385y2751t3r8             Office[3]=385385y2751t3r8
OfficeDetailsTodo[4]=3851375t1375831r1          Office[3]=3851375t1375831r1           Office[4]=3851375t1375831r1
OfficeDetailsTodo[5]=2145                       Office[4]=2145                        Office[5]=2145
UrgentDetailsTodo[1]=29589fvzgd                 Urgent[1]=29589fvzgd                  Urgent[1]=29589fvzgd
UrgentDetailsTodo[2]=295 cs38 r93112249902u741  Urgent[2]=295 cs38 r931122!49902u741  Urgent[2]=295 cs38 r931122!49902u741
WorkDetailsTodo[1]=58ggg;1vv81 bc9sbu751        Work[1]=58ggg;1vv81 bc9sbu751         Work[1]=58ggg;1vv81 bc9sbu751
WorkDetailsTodo[2]=838591ffsas                  Work[2]=838591ffsas                   Work[2]=838591ffsas
WorkDetailsTodo[3]=13958933911v                 Work[3]=13958933911v                  Work[3]=13958933911v
WorkDetailsTodo[4]=18488                        Work[4]=18488                         Work[4]=18488
Now as you can see, both scripts failed to output one entry each but mine correctly outputted the exclamation mark in the second entry under Urgent Details} whereas Phils didn't.

In the case above, my script output is therefore more accurate, so am I saying mine is better or you are wrong? no certainly not! Just that both scripts have issues with characters which were not revealed to us when you provided your examples without ever mentioning the possibility of poison characters.
Last edited by Compo on 01 Oct 2017 04:34, edited 1 time in total.

PaperTronics
Posts: 118
Joined: 02 Apr 2017 06:11

Re: How to perfectly extract the content of a text file?

#10 Post by PaperTronics » 30 Sep 2017 23:45

@Compo - I'm really sorry. I apologize for the lack of details, but there will be eventually every type of character in the file. Whether it be a poison character or any other, I hope till then I would've mastered the for loop and known how to get around this problem.

Anyways, about the code:
I tried your code again today, and it worked as I wanted it to. With the added benefit of your code being short and easy for me to understand, I'm using your code from now on. Thanks, and sorry for my foolishness :mrgreen: . Is there anyway I could restore your evenings :D ?

@dbenham - I agree, too. :!:



Thanks and sorry,
PaperTronics

Compo
Posts: 600
Joined: 21 Mar 2014 08:50

Re: How to perfectly extract the content of a text file?

#11 Post by Compo » 01 Oct 2017 04:38

TBF, looking at the side by side comparisons in my last posting, (which I've edited to include the Expected Output), I wouldn't use either of them at this stage!

I may however prefer to use this version:

Code: Select all

@Echo Off
For /F "UseBackQEOL={Tokens=*" %%A In ("E_Summary.txt"
) Do Echo(%%A>"$.tmp"&Set "vl=%%A"&Call :PL
If Exist "$.tmp" (Del "$.tmp"&Set|Find "["&Timeout -1)
Exit/B
:PL
FindStr/E "}" "$.tmp">Nul&&(Set "vn=%vl: ="&:"%"
   Set "vi=")||(If Defined vn (If Not Defined vi (Set "vi=1") Else Set/A vi+=1
      Call Set "%%vn%%[%%vi%%]=%%vl%%"))
GoTo :EOF

PaperTronics
Posts: 118
Joined: 02 Apr 2017 06:11

Re: How to perfectly extract the content of a text file?

#12 Post by PaperTronics » 04 Oct 2017 05:16

@Compo - Thanks again, for the rewrite. The revised version of your code printed the line with poison characters just fine and I noticed a slight speed change too.



Thanks for your help,
PaperTronics

Post Reply