Page 1 of 1
How to perfectly extract the content of a text file?
Posted: 29 Sep 2017 07:05
by PaperTronics
Hey guys!
So after my previous for loop problem, I thought everything was solved and good, but little did I know, that fortune had another problem in-store for me!
Trying to extract data from my same text file "E_Summary.txt", I encountered another hair-tearing problem which was:
Content of the E_Summary File:
Code: Select all
IMPORT EXPORT DETAILS 2017
Export Details}
2157a45e9b - 15-4-2017 - To Unknown
541288t918b9s - 25-7-2017 - To KDM
--
Import Details}
124j19248I192L - 14-9-2017 - From Lahore
h7a8a99879a88 - 15-9-2017 - From Karachi
--
I wanted to extract the content from the file, assigning each line to a different variable, and dividing the content into sections. E.g: the content under the Import Details should be in ImportDetail[1], ImportDetail[2]... variables and the content under Export Details should be in ExportDetail[1], ExportDetail[2]... vars.
The only non-working and stupid piece of code that I've come up with till now is this:
Code: Select all
:ImportDetails
set ImportDetails_Cnt=0
For /f "skip=1 tokens=*" %%A IN (E_Summary.txt) DO (
if %%A == {*end*} (goto :ExportDetails)
Set "ImportDetailsTodo[!ImportDetails_Cnt!]=%%A"
Set /a ImportDetails_Cnt+=1
)
:ExportDetails
set ExportDetails_Cnt=0
set /a ImportDetails_Cnt+=3
For /f "skip=%ImportDetails_Cnt% tokens=*" %%A IN (E_Summary.txt) DO (
if %%A == {*end*} (goto :Start)
Set "ExportDetailsTodo[!ExportDetails_Cnt!]=%%A"
Set /a ExportDetails_Cnt+=1
)
I'm sure this code will be a good one to laugh at when I would've mastered the for loop, but till then, I've got to rely on the professional minds of DosTips to help me out.
Any help is greatly appreciated,
PaperTronics
Re: How to perfectly extract the content of a text file?
Posted: 29 Sep 2017 09:28
by Compo
Here's some example code for you:
Code: Select all
@Echo Off
SetLocal EnableDelayedExpansion
For /F "UseBackQ EOL=- Tokens=*" %%A In ("E_Summary.txt"
) Do Set "vl=%%A" & Call :PL !vl!
For /F "Delims=" %%A In ('Set^|Find "["') Do Echo %%A
Pause
Exit/B
:PL
Echo %*|FindStr/LIE "}">Nul&&(Set "vn=%1"&Set "vi=")||(If Defined vn (
If Not Defined vi (Set "vi=1") Else Set/A "vi+=1"
Set "%vn%[!vi!]=!vl!"))
GoTo :EOF
[Edit /]…or without delayed expansion
Code: Select all
@Echo Off
For /F "UseBackQ EOL=- Tokens=*" %%A In ("E_Summary.txt"
) Do Set "vl=%%A" & Call :PL %%vl%%
For /F "Delims=" %%A In ('Set^|Find "["') Do Echo %%A
Pause
Exit/B
:PL
Echo %*|FindStr/LIE "}">Nul&&(Set "vn=%1"&Set "vi=")||(If Defined vn (
If Not Defined vi (Set "vi=1") Else Set/A "vi+=1"
Call Set "%vn%[%%vi%%]=%%vl%%"))
GoTo :EOF
Re: How to perfectly extract the content of a text file?
Posted: 29 Sep 2017 09:42
by pieh-ejdsch
never go with a goto: mark out of a loop. This will be further muddled afterwards.
Unless you go with an exit / b from the loop that is inside a called subroutine.
This will work:
Code: Select all
:I_E_Details
@echo off
setlocal enabledelayedexpansion
set "prompt=$g$s"
for /f "delims==" %%i in ('2^>nul set cnt_') do set "%%i="
set "data="
for /f tokens^=1^,2^,4^,5^,7^,8delims^=^=^" %%A in (
"exportDetails=Export Details}" ^
"importDetails=Import Details}" ^
"notThis={*end*}"
) do for /f "tokens=*" %%a in (e_summary.txt) do (
if /i %%a == %%B set "data=%%A"
if /i %%a == %%D set "data=%%C"
if /i %%a == %%F set "data="
if defined data (
set /a "cnt_D=(cnt_!data!+=1)"
set "!data!Todo[!Cnt_D!]=%%a"
)
)
set exp
set imp
pause
exit /b
Phil
Re: How to perfectly extract the content of a text file?
Posted: 29 Sep 2017 22:32
by PaperTronics
@Phil & @Compo - I'm really grateful to you because the code that you guys provided is working. But I actually have 5 Files that I want to use this code on. 4 of them are similar to each other and the code works on them, but the 5th one is a bit different and the code doesn't work on it. Here is the content of the 5th file:
Code: Select all
Common Details}
Gear
1249nu8h12
{*end*}
Urgent Details}
29589fvzgd
295 cs38 r93112249902u741
{*end*}
Home Details}
71%ahr418901if15
{*end*}
Work Details}
58ggg1vv81 bc9sbu751
838591ffsas
13958933911v
18488
{*end*}
Office Details}
swiyr128rnddd12fcbq
fheuwvfewycey8r13353
385385y2751t3r8
3851375t1375831r1
2145
{*end*}
Also, data is regularly entered into this file by another batch file, so the amount of lines in a section may not be the same every time. I DO NOT want the titles (Office Details, Work Details, etc.) and I also don't want the {*end*} thingys.
Thanks for your help,
PaperTronics
Re: How to perfectly extract the content of a text file?
Posted: 30 Sep 2017 02:47
by pieh-ejdsch
now it was not clear enough how the file looks (even the example was different from your explanations).
As well as which lines are to be extracted.
Well, a revised version:
Code: Select all
@echo off
setlocal disabledelayedexpansion
set "prompt=$g$s"
for /f "delims==" %%i in ('2^>nul set cnt_') do set "%%i="
set "data="
for /f "tokens=*" %%a in (e_summary.txt) do (
set "Line=%%a"
setlocal enabledelayedexpansion
if !line! == {*end*} (
set "data="
) else if .!line:*}^=! == . (
set "data=!line: =!"
set "data=!data:}=!"
) else if defined data (
set /a "cnt_D=(cnt_!data!+=1)"
)
for /f "tokens=1,2delims=;" %%D in ("!data!;!Cnt_D!") do (
endlocal
set "data=%%D"
set "cnt_%%D=%%E"
if NOT .%%E == . set "%%DTodo[%%E]=%%a"
)
if NOT .!! == . endlocal
)
set |find "["
pause
exit /b
Phil
Re: How to perfectly extract the content of a text file?
Posted: 30 Sep 2017 03:01
by Compo
The problem would have just been the
EOL character I was using, however you decided to sneakily include a
% character.
Try this slightly modified version, only for files which use the
{*end*} type markers,
(I have fixed the scripts I posted previously for use with -- markers):
Code: Select all
@Echo Off
SetLocal EnableDelayedExpansion
For /F "UseBackQ EOL={ Tokens=*" %%A In ("E_Summary.txt"
) Do Set "vl=%%A" & Call :PL !vl!
For /F "Delims=" %%A In ('Set^|Find "["') Do Echo %%A
Pause
Exit/B
:PL
Echo %*|FindStr/LIE "}">Nul&&(Set "vn=%1"&Set "vi=")||(If Defined vn (
If Not Defined vi (Set "vi=1") Else Set/A "vi+=1"
Set "%vn%[!vi!]=!vl!"))
GoTo :EOF
…and without delayed expansion
Code: Select all
@Echo Off
For /F "UseBackQ EOL={ Tokens=*" %%A In ("E_Summary.txt"
) Do Set "vl=%%A" & Call :PL %%vl%%
For /F "Delims=" %%A In ('Set^|Find "["') Do Echo %%A
Pause
Exit/B
:PL
Echo %*|FindStr/LIE "}">Nul&&(Set "vn=%1"&Set "vi=")||(If Defined vn (
If Not Defined vi (Set "vi=1") Else Set/A "vi+=1"
Call Set "%vn%[%%vi%%]=%%vl%%"))
GoTo :EOF
Re: How to perfectly extract the content of a text file?
Posted: 30 Sep 2017 08:02
by PaperTronics
@Phil - Thanks a bunch man! Your code works perfectly!
@Compo - I tried your code but it doesn't seem to support poison characters. I still appreciate the efforts you put into brainstorming the code.
Thanks again,
PaperTronics
Re: How to perfectly extract the content of a text file?
Posted: 30 Sep 2017 10:42
by dbenham
pieh-ejdsch wrote:never go with a goto: mark out of a loop. This will be further muddled afterwards.
Unless you go with an exit / b from the loop that is inside a called subroutine.
I don't fully understand what you are saying, nor why you are bringing this up. But my limited understanding of your words appears to contradict techniques that I know work very well with batch.
Using GOTO :labelOutsideLoop is a long standing, tried and true method to exit a FOR loop prematurely. I, and many others, use it frequently, with perfect results.
Trivial example that demonstrates processing only the first line of a file (there are better ways to do this)Code: Select all
for /f "delims=" %%A in (file.txt) do (
echo %%A
goto :endLoop
)
:endLoop
The only situation for which it does not work well is with a FOR /L loop with many iterations, because the loop continues to iterate to completion (taking time), even though the contents of the DO loop are not actually executed after the GOTO has executed.
Trivial FOR /L example that does not work wellCode: Select all
for /l %%N in (1 1 1000000) do (
echo %%N
goto :endLoop
)
:endLoop
Only 1 line is printed, but the loop takes significant time because it still iterates 1 million times
Dave Benham
Re: How to perfectly extract the content of a text file?
Posted: 30 Sep 2017 12:12
by Compo
PaperTronics wrote:@Phil - Thanks a bunch man! Your code works perfectly!
@Compo - I tried your code but it doesn't seem to support poison characters.
The only poison character you provided was a percent character.
I'm certainly am not spending my evenings/weekends providing a script which pre-guesses whichever poison characters you decide to throw at it after it's written.Here's your text file now containing four additional poison characters,
| ! ; &Code: Select all
Common Details}
Gear
1249|nu8h12
{*end*}
Urgent Details}
29589fvzgd
295 cs38 r931122!49902u741
{*end*}
Home Details}
71%ahr418901if15
{*end*}
Work Details}
58ggg;1vv81 bc9sbu751
838591ffsas
13958933911v
18488
{*end*}
Office Details}
swiyr128rnddd12fcbq
fheuwvfe&wycey8r13353
385385y2751t3r8
3851375t1375831r1
2145
{*end*}
And here is the side by side output the last posted scripts by Phil and myself:
Code: Select all
Phils Compos [no delayed expanson] Expected Output
Common[1]=Gear Common[1]=Gear
Common[2]=1249|nu8h12
HomeDetailsTodo[1]=71%ahr418901if15 Home[1]=71%ahr418901if15 Home[1]=71%ahr418901if15
OfficeDetailsTodo[1]=swiyr128rnddd12fcbq Office[1]=swiyr128rnddd12fcbq Office[1]=swiyr128rnddd12fcbq
OfficeDetailsTodo[2]=fheuwvfe&wycey8r13353 Office[2]=fheuwvfe&wycey8r13353
OfficeDetailsTodo[3]=385385y2751t3r8 Office[2]=385385y2751t3r8 Office[3]=385385y2751t3r8
OfficeDetailsTodo[4]=3851375t1375831r1 Office[3]=3851375t1375831r1 Office[4]=3851375t1375831r1
OfficeDetailsTodo[5]=2145 Office[4]=2145 Office[5]=2145
UrgentDetailsTodo[1]=29589fvzgd Urgent[1]=29589fvzgd Urgent[1]=29589fvzgd
UrgentDetailsTodo[2]=295 cs38 r93112249902u741 Urgent[2]=295 cs38 r931122!49902u741 Urgent[2]=295 cs38 r931122!49902u741
WorkDetailsTodo[1]=58ggg;1vv81 bc9sbu751 Work[1]=58ggg;1vv81 bc9sbu751 Work[1]=58ggg;1vv81 bc9sbu751
WorkDetailsTodo[2]=838591ffsas Work[2]=838591ffsas Work[2]=838591ffsas
WorkDetailsTodo[3]=13958933911v Work[3]=13958933911v Work[3]=13958933911v
WorkDetailsTodo[4]=18488 Work[4]=18488 Work[4]=18488
Now as you can see, both scripts failed to output one entry each but mine correctly outputted the exclamation mark in the second entry under Urgent Details} whereas Phils didn't.
In the case above, my script output is therefore more accurate, so am I saying mine is better or you are wrong? no certainly not! Just that both scripts have issues with characters which were not revealed to us when you provided your examples without ever mentioning the possibility of poison characters.
Re: How to perfectly extract the content of a text file?
Posted: 30 Sep 2017 23:45
by PaperTronics
@Compo - I'm really sorry. I apologize for the lack of details, but there will be eventually every type of character in the file. Whether it be a poison character or any other, I hope till then I would've mastered the for loop and known how to get around this problem.
Anyways, about the code:
I tried your code again today, and it worked as I wanted it to. With the added benefit of your code being short and easy for me to understand, I'm using your code from now on. Thanks, and sorry for my foolishness
. Is there anyway I could restore your evenings
?
@dbenham - I agree, too.
Thanks and sorry,
PaperTronics
Re: How to perfectly extract the content of a text file?
Posted: 01 Oct 2017 04:38
by Compo
TBF, looking at the side by side comparisons in my last posting,
(which I've edited to include the Expected Output), I wouldn't use either of them at this stage!
I may however prefer to use this version:
Code: Select all
@Echo Off
For /F "UseBackQEOL={Tokens=*" %%A In ("E_Summary.txt"
) Do Echo(%%A>"$.tmp"&Set "vl=%%A"&Call :PL
If Exist "$.tmp" (Del "$.tmp"&Set|Find "["&Timeout -1)
Exit/B
:PL
FindStr/E "}" "$.tmp">Nul&&(Set "vn=%vl: ="&:"%"
Set "vi=")||(If Defined vn (If Not Defined vi (Set "vi=1") Else Set/A vi+=1
Call Set "%%vn%%[%%vi%%]=%%vl%%"))
GoTo :EOF
Re: How to perfectly extract the content of a text file?
Posted: 04 Oct 2017 05:16
by PaperTronics
@Compo - Thanks again, for the rewrite. The revised version of your code printed the line with poison characters just fine and I noticed a slight speed change too.
Thanks for your help,
PaperTronics