Parsing XML using Batch

Discussion forum for all Windows batch related topics.

Moderator: DosItHelp

Post Reply
Message
Author
darioit
Posts: 230
Joined: 02 Aug 2010 05:25

Parsing XML using Batch

#1 Post by darioit » 20 Oct 2010 02:18

Hello everybody,

I need to parse a lot of line like this XML to extract only few row

<DOCUMENT TYPE="ABC" DISK="aaaaa" PATH="aaa_0000/aaa.0000000000.uuu_xxx.0000000000.pdf.ppp">
<FIELD NAME="Data xxxxxxxxxx" VALUE="2000-01-01 00:00:00.0" TYPE="DATE" />
<FIELD NAME="uuuuuu" VALUE="abcdefghilmnoprz" TYPE="STRING" />
</DOCUMENT>

to obtain this row
DISK="aaaaa";PATH="aaa_0000/aaa.0000000000.uuu_xxx.0000000000.pdf.ppp";"Data xxxxxxxxxx" VALUE="2000-01-01 00:00:00.0;"uuuuuu" VALUE="abcdefghilmnoprz"

or also like this
aaaaa;aaa_0000/aaa.0000000000.uuu_xxx.0000000000.pdf.ppp;2000-01-01 00:00:00.0;abcdefghilmnoprz

Regards
Dario

ghostmachine4
Posts: 319
Joined: 12 May 2006 01:13

Re: Parsing XML using Batch

#2 Post by ghostmachine4 » 20 Oct 2010 03:15

batch is not even suitable for parsing files, let alone for parsing XML. Ideally you should use XML processing tools. Or a programming language that support XML api/libraries for processing XML specifically. here's an example in native vbscript

Code: Select all

strFile = WScript.Arguments(0)
Set objFS = CreateObject( "Scripting.FileSystemObject" )
set xmlDoc=CreateObject("Microsoft.XMLDOM")
xmlDoc.async="false"
xmlDoc.load(strFile)
For each x in xmlDoc.documentElement.attributes
  WScript.Echo x.nodeName, x.text
Next
set xmlCol = xmlDoc.documentElement.childNodes
For Each Elem In xmlCol
 WScript.Echo "Now at child node: " & Elem.nodeName
 For Each z In Elem.attributes
    wscript.Echo z.nodeName ,z.text
 Next
Next


output

Code: Select all

C:\test>cscript //nologo test.vbs xmlfile
TYPE ABC
DISK aaaaa
PATH aaa_0000/aaa.0000000000.uuu_xxx.0000000000.pdf.ppp
Now at child node: FIELD
NAME Data xxxxxxxxxx
VALUE 2000-01-01 00:00:00.0
TYPE DATE
Now at child node: FIELD
NAME uuuuuu
VALUE abcdefghilmnoprz
TYPE STRING
Last edited by ghostmachine4 on 20 Oct 2010 03:59, edited 1 time in total.

darioit
Posts: 230
Joined: 02 Aug 2010 05:25

Re: Parsing XML using Batch

#3 Post by darioit » 20 Oct 2010 03:20

doesnt' works, it needs this xmlDoc.documentElement

Regards
Dario

ghostmachine4
Posts: 319
Joined: 12 May 2006 01:13

Re: Parsing XML using Batch

#4 Post by ghostmachine4 » 20 Oct 2010 03:57

saying it doesn't work doesn't really help, isn't it? What windows OS version are you using? How did you execute it? what other error messages did you see?

darioit
Posts: 230
Joined: 02 Aug 2010 05:25

Re: Parsing XML using Batch

#5 Post by darioit » 20 Oct 2010 04:09

XP Professional SP3 Ita

this is the error

test.vbs(6, 1) Errore di run-time di Microsoft VBScript: Necessario oggetto: 'xmlDoc.documentElement'

Thanks
Dario

darioit
Posts: 230
Joined: 02 Aug 2010 05:25

Re: Parsing XML using Batch

#6 Post by darioit » 20 Oct 2010 04:14

I't works only with 1 documento
<DOCUMENT ....
</DOCUMENT>

I have many doc....

<DOCUMENT ....
</DOCUMENT>
<DOCUMENT ....
</DOCUMENT>
<DOCUMENT ....
</DOCUMENT>

Regards
Dario

ghostmachine4
Posts: 319
Joined: 12 May 2006 01:13

Re: Parsing XML using Batch

#7 Post by ghostmachine4 » 20 Oct 2010 04:37

then show a proper subset of the XML file you are processing !!

darioit
Posts: 230
Joined: 02 Aug 2010 05:25

Re: Parsing XML using Batch

#8 Post by darioit » 20 Oct 2010 04:57

<INDEX>
<DOCUMENT TYPE="ABC" DISK="aaaaa" PATH="aaa_0000/aaa.0000000000.uuu_xxx.0000000000.pdf.ppp">
<FIELD NAME="Data xxxxxxxxxx" VALUE="2000-01-01 00:00:00.0" TYPE="DATE" />
<FIELD NAME="uuuuuu" VALUE="abcdefghilmnoprz" TYPE="STRING" />
<FIELD NAME="Tipo ABC" VALUE="V" TYPE="STRING" />
<FIELD NAME="Tipo O" VALUE="E" TYPE="STRING" />
<FIELD NAME="Tipo P" VALUE="0" TYPE="STRING" />
<FIELD NAME="Cod O" VALUE="456768345234" TYPE="STRING" />
<FIELD NAME="Cod A" VALUE="123123123" TYPE="STRING" />
<FIELD NAME="Data E ABC" VALUE="2000-01-01 00:00:00.0" TYPE="DATE" />
<FIELD NAME="Nu V" VALUE="null" TYPE="NUMBER" />
<FIELD NAME="D V" VALUE="null" TYPE="DATE" />
</DOCUMENT>
<DOCUMENT TYPE="ABC" DISK="aaaaa" PATH="aaa_0000/aaa.0000000000.uuu_xxx.0000000000.pdf.ppp">
<FIELD NAME="Data xxxxxxxxxx" VALUE="2000-01-01 00:00:00.0" TYPE="DATE" />
<FIELD NAME="uuuuuu" VALUE="abcdefghilmnopr1" TYPE="STRING" />
<FIELD NAME="Tipo ABC" VALUE="V" TYPE="STRING" />
<FIELD NAME="Tipo O" VALUE="E" TYPE="STRING" />
<FIELD NAME="Tipo P" VALUE="0" TYPE="STRING" />
<FIELD NAME="Cod O" VALUE="456768345234" TYPE="STRING" />
<FIELD NAME="Cod A" VALUE="123123123" TYPE="STRING" />
<FIELD NAME="Data E ABC" VALUE="2000-01-01 00:00:00.0" TYPE="DATE" />
<FIELD NAME="Nu V" VALUE="null" TYPE="NUMBER" />
<FIELD NAME="D V" VALUE="null" TYPE="DATE" />
</DOCUMENT>
................................................. and thousands other code

I need match (from a list) this value (abcdefghilmnoprz) find in this row FIELD NAME="uuuuuu" VALUE="abcdefghilmnoprz" TYPE="STRING" />
and write a line with the value and this other fields TYPE="ABC" DISK="aaaaa" PATH="aaa_0000/aaa.0000000000.uuu_xxx.0000000000.pdf.ppp"
"Data xxxxxxxxxx" VALUE="2000-01-01 00:00:00.0"

Regards
Dario

ghostmachine4
Posts: 319
Joined: 12 May 2006 01:13

Re: Parsing XML using Batch

#9 Post by ghostmachine4 » 20 Oct 2010 05:23

since you requirement is simple enough, you can use this gawk script

Code: Select all

BEGIN{
 RS="</DOCUMENT>"
 FS="\n"
}
{ for(i=1;i<=NF;i++){
        if( $i ~ /DOCUMENT/) {d=$i}
        if( $i ~ /Name=\"Data/) {n=$i}
        if( $i ~ /VALUE=\"abcdefghilmnoprz\"/) {
          print "->"d,n,$i
        }
}
}


save the above as myscript.awk and

Code: Select all

C:\test>gawk -f myscript.awk file
-><DOCUMENT TYPE="ABC" DISK="aaaaa" PATH="aaa_0000/aaa.0000000000.uuu_xxx.0000000000.pdf.ppp">  <FIELD NAME="uuuuuu" VAL
UE="abcdefghilmnoprz" TYPE="STRING" />

darioit
Posts: 230
Joined: 02 Aug 2010 05:25

Re: Parsing XML using Batch

#10 Post by darioit » 20 Oct 2010 06:00

Great it works fine, but I have many code to fine about 200 in a txt file, how can I pass all code and get a list ?

Regards
Dario

Post Reply