Page 1 of 1

Parsing XML using Batch

Posted: 20 Oct 2010 02:18
by darioit
Hello everybody,

I need to parse a lot of line like this XML to extract only few row

<DOCUMENT TYPE="ABC" DISK="aaaaa" PATH="aaa_0000/aaa.0000000000.uuu_xxx.0000000000.pdf.ppp">
<FIELD NAME="Data xxxxxxxxxx" VALUE="2000-01-01 00:00:00.0" TYPE="DATE" />
<FIELD NAME="uuuuuu" VALUE="abcdefghilmnoprz" TYPE="STRING" />
</DOCUMENT>

to obtain this row
DISK="aaaaa";PATH="aaa_0000/aaa.0000000000.uuu_xxx.0000000000.pdf.ppp";"Data xxxxxxxxxx" VALUE="2000-01-01 00:00:00.0;"uuuuuu" VALUE="abcdefghilmnoprz"

or also like this
aaaaa;aaa_0000/aaa.0000000000.uuu_xxx.0000000000.pdf.ppp;2000-01-01 00:00:00.0;abcdefghilmnoprz

Regards
Dario

Re: Parsing XML using Batch

Posted: 20 Oct 2010 03:15
by ghostmachine4
batch is not even suitable for parsing files, let alone for parsing XML. Ideally you should use XML processing tools. Or a programming language that support XML api/libraries for processing XML specifically. here's an example in native vbscript

Code: Select all

strFile = WScript.Arguments(0)
Set objFS = CreateObject( "Scripting.FileSystemObject" )
set xmlDoc=CreateObject("Microsoft.XMLDOM")
xmlDoc.async="false"
xmlDoc.load(strFile)
For each x in xmlDoc.documentElement.attributes
  WScript.Echo x.nodeName, x.text
Next
set xmlCol = xmlDoc.documentElement.childNodes
For Each Elem In xmlCol
 WScript.Echo "Now at child node: " & Elem.nodeName
 For Each z In Elem.attributes
    wscript.Echo z.nodeName ,z.text
 Next
Next


output

Code: Select all

C:\test>cscript //nologo test.vbs xmlfile
TYPE ABC
DISK aaaaa
PATH aaa_0000/aaa.0000000000.uuu_xxx.0000000000.pdf.ppp
Now at child node: FIELD
NAME Data xxxxxxxxxx
VALUE 2000-01-01 00:00:00.0
TYPE DATE
Now at child node: FIELD
NAME uuuuuu
VALUE abcdefghilmnoprz
TYPE STRING

Re: Parsing XML using Batch

Posted: 20 Oct 2010 03:20
by darioit
doesnt' works, it needs this xmlDoc.documentElement

Regards
Dario

Re: Parsing XML using Batch

Posted: 20 Oct 2010 03:57
by ghostmachine4
saying it doesn't work doesn't really help, isn't it? What windows OS version are you using? How did you execute it? what other error messages did you see?

Re: Parsing XML using Batch

Posted: 20 Oct 2010 04:09
by darioit
XP Professional SP3 Ita

this is the error

test.vbs(6, 1) Errore di run-time di Microsoft VBScript: Necessario oggetto: 'xmlDoc.documentElement'

Thanks
Dario

Re: Parsing XML using Batch

Posted: 20 Oct 2010 04:14
by darioit
I't works only with 1 documento
<DOCUMENT ....
</DOCUMENT>

I have many doc....

<DOCUMENT ....
</DOCUMENT>
<DOCUMENT ....
</DOCUMENT>
<DOCUMENT ....
</DOCUMENT>

Regards
Dario

Re: Parsing XML using Batch

Posted: 20 Oct 2010 04:37
by ghostmachine4
then show a proper subset of the XML file you are processing !!

Re: Parsing XML using Batch

Posted: 20 Oct 2010 04:57
by darioit
<INDEX>
<DOCUMENT TYPE="ABC" DISK="aaaaa" PATH="aaa_0000/aaa.0000000000.uuu_xxx.0000000000.pdf.ppp">
<FIELD NAME="Data xxxxxxxxxx" VALUE="2000-01-01 00:00:00.0" TYPE="DATE" />
<FIELD NAME="uuuuuu" VALUE="abcdefghilmnoprz" TYPE="STRING" />
<FIELD NAME="Tipo ABC" VALUE="V" TYPE="STRING" />
<FIELD NAME="Tipo O" VALUE="E" TYPE="STRING" />
<FIELD NAME="Tipo P" VALUE="0" TYPE="STRING" />
<FIELD NAME="Cod O" VALUE="456768345234" TYPE="STRING" />
<FIELD NAME="Cod A" VALUE="123123123" TYPE="STRING" />
<FIELD NAME="Data E ABC" VALUE="2000-01-01 00:00:00.0" TYPE="DATE" />
<FIELD NAME="Nu V" VALUE="null" TYPE="NUMBER" />
<FIELD NAME="D V" VALUE="null" TYPE="DATE" />
</DOCUMENT>
<DOCUMENT TYPE="ABC" DISK="aaaaa" PATH="aaa_0000/aaa.0000000000.uuu_xxx.0000000000.pdf.ppp">
<FIELD NAME="Data xxxxxxxxxx" VALUE="2000-01-01 00:00:00.0" TYPE="DATE" />
<FIELD NAME="uuuuuu" VALUE="abcdefghilmnopr1" TYPE="STRING" />
<FIELD NAME="Tipo ABC" VALUE="V" TYPE="STRING" />
<FIELD NAME="Tipo O" VALUE="E" TYPE="STRING" />
<FIELD NAME="Tipo P" VALUE="0" TYPE="STRING" />
<FIELD NAME="Cod O" VALUE="456768345234" TYPE="STRING" />
<FIELD NAME="Cod A" VALUE="123123123" TYPE="STRING" />
<FIELD NAME="Data E ABC" VALUE="2000-01-01 00:00:00.0" TYPE="DATE" />
<FIELD NAME="Nu V" VALUE="null" TYPE="NUMBER" />
<FIELD NAME="D V" VALUE="null" TYPE="DATE" />
</DOCUMENT>
................................................. and thousands other code

I need match (from a list) this value (abcdefghilmnoprz) find in this row FIELD NAME="uuuuuu" VALUE="abcdefghilmnoprz" TYPE="STRING" />
and write a line with the value and this other fields TYPE="ABC" DISK="aaaaa" PATH="aaa_0000/aaa.0000000000.uuu_xxx.0000000000.pdf.ppp"
"Data xxxxxxxxxx" VALUE="2000-01-01 00:00:00.0"

Regards
Dario

Re: Parsing XML using Batch

Posted: 20 Oct 2010 05:23
by ghostmachine4
since you requirement is simple enough, you can use this gawk script

Code: Select all

BEGIN{
 RS="</DOCUMENT>"
 FS="\n"
}
{ for(i=1;i<=NF;i++){
        if( $i ~ /DOCUMENT/) {d=$i}
        if( $i ~ /Name=\"Data/) {n=$i}
        if( $i ~ /VALUE=\"abcdefghilmnoprz\"/) {
          print "->"d,n,$i
        }
}
}


save the above as myscript.awk and

Code: Select all

C:\test>gawk -f myscript.awk file
-><DOCUMENT TYPE="ABC" DISK="aaaaa" PATH="aaa_0000/aaa.0000000000.uuu_xxx.0000000000.pdf.ppp">  <FIELD NAME="uuuuuu" VAL
UE="abcdefghilmnoprz" TYPE="STRING" />

Re: Parsing XML using Batch

Posted: 20 Oct 2010 06:00
by darioit
Great it works fine, but I have many code to fine about 200 in a txt file, how can I pass all code and get a list ?

Regards
Dario