BatchSubstitute Maximum Input/Output

Discussion forum for all Windows batch related topics.

Moderator: DosItHelp

Post Reply
Message
Author
JDV
Posts: 7
Joined: 17 Dec 2010 14:52

BatchSubstitute Maximum Input/Output

#1 Post by JDV » 19 Jan 2011 12:07

Is there some limit to the amount of text that BatchSub (the improved, forum version) can search through?

I'm wondering because I'm using BatchSub in a second project involving searching/replacing text in an XML file & saving as a new XML (the first project was near identical as worked perfectly, the XML files involved just were not as large) .

The XML is created by a stored procedure. When I retrieve 15 records, BatchSub is able to search/replace correctly.
When I retrieve more than 15 records, perhaps the amount of text throws it off - and much of the original text is deleted in the resulting file.

The odd thing is (the XML file is an RSS) that BatchSub leaves almost all RSS header and leaves the footer. It only deletes the body (all the "<items>") as well as one line of the header that suspiciously happens be on the same line as the deleted body.

The original XML file seems to be created properly no matter the number of records, so I've narrowed it to BatchSub.
I've also tried changing the words being searched for and the records being retrieved and those don't seem to be a problem.

Any help is appreciated, Thanks.

aGerman
Expert
Posts: 4678
Joined: 22 Jan 2010 18:01
Location: Germany

Re: BatchSubstitute Maximum Input/Output

#2 Post by aGerman » 19 Jan 2011 12:24

There is definately a limit for the length of a line (I forgot the number of characters, sorry).
If you open the file in a text editor, do you find the file content cascaded or all in one line?

Regards
aGerman

JDV
Posts: 7
Joined: 17 Dec 2010 14:52

Re: BatchSubstitute Maximum Input/Output

#3 Post by JDV » 19 Jan 2011 13:28

Thanks. The length of a line is probably it.
All the deleted lines are fairly long, while the sections that are left are more cascaded.

I would have thought the lines would be truncated instead of deleted. Is this limit part of the Bat file or just something with CMD? Can it be adjusted?


aGerman
Expert
Posts: 4678
Joined: 22 Jan 2010 18:01
Location: Germany

Re: BatchSubstitute Maximum Input/Output

#5 Post by aGerman » 19 Jan 2011 16:06

Good link, ChickenSoup.

JDV
As you can see, there is no chance to adjust.
IMHO you should learn something about XML DOM (can be used with VBScript). This provides a better way to manipulate xml files.

Concerning the long lines: Are these single nodes or are there child nodes inside which could be cascaded?

Regards
aGerman

JDV
Posts: 7
Joined: 17 Dec 2010 14:52

Re: BatchSubstitute Maximum Input/Output

#6 Post by JDV » 19 Jan 2011 16:22

ChickenSoup wrote:http://support.microsoft.com/kb/830473

"Modify programs that require long command lines so that they use a file that contains the parameter information, and then include the name of the file in the command line.

For example, instead of using the ExecutableFile.exe Parameter1 Parameter2 ...ParameterN command line in a batch file, modify the program to use a command line that is similar to the following command line, where ParameterFile is a file that contains the required parameters (parameter1 parameter2 ...ParameterN):
ExecutableFile.exe c:\temp\ParameterFile.txt"

This workaround the KB mentions seems to be exactly what I already do when running a .BAT containing BatchSub, essentially this:

CALL BatchSubstitute.bat "OldWord" "NewWord" oldfile.xml>newfile.xml

Concerning the long lines: it is an RSS feed like <channel><item(1)></item><item(2)></item>...etc.</channel>. "Item" and its sub-tags are children nodes, correct?
When opening the .XML in notepad the section that is deleted is wrapped along several lines (probably as a function of notepad), opening in Visual Studio forms the XML properly (I assume it's just being "smart"), but the section starts off as a single line from a single, long line making up a single row in a database table.

aGerman
Expert
Posts: 4678
Joined: 22 Jan 2010 18:01
Location: Germany

Re: BatchSubstitute Maximum Input/Output

#7 Post by aGerman » 19 Jan 2011 17:59

JDV wrote:This workaround the KB mentions seems to be exactly what I already do when running a .BAT containing BatchSub, essentially this:

CALL BatchSubstitute.bat "OldWord" "NewWord" oldfile.xml>newfile.xml

No, because BatchSubstitute.bat has to process each line of oldfile.xml. If a line is too large for a variable (that is expanded to the content of the line again) then it will not work.

OK, it's a bit off topic for a batch forum, but lets try to transform your long lines to an indented block (BTW I'm not an expert for XML DOM ...)

I wrote the following xml file for testing:
test.xml

Code: Select all

<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<test><string1>qwe</string1><string2>asd</string2><string3>yxc</string3></test>

As you can see all nodes are in a single line.

Now we need a stylesheet that tells the parser how to transform the file:
transform.xslt

Code: Select all

<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:xalan="http://xml.apache.org/xslt" version="1.0">
   <xsl:output method="xml" encoding="UTF-8" standalone="yes" indent="yes" xalan:indent-amount="4"/>
   <xsl:strip-space elements="*"/>
   <xsl:template match="@*|node()">
      <xsl:copy>
         <xsl:apply-templates select="@*|node()"/>
      </xsl:copy>
   </xsl:template>
</xsl:stylesheet>

And a VBScript to do the job:
transform.vbs

Code: Select all

Const xmlfile = "test.xml"
Const xsltfile = "transform.xslt"

Set oXmlDoc = CreateObject("Microsoft.XMLDOM")
oXmlDoc.async = False
oXmlDoc.load(xmlfile)

Set oXslDoc = CreateObject("Microsoft.XMLDOM")
oXslDoc.async = False
oXslDoc.load(xsltfile)

Set oXmlOutDoc = CreateObject("Microsoft.XMLDOM")
oXmlOutDoc.async = False

oXmlDoc.transformNodeToObject oXslDoc, oXmlOutDoc

oXmlOutDoc.save(xmlfile)


If I execute the transform.vbs the new content of the xml file is as follows:

Code: Select all

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<test>
   <string1>qwe</string1>
   <string2>asd</string2>
   <string3>yxc</string3>
</test>


NOTE:
Have a look at the encoding in your xml file. I used UTF-8. Change it in the output encoding of the xslt file.

Hope this will help
aGerman

ghostmachine4
Posts: 319
Joined: 12 May 2006 01:13

Re: BatchSubstitute Maximum Input/Output

#8 Post by ghostmachine4 » 19 Jan 2011 21:47

JDV wrote:Any help is appreciated, Thanks.

people should stop using batch to do things like file processing (and others), especially if its XML or HTML. Use a programming language with XML/HTML facilities, ( or at least with regular expression support + string manipulation functions )

JDV
Posts: 7
Joined: 17 Dec 2010 14:52

Re: BatchSubstitute Maximum Input/Output

#9 Post by JDV » 20 Jan 2011 09:50

ghostmachine4 wrote:people should stop using batch to do things like file processing (and others), especially if its XML or HTML. Use a programming language with XML/HTML facilities, ( or at least with regular expression support + string manipulation functions )

:oops: I know. I'm just rather....limited in that area.

aGerman wrote:No, because BatchSubstitute.bat has to process each line of oldfile.xml. If a line is too large for a variable (that is expanded to the content of the line again) then it will not work.

OK, thanks for the explanation.

aGerman wrote:Hope this will help
aGerman


This was indeed a great help. Thank you! :D
*SOLVED*

Post Reply