findstr.bat and repl.bat and NULLS

Discussion forum for all Windows batch related topics.

Moderator: DosItHelp

Message
Author
foxidrive
Expert
Posts: 6031
Joined: 10 Feb 2012 02:20

Re: findstr.bat and repl.bat and NULLS

#16 Post by foxidrive » 23 Jul 2014 00:09

Thanks aGerman,

This slightly changed snippet changes nulls to a pipe, which repl and findrepl seemed unable to do.

Code: Select all

var objAdoS = WScript.CreateObject("ADODB.Stream");
objAdoS.Type = 2;
objAdoS.CharSet = "us-ascii";
objAdoS.Open();
objAdoS.LoadFromFile("file.j8i");
var strContent = objAdoS.ReadText();
objAdoS.Close();
var strFind = strContent.replace(/\x00/g,"|");

WScript.Echo(strFind);

carlos
Expert
Posts: 503
Joined: 20 Aug 2010 13:57
Location: Chile
Contact:

Re: findstr.bat and repl.bat and NULLS

#17 Post by carlos » 23 Jul 2014 06:55

In my bhx program I used a adodb object, and I remember that the charset for handle binary data is "windows-1252", else some characters are interpreted bad.

aGerman
Expert
Posts: 4678
Joined: 22 Jan 2010 18:01
Location: Germany

Re: findstr.bat and repl.bat and NULLS

#18 Post by aGerman » 23 Jul 2014 13:50

Good point carlos.
As long as all characters are standard ASCII it doesn't matter. Although probably you must not use a unicode character set (even if UTF-8 could also be a possible encoding :?). Whether you have to use a different character set can only be discovered if it happens that one or more extended ASCII characters are in the plain text phrase. I assume only foxidrive can answer that issue ...

@foxidrive
Basically it's not a question of the replace method but a question of how different string types or string streams are casted into a JScript string object. Printable ASCII characters seem to be casted always correctly. Since the NUL character is also used as string terminator in some string types it may happen that a certain expression will be truncated at the first occurrence of a 0 byte. That's the behavior you will find if you use the ReadAll Method instead of the ADO Stream workaround.

Regards
aGerman

foxidrive
Expert
Posts: 6031
Joined: 10 Feb 2012 02:20

Re: findstr.bat and repl.bat and NULLS

#19 Post by foxidrive » 23 Jul 2014 18:22

aGerman wrote:Since the NUL character is also used as string terminator in some string types it may happen that a certain expression will be truncated at the first occurrence of a 0 byte. That's the behavior you will find if you use the ReadAll Method instead of the ADO Stream workaround.


Thanks aGerman, that is the behaviour of every tool I tried in plain batch too.
set /p also truncates at the first nul for instance.

What I was interested in (my first post isn't really clear on this point, but the thread title is - sorry) is that repl.bat and findrepl.bat don't correctly handle a replacement of \x00 with another character.

The character set in this example was plain ascii so that aspect was ok.

dbenham
Expert
Posts: 2461
Joined: 12 Feb 2011 21:02
Location: United States (east coast)

Re: findstr.bat and repl.bat and NULLS

#20 Post by dbenham » 23 Jul 2014 22:18

There is something wonky going on with JScript. REPL.BAT is designed to support replacement of 0x00, and in all my prior tests it worked fine.

I created a file test.txt with the following hex values:

Code: Select all

C:\test\test>hexdump test.txt
00 41 00 42 00 43 00

I was able to replace the null bytes no problem:

Code: Select all

C:\test\test><test.txt repl \x00 _ >new.txt

C:\test\test>type new.txt
_A_B_C_


But if I try to replace the original file.j8i, without redirection, I get screwy results:

Code: Select all

C:\test\test><file.j8i repl \x00 _
JS-8_U__♦___?_?_?U__C___‼_?_____?9__?9__&___?_?_____?9__?9__(___·_?_____?9__?9__
C___?_?_____?9__?9__☺___?_?_?U__☺___?_?_?U__☺___?_?_?U__'___?_?_____?9__?9__♣___
?_7_?U__♦___8_1_?U__C___?_1_____?9__?9__&___?_2_____?9__?9__(___?_2_____?9__?9__
C___·_2_____?9__?9______________________________?}___H_♀_____________________♦__
___d__ô♥A_^_☺♦__☻ _%I↕♦IéA'<S('♦(S☻(¢ò(,♦äJ♥E¢,(`♣"*¢?I►ôîéA<______ ______ _____
________________________________________________________________________________
___________________________a???JS-8_U______?9__?9__(___?_?_____?9__?9__C___7_?__
___?9__?9__☺___?_?_?U__☺___?_?_?U__☺___?_?_?U__'___?_?_____?9__?9__♣___?_I_?U__♦
___I_Q_?U__C___?_Q_____?9__?9__&___?_R_____?9__?9__(___?_R_____?9__?9__C___?_R__
___?9__?9__☺___?_?_?U__☺___Z_e_?U__☺___E_?_?U__'___Z_?_____?9__?9__♣___?_?_?U__♦
___?_?_?U__C___?_?_____?9__?9__&___Z_?_____?9__?9__(___?_?_____?9__?9__C___?_?__
___?9__?9__☺___?_?_?U__☺___?_?_?U__☺___?_?_?U__'___?_?_____?9__?9__♣___?_?_?U__♦
___?_?_?U__C___?_?_____?9__?9__&___?_?_____?9__?9__(___?_?______________________
___

The real mystery occurs when I try to redirect the output to a file:

Code: Select all

C:\test\test><file.j8i repl \x00 _ >file.new
C:\utils\repl.bat(292, 39) Microsoft JScript runtime error: Invalid procedure call or argument

That should not happen :!: :shock: :?

It looks to me like a loose pointer. I don't see how my code could logically give the above results. I wonder if this is a JScript bug?


Dave Benham

Aacini
Expert
Posts: 1914
Joined: 06 Dec 2011 22:15
Location: México City, México
Contact:

Re: findstr.bat and repl.bat and NULLS

#21 Post by Aacini » 23 Jul 2014 22:29

This point seems to be related with the old problem about the fact that JScript always manage Unicode characters, so a problem arise when a character beyond Ascii 256 is sent to the screen.

Don't you include the character mapping we talking about sometime into repl.bat?

Antonio

dbenham
Expert
Posts: 2461
Joined: 12 Feb 2011 21:02
Location: United States (east coast)

Re: findstr.bat and repl.bat and NULLS

#22 Post by dbenham » 24 Jul 2014 05:12

I don't think that is the issue. Both your FINDREPL.BAT and my REPL.BAT are choking on foxi's file, and both have code to compensate for the unicode.

I create a file named ASCII.TXT containing all byte codes, in order, from 0x00 - 0xFF. I am able to properly replace \x00 with _ using

Code: Select all

<ascii.txt repl \x00 _ m >ascii.new
The M option is used to prevent addition of \x0D \0x0A at end.

I also replaced null with itself and verified the output matched the input using FC

Code: Select all

<ascii.txt repl \x00 \x00 mx >ascii.new
fc /b ascii.txt ascii.new

I don't think there is anything wrong with the logic in REPL.BAT, but instead there is something in foxi's file that is triggering a bug in JScript, causing it to run amok.

Or perhaps some byte sequence in the file is interpreted by JScript as an invalid variable length Unicode sequence?


Dave Benham

penpen
Expert
Posts: 2009
Joined: 23 Jun 2013 06:15
Location: Germany

Re: findstr.bat and repl.bat and NULLS

#23 Post by penpen » 24 Jul 2014 13:22

I just noticed, that the result is different, when repeating it.

Code: Select all

Z:\><file.j8i repl \x00 _
JS-8_2__♦___?_?_?2__C___?_?_____?6__?6__&___._?_____?6__?6__(___"_?_____?6__?6__
C___?_?_____?6__?6__☺___?_?_?2__☺___?_?_?2__☺___?_?_?2__'___?_?_____?6__?6__♣___
?_?_?2__♦___?_?_?2__C___?_?_____?6__?6__&___?_?_____?6__?6__(___?_?_____?6__?6__
C___"_?_____?6__?6______________________________?}___H_♀_____________________♦__
___ð__ô♥À_^_☺♦__☻ _%Ï↕♦ÎéÂ'<S('♦(S☻(¢ò('♦äJ♥È¢'('♣"*¢?Ï►ôîéÂ<______ ______ _____
________________________________________________________________________________
___________________________a???JS-8_2______?6__?6__(___?_?_____?6__?6__C___¹_?__
___?6__?6__☺___?_?_?2__☺___?_?_?2__☺___?_?_?2__'___?_?_____?6__?6__♣___?_g_?2__♦
___H_?_?2__C___?_?_____?6__?6__&___?_N_____?6__?6__(___?_N_____?6__?6__C___?_N__
___?6__?6__☺___Q_?_?2__☺___T_?_?2__☺___K_F_?2__'___T_F_____?6__?6__♣___?_?_?2__♦
___?_?_?2__C___?_?_____?6__?6__&___T_?_____?6__?6__(___Q_?_____?6__?6__C___?_?__
___?6__?6__☺___?_?_?2__☺___?_?_?2__☺___?_?_?2__'___?_?_____?6__?6__♣___?_?_?2__♦
___?_?_?2__C___?_?_____?6__?6__&___?_?_____?6__?6__(___?_?______________________
___

Z:\><file.j8i repl \x00 _
JS-8_n__♦___?_?_?n__C___?_?_____?☼__?☼__&___._?_____?☼__?☼__(___"_?_____?☼__?☼__
C___?_?_____?☼__?☼__☺___?_?_?n__☺___?_?_?n__☺___?_?_?n__'___?_?_____?☼__?☼__♣___
?_?_?n__♦___?_?_?n__C___?_?_____?☼__?☼__&___?_?_____?☼__?☼__(___?_?_____?☼__?☼__
C___"_?_____?☼__?☼______________________________?}___H_♀_____________________♦__
___ð__ô♥À_^_☺♦__☻ _%Ï↕♦ÎéÂ'<S('♦(S☻(¢ò('♦äJ♥È¢'('♣"*¢?Ï►ôîéÂ<______ ______ _____
________________________________________________________________________________
___________________________a???JS-8_n______?☼__?☼__(___?_?_____?☼__?☼__C___¹_?__
___?☼__?☼__☺___?_?_?n__☺___?_?_?n__☺___?_?_?n__'___?_?_____?☼__?☼__♣___?_g_?n__♦
___H_?_?n__C___?_?_____?☼__?☼__&___?_N_____?☼__?☼__(___?_N_____?☼__?☼__C___?_N__
___?☼__?☼__☺___Q_?_?n__☺___T_?_?n__☺___K_F_?n__'___T_F_____?☼__?☼__♣___?_?_?n__♦
___?_?_?n__C___?_?_____?☼__?☼__&___T_?_____?☼__?☼__(___Q_?_____?☼__?☼__C___?_?__
___?☼__?☼__☺___?_?_?n__☺___?_?_?n__☺___?_?_?n__'___?_?_____?☼__?☼__♣___?_?_?n__♦
___?_?_?n__C___?_?_____?☼__?☼__&___?_?_____?☼__?☼__(___?_?______________________
___

Z:\>
So the problem is not located within the jscript part, i think (else it should always produce the same output).
I assume jscript has problems to read from the pipe (used for the redirected input stream).
If this is true, there should be a limit up to where the reading was ok (somewhere around 1 KB).

And indeed the result was ok after i've shrinked it to a filesize of 260 (0x104) (why not 1KB or 512 bytes: don't know).
I replaced all file content to NUL characters and later to random characters (containing at least one NUL char) no problems.
Errors always occure on files with sizes >= 261 and one or more NUL character(s) within this block.

So i assume it is a kind of pipe "corruption" (or better: the missing ability of jscript to read from it correctly) if using more than 260 bytes.

penpen

Edit: I've tested the above on pipes, too ("type file.j8i | repl \x00 _"); same results. So i've tricked myself with terms and conclusions a little bit.
Last edited by penpen on 27 Jul 2014 08:52, edited 1 time in total.

dbenham
Expert
Posts: 2461
Joined: 12 Feb 2011 21:02
Location: United States (east coast)

Re: findstr.bat and repl.bat and NULLS

#24 Post by dbenham » 24 Jul 2014 14:59

penpen wrote:I just noticed, that the result is different, when repeating it.
...
So the problem is not located within the jscript part, i think (else it should always produce the same output).
I assume jscript has problems to read from the pipe (used for the redirected input stream).
If this is true, there should be a limit up to where the reading was ok (somewhere around 1 KB).

And indeed the result was ok after i've shrinked it to a filesize of 260 (0x104) (why not 1KB or 512 bytes: don't know).
I replaced all file content to NUL characters and later to random characters (containing at least one NUL char) no problems.
Errors always occure on files with sizes >= 261 and one or more NUL character(s) within this block.

So i assume it is a kind of pipe "corruption" (or better: the missing ability of jscript to read from it correctly) if using more than 260 bytes.

I interpret the evidence very differently. First off, there is no pipe - JScript is reading redirected stdin - a very basic OS function. Second, the issue cannot be strictly size based, as I have successfully used REPL.BAT on many files that had hundreds of megabytes. Finally, the fact that it does not give consistent results makes me think that there is a bug in JScript itself - again, I think there is a loose pointer within the JScript engine (or within some library that it uses).


Dave Benham

carlos
Expert
Posts: 503
Joined: 20 Aug 2010 13:57
Location: Chile
Contact:

Re: findstr.bat and repl.bat and NULLS

#25 Post by carlos » 24 Jul 2014 22:29

In cmd when a program print the character 0xA cmd convert it to 0xD 0xA. Maybe some interpretation ocurrs.

einstein1969
Expert
Posts: 961
Joined: 15 Jun 2012 13:16
Location: Italy, Rome

Re: findstr.bat and repl.bat and NULLS

#26 Post by einstein1969 » 25 Jul 2014 01:23

My experiment:

I have readapted a script and seem work... not bug.

simple_repl.vbs

Code: Select all

set re=new regexp
re.global=true
re.pattern="[\x00]"
set stream2=createobject("adodb.stream")
set stream=createobject("adodb.stream")
stream.open
stream.type=1
stream.loadfromfile wscript.arguments.item(0)
for p=0 to stream.size-1 step 16
  buf=stream.read(16)
  txt=mid(hex(&H1000000+p),2)
  for k=0 to 15
    if p+k<stream.size then
      txt=txt & " " & mid(hex(&h100+ascb(midb(buf,k+1,1))),2)
    else
      txt=txt & "   "
    end if
  next
  stream2.open
  stream2.type=1
  stream2.write buf
  stream2.position=0
  stream2.type=2
  stream2.charset="iso-8859-1"
  wscript.echo txt,re.replace(stream2.readtext(-1),"|")
  stream2.close
next


output

Code: Select all

E:\x264\provini\tmp>cscript ..\simple_repl.vbs file.j8i
Microsoft (R) Windows Script Host Versione 5.8
Copyright (C) Microsoft Corporation 1996-2001. Tutti i diritti riservati.

000000 4A 53 2D 38 00 00 04 14 66 6D 74 20 00 00 00 04 JS-8||♦¶fmt |||♦
000010 00 00 00 01 4A 38 49 20 00 00 04 00 00 00 00 01 |||☺J8I ||♦||||☺
000020 50 6F 70 3A 43 6C 61 73 73 69 63 20 50 6F 70 00 Pop:Classic Pop|
000030 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ||||||||||||||||
000040 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ||||||||||||||||
000050 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ||||||||||||||||
000060 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ||||||||||||||||
000070 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ||||||||||||||||
000080 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ||||||||||||||||
000090 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ||||||||||||||||
0000A0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ||||||||||||||||
0000B0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ||||||||||||||||
0000C0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ||||||||||||||||
0000D0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ||||||||||||||||
0000E0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ||||||||||||||||
0000F0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ||||||||||||||||
000100 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ||||||||||||||||
000110 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ||||||||||||||||
000120 80 7D 00 00 00 48 00 0C 00 00 00 00 00 00 00 00 ?}|||H|♀||||||||
000130 00 00 00 00 00 00 00 00 00 00 00 00 00 04 00 00 |||||||||||||♦||
000140 00 00 00 F0 00 00 F4 00 08 03 C0 00 88 00 01 04 |||ð||ô♥À|^|☺♦
000150 00 00 02 20 00 89 CF 12 04 CE E9 C2 27 3C 8A 28 ||☻ |%Ï↕♦ÎéÂ'<S(
000160 92 04 28 8A 02 28 A2 F2 28 82 04 E4 4A 03 C8 A2 '♦(S☻(¢ò('♦äJ♥È¢
000170 82 28 91 05 22 2A 02 08 A2 81 CF 10 F4 EE E9 C2 '('♣"¢?Ï►ôîéÂ
000180 07 3C 00 08 00 00 00 00 00 00 20 00 08 00 00 00 <|||||| |||
000190 00 00 00 20 00 00 00 00 00 00 00 00 00 00 00 00 ||| ||||||||||||
0001A0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ||||||||||||||||
0001B0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ||||||||||||||||
0001C0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ||||||||||||||||
0001D0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ||||||||||||||||
0001E0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ||||||||||||||||
0001F0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ||||||||||||||||
000200 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ||||||||||||||||
000210 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ||||||||||||||||
000220 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ||||||||||||||||
000230 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ||||||||||||||||
000240 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ||||||||||||||||
000250 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ||||||||||||||||
000260 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ||||||||||||||||
000270 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ||||||||||||||||
000280 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ||||||||||||||||
000290 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ||||||||||||||||
0002A0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ||||||||||||||||
0002B0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ||||||||||||||||
0002C0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ||||||||||||||||
0002D0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ||||||||||||||||
0002E0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ||||||||||||||||
0002F0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ||||||||||||||||
000300 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ||||||||||||||||
000310 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ||||||||||||||||
000320 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ||||||||||||||||
000330 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ||||||||||||||||
000340 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ||||||||||||||||
000350 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ||||||||||||||||
000360 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ||||||||||||||||
000370 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ||||||||||||||||
000380 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ||||||||||||||||
000390 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ||||||||||||||||
0003A0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ||||||||||||||||
0003B0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ||||||||||||||||
0003C0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ||||||||||||||||
0003D0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ||||||||||||||||
0003E0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ||||||||||||||||
0003F0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ||||||||||||||||
000400 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ||||||||||||||||
000410 00 00 00 00 00 00 00 00 00 00 00 00             ||||||||||||



einstein1969

penpen
Expert
Posts: 2009
Joined: 23 Jun 2013 06:15
Location: Germany

Re: findstr.bat and repl.bat and NULLS

#27 Post by penpen » 27 Jul 2014 10:53

I've done some more testings on this (and edited my above post).
dbenham wrote:First off, there is no pipe - JScript is reading redirected stdin - a very basic OS function
Right: I was wrong naming "redirected input" "piped input";
i've added an edit (note) in my above post why i've written about pipes.
My (too hasty; sry i'd not much time that day) conclusions are wrong, too:
I must apologise for that.

dbenham wrote:Second, the issue cannot be strictly size based, as I have successfully used REPL.BAT on many files that had hundreds of megabytes.
It is not strictly sized based: I've only treated such files containing at minimum one NUL character.
But if there is a NUL character then this is a size based issue.
But it not realates to one fixed size; the size of 261 characters (/file bytes) is only the minimal size, where the issue occurs.

All my tests have the same result (Win xp home version):
The data is divided in parts P_1 ... P_n (|P_1|=260, |P_i|=256; i in setN_>1).
All data in P_n is as it should be, and
all data in parts P_1, ... P_n-1 is only ok up to the first NUL byte.
The content of my test files is (in regular expression): 'x'* NUL 'x'* '@' (("123456789" NUL)^24) "12345".

I now think the following is happening.
JScript reads the input from the stdIn text stream to an internal buffer (B1).
If the input exceeds an given size (B.crit in {260+256*i| i in setN_0} ), then a new buffer (B2) is created to hold the data.
Then the (old) data (string) is copied from B1 to B2 assuming it holds a NUL terminated string, so the data between the first NUL character and the first character in P_n gets corrupted.

penpen

Edit: It seems, that instead of "str1 += WScript.StdIn.ReadLine();" ( or ...ReadAll...), you could use:

Code: Select all

   while (!WScript.StdIn.AtEndOfLine) {
      str1 += WScript.StdIn.Read(1);
   }
BUT this is very slow (especially for big files).

dbenham
Expert
Posts: 2461
Joined: 12 Feb 2011 21:02
Location: United States (east coast)

Re: findstr.bat and repl.bat and NULLS

#28 Post by dbenham » 27 Jul 2014 23:14

:D :D :D :D
Fantastic penpen :!:

I optimized your work-around by using Read(260) instead of Read(1). I've updated REPL.BAT to accept the N option to enable proper reading of NULL bytes using the work-around. It only works when also using the M option.

I haven't tested, but I imagine it is still slow with large files, but at least slow is better than broken.


Dave Benham

dbenham
Expert
Posts: 2461
Joined: 12 Feb 2011 21:02
Location: United States (east coast)

Re: findstr.bat and repl.bat and NULLS

#29 Post by dbenham » 30 Jul 2014 18:26

It turns out it is only ReadAll() that suffers the bug. Read(n) works properly with any size n (within limits of string size).

I removed the N option from my new REPL.BAT version 4.1. The M option now always works properly with binary files. The initial binary read is 1024 bytes, and the size doubles each time there is more content to read. This is a major speed boost.

I successfully processed a 100 MB file containing NULL bytes in 8 seconds.


Note: I also tested ReadAll() using VBS instead of JScript, and it also failed with NULL bytes. So the bug is in the core of the scripting host; it is not specific to JScript.


Dave Benham

Post Reply