Page 1 of 2

findstr.bat and repl.bat and NULLS

Posted: 21 Jul 2014 06:06
by foxidrive
I've been fiddling around trying to parse some binary data that has some nulls in the first 32 bytes, and then a text string and followed by more nulls, with binary data after it too.

I tried WSH and then read that it was supposed to have binary data handling added in an 'future update',
and I tried findstr and repl and had different results with them, which I can't really explain.

I tried to change all hex 00 to a pipe.

Image

The START of the file has 32 bytes of some text with hex 00 and then "Pop:Classic Pop" followed by around 250 bytes of hex 00 and another kb of data

Code: Select all

JS-8  fmt       J8I       Pop:Classic Pop


You can see in the image that "Pop:Classic Pop" has disappeared (which I was trying to extract) as well as other text that disappeared.


I solved the problem with GNUsed but was wondering how it could be solved simply with native tools on a 64 bit system.

If anyone wants to have a look then here is the file that you can recover

Code: Select all

@echo off
(
echo -----BEGIN CERTIFICATE-----
echo SlMtOAAABBRmbXQgAAAABAAAAAFKOEkgAAAEAAAAAAFQb3A6Q2xhc3NpYyBQb3AA
echo AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
echo AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
echo AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
echo AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
echo AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
echo gH0AAABIAAwAAAAAAAAAAAAAAAAAAAAAAAAAAAAEAAAAAADwAAD0AAgDwACIAAEE
echo AAACIACJzxIEzunCJzyKKJIEKIoCKKLyKIIE5EoDyKKCKJEFIioCCKKBzxD07unC
echo BzwACAAAAAAAACAACAAAAAAAACAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
echo AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
echo AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
echo AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
echo AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
echo AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
echo AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
echo AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
echo AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
echo AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
echo AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
echo AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
echo AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
echo AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA=
echo -----END CERTIFICATE-----
)>file.tmp
certutil -decode file.tmp file.j8i
del file.tmp

Re: findstr.bat and repl.bat and NULLS

Posted: 21 Jul 2014 08:25
by Squashman
Look at it in binary mode then and use finstr to remove the nulls.

Code: Select all

<nul set /p ".=A" >dummy.txt
for /l %%n in (1 1 15) do type dummy.txt >>dummy.txt

fc /b file.j8i dummy.txt | findstr /v /r /c:": 00 41$"

You could parse that output in a FOR /F command and then I assume you could use a hex to ascii converter to get the data you need in a usable format.

Credit to dbenham for the code above.

Re: findstr.bat and repl.bat and NULLS

Posted: 21 Jul 2014 09:22
by foxidrive
Thanks Squashman.

I can see how it removes the nulls from the FC output - though it in needs still more processing.

I like using the tools that the guys here have developed with native code - they are robust and quick and make scripts smaller - and I'm wondering if there is something in repl and findrepl that can handle hex 00 in an elegant way too, or if it is a limitation of jscript.

Re: findstr.bat and repl.bat and NULLS

Posted: 21 Jul 2014 10:02
by Squashman
foxidrive wrote:Thanks Squashman.

I can see how it removes the nulls from the FC output - though it in needs still more processing.


Yes some more processing but I think it would be doable. Your output should look like this currently.

Code: Select all

Comparing files file.j8i and DUMMY.TXT
00000000: 4A 41
00000001: 53 41
00000002: 2D 41
00000003: 38 41
00000006: 04 41
00000007: 14 41
00000008: 66 41
00000009: 6D 41
0000000A: 74 41
0000000B: 20 41
0000000F: 04 41
00000013: 01 41
00000014: 4A 41
00000015: 38 41
00000016: 49 41
00000017: 20 41
0000001A: 04 41
0000001F: 01 41
00000020: 50 41
00000021: 6F 41
00000022: 70 41
00000023: 3A 41
00000024: 43 41
00000025: 6C 41
00000026: 61 41
00000027: 73 41
00000028: 73 41
00000029: 69 41
0000002A: 63 41
0000002B: 20 41
0000002C: 50 41
0000002D: 6F 41
0000002E: 70 41
00000120: 80 41
00000121: 7D 41
00000125: 48 41
00000127: 0C 41
0000013D: 04 41
00000143: F0 41
00000146: F4 41
00000148: 08 41
00000149: 03 41
0000014A: C0 41
0000014C: 88 41
0000014E: 01 41
0000014F: 04 41
00000152: 02 41
00000153: 20 41
00000155: 89 41
00000156: CF 41
00000157: 12 41
00000158: 04 41
00000159: CE 41
0000015A: E9 41
0000015B: C2 41
0000015C: 27 41
0000015D: 3C 41
0000015E: 8A 41
0000015F: 28 41
00000160: 92 41
00000161: 04 41
00000162: 28 41
00000163: 8A 41
00000164: 02 41
00000165: 28 41
00000166: A2 41
00000167: F2 41
00000168: 28 41
00000169: 82 41
0000016A: 04 41
0000016B: E4 41
0000016C: 4A 41
0000016D: 03 41
0000016E: C8 41
0000016F: A2 41
00000170: 82 41
00000171: 28 41
00000172: 91 41
00000173: 05 41
00000174: 22 41
00000175: 2A 41
00000176: 02 41
00000177: 08 41
00000178: A2 41
00000179: 81 41
0000017A: CF 41
0000017B: 10 41
0000017C: F4 41
0000017D: EE 41
0000017E: E9 41
0000017F: C2 41
00000180: 07 41
00000181: 3C 41
00000183: 08 41
0000018A: 20 41
0000018C: 08 41
00000193: 20 41
FC: DUMMY.TXT longer than file.j8i

Re: findstr.bat and repl.bat and NULLS

Posted: 21 Jul 2014 10:36
by foxidrive
Squashman wrote:Yes some more processing but I think it would be doable.


One aspect of converting hex 00 to another character is that it would be fairly easy to find the end of the text string in question.
The string is of variable length and composition, but starts at byte 33, and removing every nul could introduce a difficulty in determining the correct string length.

Re: findstr.bat and repl.bat and NULLS

Posted: 21 Jul 2014 10:56
by carlos
more convert nul to spaces ?

Re: findstr.bat and repl.bat and NULLS

Posted: 21 Jul 2014 11:02
by Squashman
This is what I got so far.

Code: Select all

@echo off
setlocal EnableDelayedExpansion
<nul set /p ".=A" >dummy.txt
for /l %%n in (1 1 15) do type dummy.txt >>dummy.txt

>"output.txt" (
  for /f "skip=33 tokens=1,2 delims=: " %%i in ('fc /b "file.j8i" "dummy.txt"') do (
   IF "%%j"=="00" goto next
    <nul set /p "=%%j"
  )
)

:next

Which outputs this.

Code: Select all

506F703A436C617373696320506F70

Re: findstr.bat and repl.bat and NULLS

Posted: 21 Jul 2014 12:23
by Squashman
Here is some powershell code to convert hex to ascii.
http://blogs.technet.com/b/heyscripting ... shell.aspx

I know you can run powershell code from a batch file and capture the output with the FOR /F command but I am not sure how you would use that code to stream it into powershell.

Re: findstr.bat and repl.bat and NULLS

Posted: 21 Jul 2014 13:13
by Squashman
Or use FORFILES to generate the characters back into ASCII.

Code: Select all

@echo off
setlocal
<nul set /p ".=A" >dummy.txt
for /l %%n in (1 1 15) do type dummy.txt >>dummy.txt

::Define a Linefeed variable
set LF=^


::above 2 blank lines are critical - do not remove.
set hexvar=
for /f "skip=33 tokens=1,2 delims=: " %%i in ('fc /b "file.j8i" "dummy.txt"') do (
   IF "%%j"=="00" goto next
   setlocal EnableDelayedExpansion
   set "hexvar=!hexvar!0x%%j"
   setlocal
)

:next
echo %hexvar%
pause
call :hexprint "%hexvar%" charvar
echo %charvar%
pause
GOTO :EOF

:hexPrint  string  [rtnVar]
  for /f eol^=^%LF%%LF%^ delims^= %%A in (
    'forfiles /p "%~dp0." /m "%~nx0" /c "cmd /c echo(%~1"'
  ) do if "%~2" neq "" (set %~2=%%A) else echo(%%A
exit /b

Re: findstr.bat and repl.bat and NULLS

Posted: 21 Jul 2014 16:36
by penpen
if the stringname could be read using set /P this may help, too:

Code: Select all

@echo off
setlocal
set "input="
for /F "tokens=* delims=" %%A in ('type file.j8i ^| ^(^(for /L %%n in ^(1, 1, 32^) do @pause^) ^> nul ^& set /P "input=" ^& ^(cmd /V:ON /C echo^(!input!^)^)') do set "input=%%A"
echo(%input%
endlocal

penpen

Re: findstr.bat and repl.bat and NULLS

Posted: 21 Jul 2014 17:47
by aGerman
I thought about something very similar using FIND.

Code: Select all

for /f delims^=^ eol^= %%i in ('type "file.j8i"^|^(^(for /l %%j in ^(1 1 32^) do @pause^) ^>nul ^&find /v ""^)') do if not defined str set "str=%%i"

Regards
aGerman

Re: findstr.bat and repl.bat and NULLS

Posted: 21 Jul 2014 19:15
by foxidrive
Squashman, penpen, aGerman, your solutions all work fine, and are a little bit magic.

Convoluted would also describe them. :)

Thanks guys - though I am still keen to see if the jscript tools are able to handle it in a regular expression.

Re: findstr.bat and repl.bat and NULLS

Posted: 21 Jul 2014 20:08
by Squashman
Should be able to do it with jscript. I found some code to do it in vbscript but it basically does the same thing that forfiles is doing.

Powershell does look like a good option here as well.

Re: findstr.bat and repl.bat and NULLS

Posted: 21 Jul 2014 23:51
by foxidrive
Squashman wrote:I found some code to do it in vbscript but it basically does the same thing that forfiles is doing.


That's of interest - I tried vbscript but with my lack of familiarity couldn't get it to handle nulls

Do you have the code to post here please?

Powershell does look like a good option here as well.


Thanks for that reminder too.

foxi

Re: findstr.bat and repl.bat and NULLS

Posted: 22 Jul 2014 16:17
by aGerman
foxidrive wrote:Thanks guys - though I am still keen to see if the jscript tools are able to handle it in a regular expression.

Speaking of "convoluted" :lol:
(just the proof of concept in pure JS:)

Code: Select all

var objAdoS = WScript.CreateObject("ADODB.Stream");
objAdoS.Type = 2;
objAdoS.CharSet = "us-ascii";
objAdoS.Open();
objAdoS.LoadFromFile("file.j8i");
var strContent = objAdoS.ReadText();
objAdoS.Close();
var strFind = strContent.replace(/(?:(^.{32})|\w+)/, function($0, $1) {
  return $1 ? "" : $0;
});

WScript.Echo(strFind);

Computing binary files in JScript is difficult (doesn't work with the FileSystemObject), no less emulating a "positive lookbehind" using lambda.

Regards
aGerman