Reading file with special characters

Discussion forum for all Windows batch related topics.

Moderator: DosItHelp

Post Reply
Message
Author
sandeepshampur
Posts: 1
Joined: 07 Oct 2018 07:40

Reading file with special characters

#1 Post by sandeepshampur » 07 Oct 2018 07:50

I am trying to read a file line by line that contains special characters like ")", "&" without success as I get error like
"-10314679.html: was unexpected at this time.".
Here is the code.

Code: Select all

    for /f "tokens=1 delims=@" %%A in (_HashList.tmp) do (
    call set myParam="%%A"
    call :myParseLine %%myParam%%
    )
    exit /b

    :myParseLine
    call set myParam=%~1
    call set myPartLine=%myParam:~0,8%

    if "%myPartLine%" == "CertUtil" ( 
        exit /b
    )

    if "%myPartLine%" == "MD5 hash" ( 
        call set "myPartLine=%myParam:~12,-1%"
        call set myPartLine=!myPartLine!;
        call echo | set /p=%%myPartLine%% >> z:\utilities\_HashDatabase.tmp
        exit /b 
    )

    call echo %%myParam%% >> z:\utilities\_HashDatabase.tmp
    exit /b
The file _HashList.tmp contains :

Code: Select all

    MD5 hash of z:\Church\Messages\Emails\19981112-The Stranger- You got to read this.... (fwd)-10314679.html:
    966b538d0f52fc66bbb7ef4fd98ec1ca
    CertUtil: -hashfile command completed successfully.
    MD5 hash of z:\Church\Messages\Emails\20061013-God_s perfect will-Q &-266668.html:
    32b3c1381bbff6f6d94fe00355c3bf29
    CertUtil: -hashfile command completed successfully.
How can I overcome this problem?

ShadowThief
Expert
Posts: 1166
Joined: 06 Sep 2013 21:28
Location: Virginia, United States

Re: Reading file with special characters

#2 Post by ShadowThief » 07 Oct 2018 12:23

If you don't mind regenerating the hashfiles, I've thrown together something that will automatically get the MD5 of the emails and put it in the format you wanted without the need for a temporary file.

Code: Select all

@echo off
setlocal enabledelayedexpansion

if "%~1"=="" goto :Usage
for %%A in (%*) do (
	set "email_file=%%~A"
	for /f %%B in ('certutil -hashfile "%%A" MD5 ^| find /v "hash"') do set "hash=%%B"
	>>z:\utilities\_HashDatabase.tmp echo !email_file!; !hash!
)

exit /b

:Usage
echo EmailHasher.bat ^<mail_1^> [mail_2] [...] [mail_n]
Just drag and drop all of the emails onto the script.

ShadowThief
Expert
Posts: 1166
Joined: 06 Sep 2013 21:28
Location: Virginia, United States

Re: Reading file with special characters

#3 Post by ShadowThief » 07 Oct 2018 12:47

You can also shorten your original code to

Code: Select all

@echo off
setlocal enabledelayedexpansion

for /f "tokens=1-3,*" %%A in (_HashList.tmp) do (
	if "%%A"=="MD5" (
		set "email_name=%%D"
		set "email_name=!email_name:~0,-1!"
	) else (
		if not "%%A"=="CertUtil:" (
			set "email_hash=%%A"
			
			>>z:\utilities\_HashDatabase.tmp echo !email_name!; !email_hash!
		)
	)
)

dbenham
Expert
Posts: 2461
Joined: 12 Feb 2011 21:02
Location: United States (east coast)

Re: Reading file with special characters

#4 Post by dbenham » 07 Oct 2018 12:58

"Poison" characters like & | > < etc can cause problems if they are exposed to the early phases of the parser. If the they are quoted, then there is no problem, but if not quoted (or escaped) then they are interpreted as operators with special meaning.

Expanding environment variables with percent signs exposes the risks.


When reading a file with unknown (unconstrained) content, there are basically two possible methods to safely work with the values

1) Work strictly with FOR /F variables like %%A if possible. But note that delayed expansion must be off if the value may contain ! characters. If delayed expansion is on when %%A is expanded, then strings with ! will be corrupted.

2) Use delayed expansion whenever you expand a variable that may contain poison characters. SetLocal EnableDelayedExpansion to enable the delayed expansion, and !varName! to safely expand the value.

There may be a third option if you know that the value will never contain quotes. Just make sure that the value is always quoted. But that assumes you don't care if you introduce quotes in the value when you ECHO it.

It looks like all you are trying to do is capture the full file paths of all files in a log file that resulted from processing by CERTUTIL.

That is easily done by using FINDSTR to filter out all the "MD5 hash o"f lines, and FOR /F to capture the remainder of the line after the 3rd space delimited token (text after "of ")

Code: Select all

@echo off
(for /f "tokens=3*" %%A in ('findstr /c:"MD5 hash of" _HashList.tmp') do echo %%B) >z:\utilities\_HashDatabase.tmp
But I don't understand the need for the code. It seems obvious that your _HashList.tmp is a log of various CERTUTIL runs. This implies that you must have known the paths of the files when you ran CERTUTIL. So I don't understand why you need to parse out the file paths after the fact. You should have captured the values as part of the CERTUTIL processing.

And I would think you would want to capture the actual hash values.

It would be helpful if you provided the big picture of your overall intended process. What is your overall goal, and what are all the steps you are taking to get there, not just the code you have shown. I suspect there is a much better way of accomplishing your goals.


Dave Benham

Post Reply