Page 1 of 1

How to replace special characters...

Posted: 28 Jun 2010 07:16
by miskox
Hi all,

let's say we have a file tmp.tmp with the following records (there are some blanks (spaces) infront of the <a title):

Code: Select all

 <a title="some text" href="http://some.site.com/somefile1.jpg" target="_blank">somefile1.jpg</a>


This file contains some lines as above with somefiles1.jpg ranging from somefile1.jpg ...somefile'n'.jpg

Question:

How can we easily replace/remove special characters (", <, >) from this file?

Code: Select all

@echo off
setlocal ENABLEDELAYEDEXPANSION
for /f "tokens=1* delims=" %%a in (tmp.tmp) do set par1=%%a&& set par1=!par1:^<=!&& set par1=!par1:^>=! set par1=!par1:^"=!&&echo XX!par1!XX
goto :EOF


This is close to what I need but not 100% what I need. Just run it and you will see what I mean.

Please help.
Saso

Re: How to replace special characters...

Posted: 28 Jun 2010 07:44
by jeb
Hi,

this should work, you got the complete content.

Code: Select all

@echo off
setlocal ENABLEDELAYEDEXPANSION
for /f "tokens=1* delims=" %%a in (tmp.tmp) do (
   set par1=%%a
   echo !par1!
)


If you want to escape the characters too, you could use something like

Code: Select all

@echo off
setlocal ENABLEDELAYEDEXPANSION
for /f "tokens=1* delims=" %%a in (tmp.tmp) do (
   set par1=%%a
   set par1=!par1:^<=^^^<!
   set par1=!par1:^>=^^^>!
   echo !par1!
)


jeb

Re: How to replace special characters...

Posted: 28 Jun 2010 10:48
by SenHu
I think Jeb provided an excellent response.

If you want to do more fancy automated editing, you may want to consider these commands.

http://www.biterscripting.com/helppages_editors.html


I was unsure if you wanted to remove the entire html tags or just wanted to manipulate only some characters. If you want to remove the html tags (convert html source to plain text), here is a possible script already written by someone.

http://www.biterscripting.com/helppages ... oText.html

You call the script with this command.

Code: Select all

script "SS_WebPageToText.txt" page("http://www.dostips.com/forum/viewtopic.php?f=3&t=1130")



This will show the plain text from this page itself.

Re: How to replace special characters...

Posted: 28 Jun 2010 15:03
by miskox
Thank you both.

I will check this tomorrow (it is late here).

I just want to extract a URL from the line above. The main problem is always with those special characters...

Thanks again.
Saso

Re: How to replace special characters...

Posted: 28 Jun 2010 15:45
by aGerman
Well, to process html code using batch is never a good idea. In case the number of quotes before the URL is always the same, you could work with this quotes to separate the URL from the whole waste before and after.
Try this

Code: Select all

@echo off &setlocal
for /f "delims=" %%a in (tmp.tmp) do (
  set "line=%%a"
  call set "line=%%line:"=\%%"
  call :proc
)
pause
goto :eof

:proc
for /f "delims=\ tokens=4" %%a in ("%line%") do (
  echo %%a
)
goto :eof



Regards
aGerman