How to replace special characters...

Discussion forum for all Windows batch related topics.

Moderator: DosItHelp

Post Reply
Message
Author
miskox
Posts: 630
Joined: 28 Jun 2010 03:46

How to replace special characters...

#1 Post by miskox » 28 Jun 2010 07:16

Hi all,

let's say we have a file tmp.tmp with the following records (there are some blanks (spaces) infront of the <a title):

Code: Select all

 <a title="some text" href="http://some.site.com/somefile1.jpg" target="_blank">somefile1.jpg</a>


This file contains some lines as above with somefiles1.jpg ranging from somefile1.jpg ...somefile'n'.jpg

Question:

How can we easily replace/remove special characters (", <, >) from this file?

Code: Select all

@echo off
setlocal ENABLEDELAYEDEXPANSION
for /f "tokens=1* delims=" %%a in (tmp.tmp) do set par1=%%a&& set par1=!par1:^<=!&& set par1=!par1:^>=! set par1=!par1:^"=!&&echo XX!par1!XX
goto :EOF


This is close to what I need but not 100% what I need. Just run it and you will see what I mean.

Please help.
Saso

jeb
Expert
Posts: 1055
Joined: 30 Aug 2007 08:05
Location: Germany, Bochum

Re: How to replace special characters...

#2 Post by jeb » 28 Jun 2010 07:44

Hi,

this should work, you got the complete content.

Code: Select all

@echo off
setlocal ENABLEDELAYEDEXPANSION
for /f "tokens=1* delims=" %%a in (tmp.tmp) do (
   set par1=%%a
   echo !par1!
)


If you want to escape the characters too, you could use something like

Code: Select all

@echo off
setlocal ENABLEDELAYEDEXPANSION
for /f "tokens=1* delims=" %%a in (tmp.tmp) do (
   set par1=%%a
   set par1=!par1:^<=^^^<!
   set par1=!par1:^>=^^^>!
   echo !par1!
)


jeb

SenHu
Posts: 19
Joined: 19 Mar 2009 14:57

Re: How to replace special characters...

#3 Post by SenHu » 28 Jun 2010 10:48

I think Jeb provided an excellent response.

If you want to do more fancy automated editing, you may want to consider these commands.

http://www.biterscripting.com/helppages_editors.html


I was unsure if you wanted to remove the entire html tags or just wanted to manipulate only some characters. If you want to remove the html tags (convert html source to plain text), here is a possible script already written by someone.

http://www.biterscripting.com/helppages ... oText.html

You call the script with this command.

Code: Select all

script "SS_WebPageToText.txt" page("http://www.dostips.com/forum/viewtopic.php?f=3&t=1130")



This will show the plain text from this page itself.

miskox
Posts: 630
Joined: 28 Jun 2010 03:46

Re: How to replace special characters...

#4 Post by miskox » 28 Jun 2010 15:03

Thank you both.

I will check this tomorrow (it is late here).

I just want to extract a URL from the line above. The main problem is always with those special characters...

Thanks again.
Saso

aGerman
Expert
Posts: 4678
Joined: 22 Jan 2010 18:01
Location: Germany

Re: How to replace special characters...

#5 Post by aGerman » 28 Jun 2010 15:45

Well, to process html code using batch is never a good idea. In case the number of quotes before the URL is always the same, you could work with this quotes to separate the URL from the whole waste before and after.
Try this

Code: Select all

@echo off &setlocal
for /f "delims=" %%a in (tmp.tmp) do (
  set "line=%%a"
  call set "line=%%line:"=\%%"
  call :proc
)
pause
goto :eof

:proc
for /f "delims=\ tokens=4" %%a in ("%line%") do (
  echo %%a
)
goto :eof



Regards
aGerman

Post Reply