Discussion forum for all Windows batch related topics.
Moderator: DosItHelp
-
miskox
- Posts: 630
- Joined: 28 Jun 2010 03:46
#1
Post
by miskox » 28 Jun 2010 07:16
Hi all,
let's say we have a file tmp.tmp with the following records (there are some blanks (spaces) infront of the <a title):
Code: Select all
<a title="some text" href="http://some.site.com/somefile1.jpg" target="_blank">somefile1.jpg</a>
This file contains some lines as above with somefiles1.jpg ranging from somefile1.jpg ...somefile'n'.jpg
Question:
How can we easily replace/remove special characters (", <, >) from this file?
Code: Select all
@echo off
setlocal ENABLEDELAYEDEXPANSION
for /f "tokens=1* delims=" %%a in (tmp.tmp) do set par1=%%a&& set par1=!par1:^<=!&& set par1=!par1:^>=! set par1=!par1:^"=!&&echo XX!par1!XX
goto :EOF
This is close to what I need but not 100% what I need. Just run it and you will see what I mean.
Please help.
Saso
-
jeb
- Expert
- Posts: 1055
- Joined: 30 Aug 2007 08:05
- Location: Germany, Bochum
#2
Post
by jeb » 28 Jun 2010 07:44
Hi,
this should work, you got the complete content.
Code: Select all
@echo off
setlocal ENABLEDELAYEDEXPANSION
for /f "tokens=1* delims=" %%a in (tmp.tmp) do (
set par1=%%a
echo !par1!
)
If you want to escape the characters too, you could use something like
Code: Select all
@echo off
setlocal ENABLEDELAYEDEXPANSION
for /f "tokens=1* delims=" %%a in (tmp.tmp) do (
set par1=%%a
set par1=!par1:^<=^^^<!
set par1=!par1:^>=^^^>!
echo !par1!
)
jeb
-
SenHu
- Posts: 19
- Joined: 19 Mar 2009 14:57
#3
Post
by SenHu » 28 Jun 2010 10:48
I think Jeb provided an excellent response.
If you want to do more fancy automated editing, you may want to consider these commands.
http://www.biterscripting.com/helppages_editors.htmlI was unsure if you wanted to remove the entire html tags or just wanted to manipulate only some characters. If you want to remove the html tags (convert html source to plain text), here is a possible script already written by someone.
http://www.biterscripting.com/helppages ... oText.htmlYou call the script with this command.
Code: Select all
script "SS_WebPageToText.txt" page("http://www.dostips.com/forum/viewtopic.php?f=3&t=1130")
This will show the plain text from this page itself.
-
miskox
- Posts: 630
- Joined: 28 Jun 2010 03:46
#4
Post
by miskox » 28 Jun 2010 15:03
Thank you both.
I will check this tomorrow (it is late here).
I just want to extract a URL from the line above. The main problem is always with those special characters...
Thanks again.
Saso
-
aGerman
- Expert
- Posts: 4678
- Joined: 22 Jan 2010 18:01
- Location: Germany
#5
Post
by aGerman » 28 Jun 2010 15:45
Well, to process html code using batch is never a good idea. In case the number of quotes before the URL is always the same, you could work with this quotes to separate the URL from the whole waste before and after.
Try this
Code: Select all
@echo off &setlocal
for /f "delims=" %%a in (tmp.tmp) do (
set "line=%%a"
call set "line=%%line:"=\%%"
call :proc
)
pause
goto :eof
:proc
for /f "delims=\ tokens=4" %%a in ("%line%") do (
echo %%a
)
goto :eof
Regards
aGerman