Hey guys,
I know this question has been asked several times but even when digging into different posts, i'm not able to apply answer to my specific question (and i certainly am not skilled enough with dos scripting as a matter of fact )
So, here we go
Let's say i have a variable called $html which contains this :
<source src="http://videos.dummysite.com/videos/video01.mp4" type="video/mp4"/>
Now, what i would like to do is to extract just the URL between the 1st 2 quotes so that in the end, i'm able to retrieve just this :
http://videos.dummysite.com/videos/video01.mp4
(note that it could be another video format than mp4, hence why i'm talking about extract the address between the 1st 2 quotes (not the quotes after the tag "type")
Does that make any sense to you guys ?
How can i do that ?
using for loop and delims to extract a substring
Moderator: DosItHelp
Re: using for loop and delims to extract a substring
Give that a go:
Steffen
Code: Select all
@echo off &setlocal
set "$html=<source src="http://videos.dummysite.com/videos/video01.mp4" type="video/mp4"/>"
for /f tokens^=2^ delims^=^" %%i in ("%$html%") do echo %%i
pause
Re: using for loop and delims to extract a substring
Code: Select all
@echo off
setlocal
set "$html=<source src="http://videos.dummysite.com/videos/video01.mp4" type="video/mp4"/>"
set "x=%$html: =" & set "%"
echo %src:~1,-1%
Re: using for loop and delims to extract a substring
Code: Select all
@echo off
setlocal
set "$html=<source src="http://videos.dummysite.com/videos/video01.mp4" type="video/mp4"/>"
set %$html: type=<%nul <nul
echo(%src%
Re: using for loop and delims to extract a substring
Thanks a lot for all those reply
I realize that, just posting a question without giving the full picture doesn'l allow me to fully use your answers
I apology for that
So, here are the details.
My full script has the following purpose : browse through an Ascii file containing a list of html page / downloading those page / searching for the unique pattern "source src=" which gives me the URL of a video, and then, downloading the video itself
For the moment, my script looks like this :
As you can see, the downloaded page is locally named after a naming convention Page-Nbvid.html (and so is the final video name)
My actual problem, after fixing the extract of the name (for which i posted this question) is that if there's one or more spaces inside the name of the video, the tokens and delim parameters are not able to retrieve the full name of the video, but it stops after the 1st space
Any idea how to mitigate this ?
I realize that, just posting a question without giving the full picture doesn'l allow me to fully use your answers
I apology for that
So, here are the details.
My full script has the following purpose : browse through an Ascii file containing a list of html page / downloading those page / searching for the unique pattern "source src=" which gives me the URL of a video, and then, downloading the video itself
For the moment, my script looks like this :
Code: Select all
set /a PAGE=80
set /a NBVID=1
setlocal ENABLEDELAYEDEXPANSION
for /F "tokens=*" %%a in (Pages.txt) do (
curl -o !PAGE!-!NBVID!.html %%a --cacert cacert.pem -b cookies.txt --silent
for /f "tokens=3 delims== " %%b in ('findstr /c:"source src=" !PAGE!-!NBVID!.html') do (
curl -o !PAGE!-!NBVID!.mp4 %%~b --cacert cacert.pem -b cookies.txt --silent
My actual problem, after fixing the extract of the name (for which i posted this question) is that if there's one or more spaces inside the name of the video, the tokens and delim parameters are not able to retrieve the full name of the video, but it stops after the 1st space
Any idea how to mitigate this ?
Re: using for loop and delims to extract a substring
There are multiple issues here. For example lines of html files may not be limited to 8190 characters, and in most cases are utf-8 encoded, ... .
You better might want to use text extracting software such as for example dave benham's "JREPL.BAT":
viewtopic.php?f=3&t=6044
(An example is given on the linked page just search for "extracting between html tags".)
penpen
You better might want to use text extracting software such as for example dave benham's "JREPL.BAT":
viewtopic.php?f=3&t=6044
(An example is given on the linked page just search for "extracting between html tags".)
penpen