I found that the dowloader works but I need to add slash after http adress. I have lost two days till somebody noticed it
....
But I have a need to correct my script a bit because I dont know when the downloaded page does not contain any proxies because no other results. So I should stop downloading if not sufficient results found.
Old script:
Code: Select all
@echo off
Setlocal EnableDelayedExpansion
SET proxy_3=hide_1.htm
SET source_3=http://www.hidemyass.com/proxy-list/search-227955/
FOR /L %%N IN (1,+1,40) DO CALL :download "http://www.hidemyass.com/proxy-list/2%%N" hidemyass_ %%N
Wait... I am not sure why I have number 2 before %%N... I think here should be rather
Code: Select all
FOR /L %%N IN (1,+1,40) DO CALL :download "!source_3!%%N"
The code for download:
Code: Select all
:download
SET user_agent="User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1.8) Gecko/20100214 Ubuntu/9.10 (karmic) Firefox/3.5.8"
SET accept="Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8"
SET accept_language="Accept-Language: en-us,en;q=0.5"
SET accept_charset="Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7"
SET keep_alive="Keep-Alive: 300"
SET connection="Connection: keep-alive"
SET url=http://request.urih.com/
del %1 2>NUL
Echo Gonna download from %~1
curl -o %2%3.htm -H %user_agent% -H %accept% -H %accept_language% -H %accept_charset% -H %keep_alive% -H %connection% %~1
GOTO :eof
Now I need to detect size of downloaded file, if it is less or equal 26823 bytes and it does not contain string <span class="updatets ">
so this page is empty so no next downloading should be made.
But better way would be to count how many strings of <span class="updatets "> is included in the downloaded file to detect If I should continue to download.
But the dowload section should serve for more servers, but hydemyass is just one of them. So I need universal solution. I would like to get second argument of the call and from the output name remove all from underline to end of string. So from hidemyass_ is hidemyass. So if I would work with hidemyass, I could define which subroutine would be called to specify how many results of certain string should be contained in the downloaded file. If the count is less, then don't download next file. If the result is 0 then delete the actually downloaded file.
I know how to get filesize
for %%I in (!file!) do SET filesize=%%~zI
echo Filesize !filesize! B
but would like help in filtering the file...
I am having few commands doing similar things with grep:
Code: Select all
FOR /F "tokens=1-20 delims=<>" %%A IN ('grep -B 1411 -E "</table>" %file% ^| grep -E ^"^(display^|^>[0-9]{1^,2}^<^|[0-9][0-9][0-9]^|[0-9][0-9]{1^,2}^</td^>^|flag^|^<td^>HTTP^|rightborder^).*$^" ') DO (
Which means that it will make search in block if text starting on line 1411 and finishing with tag </table> ... I would need just change the part following to return count of results of the string <span class="updatets "> ...
So I believe I could have something like
Code: Select all
grep -B 1411 -E "</table>" %2%3.htm ^| grep -c ^"<span class="updatets "> ... "
to get count of results ...
any tips how to complete this code?