Extract data from website (another problem)

Discussion forum for all Windows batch related topics.

Moderator: DosItHelp

Post Reply
Message
Author
miskox
Posts: 630
Joined: 28 Jun 2010 03:46

Extract data from website (another problem)

#1 Post by miskox » 24 Apr 2018 04:52

Hello!

Here viewtopic.php?f=3&t=7852#p52292 Hackoo provided very useful tool to extract links from a website.

I have this html (107 of them) 1200.html - see attached .zip (it is from lenovo.com).

Hackoo's solution gives me this:

Code: Select all

http://support.lenovo.com/en_US/downloads/detail.page?DocID=DS002917 ========> ... Learn more
../../../ibmdl/pub/pc/pccbbs/mobiles/tpafkq98.exe ========>  version 5.12.4028 - Audio driver for Windows 98
../../../ibmdl/pub/pc/pccbbs/mobiles/tpafkq98.txt ========>  Read me
http://support.lenovo.com/en_US/downloads/detail.page?DocID=DS001350 ========> ... Learn more
../../../ibmdl/pub/pc/pccbbs/mobiles/tpaf152k.exe ========>  version 5.12.01.4031 - Audio Features III for Windows 2000
../../../ibmdl/pub/pc/pccbbs/mobiles/tpaf152k.txt ========>  Read me
http://support.lenovo.com/en_US/downloads/detail.page?DocID=DS003130 ========> ... Learn more
../../../ibmdl/pub/pc/pccbbs/mobiles/aftpkw8m.exe ========>  version 5.12.01.4028 - Audio Features III for Windows 98/Me
../../../ibmdl/pub/pc/pccbbs/mobiles/aftpkw8m.txt ========>  Read me
http://support.lenovo.com/en_US/downloads/detail.page?DocID=DS003688 ========> ... Learn more
../../../ibmdl/pub/pc/pccbbs/mobiles/tpafkw2k.exe ========>  Version 5.12.01.4031 - Audio features IV for Windows 2000
../../../ibmdl/pub/pc/pccbbs/mobiles/tpafkw2k.txt ========>  Read me
I need more info:

This .html has an option to display just information that is related to the selected category. So this .html has 'audio', 'bios', 'cd and dvd drive', 'diskette drive'... categories which I would also need. Together with the 'operating system' column. Of course all in a way that I know how to put all the data together.

See attached image.

Hope this makes sense.
Thanks.
Saso
Attachments
1200.png
1200.png (36.36 KiB) Viewed 2945 times
1200.zip
(8.19 KiB) Downloaded 295 times

miskox
Posts: 630
Joined: 28 Jun 2010 03:46

Re: Extract data from website (another problem)

#2 Post by miskox » 25 Apr 2018 14:00

I checked .html file:

looks like I need these:

- category is between

Code: Select all

id='table1' name='
and

Code: Select all

'><thead>
- operating systems supported are between

Code: Select all

<br><br></td><td>
and

Code: Select all

<br></td><td>
If there is more than one they are separated by

Code: Select all

<br>
Well, I don't know how to get this information.

Anyone?

Saso

Post Reply