Hello DOS Tips
We have a list of URLs in a txt file "c:\files\list.txt"
We are wanting to only keep URLs that end in any of the filetypes we care about.
For instance, only keep URLs that end in PDF, DOC, PPT, htm, or html and dump only these into "c:\files\list_scrubbed.txt"
What we have is:
findstr ".pdf" c:\files\list.txt >c:\files\list_scrubbed.txt
findstr ".doc" c:\files\list.txt >>c:\files\list_scrubbed.txt
findstr ".ppt" c:\files\list.txt >>c:\files\list_scrubbed.txt
findstr ".htm" c:\files\list.txt >>c:\files\list_scrubbed.txt
findstr ".html" c:\files\list.txt >>c:\files\list_scrubbed.txt
However, the outcome is that if the filetype is mentioned somewhere besides the end, then it still gets fed into the scrubbed list.
We are trying to avoid any URLs that do not actually END in the filetype.
So http://www.somedomain.com/FlashyBadgers ... hadows.asp
is still ending up in the scrubbed file.
Any advice?
Aisha
findstr for file extension at end of line
Moderator: DosItHelp
Re: findstr for file extension at end of line
FINDSTR uses "Regular Expressions" where the . is for "any character". You need to escape it using a backslash.
Options:
r use regular expressions
i ignore case
e match the end of the string
Regular expressions:
\. literal period
* the character before the asterisk may or may not be content of the string
For further information execute FINDSTR /? or have a look at the command index.
Steffen
Code: Select all
findstr /rie "\.pdf \.doc \.ppt \.html*" c:\files\list.txt >c:\files\list_scrubbed.txt
Options:
r use regular expressions
i ignore case
e match the end of the string
Regular expressions:
\. literal period
* the character before the asterisk may or may not be content of the string
For further information execute FINDSTR /? or have a look at the command index.
Steffen
Re: findstr for file extension at end of line
that is perfect - thank you sir
Aisha
Aisha