findstr for file extension at end of line
Posted: 11 Apr 2017 15:07
Hello DOS Tips
We have a list of URLs in a txt file "c:\files\list.txt"
We are wanting to only keep URLs that end in any of the filetypes we care about.
For instance, only keep URLs that end in PDF, DOC, PPT, htm, or html and dump only these into "c:\files\list_scrubbed.txt"
What we have is:
findstr ".pdf" c:\files\list.txt >c:\files\list_scrubbed.txt
findstr ".doc" c:\files\list.txt >>c:\files\list_scrubbed.txt
findstr ".ppt" c:\files\list.txt >>c:\files\list_scrubbed.txt
findstr ".htm" c:\files\list.txt >>c:\files\list_scrubbed.txt
findstr ".html" c:\files\list.txt >>c:\files\list_scrubbed.txt
However, the outcome is that if the filetype is mentioned somewhere besides the end, then it still gets fed into the scrubbed list.
We are trying to avoid any URLs that do not actually END in the filetype.
So http://www.somedomain.com/FlashyBadgers ... hadows.asp
is still ending up in the scrubbed file.
Any advice?
Aisha
We have a list of URLs in a txt file "c:\files\list.txt"
We are wanting to only keep URLs that end in any of the filetypes we care about.
For instance, only keep URLs that end in PDF, DOC, PPT, htm, or html and dump only these into "c:\files\list_scrubbed.txt"
What we have is:
findstr ".pdf" c:\files\list.txt >c:\files\list_scrubbed.txt
findstr ".doc" c:\files\list.txt >>c:\files\list_scrubbed.txt
findstr ".ppt" c:\files\list.txt >>c:\files\list_scrubbed.txt
findstr ".htm" c:\files\list.txt >>c:\files\list_scrubbed.txt
findstr ".html" c:\files\list.txt >>c:\files\list_scrubbed.txt
However, the outcome is that if the filetype is mentioned somewhere besides the end, then it still gets fed into the scrubbed list.
We are trying to avoid any URLs that do not actually END in the filetype.
So http://www.somedomain.com/FlashyBadgers ... hadows.asp
is still ending up in the scrubbed file.
Any advice?
Aisha