Page 1 of 2

Using PowerShell to check if URLs are valid

Posted: 05 Sep 2015 18:29
by born2achieve
Hello Guys,

I am trying to check the URL exists logic. I have text file contains 10000 url's and i wanted to check if some url broken because of no image exists on the directory. I tried with Wget and it took 3 hours to check it and response which url doesn't have image. I browsed through and found once nice article about powershell will do much faster. I can understand the below code. But little confused how to call my "Url.txt" file on this code. Any suggestion please how to use the below code to check image exists and if not output the url which is broken.

https://www.petri.com/testing-uris-urls-powershell
code:

Code: Select all

#requires -version 4.0

Function Test-URI {
<#
.Synopsis
Test a URI or URL
.Description
This command will test the validity of a given URL or URI that begins with either http or https. The default behavior is to write a Boolean value to the pipeline. But you can also ask for more detail.

Be aware that a URI may return a value of True because the server responded correctly. For example this will appear that the URI is valid.

test-uri -uri http://files.snapfiles.com/localdl936/CrystalDiskInfo7_2_0.zip

But if you look at the test in detail:

ResponseUri   : http://files.snapfiles.com/localdl936/CrystalDiskInfo7_2_0.zip
ContentLength : 23070
ContentType   : text/html
LastModified  : 1/19/2015 11:34:44 AM
Status        : 200

You'll see that the content type is Text and most likely a 404 page. By comparison, this is the desired result from the correct URI:

PS C:\> test-uri -detail -uri http://files.snapfiles.com/localdl936/CrystalDiskInfo6_3_0.zip

ResponseUri   : http://files.snapfiles.com/localdl936/CrystalDiskInfo6_3_0.zip
ContentLength : 2863977
ContentType   : application/x-zip-compressed
LastModified  : 12/31/2014 1:48:34 PM
Status        : 200

.Example
PS C:\> test-uri https://www.petri.com
True
.Example
PS C:\> test-uri https://www.petri.com -detail

ResponseUri   : https://www.petri.com/
ContentLength : -1
ContentType   : text/html; charset=UTF-8
LastModified  : 1/19/2015 12:14:57 PM
Status        : 200
.Example
PS C:\> get-content D:\temp\uris.txt | test-uri -Detail | where { $_.status -ne 200 -OR $_.contentType -notmatch "application"}

ResponseUri   : http://files.snapfiles.com/localdl936/CrystalDiskInfo7_2_0.zip
ContentLength : 23070
ContentType   : text/html
LastModified  : 1/19/2015 11:34:44 AM
Status        : 200

ResponseURI   : http://download.bleepingcomputer.com/grinler/rkill
ContentLength :
ContentType   :
LastModified  :
Status        : 404

Test a list of URIs and filter for those that are not OK or where the type is not an application.
.Notes
Last Updated: January 19, 2015
Version     : 1.0

Learn more about PowerShell:
http://jdhitsolutions.com/blog/essential-powershell-resources/

  ****************************************************************
  * DO NOT USE IN A PRODUCTION ENVIRONMENT UNTIL YOU HAVE TESTED *
  * THOROUGHLY IN A LAB ENVIRONMENT. USE AT YOUR OWN RISK.  IF   *
  * YOU DO NOT UNDERSTAND WHAT THIS SCRIPT DOES OR HOW IT WORKS, *
  * DO NOT USE IT OUTSIDE OF A SECURE, TEST SETTING.             *
  ****************************************************************

.Link
Invoke-WebRequest
#>

[cmdletbinding(DefaultParameterSetName="Default")]
Param(
[Parameter(Position=0,Mandatory,HelpMessage="Enter the URI path starting with HTTP or HTTPS",
ValueFromPipeline,ValueFromPipelineByPropertyName)]
[ValidatePattern( "^(http|https)://" )]
[Alias("url")]
[string]$URI,
[Parameter(ParameterSetName="Detail")]
[Switch]$Detail,
[ValidateScript({$_ -ge 0})]
[int]$Timeout = 30
)

Begin {
    Write-Verbose -Message "Starting $($MyInvocation.Mycommand)"
    Write-Verbose -message "Using parameter set $($PSCmdlet.ParameterSetName)"
} #close begin block

Process {

    Write-Verbose -Message "Testing $uri"
    Try {
     #hash table of parameter values for Invoke-Webrequest
     $paramHash = @{
     UseBasicParsing = $True
     DisableKeepAlive = $True
     Uri = $uri
     Method = 'Head'
     ErrorAction = 'stop'
     TimeoutSec = $Timeout
    }

    $test = Invoke-WebRequest @paramHash

     if ($Detail) {
        $test.BaseResponse |
        Select ResponseURI,ContentLength,ContentType,LastModified,
        @{Name="Status";Expression={$Test.StatusCode}}
     } #if $detail
     else {
       if ($test.statuscode -ne 200) {
            #it is unlikely this code will ever run but just in case
            Write-Verbose -Message "Failed to request $uri"
            write-Verbose -message ($test | out-string)
            $False
         }
         else {
            $True
         }
     } #else quiet
     
    }
    Catch {
      #there was an exception getting the URI
      write-verbose -message $_.exception
      if ($Detail) {
        #most likely the resource is 404
        $objProp = [ordered]@{
        ResponseURI = $uri
        ContentLength = $null
        ContentType = $null
        LastModified = $null
        Status = 404
        }
        #write a matching custom object to the pipeline
        New-Object -TypeName psobject -Property $objProp

        } #if $detail
      else {
        $False
      }
    } #close Catch block
} #close Process block

End {
    Write-Verbose -Message "Ending $($MyInvocation.Mycommand)"
} #close end block

} #close Test-URI Function

Re: Help Needed in PowerShell

Posted: 05 Sep 2015 22:55
by ShadowThief
I'm afraid I'm not too familiar with PowerShell, what with this being a forum dedicated to batch.

Re: Help Needed in PowerShell

Posted: 05 Sep 2015 22:57
by foxidrive
I'm not all that clued up on Powershell but that script doesn't return anything here. No info.

Re: Help Needed in PowerShell

Posted: 06 Sep 2015 11:23
by born2achieve
apologize for the inconvenient guys. I thought if someone know about that would help me. not a problem. I am trying with different approach.

thanks

Re: Help Needed in PowerShell

Posted: 06 Sep 2015 12:02
by Yury
Try the batch file with the more simple PowerShell code:

Code: Select all

@powershell "gc 'URL.txt'|%%{if($(Try{(iwr $_).StatusCode}Catch{}) -eq 200){$_}}|sc 'True_URL.txt'"
.

Re: Help Needed in PowerShell

Posted: 06 Sep 2015 12:39
by born2achieve
Hi Yury,

thanks for the reply and this what i tried based on your input. I created .bat file and pasted the code as below

Code: Select all

@powershell "gc 'E:\CheckImageExits\mm.txt'|%%{if($(Try{(iwr $_).StatusCode}Catch{}) -eq 200){$_}}|sc 'True_URL.txt'"


but nothing is happening. any suggestion please

Re: Help Needed in PowerShell

Posted: 06 Sep 2015 19:52
by Meerkat
born2achieve wrote:Hi Yury,

thanks for the reply and this what i tried based on your input. I created .bat file and pasted the code as below

Code: Select all

@powershell "gc 'E:\CheckImageExits\mm.txt'|%%{if($(Try{(iwr $_).StatusCode}Catch{}) -eq 200){$_}}|sc 'True_URL.txt'"


but nothing is happening. any suggestion please


Hmm... IWR does not work in my PowerShell (I have v2.0). Maybe it exist on later versions of PS...

Maybe this could help: :roll:

Code: Select all

@powershell "gc 'links.txt'|%%{if($(try{[int][Net.WebRequest]::Create($_).GetResponse().Statuscode}Catch{}) -eq 200){$_}}|sc 'True_URL.txt'"


NOTE:
All links in the input file must have http:// "prefix" so that links in the text file will be 'parsed' correctly.

Meerkat

Re: Help Needed in PowerShell

Posted: 06 Sep 2015 20:22
by born2achieve
Hi Meerkat,

Thanks,

I copied your code and made it as .bat file. I have the text file links.txt in the same directory where the .bat file exists. then i hit the the .bat file and nothing is happening. Am i missing anything here.

I am having powershell version 1.0 and working in windows7 OS. Any help would be much appreciated

Re: Help Needed in PowerShell

Posted: 06 Sep 2015 20:34
by Squashman
born2achieve wrote:I am having powershell version 1.0 and working in windows7 OS. Any help would be much appreciated

I highly doubt that.

Re: Help Needed in PowerShell

Posted: 06 Sep 2015 20:40
by born2achieve
thanks squashMan.

It would be great if we have any workaround this solve this.

Re: Help Needed in PowerShell

Posted: 06 Sep 2015 20:55
by foxidrive
Meerkat's code works here in my brief test.
The -command switch I added wasn't needed here but works fine.

Code: Select all

@echo off
powershell -command "gc 'links.txt'|%%{if($(try{[int][Net.WebRequest]::Create($_).GetResponse().Statuscode}Catch{}) -eq 200){$_}}|sc 'True_URL.txt'"
pause


links.txt

Code: Select all

http://www.google.com
https://www.google.com
http://www.teeeelstra.com
https://www.teeeelstra.com
http://www.telstra.com
https://www.telstra.com
http://www.dostips.com/forum/images/smilies/icon_rollrrreyes.gif
http://www.dostips.com/forum/images/smilies/icon_rolleyes.gif



The output file:

Code: Select all

http://www.google.com
https://www.google.com
http://www.telstra.com
https://www.telstra.com
http://www.dostips.com/forum/images/smilies/icon_rolleyes.gif

Re: Using PowerShell to check if URLs are valid

Posted: 06 Sep 2015 21:10
by born2achieve
Hi Foxidrive,
thanks now it works. But i wanted to capture invalid urls. But the logic in the code has outputs the valid urls. I have 10*1000 urls in file and i just need to output the invalid url's only. Any suggestion please

Re: Using PowerShell to check if URLs are valid

Posted: 06 Sep 2015 21:20
by born2achieve
Finally am able to achieve. instead of "-eq" i need to use "ne" which is not equal. Great and thanks everyone for the wonderful help.

Re: Using PowerShell to check if URLs are valid

Posted: 07 Sep 2015 04:58
by born2achieve
One quick question,

Is there any way to print the running url? Right now i don't see what's happening on the screen. it's blank. But the process is running. I would like see which url is getting checked. Any suggestion please,
thnks

Re: Using PowerShell to check if URLs are valid

Posted: 07 Sep 2015 05:56
by Meerkat
Sir, something like this? :)

Batch file code:

Code: Select all

@echo off
powershell -command "gc 'links.txt'|%%{if($(try{[int][Net.WebRequest]::Create($_).GetResponse().Statuscode}Catch{}) -eq 200){$_+' [OK]'}else{$_+' [X]'}}"
pause


links.txt

Code: Select all

http://www.google.com
https://www.google.com
http://www.teeeelstra.com
https://www.teeeelstra.com
http://www.telstra.com
https://www.telstra.com
http://www.dostips.com/forum/images/smilies/icon_rollrrreyes.gif
http://www.dostips.com/forum/images/smilies/icon_rolleyes.gif


Output of Batch File:

Code: Select all

\Desktop>code
http://www.google.com [OK]
https://www.google.com [OK]
http://www.teeeelstra.com [X]
https://www.teeeelstra.com [X]
http://www.telstra.com [OK]
https://www.telstra.com [OK]
http://www.dostips.com/forum/images/smilies/icon_rollrrreyes.gif [X]
http://www.dostips.com/forum/images/smilies/icon_rolleyes.gif [OK]
Press any key to continue . . .

Desktop>


Meerkat