Powershell Web Data Fetching - powershell

I am not sure what I am doing wrong. I am trying to fetch the data from URL below.
But I am not getting anything in return using the code below.
What I want to fetch (34)
Powershell Code
$downloadURL = 'https://example.html'
$downloadRequest = Invoke-WebRequest -Uri $downloadURL
Result
Data not in the output code from Powershell

The page is dynamically created with JavaScript. It is hard to get the data by using Invoke-WebRequest cmdlet. Instead of that, try to use Internet Explorer COM object like this:
$url = 'https://example.com'
$ie = New-Object -ComObject InternetExplorer.Application
$ie.Visible = $false
$ie.Navigate2($url)
while($ie.ReadyState -ne 4) { Start-Sleep 1 }
$ie.document.body.getElementsByClassName('example-class-name')::outerHTML
$ie.document.body.getElementsByClassName('example-class-name')::textContent
For more information, see this article.

Related

OAuth2.0 authorization code flow in PowerShell

I wrote the below function to pop up an IE window to handle the user authentication of the OAuth2.0 authorization code flow in PowerShell which works but when calling it as a function, it doesn't stay in the while loop to wait for the URL of the IE window to change and to filter out the OAuth2.0 authorization code and then close the window.
Is there a way to keep the function "open" for longer and to make sure it waits for the URL of the IE window to change?
All remarks regarding the function are welcome...
function Show-OAuth2AuthCodeWindow {
[CmdletBinding()]
param
(
[Parameter(Mandatory = $true, Position = 0, HelpMessage = "The OAuth2 authorization code URL pointing towards the oauth2/v2.0/authorize endpoint as documented here: https://learn.microsoft.com/en-us/azure/active-directory/develop/v2-oauth2-auth-code-flow")]
[System.Uri] $URL
)
try {
# create an Internet Explorer object to display the OAuth 2 authorization code browser window to authenticate
$InternetExplorer = New-Object -ComObject InternetExplorer.Application
$InternetExplorer.Width = "600"
$InternetExplorer.Height = "500"
$InternetExplorer.AddressBar = $false # disable the address bar
$InternetExplorer.ToolBar = $false # disable the tool bar
$InternetExplorer.StatusBar = $false # disable the status bar
# store the Console Window Handle (HWND) of the created Internet Explorer object
$InternetExplorerHWND = $InternetExplorer.HWND
# make the browser window visible and navigate to the OAuth2 authorization code URL supplied in the $URL parameter
$InternetExplorer.Navigate($URL)
# give Internet Explorer some time to start up
Start-Sleep -Seconds 1
# get the Internet Explorer window as application object
$InternetExplorerWindow = (New-Object -ComObject Shell.Application).Windows() | Where-Object {($_.LocationURL -match "(^https?://.+)") -and ($_.HWND -eq $InternetExplorerHWND)}
# wait for the URL of the Internet Explorer window to hold the OAuth2 authorization code after a successful authentication and close the window
while (($InternetExplorerWindow = (New-Object -ComObject Shell.Application).Windows() | Where-Object {($_.LocationURL -match "(^https?://.+)") -and ($_.HWND -eq $InternetExplorerHWND)})) {
Write-Host $InternetExplorerWindow.LocationURL
if (($InternetExplorerWindow.LocationURL).StartsWith($RedirectURI.ToString() + "?code=")) {
$OAuth2AuthCode = $InternetExplorerWindow.LocationURL
$OAuth2AuthCode = $OAuth2AuthCode -replace (".*code=") -replace ("&.*")
$InternetExplorerWindow.Quit()
}
}
# return the OAuth2 Authorization Code
return $OAuth2AuthCode
}
catch {
Write-Host -ForegroundColor Red "Could not create a browser window for the OAuth2 authentication"
}
}
The following example does what you want with a WebBrowser control, which allows you to register a Navigating event handler to catch the authorization code obtained from your authorization server.
PowerShell OAuth2 client
Answer from this blog post
I managed to get the Auth code flow working using the headless chrome. All you need are these two components.
Chrome/edge driver
Selenium driver
Once you have these setup, you need to use the below Powershell commands to generate token using Auth code flow
$SeleniumWebDriverFullPath = ".\WebDriver.dll" # Full path to selenium web driver
$ClientId = ""
$Scopes = ""
$RedirectUri = ""
$authCodeUri = "$($AuthorizeEndpoint.TrimEnd("/"))?client_id=$ClientId&scope=$Scopes&redirect_uri=$RedirectUri&response_type=code
Write-Host $authCodeUri
Import-Module $SeleniumWebDriverFullPath
$ChromeOptions = New-Object OpenQA.Selenium.Edge.EdgeOptions
$ChromeOptions.AddArgument('headless')
$ChromeOptions.AcceptInsecureCertificates = $True
$ChromeDriver = New-Object OpenQA.Selenium.Edge.EdgeDriver($ChromeOptions);
$ChromeDriver.Navigate().GoToUrl($authCodeUri);
while (!$ChromeDriver.Url.Contains("code")) { Start-Sleep 1 }
Write-Host $ChromeDriver.Url
$ParsedQueryString = [System.Web.HttpUtility]::ParseQueryString($ChromeDriver.Url)
$Code = $ParsedQueryString[0]
Write-Host "Received code: $Code"
Write-Host "Exchanging code for a token"
$tokenrequest = #{ "client_id" = $ClientId; "grant_type" = "authorization_code"; "redirect_uri" = $RedirectUri; "code" = $ParsedQueryString[0] }
$token = Invoke-RestMethod -Method Post -Uri $AuthTokenEndpoint -Body $tokenrequest
$tokenString = $token | ConvertTo-Json
My guess is that the function has no idea what $RedirectURI is.
You should make that a second parameter to the function or it should be (at least) Script scoped
I'd prefer using a second parameter, but if you do scoping, you should be able to use it inside the function with $script:RedirectURI

Get URL of page in Internet Explorer using Powershell

I have a PowerShell script which will open an instance of internet explorer, go to a certain website and find a hidden link but when I go to that link it redirects me to another page.
I'd like to find the URL of the page that internet explorer is now in to be stored in a variable.
Thanks!
For those who only read code:
$IE = new-object -com internetexplorer.application
$IE.visible = $true
$Document = $IE.navigate2("example.com")
# Do stuff
$Document = $IE.navigate2($LinkIFound)
# It redirects me...
# TODO: Find URL of page I am now in.
So if you are trying to get the current location of a document then you can use : $IE.Document.url
$IE = new-object -com internetexplorer.application
$IE.visible = $true
$Document = $IE.navigate2("example.com")
# Do stuff
$OldUrl = $IE.Document.url
$Document = $IE.navigate2($LinkIFound)
sleep -seconds 3
$NewUrl = $IE.Document.url

In PowerShell, is there a way to get the text of a URL without using a browser?

The site itself has to be logged into-which I do as a "ghost user" using the COM object. There is a page with a table that I'm able to copy by sending keystrokes, collecting a list of urls/ids/other info, which is saved to the clipboard/text file. With the com object, I'm able to open each url element from that file(as long as I'm logged in) and get the page text in the document request, but only after the page has loaded(too soon returns a partial page).
This is extremely slow over 15,000 pages/urls and I was hoping to improve that speed without having to render each url(I even have to put each one in a loop with checks just in case it fails to load)
Is there a way to grab the various label:LabelValue properties from the URL of a webpage without loading it? Edit* I have the code:
[Net.ServicePointManager]::SecurityProtocol = [Net.SecurityProtocolType]::Tls12
$ie = New-Object -ComObject 'internetExplorer.Application'
$user = "me"; $pw = "pw"
$ie.Visible = $true
$ie.Navigate("https://Loginscreen.com")
$userField = $ie.document.getElementByID('login')
$usernamefield.value = "$user"
$passwordfield = $ie.document.getElementByID('password')
$passwordfield.value = "$pw"
$Link = $ie.document.getElementByID('SubmitLogin')
$Link.click()
$ieProc = Get-Process | ? { $_.MainWindowHandle -eq $ie.HWND }
[Microsoft.VisualBasic.Interaction]::AppActivate($ieProc.Id)
$fc = gc "C:\TempProject.txt"
foreach ($f in $fc)
{
$url = $f.split("`t")[-1]
$HTML = Invoke-WebRequest $url
$body = $HTML.ParsedHTML.body.innerText
$body
}
However, it seems to be returning generic information about the login page.....I couldn't find any reference to anything on the webpage that I wanted-either HTML tag info or the text itself.
If you mean "Is there a way to get the text of a URL without using a browser?" the answer is an overwhelming and resounding Yes! Try Invoke-WebRequest and see if the Content property has what you are looking for. Scraping HTML out of text is still something you will need to do manually or use a different package to scrape your juicy bits from the fruit.

Using htmlagilitypack to capture 'p class' in a website

I want to capture this 'text' shown int he picture below on a loop every minute, as this text changes every few minutes.
Here's the code I am using, and using HtmlAgilityPack.HtmlDocument
$metro = 'greatesthits'
$URL = "https://triplem.scadigital.com.au/stations/$metro/live"
[Reflection.Assembly]::LoadFile("C:\Users\makean\Downloads\htmlagilitypack.1.8.10\lib\Net45\HtmlAgilityPack.dll")
[HtmlAgilityPack.HtmlWeb]$web = #{}
[HtmlAgilityPack.HtmlDocument]$doc = $web.Load($url)
$doc.DocumentNode.SelectNodes(".//*[contains(#class,'sc-bdVaJa iHZvIS')]")
This is slimier code below, does the same thing, however just a different way of doing it
$metro = 'greatesthits'
$URL = "https://triplem.scadigital.com.au/stations/$metro/live"
Add-Type -path 'C:\Users\makean\Downloads\htmlagilitypack.1.8.10\lib\Net45\HtmlAgilityPack.dll'
$doc = New-Object HtmlAgilityPack.HtmlDocument
$wc = New-Object System.Net.WebClient
$doc.LoadHtml($wc.DownloadString($url))
$doc.DocumentNode.SelectNodes(".//*[contains(#class,'sc-bdVaJa iHZvIS')]")
This class sc-bdVaJa iHZvIS is a div and sits just a bit higher than PlayerNowPlaying__TrackInfo-kia103-1 gDXfGh and PlayerNowPlaying__TrackInfo-kia103-1 gDXfGh is what I want to capture, however when using this in my code, it returns blank.
How can I return just the text I want? Any help greatly appreciated.
I looked further at the thanks to the person above who pointed me in the right direction, checked the network option in Chrome 'inspect'. Grabbed the metadata from the stream URL.
$metro = '2classicrock'
$URL = 'https://wz2web.scahw.com.au/live/' + $metro + '_32.stream/playlist.m3u8'
$null = (Invoke-WebRequest -Uri $URL).RawContent -match '(https.*m3u8.*)'
$StreamURL = $Matches[0]
$streamMetaData = Invoke-WebRequest -Uri $StreamURL
$null = $streamMetaData.RawContent -match '#EXTINF:4.*?,(.*)'
$Matches[1]

Add TFS Tag with REST API via powershell

I want to add a tag to a TFS project using the REST API in Powershell.
I am trying to make this request based on the documentation for
Visual Studio Integration
I am calling this:
[void][System.Reflection.Assembly]::LoadWithPartialName('Microsoft.TeamFoundation.Client')
[void][System.Reflection.Assembly]::LoadWithPartialName('Microsoft.TeamFoundation.WorkItemTracking.Client')
if ( (Get-PSSnapin -Name "Microsoft.TeamFoundation.Powershell" -ErrorAction SilentlyContinue) -eq $null )
{
Add-PSSnapin "Microsoft.TeamFoundation.Powershell"
}
$SrcCollectionUrl = 'http://tfs.myCompany.com:8080/tfs/MyCollection'
$SrcProjectName = 'myProject'
[psobject] $tfs = [Microsoft.TeamFoundation.Client.TeamFoundationServerFactory]::GetServer($SrcCollectionUrl)
[Microsoft.TeamFoundation.WorkItemTracking.Client.WorkItemStoreFlags]$WorkItemBypass = [Microsoft.TeamFoundation.WorkItemTracking.Client.WorkItemStoreFlags]::BypassRules
$tfstp = New-Object Microsoft.TeamFoundation.Client.TfsTeamProjectCollection($SrcCollectionUrl)
$WorkItemStore = New-Object -TypeName 'Microsoft.TeamFoundation.WorkItemTracking.Client.WorkItemStore' -ArgumentList $tfs.TfsTeamProjectCollection, $WorkItemBypass
$SrcProject = $WorkItemStore.Projects[$SrcProjectName]
$ProjectGuid = Split-Path $SrcProject.Uri -Leaf
$AddTagsUrl = '{0}/_apis/tagging/scopes/{1}/tags?api-version=1.0' -f $SrcCollectionUrl,$ProjectGuid
$newTagParams = #{name="PWCreateTag2"}
$outjson = $newTagParams | ConvertTo-Json
$nresp = Invoke-RestMethod -Method POST -Uri $AddTagsUrl -UseDefaultCredentials -Body $outjson -ContentType 'application/json'
Everything works. The first time. However the documentation states: "If a tag by that name already exists, no tag is created. Instead, the response body includes the existing tag with that name."
The 2nd time I call the line I get: "The remote server returned an error: (400) Bad Request."
Anyone have any Idea why this fails the 2nd time?
FYI: TFS Server is 2015, Powershell is V4
I created powershell module for this - tfs
To add tags:
'tag1','tag2' | % { Add-TFSBuildTag -Id 520 -Tag $_ }