Getting redirected url PowerShell for webcrawler - powershell

I'm just recently starting to learn powershell and made a webcrawler. Im trying to get the redirected url to job listings.
$links=#();
For ($i=0; $i -lt 50; $i=$i+10) {
$arr=((Invoke-WebRequest –Uri (‘http://www.indeed.com/jobs?q=software+engineer+entry+level+%2460%2C000&l=San+Diego%2C+CA&jt=fulltime&start=’+$i) -MaximumRedirection 2).Links
| Where data-tn-element -eq “jobTitle”).href;
foreach ($arr in $arr) {
$foo="https://www.indeed.com"+$arr -replace "mp;a","";
$bar=Invoke-WebRequest –Uri ($foo) -MaximumRedirection 10;
$test=$bar;
#im trying to get the url of $bar
$links+= $test;
}
}
foreach($links in $links){
echo $links;
}
With some editing i can get the links that indeed uses, but im trying to get the url where the actual job description is listed

If you get a redirection, the result should be visible response of your invoke-webrequest object:
$bar.BaseResponse.ResponseUri
So:
$test = $bar.BaseResponse.ResponseUri

Related

For-each loop unable to process all strings

I am trying to genrate URLs using a bunch of strings and an anchor URL
strings in s.txt are
123
234
345
anchor URL is https://testurl.com/prod/hostdetails.php?qs=
The code I am using
$ur = gc C:\temp\s.txt
foreach($u in $ur) {
$test = invoke-webrequest -uri "https://testurl.com/prod/hostdetails.php?qs=$u" -UseDefaultCredentials
}
$test
but it returns data only for
https://testurl.com/prod/hostdetails.php?qs=345
Its because you reassign $test each time within your loop. I suggest this solution:
$test = 'https://testurl.com/prod/hostdetails.php?{0}' -f ((C:\temp\s.txt) -join '&')
You edited your question. This is probably what you are looking for. Please notice the += operator to concat the results:
$result = #()
Get-Content C:\temp\s.txt | ForEach-Object {
$result += invoke-webrequest -uri "https://testurl.com/prod/hostdetails.php?qs=$($_)"
}

Getting unexpected Int value when passing value to powershell function as parameter [duplicate]

This question already has answers here:
How do I pass multiple parameters into a function in PowerShell?
(15 answers)
Closed 4 years ago.
I'm trying to call an API with a powershell script to return a paginated dump of all users and there values. I get a page count which populates the number of pages I need to call. I put that value into a for loop and the int increases with each run of the loop. In the middle of the Loop when I pass $I into my function the Function gets 0 instead of the number being passed in.
Function GetUserOnPage ([string]$AccessToken, [int]$I)
{
write-host $I 'the loaded page'
$Header=$null
$Header = #{};
$Header.Add("Authorization",'Bearer '+ $AccessToken)
$URL='https://mycompany.myapplication.com/api/member?page='+ $I
write-host $URL
$request = Invoke-webrequest -UseDefaultCredentials -Method Get -uri $URL -Headers $Header -ContentType application/x-www-form-urlencoded
$JsonParameters = ConvertFrom-Json -InputObject $request.content
$memberList = $JsonParameters.member_list
return $memberList
}
Function Execute()
{
BuildDataTable
$accessToken = LogintoBI
$pageCount = GetUserPageCount($accessToken)
$pages = $pageCount
For($I = 1; $I -le $pages; $I++)
{
Write-host 'counting up' $I
$members = GetUserOnPage($accessToken, [int]$I)
write-host 'checking' $I
Foreach($member in $members)
{
AddMemberToTable($member)
}
}
}
Execute
Below is the returns i'm putting in with the write-host to check my values
counting up 1
0 the loaded page
https://mycompany.myapplication.com/api/member?page=0
checking 1
counting up 2
0 the loaded page
https://mycompany.myapplication.com/api/member?page=0
checking 2
counting up 3
0 the loaded page
https://mycompany.myapplication.com/api/member?page=0
checking 3
counting up 4
0 the loaded page
https://mycompany.myapplication.com/api/member?page=0
checking 4
counting up 5
0 the loaded page
https://mycompany.myapplication.com/api/member?page=0
checking 5
counting up 6
0 the loaded page
https://mycompany.myapplication.com/api/member?page=0
checking 6
Change the below lines and it should clear up as well.
GetUserOnPage($accessToken, [int]$I)
Should be
GetUserOnPage $accessToken $I
Changed answer because of comment

Formatting Output of an Array

I've encountered a situation that I cannot seem to find a solution to. I am scraping a website using Invoke-WebRequest and when I look at my output from my array, several of the properties are System.Objects. I need to find a way to have them be strings so that when I Export-Csv I can actually see the values. Here is my code:
$params = #{api_id='';api_key='';page_size='100';site_id=''}
$stats = Invoke-WebRequest https://my.incapsula.com/api/visits/v1 -Method Post -Body $params
$s = $stats
$s = $s | ConvertFrom-Json
$s = $s.visits
Here are what my results look like:
My solution was to create a new custom object and use the following syntax on the properties that were an object themselves:
($_ | select -expandproperty 'propertyname')

TeamCity - iterating vcs roots with powershell

I was trying to create a meta-runner to generate a metadata file using powershell in TeamCity and I was wondering if there was a way to iterate over the different vcs routes?
My Code:
$fileName = "metadata.json"
$vcsArray = #()
for ($i = 0; $i -le 5; $i++)
{
$vcsObject= #{
"VCSNumber"="%build.vcs.number.Proj_App_TcTestApp%"
}
$vcsArray += $vcsObject
}
$content = #{
"TeamCityBuildLogUrl" = "http://teamcity.hps.com/viewLog.html?buildId=%teamcity.build.id%&tab=buildResultsDiv&buildTypeId=%system.teamcity.buildType.id%";
"TeamCityProjectName" = "%system.teamcity.projectName%";
"TeamCityBuildNumber" = "%system.build.number%";
"BuildDateGenerated" = (Get-Date).ToString();
"TeamCityExecutionAgentName" = "%teamcity.agent.name%";
"VCSes" = $vcsArray
}
}
$content = $content | Add-Member #{"VCS Version2" = "testValue"} -PassThru # How to add more members dynamically.
$content = ConvertTo-JSON $content
New-Item $fileName -type file -force -value "// Metadata file generated by TeamCity`n"
Add-Content $fileName $content
cat $fileName # Test afterwards
When I add another root, the names of the roots end up becoming the identifiers, which makes it difficult to iterate over them since I don't technically know the names of the roots.
Here's an example use-case:
I have two vcs roots:
%build.vcs.number.Proj_App_TcTestFW%
%build.vcs.number.Proj_App_TcTestApp%
Ideally, I'd like to iterate through them like so:
$vcsArray = #()
foreach ($vcsRoot in vcsRoots)
{
$vcsObject=#{
"VCSName"= $vcsRoot;
"VCSNumber"= "%build.vcs.number." + $vcsRoot%
}
$vcsArray += $vcsObject
}
But it seems that I have to hardcode the names in my script, so I'm currently at a loss.
Does TeamCity expose the VCS routes in such a way that I can iterate over them?
Thanks
Alex
Ok, I don't have an actual TeamCity experience, but it looks like you can get a list of roots by issuing a REST command:
Invoke-WebRequest -Uri 'http://teamcity.hps.com/httpAuth/app/rest/vcs-roots' -Method Get
which should return a XML responce with a list of roots:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<vcs-roots count="1">
<vcs-root id="TestProject1_TestProject1perforce"
name="test-project1-perforce"
href="/httpAuth/app/rest/vcs-roots/id:TestProject1_TestProject1perforce"/>
</vcs-roots>
Is this is what are you looking for?
References:
TeamCity REST API Commands

PowerShell: select string in standard output?

PS C:\squid\sbin> .\squid.exe -v
Squid Cache: Version 2.7.STABLE8
configure options: --enable-win32-service --enable-storeio='ufs aufs null coss' --enable-default-hostsfile=none --enable
-removal-policies='heap lru' --enable-snmp --enable-htcp --disable-wccp --disable-wccpv2 --enable-useragent-log --enable
-referer-log --enable-cache-digests --enable-auth='basic ntlm digest negotiate' --enable-basic-auth-helpers='LDAP NCSA m
swin_sspi squid_radius_auth' --enable-negotiate-auth-helpers=mswin_sspi --enable-ntlm-auth-helpers='mswin_sspi fakeauth'
--enable-external-acl-helpers='mswin_ad_group mswin_lm_group ldap_group' --enable-large-cache-files --enable-digest-aut
h-helpers='password LDAP eDirectory' --enable-forw-via-db --enable-follow-x-forwarded-for --enable-arp-acl --prefix=c:/s
quid
Compiled as Windows System Service.
PS C:\squid\sbin> .\squid.exe -v|Select-String Squid
squid.exe -v will output its version information, which contains keyword "Squid".
I want powershell to tell me whether keyword "Squid" exists in the output. So I use .\squid.exe -v|Select-String Squid, but it outputs nothing.
What's the right way to do it? I'm using PS 3.0.
You ARE doing it the right way :)
The problem is not your code but the squid port itself. Its doing something weird to write text to the console to where PowerShell and cmd can't capture it through the stdout/stderr streams. I'm guessing instead of using the stdout/stderr api it may be manipulating characters on the console directly or something. I tried redirecting stderr to stdout (2>&1) but that didn't work either.
It comes with a change log text file, I guess you can just parse that instead...
EDIT --
Or you can use this kludgy but serviceable workaround to scrape the console text:
function Get-ConsoleText {
if ($host.Name -eq 'ConsoleHost') {
$text_builder = new-object system.text.stringbuilder
$buffer_width = $host.ui.rawui.BufferSize.Width
$buffer_height = $host.ui.rawui.CursorPosition.Y
$rec = new-object System.Management.Automation.Host.Rectangle 0,0,($buffer_width -2), $buffer_height
$buffer = $host.ui.rawui.GetBufferContents($rec)
$console_out = #()
for($i = 0; $i -lt $buffer_height; $i++) {
$text_builder = new-object system.text.stringbuilder
for($j = 0; $j -lt $buffer_width; $j++) {
$cell = $buffer[$i,$j]
$text_builder.Append($cell.Character) | Out-Null
}
$console_out += $text_builder.ToString()
}
return $console_out
}
}
cls; .\squid.exe -v; Get-ConsoleText |
ForEach-Object {
if ($_ -match 'Version (.+)') {$matches[1]}
}