Powershell Invoke-RestMethod incorrect character - powershell

I'm using Invoke-RestMethod to get page names from an application I'm using. I notice that when I do a GET on the page it returns the page name like so
This page â is working
However the actual page name is
This page – is working
Here's how my request looks
Invoke-WebRequest -Uri ("https://example.com/rest/api/content/123789") -Method Get -Headers $Credentials -ContentType "application/json; charset=utf-8"
The problem is with the en-dash, does anyone know how I can fix this?

In case of Invoke-WebRequest does not detect responce encoding right, you can use RawContentStream and convert it to needed encoding:
$resp = Invoke-WebRequest -Uri ...
$html=[system.Text.Encoding]::UTF8.GetString($resp.RawContentStream.ToArray());

Invoke-restmethod or invoke-webrequest?
The Invoke-RestMethod cmdlet uses the default decoding on the result of the HttpWebResponse.CharacterSet property.
If that is not set it uses a default encoding of ISO-8859-1 by default (afaik).
I'm assuming your server is sending some wrong charset in the response headers (or dropping it) hence it's beeing decoded wrongly.
Do you know what charset/encoding are sent in your response from your server?
If you're trying the Invoke-webrequest; check your headers in your response like e.g.
$r = invoke-webrequest http://example.com
$r.Headers
If you're dealing with an encoding issue; e.g. your server is not sending the right headers; you can always try to dump the response in a file and read it with a different encoding:
Invoke-WebRequest http://example.com -outfile .\test.txt
$content = get-content .\test.txt -Encoding utf8 -raw
In this case you will no longer be working with the http-response; but it might help you debug/find the encoding issues your looking for.

One line solution (without files):
[system.Text.Encoding]::UTF8.GetString((Invoke-WebRequest "https://www.example.com").RawContentStream.ToArray())

Related

Invoke-WebRequest without OutFile?

I used Invoke-WebRequest in Powershell to download a file without using the -OutFile parameter, and, from the documentation here, the file should've ended up in the directory I was in. However, there is nothing. Response was OK, no error was shown.
What could've happened to that file? Am I mistaken about how Invoke-WebRequest should work without an Out parameter?
Thanks!
Note: I know I can easily download the file using the parameter, but it's pretty big and I'd like to make sure it doesn't end up clogging disk space somewhere I don't need
From the linked docs:
By default, Invoke-WebRequest returns the results to the pipeline.
That is, in the absence of -OutFile no file is created.
(If you don't capture or redirect the output, it will print to the host (console).)
As techguy1029 notes in a comment, the current directory only comes into play if you do use -OutFile but specify a mere file name rather than a path.
As an aside: To-pipeline output is a response object of (a) type (derived from) WebResponseObject, whereas only the value of the response's body (the equivalent of property value .Content) is saved with -OutFile.
Lets talk about what the Microsoft documentation says for Invoke-WebRequest
"
-OutFile : Specifies the output file for which this cmdlet saves the response body. Enter a path and file name. If you omit the path, the
default is the current location. "
The Key word here is if a Path is omitted it will use the current path.
the -OutFile is a parameter of type String
The usage to save to current path would be
Invoke-webrequest "http://Test.com/test.pdf" -OutFile "Test.pdf"
else to have a custom path
Invoke-webrequest "http://Test.com/test.pdf" -OutFile "C:\Test\Test.Pdf"

Invoke-WebRequest and Hebrew characters

I already tried the reghack for PS to support Hebrew characters. I can type Hebrew no problems but for some reasons characters containing Hebrew returned from Invoke-WebRequest are in gibberish (see the following screenshot).
Here's the site URL I'm attempting to query:
https://www.hometheater.co.il/vt278553.html
Update:
It looks like the content-type being returned is ALWAYS of charset Windows-1255 which is probably the issue.
This seems to be not only an issue of having to specify the encoding but also that the shell cannot display the encoding correctly. If you specify the encoding to a file and edit it with a decent text editor (not Notepad but e.g. Notepad++), then you will be see that it has parsed it correctly.
Invoke-WebRequest -Uri "https://www.hometheater.co.il/vt278553.html" -ContentType "text/plain; charset=Windows-1255" -OutFile content.txt
We can also test that the in-memory presentation is correct by reading it and writing it to another file:
Get-Content .\content.txt | Set-Content test.txt

How to use a post request correctly within powershell

Currently I am trying to work with an API that is utilizing cURL and I am trying to convert it to PowerShell.
Currently the example given by this company is by using cURL is:
$ curl -X POST -u "youruser:yourkey" -H "Content-Type: application/json"
"https://falconapi.crowdstrike.com/detects/entities/summaries/GET/v1" -d
'{"ids": ["ldt:ddaab9931f4a4b90450585d1e748b324:148124137618026"]}'
Right now I am trying to convert this within powershell by using the Invoke-WebRequest method using the following:
Invoke-WebRequest -Method Post -Uri $site -Body -Credential 'whatever credentials' |
ConvertFrom-Json | Select -ExcludeProperty resources
The part I am getting confused on is how to format the -Body request to be something similar to:
'{"ids": ["ldt:ddaab9931f4a4b90450585d1e748b324:148124137618026"]}'
where the LDT part is I am going through an array so instead of the ldt I am trying to call a variable such as $detections, but I am unable to.
You could just create a hash table and convert to json:
-body (#{ids = ($detections)} | ConvertTo-Json)
or if detections is an array you could omit the () around $detections
You aren't able to use a variable because you're using single quoted ' strings, which don't interpret variables or escape sequences. You need to use a double quoted " string for that. Since your string contains double quotes you'd need to escape them with PowerShell's escape character, which is the backtick `.
"{`"ids`": [`"ldt:$detections`"]}"
This likely not what you want though; you probably want to serialize the array into JSON, in which case you should use 4c74356b41's answer; that is: create an object with the values you want and then convert it to JSON at runtime. This is much less error prone.
You could use double quoted strings around the outside of your JSON body and then you are able to include the variable:
Invoke-WebRequest -Method Post -Uri "{'ids': ['ldt:$detections']}" -Body -Credential 'whatever credentials' |
ConvertFrom-Json | Select -ExcludeProperty resources

Parsing HTML with PowerShell on dynamic sites

Is there a way to parse HTML from http://www.pgatour.com site using Invoke-WebRequest cmdlet? When I try doing this, ParsedHtml does not contain elements that I need (because cmdlet incorrectly parses the page).
I tried getting data from this page by creating IE COM object in PowerShell and it works, but very slow, so I'm wondering if there is another approach using Invoke-WebRequest (or even external parsers).
Thanks!
You could give the HmtlAgilityPack a try to parse the content returned by Invoke-WebRequest. In this scenario, I would use the -UseBasicParsing parameter.
Window 10 64-bit. PowerShell 5.1
Parsing HTML with PowerShell 5.1 on dynamic sites using Invoke-WebRequest and a regex that returns everything between un-nested tags like <html>,<title>,<head>, and <body>. It will take some tweaking for nested tags.
Invoke-WebRequest -Uri http://www.pgatour.com | sc golf.html
(gc -raw golf.html) -match '(<body>)(.*|\n).*?(<\/body>)'
$matches[0]
Everything between <div class="success-message"> and the next </div>
Invoke-WebRequest -Uri http://www.pgatour.com | sc golf.html
(gc -raw golf.html) -match '(<div class="success-message">)(.*?|\n)*(<\/div>)'
$matches[0]
Greedy and lazy quantifiers explained
regex101.com is your friend.

Invoke-WebRequest - issue with special characters in json

I'm trying to send special characters (norwegian) using Invoke-WebRequest to an ASP .NET MVC4 API controller.
My problem is that the json object show up as NULL when received by the controller, if my json data contains characters like Æ Ø Å.
An example of my code:
$text = 'Æ Ø Å'
$jsondata = $text | ConvertTo-Json
Invoke-WebRequest -Method POST -Uri http://contoso.com/create -ContentType 'application/json; charset=utf8' -Body $jsondata
Also when looking in fiddler the characters turn up like the usual weird utf8 boxes.
Sending json data from fiddler to the same API controller works fine
Any advice?
For the Body parameter try this:
... -Body ([System.Text.Encoding]::UTF8.GetBytes($jsondata))
The string in PowerShell is Unicode but you've specified a UTF8 encoding so I think you need to give it some help getting to UTF8.