Read https://google.com doesn't work anymore in Red? - red

Weird, Read https://google.com had been working, and today it doesn't work anymore:
read https://google.com
*** Access Error: invalid UTF-8 encoding: #{E9206E27}

That's a long time known issue with pages with invalid / different utf implementations.
This can help
convert-invalid: function [page] [
collect/into [foreach c page [keep to-char c]] clear ""
]
>> convert-invalid read/binary https://google.com
== {<!doctype html><html itemscope="" itemtype="http://schema.org/Web...

Related

Requests fail authorization when query string contains certain characters

I'm making requests to Twitter, using the OAuth1.0 signing process to set the Authorization header. They explain it step-by-step here, which I've followed. It all works, most of the time.
Authorization fails whenever special characters are sent without percent encoding in the query component of the request. For example, ?status=hello%20world! fails, but ?status=hello%20world%21 succeeds. But the change from ! to the percent encoded form %21 is only made in the URL, after the signature is generated.
So I'm confused as to why this fails, because AFAIK that's a legally encoded query string. Only the raw strings ("status", "hello world!") are used for signature generation, and I'd assume the server would remove any percent encoding from the query params and generate its own signature for comparison.
When it comes to building the URL, I let URLComponents do the work, so I don't add percent encoding manually, ex.
var urlComps = URLComponents()
urlComps.scheme = "https"
urlComps.host = host
urlComps.path = path
urlComps.queryItems = [URLQueryItem(key: "status", value: "hello world!")]
urlComps.percentEncodedQuery // "status=hello%20world!"
I wanted to see how Postman handled the same request. I selected OAuth1.0 as the Auth type and plugged in the same credentials. The request succeeded. I checked the Postman console and saw ?status=hello%20world%21; it was percent encoding the !. I updated Postman, because a nice little prompt asked me to. Then I tried the same request; now it was getting an authorization failure, and I saw ?status=hello%20world! in the console; the ! was no longer being percent encoded.
I'm wondering who is at fault here. Perhaps Postman and I are making the same mistake. Perhaps it's with Twitter. Or perhaps there's some proxy along the way that idk, double encodes my !.
The OAuth1.0 spec says this, which I believe is in the context of both client (taking a request that's ready to go and signing it before it's sent), and server (for generating another signature to compare against the one received):
The parameters from the following sources are collected into a
single list of name/value pairs:
The query component of the HTTP request URI as defined by
[RFC3986], Section 3.4. The query component is parsed into a list
of name/value pairs by treating it as an
"application/x-www-form-urlencoded" string, separating the names
and values and decoding them as defined by
[W3C.REC-html40-19980424], Section 17.13.4.
That last reference, here, outlines the encoding for application/x-www-form-urlencoded, and says that space characters should be replaced with +, non-alphanumeric characters should be percent encoded, name separated from value by =, and pairs separated by &.
So, the OAuth1.0 spec says that the query string of the URL needs to be decoded as defined by application/x-www-form-urlencoded. Does that mean that our query string needs to be encoded this way too?
It seems to me, if a request is to be signed using OAuth1.0, the query component of the URL that gets sent must be encoded in a way that is different to what it would normally be encoded in? That's a pretty significant detail if you ask me. And I haven't seen it explicitly mentioned, even in Twitter's documentation. And evidently the folks at Postman overlooked it too? Unless I'm not supposed to be using URLComponents to build a URL, but that's what it's for, no? Have I understood this correctly?
Note: ?status=hello+world%21 succeeds; it tweets "hello world!"
I ran into a similar issue.
put the status in post body, not query string.
Percent-encoding:
private encode(str: string) {
// encodeURIComponent() escapes all characters except: A-Z a-z 0-9 - _ . ! ~ * " ( )
// RFC 3986 section 2.3 Unreserved Characters (January 2005): A-Z a-z 0-9 - _ . ~
return encodeURIComponent(str)
.replace(/[!'()*]/g, c => "%" + c.charCodeAt(0).toString(16).toUpperCase());
}

Google Cloud Storage list objects with name containing "%" return 403 error

I am getting my objects by calling
https://<bucket>.storage.googleapis.com/?prefix=folder%2F<object name>%2F&delimiter=/&max-keys=1000
I have tried with other special characters like !, #, #, $, ^, &, *, (, ), etc.
For the other special characters I just encode them in the , and I get the response just fine.
For example, with object "!#" under folder, the url is:
https://<bucket>.storage.googleapis.com/?prefix=folder%2F%21%22%2F&delimiter=/&max-keys=1000
However, when I try with object names with "%" and encode the percent sign to "%25", I get the following error:
<?xml version='1.0' encoding='UTF-8'?><Error><Code>InvalidSecurity</Code> <Message>The provided security credentials are not valid.</Message><Details>Request was not signed or contained a malformed signature</Details></Error>
What could be causing this issue ?
Edit
So I have tried double encoding the percent sign such that '%' character becomes "%2525" in the request. However, in the response, the prefix is strangely "%25". After testing with more cases, it turns out a request is successful only when "%25" is followed by 2 characters both within the range of '0' and 'f', however, the response prefix would be wrong. For example, "%25ab" in the request would result in "%ab" in the response prefix.
I believe this is a service side bug: see https://issuetracker.google.com/issues/117932947
I think a workaround is to encode the percent twice. But this may start failing in the future when the bug is fixed.
The error message you're seeing is because you don't have enough permissions to access to your object.
If you're using an authentication method (APIkey, bearer, etc) make sure that they have the needed Roles for GCS.
However, I can see that you're calling the objects just as a GET request. Try to Make your objects public and try it again with that encoding (%25). It should work.
Hope this is helpful!

How to pass "|" symbol in Axios URL?

url=`https://xxxxxxxxxxx.com/HiThere/${firstName}|${lastName}`
I'm passing this but during request its changing to https://xxxxxxxxxxx.com/HiThere/${firstName}%7C${lastName}
How to pass “|” symbol in Axios URL?
Be certain that you are getting it correctly, and not converting it.
See: How to prevent Axios from encoding my request parameters?, Getting Started With Axios, or Get parameter array with key value.
The "encoding" ought to be done in the standard manner, it's up to the decoder to 'do the right thing'.
See: HTML URL Encoding Reference - W3Schools, URL Encoding | Maps URLs | Google Developers or Google fonts URL break HTML5 Validation on w3.org.
The pipe is encoded as: %7C
You cannot use < > # % { } | \ ^ ~ [ ] characters in a url safely.
Please check this link for more information regarding which all characters can be used safely. So either you have to change the symbol or decode the '%7C'.

TYPO3 7.6: 404 error page: HTML wrapped in numbers

I created my own “404 Page not found” error page on a TYPO3 website and implemented it via the /typo3conf/LocalConfiguration.php as follows, using the page’s Speaking URL path:
return [
...
'FE' => [
...
'pageNotFound_handling' => '/page-not-found/',
]
]
Now when I call a non-existing page, the error page gets displayed but there is a 4-digit alphanumeric number (hexadecimal as far as I’ve seen by now) BEFORE the HTML source code and a “0” AFTER it. Example (the number in the beginning is different after most of the reloads):
37b3
<!DOCTYPE html>
...
</html>
0
When calling the error page URL itself the page is returned correctly without those numbers.
Having the RealURL extension activated or deactivated does not make a difference.
Thanks a lot in advance!
I added the full description from the install tool and I guess we might find the solution there.
How TYPO3 should handle requests for non-existing/accessible pages.
empty (default)
The next visible page upwards in the page tree is shown.
'true' or '1'
An error message is shown.
String
Static HTML file to show (reads content and outputs with correct headers), e.g. notfound.html or http://www.example.org/errors/notfound.html.
Prefix "REDIRECT:"
If prefixed with "REDIRECT:" it will redirect to the URL/script after the prefix.
Prefix "READFILE:"
If prefixed with "READFILE" then it will expect the remaining string to be a HTML file which will be read and outputted directly after having the marker "###CURRENT_URL###" substituted with REQUEST_URI and ###REASON### with reason text, for example: READFILE:fileadmin/notfound.html.
Prefix "USER_FUNCTION:"
If prefixed with "USER_FUNCTION:" a user function is called, e.g. USER_FUNCTION:fileadmin/class.user_notfound.php:user_notFound->pageNotFound where the file must contain a class user_notFound with a method pageNotFound() inside with two parameters $param and $ref.
What you configured:
You're passing a string, thus TYPO3 expects to find a file - which you don't have, because it's more like an URL.
From what you try to achieve I'd go with REDIRECT:/page-not-found/.
Thanks for pointing this one out btw, I will remove the string configuration from the core since it does not make sense to have more people trip into this pitfall.
In short: change the following line in the FE section of your LocalConfiguration.php:
'pageNotFound_handling' => '/your404page.html',
to
'pageNotFound_handling' => 'REDIRECT:/your404page.html',
Cause
The actual cause is a combination of chunked Content-Encoding and the TYPO3 not being able to decode that in some cases. In your case the page not found handler eventually uses GeneralUtility::getUrl() to retrieve the error page.
If you have [SYS][curlUse] enabled it will use cUrl to retrieve the page and there is no problem.
If you don't have [SYS][curlUse] enabled it will open a socket, read the headers and then read the rest of the body. If the webserver uses "chunked" Content-Encoding the body will contain blocks of data and each block starts with a line with the length in hexadecimal format. The content ends with an empty block (with of course a line with the length "0").
cUrl apparently knows how to decode chunked data.
getUrl() itself does not know how to handle chunked data and uses the content as is as the page content.
In TYPO3 8 LTS the guzzle library is used to handle HTTP requests. In the guzzle code I can't find anything about handling chunked data. Guzzle will check if the cUrl PHP extension is present and use that as preferred transport. In most installations cUrl is present and since this decodes chunked data automagically no problem is visible. I have to test guzzle with PHP that has cUrl disabled to see if the issue is also present in v8/master.
Workaround/solution
If the PHP extension cUrl is enabled in your installation you can simply set [SYS][curlUse] in the Install Tool. The numbers around the 404 page content will disappear.

Base64 decoding of MIME email not working (GMail API)

I'm using the GMail API to retrieve an email contents. I am getting the following base64 encoded data for the body: http://hastebin.com/ovucoranam.md
But when I run it through a base64 decoder, it either returns an empty string (error) or something that resembles the HTML data but with a bunch of weird characters.
Help?
I'm not sure if you've solved it yet, but GmailGuy is correct. You need to convert the body to the Base64 RFC 4648 standard. The jist is you'll need to replace - with + and _ with /.
I've taken your original input and did the replacement: http://hastebin.com/ukanavudaz
And used base64decode.org to decode it, and it was fine.
You need to use URL (aka "web") safe base64 decoding alphabet (see rfc 4648), which it doesn't appear you're doing. Using the standard base64 alphabet may work sometimes but not always (2 of the characters are different).
Docs don't seem to consistently mention this important detail. Here's one where it does though:
https://developers.google.com/gmail/api/guides/drafts
Also, if your particular library doesn't support the "URL safe" alphabet then you can do string substitution on the string first ("-" with "+" and "_" with "/") and then do normal base64 decoding on it.
I had the same issue decoding the 'data' fields in the message object response from the Gmail API. The Google Ruby API library wasn't decoding the text correctly either. I found I needed to do a url-safe base64 decode:
#data = Base64.urlsafe_decode64(JSON.parse(#result.data.to_json)["payload"]["body"]["data"])
Hope that helps!
There is an example for python 2.x and 3.x:
decodedContents = base64.urlsafe_b64decode(payload["body"]["data"].encode('ASCII'))
If you only need to decode for displaying purposes, consider using atob to decode the messages in JavaScript frontend (see ref).
I found whilst playing with the API result, once I had drilled down to the body I was given an option to decode in the available methods.
val message = mService!!.users().messages().get(user, id).setFormat("full").execute()
println("Message snippet: " + message.snippet)
if(message.payload.mimeType == "text/plain"){
val body = message.payload.body.decodeData() // getValue("body")
Log.i("BODY", body.toString(Charset.defaultCharset()))
}
The result:-
com.example.quickstart I/BODY: ISOLATE NORMAL: 514471,Fap, South Point Rolleston, 55 Faringdon Boulevard , Rolleston, 30 May 2018 20:59:21
I coped the base64 test to a file (b64.txt), then base64-decoded it using base64 (from coreutils) with the -d option (see http://linux.die.net/man/1/base64) and I got text that was perfectly readable. The command I used was:
cat b64.txt | base64 -d