Base64 Encoding difference in a particular String

Base64 Encoding difference in a particular String - rest

I have a doubt. It's regarding Base64 encoding of one particular String.
We have an application which allows REST WebServices to be executed after authorization of type Basic Authentication is successful.
I has set the password for a user USER_NAME with the password CP#5N0v22nD17RrV8f4.
From my system, using Postman/Advanced REST client, the request sent is processed successfully. But the same request fails when made most of the other systems using the same REST client.
When I set this password to another user, that user credentials is facing the same problem.
I noticed that the Base64 encoding Output Charset is the problem. But there is no method to change it in the REST clients (not in the most of the ready-made ones).
But why is this happening only for this particular password. I check with every other passwords and it works fine.
String: USER_NAME:CP#5N0v22nD17RrV8f4
UTF-8: VVNFUl9OQU1FOkNQQDVOMHYyMm5EMTdSclY4ZjTigIs=
Windows-1252: VVNFUl9OQU1FOkNQQDVOMHYyMm5EMTdSclY4ZjQ=
ASCII: VVNFUl9OQU1FOkNQQDVOMHYyMm5EMTdSclY4ZjQ=
Only for CP#5N0v22nD17RrV8f4 the UTF-8 output charset encoding in Base64 is giving a different result.
Using any other passwords, all the outputs are the same.
Please make me understand why CP#5N0v22nD17RrV8f4 is different from the rest of the strings.
Thanks in Advance
Balu

The string has a non breaking space at the end of the string.
I tested this using the following steps.
Decoded the UTF-8 string VVNFUl9OQU1FOkNQQDVOMHYyMm5EMTdSclY4ZjTigIs= at https://www.base64decode.org/
Copied the result to encode in UTF-8 at https://www.base64decode.org/, but this time pressed backspace once at the end of string. Gives me output VVNFUl9OQU1FOkNQQDVOMHYyMm5EMTdSclY4ZjQ=
You could also try typing the characters manually, and encoding.

Related

Polish name (Wężarów) returned from json service as W\u0119\u017car\u00f3w, renders as WÄ™Å¼arÃ³w. Can't figure out encoding/charset.

I'm using DB-IP.com to get city names from IP addresses. Many of these are international cities, with special characters in the names.
As an example, one of these cities is Wężarów in Poland. Checking the JSON return in the console or opening the request URL directly, it's being returned from DB-IP as "W\u0119\u017car\u00f3w" with a Content-Type of text/javascript;charset=UTF-8. This is rendered in the browser as WÄ™Å¼arÃ³w - it is also saved in my mysql database as WÄ™Å¼arÃ³w (which I've tried with both utf8 and latin1 encoding).
I'm ok with saving it in the DB as another format, as long as I can convert it back to Wężarów for display in browser. I've tried encoding and decoding to/from several formats, even just to display directly on the screen (ignoring the DB entirely). I'm completely confused on what I need to do here to get it in readable format.
I'm working with PERL, however if I can figure out what I need to do with the encoding/decoding/charset (as I'm currently clueless), I'm sure I can figure it out from there.

It looks like the UTF-8 encoded string was interpreted by the browser as if it were Windows-1252. Here's how I deduced it:
% python3
>>> s = "W\u0119\u017car\u00f3w"
>>> b = bytes(s, encoding='utf-8')
>>> b
b'W\xc4\x99\xc5\xbcar\xc3\xb3w'
>>> str(b, encoding='utf-8')
'Wężarów'
>>> str(b, encoding='latin-1')
'WÄ\x99Å¼arÃ³w'
>>> str(b, encoding='windows-1252')
'WÄ™Å¼arÃ³w'
If you're not good with Python, what I'm doing here is encoding the string "W\u0119\u017car\u00f3w" into UTF-8, yielding the byte sequence 'W\xc4\x99\xc5\xbcar\xc3\xb3w'. Decoding that with UTF-8 yielded 'Wężarów', confirming that this is the correct UTF-8 encoding of the string you want. So I took a guess that the browser is using the wrong encoding to render it, and decoded it using Latin-1. That gave me something very close, so I looked up Latin-1 and noticed that it's named as the basis for Windows-1252. Decoding again as Windows-1252 gives the result you saw.
What's gone wrong here is that the browser can't tell what encoding to use to render the page, and it's guessing wrong. You need to fix this by telling it explicitly to use UTF-8. Here's a page by the W3C that describes how to do that. Essentially what you need to do is add an HTML <meta> element to the document head. If you also set an HTTP header with the encoding name in it, make sure they are consistent.
(In Firefox, while you're debugging, you can go to View -> Character Encoding to set the encoding on a page-by-page basis. I assume other browsers have the same feature.)

How to auto detect a String encoding?

I have a String which contains some encoded values in some way like Base64.
The problem is that I really don't know if it's actually Base64 (there are A-Z, a-z. 0-9, +, /) so it can be some any other code that i'm not familiar with.
Is there a way or any other online site to send him an encoded input and it can tell me in which code is it?
NOTE:
I'm not asking how to know if my String is UTF-8 or iso-8859-1 or something like that.
What I need is to know in which is my code is encoded.
EDIT:
To be more clear,
I need something to get an input like: 23Nzi4lUE4qlc+Pmc3blWMS1Irmgo3i8UTQHhoL7VyzqpEV/i9bDhoiteZ0a7/TqcVSkrXR89V2Yj7tEFDGJx4gvWEBs= this is the encoded String that I have.
The output should be the type of the encoded String and it's decoding like:
Base64 -> "Big yellow fish is swimming in the tube."
Maybe there is some program which get's an input and tries to decode it with a list of coding types (Base64 and etc.). The output doesn't really matter because it's the users decision if it's good or not.

This site handles base64 de/encoding.
Since Base64 is just one instance of a class of encoding schemes ( specifically, encoding a bit stream as base_<n> number ), you probably will never fare better than testing for just a couple of standard encoding schemes.
You either check the well-formedness of the encoding scheme or try to decode without getting an error thrown using a web service or your own code.
In (possibly pathological) cases there will be more than one encoding scheme for which a given octet stream will successfully decode.
Best practice would be to take the effort invested into setting up the verification to committing the data provider to one (or 'a few') encoding(s) first (won't always be possible, of course).

QNetworkRequest and automatic convertation of percent-encoded characters

I'm trying to download the audio samples from Amazon with the help of QNetworkAccessManager+QNetworkRequest+QNetworkReply. I've got a big problem in processing the redirect from, for example, http://www.amazon.com/gp/dmusic/aws/sampleTrack.html?clientid=Shazam&ASIN=B00DJBQWAE to http://d28julafmv4ekl.cloudfront.net/64%2F30%2F239068457_S64.mp3?Expires=1380627695&Signature=BlaBlaBlaBla&Key-Pair-Id=BlaBlaBla
(Note the percent-encoded path returned from the server). The problem is that when redirect target URL is passed to new QNetworkRequest and the request is sent via QNAM, the %2F characters are automatically converted to slashes. This seems to be correct behavior, BUT the server requires these slashes to remain encoded. Is there any way to disable this convertation?
Btw, QNetworkReply also has similar feature - it returns the redirect url with already converted %xx characters.

You can apply a percent encoding to this url. This way, the '%2F' will be encoded to '%252F' and the QNetworkRequest will encode it back to '%2F'.
With this method: https://developer.blackberry.com/native/reference/cascades/qurl.html#toPercentEncoding

Httperf: How to test REST api with endoded uri

I want to test my REST API which has a URI something like this:
/myrestAPI/search?startTime=0&endTime=10&count=8&filters={"params":
[{"field":"Topic","value":"Algorithms","type":"MATCH_EXACT"}]}
How would I do that. The httperf reply status is "505 HTTP Version Not Supported"
I know that this uri the httperf is not properly encoding and sending it..
How would I achieve that in httperf?

Since URLs often contain characters outside the ASCII set, the URL has to be converted into a valid ASCII format.
URL encoding replaces unsafe ASCII characters with a "%" followed by two hexadecimal digits.
For you case, it would be:
/myrestAPI/search?startTime=0&endTime=10&count=8&filters=%7B%22params%22%3A%20%5B%7B%22field%22%3A%22Topic%22%2C%22value%22%3A%22Algorithms%22%2C%22type%22%3A%22MATCH_EXACT%22%7D%5D%7D
Try to experiment with URL encoder/decoder

Blob data replace '+' with space

I have an iphone app that converts a image into NSData & then converts into base64 encoded string.
When this encoded string is submitted to server in server's database, while storing on server '+' gets converted into 'space' and so the decoder does not work properly.
I guess the issue is with default encoding of table in database. Currently its latin, i tried changing it to UTF8 but problem still exits.
Any other encoding, please help

Of course - that has nothing to do with encoding. It is the format of the POST and GET parameters which creates a clash with base64. In http://en.wikipedia.org/wiki/Base64#Variants_summary_table you see alternatives which are designed to make base64 work with URLs etc.
One of these variants is "Base64 with URL and Filename Safe Alphabet (RFC 4648 'base64url' encoding)" which replaces the + with - and the / with _.
Another alternative would be to replace the offending characters +/= by their respective hexrepresentations with %xx - but that makes the data unnecessarily longer.