how to fix Base64 mutated encoding? - encoding

I've got a string, that looks like a Base64 ASCII encoded string:
#2aHR0cDovL2RhdGEwMi1jZG4uZGF0YWxvY2sucnUvZmkybG0vZTAxNGNmZTZhMzE1ZjgyODgyZWUxMmJjOTY5MzQ1MDkvN2ZfVGVzdC5uYS5iZXJlbWVubm9zdC5zMDMuZTAyLldFQi1ETFJpcC4yNUt1em1pY2g\/\/b2xvbG8=uYTEuMTYuMTEuMjIubXA0
If I decode it without any edit, it seems like a mess, but if I remove 2 chars from the very begining (#2), it decodes into a mostly correct string:
http://data02-cdn.datalock.ru/fi2lm/e014cfe6a315f82882ee12bc96934509/7f_Test.na.beremennost.s03.e02.WEB-DLRip.25Kuzmich?
but is still not complete. This URL should be like:
http://data02-cdn.datalock.ru/fi2lm/f03143c36c778262bd9906da5d545f85/7f_Test.na.beremennost.s03.e02.WEB-DLRip.25Kuzmich.a1.16.11.22.mp4
If I remove some more characters from initial string (#2aHR0cDovL2RhdGEwMi1jZG4uZ), I get a corrupted text with correct ending of decoded URL:
][KLKLMMLMYYLLML
LK\K\[Y[LPQ\R^ZXololo.a1.16.11.22.mp4
Is it a regular problem of base64 encoding or maybe there is some sort of mutation in encoded string and it can be solved?
In my experiments i used base64decode.org

The slashes should not be backslashed and there should not be an equals sign in the middle of the stream.
If I remove the backslashed slashes and the equals sign, I get
http://data02-cdn.datalock.ru/fi2lm/e014cfe6a315f82882ee12bc96934509/7f_Test.na.beremennost.s03.e02.WEB-DLRip.25Kuzmich.16.11.22.184
I don't think there's anything "regular" about this corruption. Lucky they didn't inject valid base64 or a lot of spurious junk.

Related

Flutter JSON encode raw string

So I have this json data that contains strings \r (carriage return) and \n (new line) - It's from Firebase. The problem is when I encode the data using json.encode it add an escaping character. So \r becomes \\r.
I'm sending that data to an another server.
json.encode works as expected if I do json.encode({'hello': 'world\r\n'}) but it adds \ when I used it on my other string.
Am I missing something?
Is there some type of encoding to prevent it from adding \?
It seems that the data you received does not contain CR and LF characters but contains their escape sequences (\ followed by r and \ followed by n). Therefore when you encode that to JSON, it will be escaped again.
You could do:
data = data.replaceAll('\\r', '\r').replaceAll('\\n', '\n');
which probably would work most of the time, but it would have the corner case of undesirably replacing occurrences that were explicitly intended to be escaped. (That is, a string '\\n' would be transformed to a sequence \, LF.)
Since the data is already escaped, you probably could unescape it with json.decode. Of course, decoding the data as JSON just to re-encode it to JSON seems a little silly, so if it's already properly encoded JSON, you ideally should pass it through it unchanged.

User-Token which decode/decryption algorithm is that?

I have a string, but what I want to know is how this is generated?
3C+msMRwFDOcepm960C2kUfeFdBe2WoWLFATI+u7EKiFt9nqdPuI6nXIByUhBeNoCqaivEHp/dHimnfAeT0n7ZsZU6AmJkONCulPOLd8q09i+EzfWhW0GJmnvSIC3YEh5kuZOF62E63f12gjESKwyYVq4Y/iWcAu2TdyueX977U5O4BdLIEbDsmjSUhKLfiH8RvaGZrj4OpggOvpytsqcQ==
I did some research over the last days, and it seems its an base64 encoding but here we have also special characters in the string like "/+=". The plain text should be b33912c6-b805-412b-9660-b80186fc3b9f, but no encoding/encryption method I found online could get the same string.
Which encryption or encoding algorithm is used here?

Is it possible to base64 decode part of a base64 encoded message

I am working on a project where I am getting parts of base64 encoded data, but not the whole thing. Is it possible to figure out what that part of the base64 encoded data was?
For example. Say I base64 encode hello world
It becomes aGVsbG8gd29ybGQ=
But say I am only able to capture sbG8gd29y
Which base4 decodes to ݽ
I am familiar with how base64 encoding process works and I cannot think of a way to figure out what part of a base64 encoded message is without adding data randomly to the chunk on the front and back and comparing with dictionary words, but the problem is I am not even 100% sure that the data I am working with includes dictionary words.
Thanks
I just spent a little time using an online conveter (http://www.convertstring.com/EncodeDecode/Base64Decode)
If you take your captured section you can run it through the converter and see that its an invalid length for a base64 encoded string.
For a captured section to have a valid length you will need to add some extra characters (0-3 depending on the length of the section). A valid base64 string has a length that is exactly devisible by 4.
Pick a character ('a' for example) and then run through the posibilities of adding the correct amount of characters to the section, front and back. With your added characters the string will be decodable and one of the decoded values will be more readable, that will be the one that has the partially decoded data.
E.G:
sbG8gd29yaaa
and
aaasbG8gd29y
decodes to:
����ݽɦ�
and
i��lo wor
You can make a rudimentary programatic test for readability by counting the number of 'normal' characters within the string (a-z for example). You will need to make up your own mind what is 'normal', it will depend on the expected language of the data and the context (is it known to be numeric only for example).

How to decode mixed string with unicode symbols?

Yesterday I was confused by output of FM SSFC_PARSE_CERTIFICATE. It serves for decoding fields of X.509 certificate into readable format.
Everything is OK for latin symbols, but cyrillic letters are turned into something like \u041F\u0440\u0438\u0432\u0435\u0442.
Besides, if original text contains mixed symbols, i.e. latin, non-latin, spaces and digits, the task becomes even more comlex: Hello! \u041F\u0440\u0438\u0432\u0435\u0442 1234.
I wrote some code myself to scan string character by character and decode single entities using CL_ABAP_CONV_IN_CE=>UCCP and it seems to work well, but I'd like to know if there is a standard way to acheive same result?
Well, it's seams like in your input xstring all non-latin charcodes have been escaped instead of being encoded in UTF8. So if you not satisfied with your DIY solution, you should work upstream of the call to FM SSFC_PARSE_CERTIFICATE

base64 encoding: input character

I'm trying to understand what the input requirements are for base64 encoding. Nicholas Zakas, who I have tremendous respect for has an article here where he quotes a specification that an error should be thrown if input contains any character with a code higher than 255 Zakas Article on base64
Before even attempting to base64 encode a string, you should check to see if the string contains only ASCII characters. Since base64 encoding requires eight bits per input character, any character with a code higher than 255 cannot be accurately represented. The specification indicates that an error should be thrown in this case:
if (/([^\u0000-\u00ff])/.test(text)){
throw new Error("Can't base64 encode non-ASCII characters.");
}
He provides a link in another separate part of the article to the RFC 3548 but I don't see any input requirements other than:
Implementations MUST reject the encoding if it contains characters
outside the base alphabet when interpreting base encoded data, unless
the specification referring to this document explicitly states
otherwise.
Not sure what "base alphabet" means but perhaps this is what Zakas is referring to. But by saying they must reject the encoding it seems to imply that this is something that has already been encoded as opposed to the input (of course if the input is invalid it will also show up in the encoding so perhaps the point is moot).
A bit confused on what the standard is.
Fundamentally, it's a mistake to talk about "base64 encoding a string" where "string" is meant in terms of text.
Base64 encoding is applied to binary data (a sequence of bytes, or octets if you want to be even more picky), and the result is text. Every character in the output is printable ASCII text. The whole point of base64 is to provide a safe way of converting arbitrary binary data into a text format which can be reliably embedded in other text, transported etc. ASCII is compatible with almost all character sets, so you're very unlikely to be unable to encode ASCII text as part of something else.
When someone talks about "base64 encoding a string" they're really talking about encoding text as binary using some existing encoding (e.g. UTF-8), then applying a base64 encoding to the result. When decoding, you'd need to decode the base64 back to binary, and then decode that binary data with the original encoding, to get the original text.
For me the (first) linked article has a fundamental problem:
Before even attempting to base64 encode a string, you should check to see if the string contains only ASCII characters
You don't base64 encode strings. You base64 encode byte sequences. And when you're dealing with any kind of encoding work, it's extremely important to keep in mind this difference.
Also, his check for 'ASCII' actually lets through everything from 80 to ff, which aren't ASCII - ASCII is only 00 to 7f.
Now, if you have a string which you have checked is pure ASCII, you can then safely treat it as a byte sequence of the ASCII values of the characters in it - but this is a separate earlier step, nothing strictly to do with the act of base64 encoding.
(I should say that I do like his repeated urging for the reader to note that base64 encoding is not in any shape or form encryption)