How to figure out what and encoded string contains - encoding

I have a string that looks like this
H4sIALYnhUsCA9VXW5aDIAz9zypcgiU8dDnTWtfQ5Q8kEgSR
ap05c+YnhxLyumBu2r/s2PUvO3nh+rCaw0oFob1Q+Z51HfjNZ1jexCSsLAYx
BGG6eATZGJYALIIzG9QOy4NeaPYAyyarKfQY7TgypTjGI3ogkxDahSTw7kX/
FQUHeIgxsoClQD1JGRKF7Jy4oXNeQFou5TvJzlkJoAUIMuGAOlePMTEGWQry
2liLCfHNJPEwuiU7jmzEhM6gnGawSO3ORMnqLQRsNgki7AV4jEI9xKRU65V6
q7UUZVetqsZQC13z3UzMXkkM24nlvs+B/EktqmsnC0dxelvLycTaN+QugYw/
DTJeeTD4iy/ZXQHZ/KuXjH/2kvFKYtfaBfXtaUtlVZCZiIxw5WPLLxkFQZ2D
mMBmUaQJYCKyyBlShVqMuHUFSzu5/UTY1sVMVpwzSnimpEFOz5G7nKSoheIt
yqjg+pxU54zE64jd3zzdrYmW6Ybic2mVvcjAUKfg0s0QMfAXDadyotuGxOdH
hwZIU4NPR2fqbApbVnirTRdFGc/cjr7KwhmV+m6GGbMnf+RetoNNGwiohW4D
AREJ1R0FAhqo7gDx4b18iBh/uWPeGkwc07mMmdtKbBe0WQy9PMpr6TpLZwhR
whmj8/8FjTEWsv8ckhimqgj9+2q0hfWH1WpFCXPYfX27mEMGupKe1QA+gkwd
PDVv/xO+AbHzd9RzDQAA
My initial guess was that this was a Base64encoded file of some sort. Any ideas on how I can figure out if/which type of file it is? It should contain MIME info I guess but how would I save it to a file without fragmenting it.

It's base64. When you decode it, you get a gzipped file, which consists of a boatload of hex characters (literally, as ASCII 0xNN hex characters). They're mostly in the A-Z,a-z range.
I'd paste it here, but from this, I suspect this is part of some exercise you're doing, so I think I'll leave it to you to figure out.
P.S. For edification, I determined the binary output was a gzipped file by using the unix file command, to identify the "magic" bytes, which showed that it was gzipped. Use your decode_base64 function or whatever it is, then dump the return value into a file and gunzip it.

Related

Did anyone ever heard about asciihex encoding?

this type of encoding is used in soap messages...
I'm receiving a message encoded in ASCIIHEX and I don't have any ideas on how this encoding actually works although I have the clear description of the encoding method:
"If this mode is used, every single original byte is encoded as a sequence of two characters representing it in hexadecimal. So, if the original byte was 0x0a, the transmitted bytes are 0x30 and 0x41 (‘0’ and ‘a’ in ASCII)."
The buffer received : "1f8b0800000000000000a58e4d0ac2400c85f78277e811f2e665329975bbae500f2022dd2978ff95715ae82cdcf9415efec823c6710247582d5965c32c65aab0f5fc0a5204c415855e7c190ef61b34710bcdc7486d2bab8a7a4910d022d5e107d211ed345f2f37a103da2ddb1f619ab8acefe7fdb1beb6394998c7dfbde3dcac3acf3f399f3eeae152012e010000"
The actual file contains this : "63CD13C1697540000000662534034000030000120011084173878R 00000001000018600050000000100460000009404872101367219 000000000000 DNSO_038114 000000002001160023Replacem000000333168625 N0000 00000000"
The provider sent me the file that contains the string above. I tried to start from the buffer string and get the same result as the one sent by the provider but no results. I also tried searching after this "asciihex" encoding and same. If someone knows anything about this encoding or can give me any advice I would really appreciate it. I have pretty much no experience with SOAP services.
Based on the comments above, it's possible the buffer is compressed. It starts with 1F 8B which is a signature for GZIP compression. See the following list of signatures.
Write the bytes that correspond to the hex strings into a file. Name that file with a gz or tar.gz extension and try to extract it or open it with some file archiver tool.
Another thing you could try would be to not send the Compress element in your request, assuming it's an optional field and you can do that. If you can, check if the buffer changes and has the proper length and you can see similar patterns as the original content (for those zeros at the end, for example).

How do I use the StackExchange API from Matlab?

How do I access data from the StackExchange API using Matlab?
The naive
sitedata = urlread('http://api.stackoverflow.com/1.1/questions?tagged=matlab')
fails since the data is compressed. However, when I write this to file (using fprintf(fileID,'%s',sitedata)), I get a zip-file that cannot be uncompressed.
Try urlwrite() instead:
urlwrite('http://api.stackoverflow.com/1.1/questions?tagged=matlab',...
'tempfile.zip')
gunzip('tempfile.zip')
fid = fopen('tempfile');
str = textscan(fid,'%s',Delimiter','\n');
fclose(fid);
A better version of this snippet would use tempname to dynamically generate temporary filenames.
Matlab's urlread assumes you're getting text data back, not binary. The gzip binary data is getting mangled either when urlread is decoding the character data to Unicode values to stick in Matlab chars, or when the formatted-output fprintf function is writing them out, encoding them to UTF-8 or whatever default character encoding you're using for fileID and changing the byte sequence, or maybe both.
IIRC, urlread will default to using ISO-8859-1 encoding, which means the bytes will be turned in to the Unicode code points with the same numeric values - effectively just a widening. So you can get the byte data back by doing sitebytes = uint8(sitedata). (That's a regular uint8() conversion, not a typecast().) (If this isn't the case, you can probably fiddle with urlread's CharSet option.)
If you can't get the right bytes out from urlread by fiddling with the encoding and casts, then you can drop down and make calls against the Java HttpAgent like urlread does and bypass the character set decoding step, or fiddle with its options. See the urlread source for how to do it.
Once you have the right bytes in memory, you can write them out to a file using the lower-level fwrite() function, which won't mangle them by doing character set encoding. Then you'll have a valid gzip file of the site's original response. (I think it'll work if you also just use fwrite(fileID, sitedata, 'uint8') directly on the char string, but it's uglier IMHO.)
You can also unzip it in memory using Java classes and save a trip to the filesystem. Do jsitebytes = typecast(sitebytes 'int8') to get them as Java-friendly signed bytes and then stick it into a ByteArrayInputStream and read it out through a GZIPInputStream. You'll need to build a little Java helper class because Matlab doesn't play well with passing byte[] buffers by reference like java.io wants, but it may be worthwhile if you do a lot of in-memory munging like this.
When working with web services or fancier data downloads (e.g. sites that need sessions or certificates), I've often ended up dropping down and coding directly against the HttpAgent and java.io classes from within Matlab.

Convert SHA1 hex to base64 encoded (for Apache htpasswd) in PERL

I have a PHP application that has passwords stored in the database as the output of sha1($password), which is apparently the hex representation of the binary SHA1 hash. (as I understand it)
I would like to convert that to a format that is compatible for Apache .htpassword files, which needs to be the base64 encoded binary value or the output of base64_encode(sha1($password, true)).
I found this thread: Convert base64'd SHA1 hashes to Hex hashes ... which is doing the opposite of what I need to do, and it works great. I tried to change the ordering of the commands and use hex2bin instead of bin2hex, but that doesn't work:
Fatal error: Call to undefined function hex2bin() in php shell code on line 1
Apparently that is not available until PHP 5.4, and this server is still on 5.3.x
http://php.net/manual/en/function.hex2bin.php
Here is the real problem. I actually need it to convert in PERL, preferably only using standard built-in modules to keep everything simple. I am not sure why they are using perl for this step, but I am trying to edit one very small part of a larger application and don't want to change the whole thing yet :)
To be clear, I am not trying to convert hex numbers to binary numbers. This is a hex representation of a binary value, stored in a perl "string" :)
Thanks in advance,
Tommy
You explain so much but still leave me unsure as to what you want :/
If what you want is to convert from a hex string to a blob to base64 string, which is what you say you want in the top paragraph of your question,
use MIME::Base64;
my $bin = pack "H*", $hex_str;
my $encoded = encode_base64($bin);
which exactly matches what you want: base64_encode(sha1($password, true))
Ignore my previous answers and edits.

How exactly are TMX map files base_64-encoded?

I am writing a game for iOS that uses .tmx map files. I am creating the maps in the application 'Tiled' and then at some point before they get to iOS, I'm parsing them with Perl.
When I save the files as straight XML, it's a cinch for perl to parse them. However, cocos2d insists that the files be base64-encoded. The 'Tiled' map editor has no problem saving files with this encoding scheme, and iOS reads them just fine, but it's presenting problems for my perl code.
For some reason, the standard MIME::Base64 decode_base64() method in perl is not cutting the mustard here- when I decode the strings, I get one or two binary characters-- question marks in diamond boxes and such.
And the vague documentation for the TMX file format makes it unclear if there is some other encoding going on before or after the base64 encoding which might be causing this problems. I looked at the cpp source for the encoder, and I saw lots of references to Latin1, but I couldn't decipher what's going on in detail.
I noticed that when I tried doing my own tests with MIME::Base64, encoding and then decoding a test string, the encoded text looks dramatically different than that which I see coming out of the TMX files-- for instance, my base64-encoded text for a short string looks like this:
aGVyZSBpcyBhIHNlbnRlbmNl
But the base64-encoded text coming from the TMX files looks like this:
9QAAAAABAAANAQAAGAEAAA==
Any suggestions on what else I might try in attempts to decode a string that looks like that?
I think this page might be what you're looking for. It suggests that first you decode_base64, then (if the compression="gzip" attribute is present) use gunzip to uncompress it, and finally use unpack('V*', $data) to extract the list of 4-byte little-endian integers.

Are there any special variations of uuencoding / uudecoding?

I have written a small program which can encode/decode a text with uuencode/uudecode. The code is based on the algorithm described on Wikipedia. It works fine when I encode/decode a string. But I have found a uuencoded file which I can't decode. This website can decode the file, but when I encode it again I don't get the same file. In addition, when I decode only one line of the file I don't get readable text (neither with my program nor with the decoder I linked before). But in uuenoding all lines are independent from each other - this must be able.
Do someone know whether there are some special variations of the uuenoding, which are not described on Wikipedia? I can decode some strings so my decoder can't be totally wrong. Perhaps someone has written his own decoder, so I post the whole file:
begin 666 Restricted.zip
M4$L#!!0````(`%T[="_]<LYX`P(``'0#```.````4F5S=')I8W1E9"YT>'1M
M4\MNVT`0NQOP/TSNM#PT0!/X4N16`RE0%.GC.I9&TE;2CKH/J_K[<E;IX]"+
M'UJ20W)6^]U3)SX=]KO][D*]SD(7XHD2CX/S'26EU`L%U_6)9#E1?46NQ4,7
MR?E6P\3)J:=%#ABZY7'$P2MO"0J1GGT3Z;B1YJ#?I4ZT:!X;N#KI34)%3Y%6
MS8#>A#I-&[;E`-H%'(EY#G[/(-I',=GI;XN"H49?''YXT#LE]BNU.<!&,*(W
M0&4Y7V#,F_&11NV<-TNU-!D!>HZP5"MF91^YE0-D&H2C5CAL\T&P:#/'A*<+
M#F6(!IEXW?Q?13Q=#P[XLBHJ>L[UX,;U8+`"X3I)0S^RJX=Q+3-28)##+IK:
MEAD#AQRM7DY)ICG%BK[:(,\=L$C>20*EUCR/8BP'&'H+.OT5:+`V>,*NK$%9
MZ<;>Q1X"1WJOBZ#_8HQ+`3?K%(U<1U-:7.HI6A]_+/V[\RU,J]DW!SMV#<37
M89W+>5QCL6/"MDHTQPV&UT5-<R!=?%D)MG^AR&Y3^>]::JP0H2MZ4>3UR?F,
M[>18,L'"..I2K'.,BP8TF<K)YT_/IG1S#<#VZ^,KX$QO'[\\WC_<W;V[?_-P
MW>^`/%.?TGP^G99EJ29MCC^K6JL\G%H78CJQC[CGU=S/V_M2KEN<A0?;A5U`
M[AC.U2*6OUOE0<KD#Q#\MM_]`E!+`0(4`!0````(`%T[="_]<LYX`P(``'0#
M```.``````````$`(`"V#0````!297-T<FEC=&5D+G1X=%!+!08``````0`!
+`#P````O`#``````
`
end
I found the solution! The problem was that I did not notice the first line. This line holds information about the data encoded - a file named Restricted.zip. So the decoded data is a ZIP file which I just had to unpack.
I got a text file named Restricted.txt which contains the readable data.
The problem was so easy, but it took me days to see its solution.
That's a good change over to packing algorithms - perhaps the next thing I do is writing my own program which can pack/unpack zip files.