Decoding an arbitrary block of NSData? - iphone

If I have an arbitrary block of NSData as a hex value, is there a way to determine what the object might have been before it was archived or serialized? I don't mind a few guess and check methods, but I need some pointers in the right direction.
I have an NSData object with some hex in it. What methods of NSData should I look at? Are there other classes to try as well
Don't want to scare people away from answering, but I have a file of game data which was likely encoded using a Cocoa Touch class. The data, when viewed in a hex editor, shows gibberish and a username, which leads me to suspect that it's an archived or encoded object of sorts. I have copied the hex from the hex editor into a sample project which I am using to try and unarchive the data.
I don't believe this is related to the 3d format, the file extension is arbitrary.
Here's the data. I'm hoping it doesn't get lost in translation:
'µköXN[ÎÀü÷h/F9ó9Vìñ°ceE¸z¶=Hmoshbermú«ó¼Ppù#ÝVÔ=4â®L,K;Êç;ASÀ&Ë÷ëÓ%È;Úf¬G}tmQ;µéüø_87´y©ã©!߶óQòAçÛl©âSG4S½3ýJת9äô¡wxiD²M¼ÏB]39øþ:óñ7ª¾÷躣È3Ï¢ÍEFÍ¢ª»r]BmÁ'Ò+åygÞÅQ?luó>÷ú¼è6¸|}[¼[¶Ñ¦g!\OÎÒJSE..pSß&_ÈEäø)6òëó¨¼2¶ð°æà`ï7Ë=Ã¥:cƧ=L4qG-"µ(ÐÝïß ÓãXkÀ4fzæ·p\ññT<tu¥Æ©;Ìn4£³Ï¢ÌFåG´
And the corresponding hex:
27 B5 6B F6 01 00 00 00 58 4E 5B CE C0 FC F7 68 2F 46 86 87 83 39 F3 39 9E 56 EC F1 B0 63 9E 65 45 B8 7A B6 3D 07 99 48 6D 6F 73 68 62 65 72 6D 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 90 86 FA 03 0E AB F3 BC 0B 50 70 F9 23 DD 87 56 03 D4 3D 34 90 E2 AE 4C 2C 94 9E 8E 15 4B 0C 83 8C 3B 03 CA E7 3B 1B 41 53 C0 26 04 CB F7 EB D3 25 C8 3B DA 66 8A AC 47 7D 8A 7F 74 6D 51 3B B5 19 E9 FC F8 5F 38 37 B4 11 0C 79 A9 12 E3 A9 21 DF B6 F3 51 F2 41 E7 DB 85 02 9F 6C A9 E2 53 47 1F 34 86 53 BD 33 FD 4A D7 AA 39 C3 A4 F4 A1 77 78 69 44 B2 4D BC CF 42 5D 33 39 F8 FE 97 3A 81 F3 F1 10 37 AA BE 86 91 F7 1F E8 83 BA A3 C8 33 CF 1D A2 CD 45 7F 46 1F CD A2 AA BB 1A 72 5D 42 02 6D C1 0F 27 D2 2B E5 0B 79 67 DE C5 1A 51 3F 14 6C 75 F3 3E F7 FA BC E8 36 8E B8 7C 02 1C 7D 01 00 92 8C 19 5B BC 5B B6 D1 A6 67 7F 21 5C 84 13 4F CE 0C D2 4A 53 19 82 45 1B 2E 2E 96 70 53 DF 26 5F C8 1C 45 8F E4 F8 29 36 F2 EB 9D 95 F3 A8 BC 32 B6 F0 B0 E6 91 98 1A E0 99 60 EF 37 CB 3D C3 A5 3A 63 0C C6 A7 3D 4C 34 71 47 2D 22 B5 28 D0 DD EF DF 09 D3 E3 58 6B C0 17 34 66 7A E6 B7 70 5C F1 F1 54 3C 74 94 75 A5 C6 15 A9 9E 14 3B CC 15 10 83 6E 34 A3 B3 CF 0F A2 9C CC 8E 46 8C E5 00 00 47 B4 17 05 00 00 00 00
If anyone cares to help figure this out it would be much appreciated.

If I have an arbitrary block of NSData as a hex value, is there a way to determine what the object might have been before it was archived or serialized?
Not really. That's about as 'trivial' as reading arbitrary files correctly without the use of a UTI, extension, MIME type. Of course, your program would also need to support reading of all those files/formats.
I don't mind a few guess and check methods, but I need some pointers in the right direction.
You need to narrow your problem/inputs down, if you don't want an impossibly difficult task.
I have an NSData object with some hex in it. What methods of NSData should I look at?
It's just a data blob of length bytes. It could represent anything -- if you don't know where it came from.
Are there other classes to try as well?
Perhaps you would start by saving all your data via NSCoder or another serializer/archiver which offers some introspection and support for you to enter your own information (which would be comparable to a UTI or MIME type).
Edit:
Don't want to scare people away from answering, but I have a file of game data which was likely encoded using a Cocoa Touch class. The data, when viewed in a hex editor, shows gibberish and a username, which leads me to suspect that it's an archived or encoded object of sorts. I have copied the hex from the hex editor into a sample project which I am using to try and unarchive the data.
Using these APIs, the data may be represented multiple ways. You're probably facing something within the domain of 1) a proprietary file format through 2) a keyed archive.
The latter is easier for nontrivial data representations. You would need to define any objc classes you do not have available when unarchived. In that case, a few sample representations would offer a rough outline of the data structures you will need (under conventional implementations). It could also be an archive similar to an NSDictionary, if the unarchiver is capable of opening it. This is a problem which is easier than with other langs, since archiving often falls back on keys and values mapped to members in Cocoa.
Edit2:
The file came from the Draw Something directory. It's called gamedata.i3d
(shrug)

Try using NSKeyedUnarchiver to read it. It's not uncommon to use just the standard Foundation containers like NSArray, NSDictionary, and NSString to store data, so you might get lucky. That obviously won't work if custom classes were involved, but it might be worth 15 minutes of your time to try it.

Related

Unknown data between h264 NAL units in an AVC file

I want to understand a weird observation I had while working with h264 encoded AVC files. In such files, each NAL unit is preceded by 1/2/4 bytes that encode the size of the NAL unit (without the size header). However, there has been some cases where the end of one NAL unit doesn't take to another NAL unit, instead it takes to a sequence of some data till it eventually reaches another NAL unit
For example, starting at 01ADF399, we have:
*00 00 35 99 41* 9A 12 25 83 A5 F0 7A 08 41 0C 1E 02 50 20 03 80 A4 12 30 B6 44 90
0C E1 CD A2 68 9F 9F 2E C0 2E 1C 18 A2 28 8A 85 65 AC 0B 7D F1 DD 0F ...
Which ends at 01AE2936 as:
21 1A 54 6D FC 34 3B 32 FA AA D6 71 8A BC 92 F9 95 79 75 8A E6 B5 A9 77 24 4A AC
1C E3 EF A2 9D 97 30 51 D1 7B EB 75 FD B2 8D 8A A7 B9 47 8A C6 59 1A 32 FB 9E 77
03 8E CA 67 23 B7 52 EE 2E A4 BA 43 CE F9 CD 46 48 C5 C4 41 35 32 F3 D6 5B CD BE
DA B8 B3 3E 1B 33 87 AE 65 A0 45 74 DF EB 37 96 2F DA 9C ...
Clearly not the start of an NAL unit (since FC doesn't have forbidden zero bit)
However, at 01AE7535, we have the following:
00 00 27 EA 41 9A 14 29 81 29 7C 80 41 04 18 98 44 64 01 C6 54 00 0D 9F 34 58 71
E5 0A A6 CD B0 4B 38 60 7F E6 1F C8 00 24 7A 06 E5 9B 21 99 F0 51 24 9B ...
Which is the start of an NAL unit. I verified that those two NAL units are consecutive since filtering the file to the annex B h264 format removes the unknown data in between and places those two exact NAL units right next to each other.
I tried looking at ISO 14496 part 15 but it doesn't mention anything about this.

RSA Private Key PEM in ASN.1 Format Contains Extra Bytes

I'm looking at an RSA private key in PEM format. When I decode the base64 string and review the components of the key, some of them have an extra byte, specifically the modulus, P, DP, and IQ. They all have a leading 0x00 byte. I'm handling this by just trimming the byte[] down to the expected lengths of 256 or 128 so I can use them with .NET RSAParameters and RSACryptoServiceProvider, but wondering why some of these INTEGER structures have the extra byte while others don't. It would appear that online and other libraries that decode the PEM to XML, etc handle this gracefully, so is this part of the RFC, just something that you have to protect against, or only a concern because I'm using the .NET libraries after decoding? Here's an example of the modulus with 257 bytes:
00 B7 55 AA 3F 14 89 BC CE ED AF 80 1C 54 2A DF
AB 3C 6A 44 B4 55 58 90 0E 0D 32 96 E6 EF 35 2D
AD B7 44 A7 AB CE 6F D3 BB 9D B4 4B FD 0A DE 87
96 03 55 23 81 49 FE 1B 3E CE 62 B6 2F B1 4C 33
E4 F8 C2 09 5F 0E 10 78 22 D0 F3 C9 BF B9 AC AC
11 00 17 28 09 23 10 D5 8A C9 2B E2 86 96 A7 E2
57 68 D7 3B 63 BE 74 ED B8 02 E2 63 EF F5 40 85
0C A6 9F D0 B6 88 36 8B 4E 6B 35 27 BE 11 CC C8
C3 0A 66 25 E0 AB B6 DD 6D E6 2B AF 9E 1C D7 11
CE 5F E7 C8 1F EB 3D 79 B3 B2 E1 FF D8 20 6D 76
A2 43 9E 20 67 58 97 39 46 D8 73 F6 F0 76 01 E0
61 8E 4A EE C4 03 A6 44 C7 D3 50 E3 C8 62 CF 33
D1 37 6B 85 F5 D4 3C 6D 1F 1A 14 B3 30 B5 E0 82
A5 94 83 4F 7A 17 DA 86 2B F7 2A 47 A3 5F D2 D5
7B 96 32 86 27 5E 2A 6A 85 6E C6 24 15 A9 09 65
BB 04 8C 0D 39 F7 15 D4 F0 F8 5F 0F B0 1D A7 2F
D7
Here's an example of the "D" parameter that is not padded with a leading 0x00:
04 07 EF 8A 5D 88 3D C7 8B 00 5D DF C1 96 03 BE
FF 20 1D 0C A8 07 BF 7B 1F 9D 2A 26 3F C2 3A 93
E4 40 B5 33 18 E1 EA 94 E8 7D C0 61 EF F8 3E A0
F4 C7 CD 75 0D 4C 72 0A EA 7C CF 26 B3 4E 4A A1
D1 3A 6A FA 55 11 D5 A2 66 57 C5 EA DA 49 4A AB
41 06 41 52 1A 1C 47 A5 BA 90 A5 75 72 20 94 E0
79 24 AA 60 A2 12 6E 1B AA AC 91 A7 F8 0B 88 21
64 14 85 81 4D F3 6D 12 B7 56 BE DD F6 04 3B B1
CC 95 A6 8C 9D A6 8D BF 05 C1 72 A4 0B 03 75 F6
40 B6 8E 25 91 3D 87 84 CD 23 EF 2C 29 13 DD A7
75 6E 48 F4 DE 49 98 4F B7 09 CF 5A A3 F5 39 05
37 C8 2B 79 64 F0 B8 AD 11 EF 79 FD 78 C0 6B 2B
50 7F DE BC 59 3D D1 A1 90 59 B7 7E 57 B4 2C A0
D2 20 D2 D6 7C 4A B3 3C 63 5D FA E6 67 18 58 AC
F3 EF 0E E1 C0 C9 B6 D9 8C D1 8E 3D CE 8A FF F0
12 BF C2 FE 72 DC 07 E4 3C 00 5B BE 05 D9 5A 61
And the DP parameter without leading 0:
3E 50 B2 28 A3 B1 71 F3 D5 31 B1 2D FD B3 60 4B
57 F8 C1 46 C7 89 B7 95 F4 7D AE 54 F2 EA 11 98
F7 61 93 30 50 D9 24 19 BF 7F 06 19 DB 97 01 06
8B 20 D7 7A 5E 1A FA 76 9A 0E 27 46 AB FF 25 3C
74 61 E2 9B 3E CE A5 F9 58 40 70 15 94 F2 58 3E
DB E4 90 91 3C 50 B0 24 8F C7 A7 55 EB E3 59 A7
5D 01 19 29 4F F9 F9 E6 EB 78 D1 93 14 61 E4 5C
36 D7 E7 82 58 E7 C5 60 21 F3 1E 5A D4 49 C6 D1
The RSA modulus is a positive number, and ASN.1 integers are all signed.
So if the leading 0x00 wasn't there, this byte encoding would represent a negative number because the first byte would have the high bit set (0xB7 >= 0x80). As a consequence, the 0x00 gets inserted into the DER data stream.
.NET's representation is based on the Windows CAPI representation. CAPI uses domain knowledge to know that the values are all unsigned integers, and then omits the leading 0x00 bytes. So it's up to whoever translates between the DER data and the .NET/CAPI data to add or remove the bytes as needed.
These values are encoded as INTEGER ASN type which uses two-complement notation. That is if most significant bit is set to 1, then the number is negative. However, all numbers (modulus, exponents, primes) in key are positive numbers and prepended with extra leading zero octet to denote positive integer. If the most significant bit is already set to 0, then no extra bytes are added.

Inspecting binary over sockets

I'm using WireShark to inspect data sent/received over a web-socket, however, all I see is nonsense.
0000 1c 74 0d 7d 42 24 d8 5d e2 26 c1 7d 08 00 45 00 .t.}B$.].&.}..E.
0010 00 3c 75 4e 40 00 80 06 22 eb c0 a8 01 c0 4f 89 .<uN#...".....O.
0020 50 91 c4 f1 0f 78 72 e5 d0 f4 ea 5e 6e e2 50 18 P....xr....^n.P.
0030 00 40 91 b3 00 00 c2 8e 6d 06 87 95 7f 76 78 62 .#......m....vxb
0040 92 f9 54 2a 92 f9 b4 95 6c 06 ..T*....l.
I've seen this type of output before. The left is a line of binary, and the right is the decoded string (ASCII), right?
Is this data obfuscated/encrypted?
Is it possible to get cogent information from my socket?
Also, what do the [FIN] and [MASKED] flags mean?
If you copy and paste the data you supplied into a text file and append a line beginning with 0050 with nothing following it, you can then run text2pcap -a infile.txt outfile.pcap to convert the data to a pcap file that Wireshark can read and decode for you.
See the text2pcap man page for more information about this tool.
I have done this and the packet appears to just be a simple TCP segment. There is no [FIN] or [MASKED] flag, only PSH and ACK. For information about these TCP flags, refer to RFC 793, section 3.1, as well as the other RFC's mentioned at the top, which update this one.

Identifying an unknown data encoding?

I'm trying understand an undocumented API I have discovered, and I can't get past the format of the data that is being returned.
Here is an example of what I get back when I perform a GET on the url I'm looking at:
A+uZL4258wXdnWztlEPJNXtdl3Tu4hRITtW2AUwQHUK5c6BATSBU/XsQEVIttCpI7wrW/oXWiBloT8+cdtUWBag3mzk3cLohKPvi7PWpf7jqCSbjNGh+5Iv5Gb8by2k31kp62sfwZ+i8r/3TA6nGrnJb6edOB7d0c6F34RTFRrrZSeJtiWYXAJ5JeD3yJY+C
At first I thought this was base64 encoded, but that just gives me back gibberish:
echo -n "<above snippet>" | base64 -D
?/???ݝl?C?5Vy??????,?8?s?#M T?{R-?*H?
???ֈhOϜv??7?97p?!(??????? &?4h~???i7?Jz???g輯???Ʈr[??N?ts?w??F??I?m?f?Ix=?%?
When I strip the URL down to just the domain, I get a website with cyrillic text. Maybe the data could be converted to cyrillic somehow?
Does this data format look familiar to you?
I'll continue to keep trying and report back if I make any progress.
This is definitely base64, because of the / and + characters.
When you decode that string using base64, you get this hexdump:
00000000 03 eb 99 2f 8d b9 f3 05 dd 9d 6c ed 94 43 c9 35 |.../......l..C.5|
00000010 7b 5d 97 74 ee e2 14 48 4e d5 b6 01 4c 10 1d 42 |{].t...HN...L..B|
00000020 b9 73 a0 40 4d 20 54 fd 7b 10 11 52 2d b4 2a 48 |.s.#M T.{..R-.*H|
00000030 ef 0a d6 fe 85 d6 88 19 68 4f cf 9c 76 d5 16 05 |........hO..v...|
00000040 a8 37 9b 39 37 70 ba 21 28 fb e2 ec f5 a9 7f b8 |.7.97p.!(.......|
00000050 ea 09 26 e3 34 68 7e e4 8b f9 19 bf 1b cb 69 37 |..&.4h~.......i7|
00000060 d6 4a 7a da c7 f0 67 e8 bc af fd d3 03 a9 c6 ae |.Jz...g.........|
00000070 72 5b e9 e7 4e 07 b7 74 73 a1 77 e1 14 c5 46 ba |r[..N..ts.w...F.|
00000080 d9 49 e2 6d 89 66 17 00 9e 49 78 3d f2 25 8f 82 |.I.m.f...Ix=.%..|
This just looks like 128 bytes of random data. And whenever you call this API URL again, you get a different string, although it starts with the same few characters.
Perhaps you should ask the maintainers of that website how to use their API. Maybe this string is some session ID that you should use in further calls.

How to insert public key and hash signature generated in smart card in a CSR with openssl API's

1)I am generating a Key file and a CSR with the help of openssl commands.
When displaying the CSR information with command “ openssl req -in test_csr.pem -noout –text” I get the following printings:
Certificate Request:
Data:
Version: 0 (0x0)
Subject: C=GB, O=Test
Subject Public Key Info:
Public Key Algorithm: rsaEncryption
RSA Public Key: (2048 bit)
Modulus (2048 bit):
00:a6:af:51:e9:23:65:50:27:14:83:f5:c8:11:10:
b1:03:0b:c7:0d:2d:ae:09:81:d9:f8:31:ad:8e:d7:
8e:65:a8:e0:d4:b4:7e:f9:3e:99:fa:b0:43:5d:e0:
41:7a:ee:9f:90:3d:05:c0:6f:80:bb:bb:9e:dd:64:
1e:15:89:0c:bc:e6:3d:76:4e:d0:ef:5c:e4:de:34:
00:d0:ac:5c:e4:f8:73:b7:22:12:81:30:28:85:cd:
5a:bb:d6:28:c3:dc:01:67:f5:56:3a:3f:01:f3:d7:
8f:d9:19:67:90:1e:23:24:b0:58:e9:80:44:c9:36:
ae:2b:c3:81:a3:ce:de:af:8b:32:33:7d:f7:81:d7:
80:b8:d2:97:ce:8b:f3:21:2b:e8:e2:96:d0:b1:3f:
cc:dc:18:18:c1:e7:99:81:2a:e9:45:20:b7:80:39:
b3:5d:b3:ab:61:6a:61:f3:e1:7c:32:b7:a8:29:1a:
b2:e1:02:81:42:1f:b4:c3:7f:bf:21:f6:2d:4f:ec:
19:d4:3a:d4:bf:90:8a:3b:f0:24:cf:83:1b:21:ab:
b2:cb:15:38:f2:ac:1d:80:ba:33:2b:c8:f4:8d:52:
90:7a:25:2b:e5:08:68:a2:f2:84:61:2f:24:48:a9:
25:97:85:28:64:52:f9:15:91:eb:36:c6:d9:98:08:
09:d3
Exponent: 65537 (0x10001)
Attributes:
a0:00
Now when I edit the key file in DER format with an Hex editor, I get the following data
30 82 01 22 30 0D 06 09 2A 86 48 86 F7 0D 01 01 01 05 00 03 82 01 0F 00 30 82 01 0A 02 82 01 01 00 A6 AF 51 E9 23 65 50 27 14 83 F5 C8 11 10 B1 03 0B C7 0D 2D AE 09 81 D9 F8 31 AD 8E D7 8E 65 A8 E0 D4 B4 7E F9 3E 99 FA B0 43 5D E0 41 7A EE 9F 90 3D 05 C0 6F 80 BB BB 9E DD 64 1E 15 89 0C BC E6 3D 76 4E D0 EF 5C E4 DE 34 00 D0 AC 5C E4 F8 73 B7 22 12 81 30 28 85 CD 5A BB D6 28 C3 DC 01 67 F5 56 3A 3F 01 F3 D7 8F D9 19 67 90 1E 23 24 B0 58 E9 80 44 C9 36 AE 2B C3 81 A3 CE DE AF 8B 32 33 7D F7 81 D7 80 B8 D2 97 CE 8B F3 21 2B E8 E2 96 D0 B1 3F CC DC 18 18 C1 E7 99 81 2A E9 45 20 B7 80 39 B3 5D B3 AB 61 6A 61 F3 E1 7C 32 B7 A8 29 1A B2 E1 02 81 42 1F B4 C3 7F BF 21 F6 2D 4F EC 19 D4 3A D4 BF 90 8A 3B F0 24 CF 83 1B 21 AB B2 CB 15 38 F2 AC 1D 80 BA 33 2B C8 F4 8D 52 90 7A 25 2B E5 08 68 A2 F2 84 61 2F 24 48 A9 25 97 85 28 64 52 F9 15 91 EB 36 C6 D9 98 08 09 D3 02 03 01 00 01
I observe that in addition to the Key (from byte 33) as is it displayed in the previous step, there is extra data before the key (32 first bytes) and after the key (5 last bytes).
Does somebody know where the extra information comes from and how to decrypt it?
2)I have to test a configuration where the pair of the Keys (private and public) and the hash signature are generated in a smart card with the help of vendor API’s. With a first API I get the Public Key and Length from the smart card. With a second API I a get the hash signature data and length.
I guess that the Public key can be inserted in the CSR with openssl X509_REQ_set_pubkey API (is it correct?).
The question is: Is there an existing openssl API I can use to insert the hash signature in the CSR (something like X509_REQ_sign but without hashing and signature process that has already been done by the smart card).
Thanks.
P.L.
First 256 bytes should be structure describing certificates owner (Subject, algorithm, etc).
Last 5 bytes is the RSA public exponent - 65537 in ASN.1 encoding.
To get more information use ASN.1 decoder (or openssl asn1parse command).
Unfortunately I don't know about such function on OpenSSL and don't have time to dig into their sources, but at least it is possible to form CSR ASN.1 structure manually, that's not that hard.