onMetaData marker in FLV file - flv

I wanna know how the onMetaData marker in FLV files looks like. When i open FLV files as plain text I get this:
FLV[][][][][](TAB)[][][][][][][]8[][][][][][][][][]
onMetaData[]
duration...
The docs say the first 3 bytes are the signature "FLV" the next byte tells the flv version, the next byte is telling us if audio or video tags are present, the next 4 bytes are the data-offset(the size of the header), which is 9, in ascii its the TAB code. after the TAB starts the body with the fist "previous tag size field" which is 0(4 bytes) next, there is the Tag Type (1 byte) the data size (3 bytes) and the timestamp (4 bytes) the stream id (always 0, 3bytes). After that remains:
[]
onMetaData[]
[][][][][][]
duration...
I suppose the onMetaData marker is "1byte, newline"onMetaData"1byte,newline) but what are the 7 bytes between onMetaData marker and duration?

You would need to view this file in a hex editor to get anything useful from it; a text editor will just show you unprintable characters.
The ASCII "onMetaData" bit in the file is the tag header, which is wrapping the "duration" field. The three bytes immediately after "onMetaData" are the BodyLength of the tag (uint24, big-endian), and the next 4 bytes ("\x00\x00\x00\x08") describe the length of the name for the next tag, which is "duration."

I suggest you to use hexedit tool http://www.hexedit.com/
this will allow you to see all the info in string format..
as well as it has very nice navigation to analyze bytes.
In addition to it, use https://www.adobe.com/content/dam/Adobe/en/devnet/flv/pdfs/video_file_format_spec_v10.pdf to get details about all bytes in an flv file

Remember that the metadata is encoded using AMF. This means that after the string "onMetaData" you have a 0x08 to signify the start of an array and then 2 bytes to signify the length of the first element as number of character/bytes

Related

Did anyone ever heard about asciihex encoding?

this type of encoding is used in soap messages...
I'm receiving a message encoded in ASCIIHEX and I don't have any ideas on how this encoding actually works although I have the clear description of the encoding method:
"If this mode is used, every single original byte is encoded as a sequence of two characters representing it in hexadecimal. So, if the original byte was 0x0a, the transmitted bytes are 0x30 and 0x41 (‘0’ and ‘a’ in ASCII)."
The buffer received : "1f8b0800000000000000a58e4d0ac2400c85f78277e811f2e665329975bbae500f2022dd2978ff95715ae82cdcf9415efec823c6710247582d5965c32c65aab0f5fc0a5204c415855e7c190ef61b34710bcdc7486d2bab8a7a4910d022d5e107d211ed345f2f37a103da2ddb1f619ab8acefe7fdb1beb6394998c7dfbde3dcac3acf3f399f3eeae152012e010000"
The actual file contains this : "63CD13C1697540000000662534034000030000120011084173878R 00000001000018600050000000100460000009404872101367219 000000000000 DNSO_038114 000000002001160023Replacem000000333168625 N0000 00000000"
The provider sent me the file that contains the string above. I tried to start from the buffer string and get the same result as the one sent by the provider but no results. I also tried searching after this "asciihex" encoding and same. If someone knows anything about this encoding or can give me any advice I would really appreciate it. I have pretty much no experience with SOAP services.
Based on the comments above, it's possible the buffer is compressed. It starts with 1F 8B which is a signature for GZIP compression. See the following list of signatures.
Write the bytes that correspond to the hex strings into a file. Name that file with a gz or tar.gz extension and try to extract it or open it with some file archiver tool.
Another thing you could try would be to not send the Compress element in your request, assuming it's an optional field and you can do that. If you can, check if the buffer changes and has the proper length and you can see similar patterns as the original content (for those zeros at the end, for example).

ITEXT PDFReader not able to read PDF

I am not able to read a PDF file using itext pdfreader. This PDf is valid PDF if I tried to open this.
URL Of PDF: http://www.fundslibrary.co.uk/FundsLibrary.DataRetrieval/Documents.aspx?type=fund_class_kiid&id=f096b13b-3d0e-4580-8d3d-87cf4d002650&user=fidelitydocumentreport
The PDF in question is encrypted.
According to the PDF specification,
Encryption applies to all strings and streams in the document's PDF file, with the following exceptions:
The values for the ID entry in the trailer
Any strings in an Encrypt dictionary
Any strings that are inside streams such as content streams and compressed object streams, which themselves are encrypted
Later on there are information on special cases in which the document level metadata stream is not encrypted either or in which only attachments are encrypted.
The Cross-Reference Stream Dictionary of the PDF looks like this:
<<
/Root 101 0 R
/Info 63 0 R
/XRef(stream)
/Encrypt 103 0 R
/ID[<D034DE62220E1CBC2642AC517F0FE9C7><D034DE62220E1CBC2642AC517F0FE9C7>]
/Type/XRef
/W[1 3 2]
/Index[0 107]
/Size 107
/Length 642
>>
As you can see there is an non-encrypted string here, (stream), which is neither the value for the ID entry, nor in an Encrypt dictionary, nor inside a stream. Furthermore, the afore mentioned special cases do not apply here either.
Thus, this file violates the PDF specification here. Therefore, this file is not a valid PDF.
Furthermore, according to the PDF specification
The last line of the file shall contain only the end-of-file marker, %%EOF.
The file at handsends like this
Thus, the last line of the file does contain something else than the end-of-file marker (which is in the line before), a 0x06 and a 0x0c.
The file, therefore, violates the PDF specification here, too.

DFM file became binary and infected

We have a DFM file which began as text file.
After some years, in one of our newer versions, the Borland Developer Studio changed it into binary format.
In addition, the file became infected.
Can someone explain me what should I do now? Where can I find how binary file structure is read?
Well, I found what happens to the DFM file, but I don't know why.
The occurence of changing from text file to binary one is known, and could be found in stack overflow in another question. I'll describe only the infection of the file.
In Pascal, the original language of DFM files, a string defines so: first byte is the length of the string (0-255) and the other characters are the string. (Unlike C, which its strings length are recognized by a null character).
Someone (maybe BDS?) while changing the file from text file to binary one, also changed all string of length 13 (0D) to be length 10 (0A). This way, the string finished after 10 chars, and the next char was a value of the property.
I downloaded binary editor, fixed all occurences of length 10, and the file was displayed and compiled well.
(Not only properties' length infected, but also one byte on Icon.Data property was replaced from 0D to 0A)

Album name gets corrupted when characters are Japanese

Please let me know the maximum number of "Album Name".
Now I'm developing Photo Upload app with Graph API.
When creating an album, the album name gets corrupted if the number of Japanese characters exceeds 21.
Below is the example of this issue.
e.g.
Input:
あいうえおかきくけこあいうえおかきくけこあい
Registered Album Name:
あいうえおかきくけこあいうえおかきくけこあ��
Note that the same issue occurs if more than 21 Korean or Chinese characters are set as Album Name.
It would appear that there is a length limit on this field. Guessing that they're using UTF-8, it would be a limit of 64 bytes, rather than a integral number of characters.
Facebook appear to be truncating the string at that number of bytes, regardless of whether that byte limit happens to align with a character boundary or not. This kind of misbehaviour is unfortunately common in languages that don't handle text strings as Unicode characters natively. In your case the last い takes up three bytes, but there's only room for two, so you get left with two trailing bytes that don't form a valid UTF-8 sequence, hence ��.
To stop this happening you'd have to do their job for them and impose the length limit in a Unicode-clean way. One way to do this would be to encode to UTF-8 yourself, do the truncation, and convert back to characters ignoring the invalid end bytes. eg in Python:
>>> print u'あいうえおかきくけこあいうえおかきくけこあい'.encode('utf-8')[:64].decode('utf-8', 'ignore')
あいうえおかきくけこあいうえおかきくけこあ

Parsing H264 in mdat MP4

I have a file that only contains the mdat atom in a MP4 container. The data in the mdat contains AVC data. I know the encoding parameters for the data. The format does not appear to be in the Annex B byte stream format. I am wondering how I would go about parsing this. I have tried searching for the slice header, but have not had much luck.
Is it possible to parse the slices without the NAL's?
AVC NAL units are in the following format in MDAT section:
[4 bytes] = NAL length, network order;
[NAL bytes]
Shortly, start codes are simply replaced by lengths.
Be careful! The NAL Length is not required to be 4! The AvcConfigurationBox ('moov/trak/mdia/minf/stbl/stsd/avc1/avcC') contains a field 'lengthSizeMinusOne' specifying the length. But the default is 4.
I found what michael was talking about defined in section 5.2.3 of ISO 14496-15.
Sebastian's answer refers to section 5.2.4.1.1 and 5.3.4.1.2.
You will not be able to parse the slices in the 'mdat' box without copies of the SPS and PPS from the 'avcC' box (defined in section 5.2.4.1.1)