What video containers are 'sequential' in their file encoding? - mp4

If for example I have two video files, both of similar characteristics, file type, encoding, resolution, etc and starting at the same point but A goes on for 10 seconds while B goes on for 20. If A's file size is 10MB and B's is 20MB, if I read in e.g. the first 5MB from both will the major video encoding formats' binary sequences match for that 5MB?
E.G. MP4, AVI, MOV, WMV?

No, different containers work differently, the first X bytes will not contain the same number of frames. In some cases like mp4, you may get audio, or metadata, and no video at all, or you may get bytes that can not be interpreted without information that comes later in the file.

Related

In an A/V stream, is the amount of data streamed constant or fluctuating?

The amount of activity in an A/V stream can vary. For instance, if the data being streamed is from an empty, silent room, there is much less going on than if the data is something like a loud and explosive video game.
What I am wondering is whether the actual amount of data going up and down differs depending on this subjective interpretation of "activity". In other words, am I downloading less data when watching a stream of the empty room versus the active video game? My hunch has always been a resounding "no"; after all, how would the program know the difference between the two?
I'm asking now, though, because I've noticed a difference when streaming video in the past. The video always seems to be fine during periods of subjectively "low" activity, and it begins to lag or skip during periods of "high" activity. Is this just coincidence, or is there actually some kind of algorithm or service in place which dilutes data in periods of low activity or something like that?
Well, the thing is that audio and video streams are compressed. They can be compressed with any one of a whole range of formats. Some formats will aim for a % reduction in size, some will set a quality value, others will perform the same steps whether the data is simple or complex.
Take for example the jpg and png formats. Open up your favourite editor and create a 640x480px image, filled with pure white. Now save that file and look at it's size. Now apply noise to the image and save it as a new file. Compare the two - see the huge difference in size..
I got 1.37kb for the white image, 331kb for the noisy one. (a single 8x8 or 16x16 tile may be repeated for the entire white image, unique 8x8 or 16x16 blocks must be used for the noisy one)
VBR (variable bit rate) and CBR (constant bit rate) are two frequently used terms when video transcoding (changing from one format to another)
Anyway - the answer is 'it depends on the format' - some formats do work like that, some don't.
The video card is always sending the same quantity of data to the screen each frame, even if there is very little information in it - it's uncompressed. Transmitted audio and video on the other hand are (almost) always compressed, so when there's less information, it takes less data to convey it.

MP4 slow start for large file

I have a server that stream mp4(h264). I use MP4Box to put the moov atom at the beginning of a file and interleave default 500 ms.
However, I noticed that at peak times when the server is busy, the files start streaming slower, but not same slower, large videos (one hour or more) start much slower than small files.
I read about Atom Moov being processing slower in a lighttpd with h264 streaming module like mine...
Any way I can speed up playback start to about 2 seconds, right now it's about 7 for large files...
You can use mp4parser to see which part of moov box is getting bigger with the increasing file size. Then may be you can look for an optimal way of representing the box. I think it is sample size box (stsz). Also, I can think of segmenting MP4 so that the header overhead is spread across the file. MP4Box does support segmenting MP4 files. But then you need to check whether your client is capable of understanding this format.

image and video compression

What are similar compressors to the RAR algorithm?
I'm interested in compressing videos (for example, avi) and images (for example, jpg)
Winrar reduced an avi video (1 frame/sec) to .88% of it's original size (i.e. it was 49.8MB, and it went down to 442KB)
It finished the compression in less than 4 seconds.
So, I'm looking to a similar (open) algorithm. I don't care about decompression time.
Compressing "already compressed" formats are meaningless. Because, you can't get anything further. Even some archivers refuse to compress such files and stores as it is. If you really need to compress image and video files you need to "recompress" them. It's not meant to simply convert file format. I mean decode image or video file to some extent (not require to fully decoding), and apply your specific models instead of formats' model with a stronger entropy coder. There are several good attempts for such usages. Here is a few list:
PackJPG: Open source and fast performer JPEG recompressor.
Dell's Experimental MPEG1 and MPEG2 Compressor: Closed source and proprietry. But, you can at least test that experimental compressor strength.
Precomp: Closed source free software (but, it'll be open in near future). It recompress GIF, BZIP2, JPEG (with PackJPG) and Deflate (only generated with ZLIB library) streams.
Note that recompression is usually very time consuming process. Because, you have to ensure bit-identical restoration. Some programs even check every possible parameter to ensure stability (like Precomp). Also, their models have to be more and more complex to gain something negligible.
Compressed formats like (jpg) can't really be compressed anymore since they have reached entropy; however, uncompressed formats like bmp, wav, and avi can.
Take a look at LZMA

What is the maximum size of JPEG metadata?

Is there a theoretical maximum to the amount of metadata (EXIF, etc) that can be incorporated in a JPEG file? I'd like to allocate a buffer that is assured to be sufficient to hold the metadata for any JPEG image without having to parse it myself.
There is no theoretical maximum, since certain APP markers can be used multiple times (e.g. APP1 is used for both the EXIF header and also the XMP block). Also, there is nothing to prevent multiple comment blocks.
In practice the one that is much more common to result in a large header is specifically the APP2 marker being used to store the ICC color profile for the image. Since some complicated color profiles can be several megabytes, it will actually get split into many APP2 blocks (since each APP block one has a 16bit addressing limit).
Each APPN data area has a length field that is 2 bytes, so 65536 would hold the biggest one. If you are just worried about the EXIF data, it would be a bit less.
http://www.fileformat.info/format/jpeg/egff.htm
There are at most 16 different APPN markers in a single file. I don't think they can be repeated, so 16*65K should be the theoretical max.
Wikipedia states:
Exif metadata are restricted in size to 64 kB in JPEG images because according to the specification this information must be contained within a single JPEG APP1 segment.

How do you use afconvert to convert from wav to aac caf WITHOUT RESAMPLING

I'm making an Iphone game, we need to use a compressed format for sound, and we want to be able to loop SEAMLESSLY back to a specific sample in the audio file (so there is an intro, then it loops back to an offset)
currently THE ONLY export process I have found that will allow seamless looping (reports the right priming and padding frame numbers, no clicking when looping ect) is using apple's afconvert to a aac format in a caf file.
but when we try and encode to lower bitrates, it automatically re samples the sound! we do NOT want to have the sound re sampled, every other encoder I have encountered has an option to set the output sample rate, but I can't find it for this one.
on another note, if anyone has had any luck with seamless looping of a compressed file format using audio queues, let me know.
currently I'm working off the information found at:
http://developer.apple.com/mac/library/qa/qa2009/qa1636.html
note that this DID work PERFECTLY when I left the bitrate for the encode at default (~128kbs) but when I set it to 32kbps - with the -b option - it resampled, and looping clicks now.
It needs to be at least 48kbps. 32kbps will downsample to a lower sample rate.
I think you are confusing sample rate (typical values: 32kHz, 44.1kHz, 48kHz) and bit rate (typical values: 128kbps, 160kbps, 192kbps).
For a bit rate, 32kbps is extremely low. Sound will have bad quality at this bit rate. You probably intended to set the sample rate to 32kHz instead, which is also not outright typical, but makes more sense.
When compressing to AAC and uncompressing back to WAV, you will not get the same audio file back, because in AAC, the audio data is represented in a completely different format than in raw wave. E.g. you can have shifts by few microseconds, which are necessary to convert to the compressed format. You can not completely get around this with any highly compressed format.
The clicking sound originates from the sudden change between two samples which are played in direct succession. This is likely taking place because the offset to which you jump back in your loop does not end up to be at exactly the same position in the AAC file as it was in the WAV file (as explained above, there can shifts by microseconds).
You will not get around these slight changes when compressing. Instead, you have to compensate for them after compression by adjusting the offset. That means you have to open the compressed sound file in an audio editor, e.g. Audacity, and manually find another offset close to the original one, which is suitable for looping.
How to find an offset which is suitable for looping?
Zoom in to the waveform's end. Look at how the waveform looks there. Then zoom in to the waveform at the original offset and search in its neighbourhood for an offset at which the waveform connects seamlessly to the end of the waveform.
For an example how this shoud look like, open the uncompressed audio file in the audio editor and examine the end of the waveform and the offset there.