What is the maximum size of JPEG metadata? - metadata

Is there a theoretical maximum to the amount of metadata (EXIF, etc) that can be incorporated in a JPEG file? I'd like to allocate a buffer that is assured to be sufficient to hold the metadata for any JPEG image without having to parse it myself.

There is no theoretical maximum, since certain APP markers can be used multiple times (e.g. APP1 is used for both the EXIF header and also the XMP block). Also, there is nothing to prevent multiple comment blocks.
In practice the one that is much more common to result in a large header is specifically the APP2 marker being used to store the ICC color profile for the image. Since some complicated color profiles can be several megabytes, it will actually get split into many APP2 blocks (since each APP block one has a 16bit addressing limit).

Each APPN data area has a length field that is 2 bytes, so 65536 would hold the biggest one. If you are just worried about the EXIF data, it would be a bit less.
http://www.fileformat.info/format/jpeg/egff.htm
There are at most 16 different APPN markers in a single file. I don't think they can be repeated, so 16*65K should be the theoretical max.

Wikipedia states:
Exif metadata are restricted in size to 64 kB in JPEG images because according to the specification this information must be contained within a single JPEG APP1 segment.

Related

What is the purpose of `kCGImageSourceShouldAllowFloat` for Image I/O?

When you use Image I/O on macOS, there's an option kCGImageSourceShouldAllowFloat which is documented as follows:
Whether the image should be returned as a CGImage object that uses floating-point values, if supported by the file format. CGImage objects that use extended-range floating-point values may require additional processing to render in a pleasing manner.
But it doesn’t say what file formats support it or what the benefits are, just that it might be slower.
Does anyone know what file formats support this and what the benefits would be?
TIFF files support floating point values. For example, the 128 bits per pixel format accepts 32-bit float components. See About Bitmap Images and Image Masks. Also see Supported Pixel Formats for table of supported pixel formats for graphics contexts.
In terms of the benefits of floating point, 32 bits per channel, it just means that you have more possible gradations of colors per channel. In general you can’t see this with the naked eye (over 16 bits per channel), but if you start applying adjustments (traditionally, multiple curves or levels adjustments) it means that you’re less likely to experience posterization of the images. So, if (a) the image already has this level of information; and (b) you’re might need to perform these sorts of adjustments to images, then the added data of 32-bits per component might have benefits. Otherwise the benefits of this amount of information is somewhat limited.
Bottom line, use floating point if you are possibly editing assets that might already have floating point components. But often we don’t need or use this level of information. Most of the JPG and PNG assets we deal with are 8 bits per component, anyway.

which printable paper data storage format has the best information capacity?

I need to store data on a sheet of paper that is meant to be digitally processed. Once delivered, the sheet will be photographed with a smartphone camera.
What's the most effective barcode format, that can store the largest number of 8-bits characters?
How many bytes per page can be stored?
High Capacity Color Barcode (HCCB), Data Matrix and QRcodes have a high capacity and can be packed on a single A4 page. The information capacity depends on the printing and scanning resolution, and can reach 50-70kb per page at 600 DPI (see recent experiments).
Other less standard formats include the paperback format (linux version), the optar format, the colorsafe format.
See also:
Paper data storage
Popular 2D Bar Codes
As of now highest HCCB I've found that works is JAB code
More on this here

Storing lots of images on server compression

We have a project which will generate lots (hundreds of thousands) of .PNG images that are around 1mb. Rapid serving is not a priority as we use the images internally, not front end.
We know to use filesystem not DB to store.
We'd like to know how best to compress these images on the server to minimise long term storage costs.
linux server
They already are compressed, so you would need to recode the images into another lossless format, while preserving all of the information present in the PNG files. I don't know of a format that will do that, but you can roll your own by recoding the image data using a better lossless compressor (you can see benchmarks here), and have a separate metadata file that retains the other information from the original .png files, so that you can reconstruct the original.
The best you could get losslessly, based on the benchmarks, would be about 2/3 of their current size. You would need to test the compressors on your actual data. Your mileage may vary.

Which 2d barcode has the highest data capacity/density

;)
if you wanted encode 2mb of data onto a 2d-bar code, which 2-bar code would be good to starting point or recommend.
There are lots and different types of 2dbar codes out today,Aztec 2-d barcodes,maxicodes,Pdf417,Microsoft HCCB,vericodes....etc...lots.... all unique in their own way.
i guess in a nutshell my questions is.... which barcode would make a good start off point to encode 2mb of data??
i tried reading through the Qr code international standard turns out even # version 40L the most amount of data you could encode is on to a Qr code is
1) numeric data: 7 089 characters
2) alphanumeric data: 4 296 characters
3) 8-bit byte data: 2 953 characters
4) Kanji data: 1 817 characters
which are all a far cry from the 17million bits thats is 2mb
my goal was to create something like
http://realestatemobilemarketingsolutions.com/wp-content/uploads/2012/07/real-estate-mobile-marketing.png
After you scan the barcode you can view photos of the house/property on your phone, you dont have to walk-in or wait for an open home,20 photos # 100kb each is about 2mb
Even if you could create a single 2D barcode which will encode the whole thing, the user won't be able to scan the whole thing in one go. No one has a cellphone imager which will support that kind of resolution. Your best bet is to do a QR-code with a URL in it.
Things like DataMatrix and QR-codes are extensible. You have a limit to how much data can be encoded into one block, but you CAN create a code which has multiple blocks. Indeed, if you look at this page, you'll see a discussion of using pages full of 2D barcodes as a form of data backup. They were able to fit up to 1/2 MByte of raw data into a single page. That's at 600 dpi, which will require a scanner (not a smartphone) to decode.
From what I've been reading, DataMatrix tends to have less overhead and, therefore, will stuff more (payload) data into a square inch for a given DPI. You would need a mobile app capable of shooting multiple images (tiles) of a very large image and either:
compositing the individual images into one large one for decoding OR
decoding each of the smaller blocks and reconstructing the original data from the pieces
I know of no app which will do that.
I've pondered providing bulk data via 2D barcodes. I was pondering publishing a mobile app in a magazine and providing a way for people to "download" the app from the magazine, without needing to provide a website / FTP site where they could download it. I'd first need to provide an app which could decode such a monster. Then, the end user would have to be patient enough to scan the whole thing. Good luck with that.
I MIGHT be able to provide a large 2D barcode containing a .torrent file and then using existing BitTorrent apps to download the resulting app; I have a .torrent for a recent Linux Live-DVD where the .torrent is < 32 KB.
A chunk of data (an app or images) in the MB or larger range ... really not feasible through this channel. The megabytes of data you're wanting to provide ... again ... really not feasible through this channel.
Voiceye Code is the highest density 3d code I have been able to find. Works well too, but code making software is price prohibitive to screw around with. 500.00 (ish)
How about using some variant of DataGlyphs, which has a lot in common with steganography? In other words, you use a greyscale image to also store your data...
I have developed a reader for JAB codes that can read whole audio file from a codebar. JAB codes are very high capacity due to polychrome nature.
More on this here

array data compression that is holding 13268 bits(1.66kBytes)

i.e array is having 100*125 bits of data for each aircraft+8 ascii messages each of 12 characters
what compression technique should i apply to such data
Depends mostly on what those 12500 bits look like, since that's the biggest part of your data. If there aren't any real patterns in it, or if they aren't byte-sized or word-sized patterns, "compressing" it may actually make it bigger, since almost every compression algorithm will add a small amount of extra data just to make decompression possible.