Put png scanlines image data to zlib stream with no compressing? - png

I am making a simple png image from scratch. I have had the scanlines data for it. Now I want to make it into zlib stream without being compressed. How can I do that? I have read the "ZLIB Compressed Data Format Specification version 3.3" at "https://www.ietf.org/rfc/rfc1950.txt" but still not understanding. Could someone give me a hint about setting the bytes in zlib stream?
Thanks in advance!

As mentioned in RFC1950, the details of the compression algorithm are described in another castle RFC: DEFLATE Compressed Data Format Specification version 1.3 (RFC1951).
There we find
3.2.3. Details of block format
Each block of compressed data begins with 3 header bits
containing the following data:
first bit BFINAL
next 2 bits BTYPE
Note that the header bits do not necessarily begin on a byte
boundary, since a block does not necessarily occupy an integral
number of bytes.
BFINAL is set if and only if this is the last block of the data
set.
BTYPE specifies how the data are compressed, as follows:
00 - no compression
[... a few other types]
which is the one you wanted. These 2 bits BTYPE, in combination with the last-block marker BFINAL, is all you need to write "uncompressed" zlib-compatible data:
3.2.4. Non-compressed blocks (BTYPE=00)
Any bits of input up to the next byte boundary are ignored.
The rest of the block consists of the following information:
0 1 2 3 4...
+---+---+---+---+================================+
| LEN | NLEN |... LEN bytes of literal data...|
+---+---+---+---+================================+
LEN is the number of data bytes in the block. NLEN is the
one's complement of LEN.
So, the pseudo-algorithm is:
set the initial 2 bytes to 78 9c ("default compression").
for every block of 32768 or less bytesᵃ
if it's the last block, write 01, else write 00
... write [block length] [COMP(block length)]ᵇ
... write the immediate data
repeat until all data is written.
Don't forget to add the Adler-32 checksum at the end of the compressed data, in big-endian order, after 'compressing' it this way. The Adler-32 checksum is to verify the uncompressed, original data. In the case of PNG images, that data has already been processed by its PNG filters and has row filter bytes appended – and that is "the" data that gets compressed by this FLATE-compatible algorithm.
ᵃ This is a value that happened to be convenient for me at the time; it ought to be safe to write blocks as large as 65535 bytes (just don't try to cross that line).
ᵇ Both as words with the low byte first, then high byte. It is briefly mentioned in the introduction.

Related

How to decode 16-bit signed binary file in IEEE754 standard

I have a file format called .ogpr (openGPR, a dead format used for Ground Radar data), I'm trying to read this file and convert it into a matrix using Matlab(R).
In the first part of file there is a JSON Header where are explained the characteristics of data acquisition (number of traces, position etc), and on the second part there are two different data blocks.
First block contains the 'real' GPR data and I know that they are formatted as:
Multibyte binary data are little-endian
Floating point binary data follow the IEEE 754 standard
Integer data follow the two’s complement encoding
I know also the total number of bytes and also the relative number of bytes for each single 'slice' (we have 512 samples * 10 channel * 3971 slices [x2 byte per sample]).
Furthermore: 'A Data Block of type Radar Volume stores a 3D array of radar Samples At the moment, each sample value is stored in a 16-bit signed integer. Each Sample value is in volts in the range [-20, 20].'
Second block contains geolocation infos.
I'd like to read and convert the Data Block from that codification but it ain't clear especially how many bytes break the data and how to convert them from that codification to number.
I tried to use this part of code:
bin_data = ogpr_data(48:(length(ogpr_data)-1),1);
writematrix(bin_data, 'bin_data.txt');
fileID = fopen('bin_data.txt', 'r', 'ieee-le');
format = 'uint16';
Data = fread(fileID, Inf, format);fclose(fileID)
Looks like your posted code is mixing text files and binary files. The writematrix( ) routine writes values as comma delimited text. Then you turn around and try to use fopen( ) and fread( ) to read this as a binary file in IEEE Little Endian format. These are two totally different things. You need to pick one format and use it consistently, either human readable comma delimited text files, or machine readable binary IEEE format files.

How to read and decode zlib block and deflate block in a png image file

Recently I started writing a png decoder for fun.And for understanding the format and compression I have read the following
PNG (Portable Network Graphics) Specification, Version 1.2)
RFC 1950("ZLIB Compressed Data Format Specification")
RFC 1951 ("DEFLATE Compressed Data Format Specification")
And hence I have an simple understanding of the format and compression.
So at first I decided to implement a simple decoder which will decode a simple png image file,and for that i have choosen a simple png image containing a single IDAT chunk which contains a single zlib block and deflate block.
And image file used is this
Image file used for decoding
And i have extracted the zlib block from the the image file and in the hex editor it looks like this
Hex view of IDAT CHUNK
And the binary representation of the part marked in red is this
Binary representation of zlib block
Now from what i have understood from reading the specs i have decoded it as follows
Decoded binary representation of zlib block
BFINAL=FINAL BLOCK
BTYPE=DYNAMIC HUFFMAN
HLIT=29
HDIST=29
HCLEN=11
The parts marked in green is the (HCLEN + 4) code lengths for the code length alphabet.
The read lengths are as follows 6,6,0,2,3,3,5,4,4,4,3,6,4,6,5
The generated huffman codes for above code bit lengths are as follows
Generated huffman code
And after assigning them to corresponding length alphabet is as follows(Note:The length alphabet 18 is not used as its code length was zero)
Assigned huffman codes
Now when I started to decode the huffman code for the (HLIT+257) of code lengths of the huffman codes for the literal/length alphabets using the assigned huffman codes, I got the first decoded alphabet as 16,but this is not possible as alphabet 16 is copy the previous code length alphabet,but this is not possible as it is the first decoded alphabet.
Hence there is some error in my understanding of the format and i cannot seem to figure it out and this is where I need help with.
The way you're representing the codes doesn't make sense, since you are showing a bunch of zeros that aren't part of the codes, so you can't tell what their lengths are. Also you are showing the codes in reverse, as compared to how they show up in the bytes.
More importantly, somehow you got the assignments wrong. Here are the correct codes, showing only the bits in the codes, in the correct bit order, with the correct assignments:
00 - 0
010 - 7
110 - 8
001 - 11
0101 - 5
1101 - 6
0011 - 10
1011 - 12
00111 - 9
10111 - 13
001111 - 3
101111 - 4
011111 - 16 (followed by two bits, +3 is the repeat count)
111111 - 17 (followed by three bits, +3 is the zeros count)
The first code length code is 001111, which says the literal 0 has code length 3.

Size of binary file after base64 encoding? Need explanation on the solution

So I'm studying for the upcoming exam, and there's this question: given a binary file with the size of 31 bytes what will its size be, after encoding it to base64?
The solution teacher gave us was (40 + 4) bytes as it needs to be a multiple of 4.
I'm not being able to come across this solution, and I have no idea how to solve this, so I was hoping somebody could help me figure this out.
Because base 64 encoding divide the input data in six bit block and one block use an ascii code.
If you have 31 byte in input you have 31*8/6 bit block to encode. As a rule of thumb every three byte in input you have 4 byte in output
If input data is not a multiple of six bit the base64 encoding fills the last block with 0 bit
In your example you have 42 block of six bit, with last filled with missing 0 bit.
Base 64 algorithm implementation filled the encoded data with '=' symbol in order to have of multiple of 4 as final result.

Compression or encrypt data

I have two bytes and I want to compress them into a single byte using a key( key length can be up to 64 bits).
And further I want to be able to retrieve the two bytes by using the compressed byte and the same key.
Someone has an idea how to do that?
Thanks.
There are 2^{16} = 65,536 ways two choose a pair of 8-bit bytes.
However the result of your procedure is only one 8-bit byte, which can occur in 2^8 = 256 different variations.
So you could use this one byte as input to some decompressing procedure, but because there are only 256 different inputs, the procedure can not produce more than 256 different results, so you can retrieve no more than 256 of the 65,536 possible pairs, the other pairs are not accessible, because you ran out of names for them, so to say.
This makes the procedure impractical, if more than 256 different input byte pairs occur.
(See the comments below for more details)
Compression would only be practical, if restrictions on your input data exist. E.g. if only the pairs p1 = (42,37) and p2 = (127,255) can occur as possible input you could compress them as 01 and and 02.

After encoding data size is increasing

I am having a text data in XML format and it's length is around 816814 bytes. It contains some image data as well as some text data.
We are using ZLIB algorithm for compressing and after compressing, the compressed data length is 487239 bytes.
After compressing we are encoding data using BASE64Encoder. But after encoding the compressed data, size is increasing and length of encoded data is 666748 bytes.
Why, after encoding data size is increasing? Is there any other best encoding techniques?
Regards,
Siddesh
As noted, when you are encoding binary 8-bit bytes with 256 possible values into a smaller set of characters, in this case 64 values, you will necessarily increase the size. For a set of n allowed characters, the expansion factor for random binary input will be log(256)/log(n), at a minimum.
If you would like to reduce this impact, then use more characters. Chances are that whatever medium you are using, it can handle more than 64 characters transparently. Find out how many by simply sending all 256 possible bytes, and see which ones make it through. Test the candidate set thoroughly, and then ideally find documentation of the medium that backs up that set of n < 256.
Once you have the set, then you can use a simple hard-wired arithmetic code to convert from the set of 256 to the set of n and back.
That is perfectly normal.
Base64 is required to be done, if your transmitting medium is not designed to transmit binary data but only textual data (eg XML)
So your zip file gets base64 encoded.
Plainly speaking, it requires the transcoder to change "non-ASCII" letters into a ASCII form but still remember the way to go back
As a rule of thumb, it's around a 33% size increase ( http://en.wikipedia.org/wiki/Base64#Examples )
This is the downside of base64. You are better of using a protocol supporting file-transfer... but for files encoded within XML, you are pretty much out of options.