DEFLATE Encoding with static Huffman Codes - encoding

need some help to understand how DEFLATE Encoding works. I know that is a combination of the LZSS algorithm and Huffman coding.
So let encode for example "Deflate late". Params: [Search buffer: 8kb and Look-ahead buffer 4kb] Well, the output of LZSS algorithm is "Deflate <5, 4>" The next step uses static huffman coding to reduce the redundancy. Here is my problem, I dont know how should i encode this pair <5, 4> with huffman.
[Edited]
D 000
f 001
l 010
a 011
t 100
_ 101
e 11
So well, according to this table the string "Deflate " is written as 000 11 001 010 011 100 11 101. As a next step lets encode the pair (5, 4). The fixed prefix code of the length 4 according to the book "Data Compression - The Complete Reference" is 258, followed by fixed prefix code of the distance 5 (Code 4 + 1 Extra bit).
That can be summarized as:
length 4 -> 258 -> 0000010
distance 5 -> 4 + 1 extra bit -> 00100|0
So, the encoded string is written as [header: 1 01] 000 11 001 010 011 100 11 101 0000010 001000 [end-of-block: 0000000], BUT if i create a huffman tree, it is not a static huffman anymore, right?
Good day

D 000
f 001
l 010
a 011
t 100
_ 101
e 11
is not the Deflate static code. The static literal/length codes are all 7, 8, or 9 bits, and the distance codes are all 5 bits. You asked about the static codes.
'Deflate late' encoded in static deflate format as the literals 'Deflate ' and a length 4, distance 5 match in hex is:
73 49 4d cb 49 2c 49 55 00 11 00
That is broken down as follows (bits are read from the least significant part of each byte first):
011 - 01 means fixed code, 1 means last block
00101110 - D
10101001 - e
01101001 - f
00111001 - l
10001001 - a
00100101 - t
10101001 - e
00001010 - space
0100000 - length 4
00100 - distance 5 or 6 depending on one extra bit
0 - extra bit -> distance 5
0000000 - end code
0 - fill bit to byte boundary

Related

formatting an AVC/H.264 mdat

Can anyone tell me or point me to a section of the specification(s) that clearly demonstrates how from an elementary stream with a series of NALUs how these should be written into a ISO BMFF mdat?
I can see looking at samples and other code that I should have something like: AUD, SPS, PPS, SEI, VideoSlice, AUD etc etc
Things that are not entirely clear to me:
If the SPS and PPS are also stored out of band in the AVCC are they required in the mdat?
If they are required in the mdat when/where should they be written? e.g. just prior to an IDR?
What is the requirement for AUDs?
If I am generating sample sizes for the trun is the calcuation for this? In the example I am working to recreate the first sample in the trun has a size of 22817 however if I look at the first sample in the mdat the NALU size prefix is 22678. The value in the trun appears to be the size of all the NALUs + sizes up to and including the first sample (see my example below)
>
1 0016E405 (1500165) - box.Size
2 6D646174 (mdat) - box.Type
3 00000002 (2) NAL Size
4 0910 - (2) AUD # 5187
5 00000025 (37)
6 27640020 AC248C0F 0117EF01 10000003 00100000 078E2800 0F424001 E84EF7B8 0F844229 C0 (37) # 5193 SPS
7 00000004 (4)
8 28DEBCB0 (4) PPS
9 0000000B (11)
10 06000781 36288029 67C080 (? SEI ?)
11 0000000C (12)
12 06010700 00F00000 03020480 (? SEI is type 6)
13 0000002D (45) # 5269
14 060429B5 00314741 393403CA FFFC8080 FA0000FA 0000FA00 00FA0000 FA0000FA 0000FA00 00FA0000 FA0000FF 80 (SEI ??)
15 00005896 (22678)
16 25888010 02047843 00580010 08410410 0002….. 22678 bytes video # 5322
If the SPS and PPS are also stored out of band in the AVCC are they required in the mdat?
No
If they are required in the mdat when/where should they be written? e.g. just prior to an IDR?
Yes, if you choose to include them, but there is no reason to
What is the requirement for AUDs?
They are optional
If I am generating sample sizes for the trun is the calcuation for this?
The number of bytes in the access unit (AU, aka frame). Which may contain more than one NALU. SPS/PPS/SEI/AUD all counted toward the AU size. The 4 byte size prefixed to each NALUs is also counted in the AU size recored in the trun.
bytes
4 | 3 00000002 (2) NAL Size
2 | 4 0910 - (2) AUD # 5187
4 | 5 00000025 (37)
37 | 6 27640020 AC248C0F 0117EF01 10000003 00100000 078E2800 0F424001 E84EF7B8 0F844229 C0 (37) # 5193 SPS
4 | 7 00000004 (4)
4 | 8 28DEBCB0 (4) PPS
4 | 9 0000000B (11)
11 | 10 06000781 36288029 67C080 (? SEI ?)
4 | 11 0000000C (12)
12 | 12 06010700 00F00000 03020480 (? SEI is type 6)
4 | 13 0000002D (45) # 5269
45 | 14 060429B5 00314741 393403CA FFFC8080 FA0000FA 0000FA00 00FA0000 FA0000FA 0000FA00 00FA0000 FA0000FF 80 (SEI ??)
4 | 15 00005896 (22678)
22678 | 16 25888010 02047843 00580010 08410410 0002….. 22678 bytes video # 5322
------|
22817 | <- bytes total

how to fix and remove zero among a number to equalize the digits in an array

consider an array in MATLAB :
a = [102 20 1 30 8 255];
In this array, I need to make all the numbers to three digits by prefixing zero to all values lie this :
a = 102 020 001 030 008 255
After that, I need to reverse it again to the same. how can i do this?
I tried to separate the digits and do this. But it failed.
You want to use the notation of fprintf, which can be saved as a string with sprintf:
>> a = [102 20 1 30 8 255]
a =
102 20 1 30 8 255
>> b = sprintf('%.3d ',a) % b is a single string
b =
102 020 001 030 008 255
>> a = str2num(b)
a =
102 20 1 30 8 255
You probably need to convert to string. Have a look at the int2str or num2strfunctions for example. Then you can easily concatenate zeros at the beginning. For example:
s = int2str(10);
['0' s]
This gives you 010 as an output.
You can then revert with the str2num function.

Wiegand RFID reader VS USB RFID reader Raspberry PI

I have two Raspberry Pis running python code to retrieve the serial number of an RFID tag. One has an RFID reader with a Wiegand interface hooked to GPIO pins and other has an RFID reader that behaves like a keyboard connected over USB. However, I get different numbers from the two reader when scanning the same RFID tag.
For example, for one tag, I get 57924897 from the Raspberry Pi with the Wiegand reader and 0004591983 from the Raspberry Pi with the USB keyboard reader.
Can sombody explain the difference? Are both readers reading the same? Or are they just reading some different parameter?
Looking at those two values, it seems that you do not properly read and convert the value from the Wiegand interface.
The USB keyboard reader reads the serial number in 10 digit decimal form. A Wiegand reader typically trunkaes the serial number into a 26 bit value (1 parity bit + 8 bit site code + 16 bit tag ID + 1 parity bit).
So let's look at the two values that you get:
+--------------+------------+-------------+-----------------------------------------+
| READER | DECIMAL | HEXADECIMAL | BINARY |
+--------------+------------+-------------+-----------------------------------------+
| USB keyboard | 0004591983 | 0046116F | 0000 0000 0100 0110 0001 0001 0110 1111 |
| Wiegand | 57924897 | 373DD21 | 1 1011 1001 1110 1110 1001 0000 1 |
+--------------+------------+-------------+-----------------------------------------+
When you take a close look at the binary representation of those two values, you will see that they correlate with each other:
USB keyboard: 0000 0000 0100 0110 0001 0001 0110 1111
Wiegand: 1 1011 1001 1110 1110 1001 0000 1
So it seems as if the Wiegand value matches the inverted value obtained from the USB keyboard reader:
USB keyboard: 0000 0000 0100 0110 0001 0001 0110 1111
NOT(Wiegand): 0 0100 0110 0001 0001 0110 1111 0
So the inverted value (logical NOT) from the Wiegand interface matches the value read by the USB reader.
Next, let's look at the two parity bits. The data over the Wiegand interface typically looks like:
b0 b1 b2 b3 b4 b5 b6 b7 b8 b9 b10 b11 b12 b13 b14 b15 b16 b17 b18 b19 b20 b21 b22 b23 b24 b25
PE D23 D22 D21 D20 D19 D18 D17 D16 D15 D14 D13 D12 D11 D10 D9 D8 D7 D6 D5 D4 D3 D2 D1 D0 PO
The first line being the bits numbered as they arrive over the Wiegand wires. The second line being the same bits as they need to be interpreted by the receiver, where PE (b0) is an even parity bit over D23..D12 (b1..b12), PO (b25) is an odd parity bit over D11..D0 (b13..b24), and D23..D0 are the data bits representing an unsigned integer number.
So looking at your number, you would have received:
PE D23 D22 D21 D20 D19 D18 D17 D16 D15 D14 D13 D12 D11 D10 D9 D8 D7 D6 D5 D4 D3 D2 D1 D0 PO
0 0 1 0 0 0 1 1 0 0 0 0 1 0 0 0 1 0 1 1 0 1 1 1 1 0
If we check the parity bits PE and PO, we get:
PE D23........D12
0 0100 0110 0001
contains 4 ones (1), hence even parity is met.
D21.........D0 PO
0001 0110 1111 0
contains 7 ones (1), hence odd parity is met.
So, to summarize the above, your code reading from the Wiegand interface does not properly handle the Wiegand data format. First, it does not trim the parity bits and second, it reads the bits with wrong polarity (zeros are ones and ones are zeros).
In order to get the correct number from the Wiegand reader, you either have to fix you code for reading from the Wiegand interface (to fix polarity, to skip the first and the last bit from the data value, and to possibly check the parity bits). Or you can take the value that you currently get, invert that value, and strip the lower and upper bits. In C, that would look something like this:
int unsigned currentWiegandValue = ...;
int unsigned newWiegandValue = ((~currentWiegandValue) >> 1) & 0x0FFFFFF;

Matlab : String vector - character subtraction

I'm trying to make a linear algebra-based algorithm for shift(Ceasar) cryptography cipher . Supposing I have a string : 'hello ' . When I'm trying to convert it into a (int)number matrix I do this :
'hello' - 'a'
And the result is
ans =
7 4 11 11 14
This is the desired result . But if I subtract the character 'g' the result will be
ans =
1 -2 5 5 8
I'd like to ask what happens in Matlab(or Octave) when I subtract a character and I get the results above .
As Mohit Jain wrote, the results you get are based on a conversion to ASCII which is the most widely accepted way to numerically encode textual information. ASCII is also included as a subset in the current standard of Unicode, and on supporting platforms Matlab actually uses a 16-bit Unicode encoding, which enables it to not only represent the 95 printable characters of ASCII which support English text, but a large number of international scripts, special characters for applications in mathematics, typography and many other fields. Explicit conversion between numeric and character data in Matlab is done through char and double:
>> double('aAΔ')
ans =
97 65 916
A small latin letter 'a' has the ASCII code 97, a large latin letter 'A' the ASCII code 65, and a large greek letter Delta has the Unicode number 916. Since the latin letters are encoded in sequence with codes 97 to 122 for small letters and 65 to 90 for capitals, you can generate the English alphabet e.g. like this:
>> char(65 : 90)
ans =
ABCDEFGHIJKLMNOPQRSTUVWXYZ
When you apply an arithmetic operator like - to character strings, the characters are implicitly converted to numbers as if you had used double
>> double('hello')
ans =
104 101 108 108 111
>> double('g')
ans =
103
and therefore 'hello' - 'a' is the same as
>> [104 101 108 108 111] - 103
ans =
1 -2 5 5 8
It changes characters of string to their ascii value and then subtracts each value
'hello' - 'a' = 7 4 11 11 14 because h - a = 8 -1 =7
(these should be ascii values but i am using these values for simplicity because its all relative)
e-a=5-1=4
l-a = 12-1 =11 and so on
'hello' - 'g'
h-g=8-7=1
e-g=5-7=-2 and so on

Agilent Vee or Matlab: 4 byte ASCII to floating point

I currently have an instrument that sends 4 bytes representing a floating point number of 32-bit in little endian format, the data looks like:
Gz*=
<«�=
N×e=
or this
à|ƒ=
is there a conversion for this in matlab, Agilent vee and manually
To convert an array of char to single, you can use typecast:
c = 'Gz*=';
f = typecast(c, 'single')
f = 0.041621
Just implicitly!
>> data = ['Gz*=';'<«�=';'N×e=']
data =
Gz*=
<«�=
N×e=
>> data+0
ans =
71 122 42 61
60 171 65533 61
78 215 101 61
data+0 forces it to be interpreted as a number which is fine.
If it's interpreted it backwards (I'm not sure if MATLAB is big or little endian) just use the swapbytes function.