Two byte report count for hid report descriptor - hid

I'm trying to create an HID report descriptor for USB 3.0 with a report count of 1024 bytes.
The documentation at usb.org for HID does not seem to mention a two byte report count. Nonetheless, I have seen some people use 0x96 (instead of 0x95) to enter a two byte count, such as:
0x96, 0x00, 0x02, // REPORT_COUNT (512)
which was taken from here:
Custom HID device HID report descriptor
Likewise, from this same example, 0x26 is used for a two byte logical maximum.
Where did this 0x96 and 0x26 field come from? I don't see any documentation for it.

REPORT_COUNT is defined in the Device Class Definition for HID 1.11 document in section 6.2.2.7 Global Items on page 36 as:
Report Count 1001 01 nn Unsigned integer specifying the number of data
fields for the item; determines how many fields are included in the
report for this particular item (and consequently how many bits are
added to the report).
The nn in the above code is the item length indicator (bSize) and is defined earlier in section 6.2.2.2 Short Items as:
bSize Numeric expression specifying size of data:
0 = 0 bytes
1 = 1 byte
2 = 2 bytes
3 = 4 bytes
Rather confusingly, the valid values of bSize are listed in decimal. So, in binary, the bits for nn would be:
00 = 0 bytes (i.e. there is no data associated with this item)
01 = 1 byte
10 = 2 bytes
11 = 4 bytes
Putting it all together for REPORT_COUNT, which is an unsigned integer, the following alternatives could be specified:
1001 01 00 = 0x94 = REPORT_COUNT with no length (can only have value 0?)
1001 01 01 = 0x95 = 1-byte REPORT_COUNT (can have a value from 0 to 255)
1001 01 10 = 0x96 = 2-byte REPORT_COUNT (can have a value from 0 to 65535)
1001 01 11 = 0x97 = 4-byte REPORT_COUNT (can have a value from 0 to 4294967295)
Similarly, for LOGICAL_MAXIMUM, which is a signed integer (usually, there is an exception):
0010 01 00 = 0x24 = LOGICAL_MAXIMUM with no length (can only have value 0?)
0010 01 01 = 0x25 = 1-byte LOGICAL_MAXIMUM (can have a values from -128 to 127)
0010 01 10 = 0x26 = 2-byte LOGICAL_MAXIMUM (can have a value from -32768 to 32767)
0010 01 11 = 0x27 = 4-byte LOGICAL_MAXIMUM (can have a value from -2147483648 to 2147483647)
The specification is unclear on what value a zero-length item defaults to in general. It only mentions, at the end of section 6.2.2.4 Main Items, that MAIN item types and, within that type, INPUT item tags, have a default value of 0:
Remarks - The default data value for all Main items is zero (0).
- An Input item could have a data size of zero (0) bytes. In this case the value of
each data bit for the item can be assumed to be zero. This is functionally
identical to using a item tag that specifies a 4-byte data item followed by four
zero bytes.
It would be reasonable to assume 0 as the default for other item types too, but for REPORT_COUNT (a GLOBAL item) a value of 0 is not really a sensible default (IMHO). The specification doesn't really say.

Related

Encoding Spotify URI to Spotify Codes

Spotify Codes are little barcodes that allow you to share songs, artists, users, playlists, etc.
They encode information in the different heights of the "bars". There are 8 discrete heights that the 23 bars can be, which means 8^23 different possible barcodes.
Spotify generates barcodes based on their URI schema. This URI spotify:playlist:37i9dQZF1DXcBWIGoYBM5M gets mapped to this barcode:
The URI has a lot more information (62^22) in it than the code. How would you map the URI to the barcode? It seems like you can't simply encode the URI directly. For more background, see my "answer" to this question: https://stackoverflow.com/a/62120952/10703868
The patent explains the general process, this is what I have found.
This is a more recent patent
When using the Spotify code generator the website makes a request to https://scannables.scdn.co/uri/plain/[format]/[background-color-in-hex]/[code-color-in-text]/[size]/[spotify-URI].
Using Burp Suite, when scanning a code through Spotify the app sends a request to Spotify's API: https://spclient.wg.spotify.com/scannable-id/id/[CODE]?format=json where [CODE] is the media reference that you were looking for. This request can be made through python but only with the [TOKEN] that was generated through the app as this is the only way to get the correct scope. The app token expires in about half an hour.
import requests
head={
"X-Client-Id": "58bd3c95768941ea9eb4350aaa033eb3",
"Accept-Encoding": "gzip, deflate",
"Connection": "close",
"App-Platform": "iOS",
"Accept": "*/*",
"User-Agent": "Spotify/8.5.68 iOS/13.4 (iPhone9,3)",
"Accept-Language": "en",
"Authorization": "Bearer [TOKEN]",
"Spotify-App-Version": "8.5.68"}
response = requests.get('https://spclient.wg.spotify.com:443/scannable-id/id/26560102031?format=json', headers=head)
print(response)
print(response.json())
Which returns:
<Response [200]>
{'target': 'spotify:playlist:37i9dQZF1DXcBWIGoYBM5M'}
So 26560102031 is the media reference for your playlist.
The patent states that the code is first detected and then possibly converted into 63 bits using a Gray table. For example 361354354471425226605 is encoded into 010 101 001 010 111 110 010 111 110 110 100 001 110 011 111 011 011 101 101 000 111.
However the code sent to the API is 6875667268, I'm unsure how the media reference is generated but this is the number used in the lookup table.
The reference contains the integers 0-9 compared to the gray table of 0-7 implying that an algorithm using normal binary has been used. The patent talks about using a convolutional code and then the Viterbi algorithm for error correction, so this may be the output from that. Something that is impossible to recreate whithout the states I believe. However I'd be interested if you can interpret the patent any better.
This media reference is 10 digits however others have 11 or 12.
Here are two more examples of the raw distances, the gray table binary and then the media reference:
1.
022673352171662032460
000 011 011 101 100 010 010 111 011 001 100 001 101 101 011 000 010 011 110 101 000
67775490487
2.
574146602473467556050
111 100 110 001 110 101 101 000 011 110 100 010 110 101 100 111 111 101 000 111 000
57639171874
edit:
Some extra info:
There are some posts online describing how you can encode any text such as spotify:playlist:HelloWorld into a code however this no longer works.
I also discovered through the proxy that you can use the domain to fetch the album art of a track above the code. This suggests a closer integration of Spotify's API and this scannables url than previously thought. As it not only stores the URIs and their codes but can also validate URIs and return updated album art.
https://scannables.scdn.co/uri/800/spotify%3Atrack%3A0J8oh5MAMyUPRIgflnjwmB
Your suspicion was correct - they're using a lookup table. For all of the fun technical details, the relevant patent is available here: https://data.epo.org/publication-server/rest/v1.0/publication-dates/20190220/patents/EP3444755NWA1/document.pdf
Very interesting discussion. Always been attracted to barcodes so I had to take a look. I did some analysis of the barcodes alone (didn't access the API for the media refs) and think I have the basic encoding process figured out. However, based on the two examples above, I'm not convinced I have the mapping from media ref to 37-bit vector correct (i.e. it works in case 2 but not case 1). At any rate, if you have a few more pairs, that last part should be simple to work out. Let me know.
For those who want to figure this out, don't read the spoilers below!
It turns out that the basic process outlined in the patent is correct, but lacking in details. I'll summarize below using the example above. I actually analyzed this in reverse which is why I think the code description is basically correct except for step (1), i.e. I generated 45 barcodes and all of them matched had this code.
1. Map the media reference as integer to 37 bit vector.
Something like write number in base 2, with lowest significant bit
on the left and zero-padding on right if necessary.
57639171874 -> 0100010011101111111100011101011010110
2. Calculate CRC-8-CCITT, i.e. generator x^8 + x^2 + x + 1
The following steps are needed to calculate the 8 CRC bits:
Pad with 3 bits on the right:
01000100 11101111 11110001 11010110 10110000
Reverse bytes:
00100010 11110111 10001111 01101011 00001101
Calculate CRC as normal (highest order degree on the left):
-> 11001100
Reverse CRC:
-> 00110011
Invert check:
-> 11001100
Finally append to step 1 result:
01000100 11101111 11110001 11010110 10110110 01100
3. Convolutionally encode the 45 bits using the common generator
polynomials (1011011, 1111001) in binary with puncture pattern
110110 (or 101, 110 on each stream). The result of step 2 is
encoded using tail-biting, meaning we begin the shift register
in the state of the last 6 bits of the 45 long input vector.
Prepend stream with last 6 bits of data:
001100 01000100 11101111 11110001 11010110 10110110 01100
Encode using first generator:
(a) 100011100111110100110011110100000010001001011
Encode using 2nd generator:
(b) 110011100010110110110100101101011100110011011
Interleave bits (abab...):
11010000111111000010111011110011010011110001...
1010111001110001000101011000010110000111001111
Puncture every third bit:
111000111100101111101110111001011100110000100100011100110011
4. Permute data by choosing indices 0, 7, 14, 21, 28, 35, 42, 49,
56, 3, 10..., i.e. incrementing 7 modulo 60. (Note: unpermute by
incrementing 43 mod 60).
The encoded sequence after permuting is
111100110001110101101000011110010110101100111111101000111000
5. The final step is to map back to bar lengths 0 to 7 using the
gray map (000,001,011,010,110,111,101,100). This gives the 20 bar
encoding. As noted before, add three bars: short one on each end
and a long one in the middle.
UPDATE: I've added a barcode (levels) decoder (assuming no errors) and an alternate encoder that follows the description above rather than the equivalent linear algebra method. Hopefully that is a bit more clear.
UPDATE 2: Got rid of most of the hard-coded arrays to illustrate how they are generated.
The linear algebra method defines the linear transformation (spotify_generator) and mask to map the 37 bit input into the 60 bit convolutionally encoded data. The mask is result of the 8-bit inverted CRC being convolutionally encoded. The spotify_generator is a 37x60 matrix that implements the product of generators for the CRC (a 37x45 matrix) and convolutional codes (a 45x60 matrix). You can create the generator matrix from an encoding function by applying the function to each row of an appropriate size generator matrix. For example, a CRC function that add 8 bits to each 37 bit data vector applied to each row of a 37x37 identity matrix.
import numpy as np
import crccheck
# Utils for conversion between int, array of binary
# and array of bytes (as ints)
def int_to_bin(num, length, endian):
if endian == 'l':
return [num >> i & 1 for i in range(0, length)]
elif endian == 'b':
return [num >> i & 1 for i in range(length-1, -1, -1)]
def bin_to_int(bin,length):
return int("".join([str(bin[i]) for i in range(length-1,-1,-1)]),2)
def bin_to_bytes(bin, length):
b = bin[0:length] + [0] * (-length % 8)
return [(b[i]<<7) + (b[i+1]<<6) + (b[i+2]<<5) + (b[i+3]<<4) +
(b[i+4]<<3) + (b[i+5]<<2) + (b[i+6]<<1) + b[i+7] for i in range(0,len(b),8)]
# Return the circular right shift of an array by 'n' positions
def shift_right(arr, n):
return arr[-n % len(arr):len(arr):] + arr[0:-n % len(arr)]
gray_code = [0,1,3,2,7,6,4,5]
gray_code_inv = [[0,0,0],[0,0,1],[0,1,1],[0,1,0],
[1,1,0],[1,1,1],[1,0,1],[1,0,0]]
# CRC using Rocksoft model:
# NOTE: this is not quite any of their predefined CRC's
# 8: number of check bits (degree of poly)
# 0x7: representation of poly without high term (x^8+x^2+x+1)
# 0x0: initial fill of register
# True: byte reverse data
# True: byte reverse check
# 0xff: Mask check (i.e. invert)
spotify_crc = crccheck.crc.Crc(8, 0x7, 0x0, True, True, 0xff)
def calc_spotify_crc(bin37):
bytes = bin_to_bytes(bin37, 37)
return int_to_bin(spotify_crc.calc(bytes), 8, 'b')
def check_spotify_crc(bin45):
data = bin_to_bytes(bin45,37)
return spotify_crc.calc(data) == bin_to_bytes(bin45[37:], 8)[0]
# Simple convolutional encoder
def encode_cc(dat):
gen1 = [1,0,1,1,0,1,1]
gen2 = [1,1,1,1,0,0,1]
punct = [1,1,0]
dat_pad = dat[-6:] + dat # 6 bits are needed to initialize
# register for tail-biting
stream1 = np.convolve(dat_pad, gen1, mode='valid') % 2
stream2 = np.convolve(dat_pad, gen2, mode='valid') % 2
enc = [val for pair in zip(stream1, stream2) for val in pair]
return [enc[i] for i in range(len(enc)) if punct[i % 3]]
# To create a generator matrix for a code, we encode each row
# of the identity matrix. Note that the CRC is not quite linear
# because of the check mask so we apply the lamda function to
# invert it. Given a 37 bit media reference we can encode by
# ref * spotify_generator + spotify_mask (mod 2)
_i37 = np.identity(37, dtype=bool)
crc_generator = [_i37[r].tolist() +
list(map(lambda x : 1-x, calc_spotify_crc(_i37[r].tolist())))
for r in range(37)]
spotify_generator = 1*np.array([encode_cc(crc_generator[r]) for r in range(37)], dtype=bool)
del _i37
spotify_mask = 1*np.array(encode_cc(37*[0] + 8*[1]), dtype=bool)
# The following matrix is used to "invert" the convolutional code.
# In particular, we choose a 45 vector basis for the columns of the
# generator matrix (by deleting those in positions equal to 2 mod 4)
# and then inverting the matrix. By selecting the corresponding 45
# elements of the convolutionally encoded vector and multiplying
# on the right by this matrix, we get back to the unencoded data,
# assuming there are no errors.
# Note: numpy does not invert binary matrices, i.e. GF(2), so we
# hard code the following 3 row vectors to generate the matrix.
conv_gen = [[0,1,0,1,1,1,1,0,1,1,0,0,0,1]+31*[0],
[1,0,1,0,1,0,1,0,0,0,1,1,1] + 32*[0],
[0,0,1,0,1,1,1,1,1,1,0,0,1] + 32*[0] ]
conv_generator_inv = 1*np.array([shift_right(conv_gen[(s-27) % 3],s) for s in range(27,72)], dtype=bool)
# Given an integer media reference, returns list of 20 barcode levels
def spotify_bar_code(ref):
bin37 = np.array([int_to_bin(ref, 37, 'l')], dtype=bool)
enc = (np.add(1*np.dot(bin37, spotify_generator), spotify_mask) % 2).flatten()
perm = [enc[7*i % 60] for i in range(60)]
return [gray_code[4*perm[i]+2*perm[i+1]+perm[i+2]] for i in range(0,len(perm),3)]
# Equivalent function but using CRC and CC encoders.
def spotify_bar_code2(ref):
bin37 = int_to_bin(ref, 37, 'l')
enc_crc = bin37 + calc_spotify_crc(bin37)
enc_cc = encode_cc(enc_crc)
perm = [enc_cc[7*i % 60] for i in range(60)]
return [gray_code[4*perm[i]+2*perm[i+1]+perm[i+2]] for i in range(0,len(perm),3)]
# Given 20 (clean) barcode levels, returns media reference
def spotify_bar_decode(levels):
level_bits = np.array([gray_code_inv[levels[i]] for i in range(20)], dtype=bool).flatten()
conv_bits = [level_bits[43*i % 60] for i in range(60)]
cols = [i for i in range(60) if i % 4 != 2] # columns to invert
conv_bits45 = np.array([conv_bits[c] for c in cols], dtype=bool)
bin45 = (1*np.dot(conv_bits45, conv_generator_inv) % 2).tolist()
if check_spotify_crc(bin45):
return bin_to_int(bin45, 37)
else:
print('Error in levels; Use real decoder!!!')
return -1
And example:
>>> levels = [5,7,4,1,4,6,6,0,2,4,3,4,6,7,5,5,6,0,5,0]
>>> spotify_bar_decode(levels)
57639171874
>>> spotify_barcode(57639171874)
[5, 7, 4, 1, 4, 6, 6, 0, 2, 4, 3, 4, 6, 7, 5, 5, 6, 0, 5, 0]

CRC CMD18 SD Card

I'm try to read SDHC card with SPI and a DSP.
I succeed to read a lot of information (capacity, some other informations) using CMD17 command.
Now I want to use CMD18 command (READ_MULTIPLE_BLOCK), because I want to read 2 sectors (2 * 512 bytes). I put all the values in a buffer.
When I read it, there are 4 bytes (when I'm using a 4GB Class 4 or 10 bytes when I'm using a 4GB Class 10) between the 2 sectors which are not on the card (I read the 2 sectors with HxD software). What are these values?
This is an example with a 4GB Class 4:
Buffer values :
buffer[511] = 68 **// Good value**
buffer[512] = 143 // Bad value
buffer[513] = 178 // Bad value
buffer[514] = 255 // Bad value
buffer[515] = 254 // Bad value
buffer[516] = 48 **// Good value**
Real values readed with HxD
buffer[511] = 68 **// Good value**
buffer[512] = 48 **// Good value**
buffer[513] = 54 **// Good value**
buffer[514] = 48 **// Good value**
buffer[515] = 52 **// Good value**
buffer[516] = 69 **// Good value**
I don't send CRC (0xFF), does the problem from that?
Thank you for your help.
Regards,
buffer[512] = 143 // Bad value
buffer[513] = 178 // Bad value
These two bytes are CRC for 512-byte block. Can be not used, but can not be rejected from received stream.
buffer[514] = 255 // Bad value
Card pre-charge next block, you will get random quantity (include zero) of 255's each time. (Depends of SPI speed and card speed)
buffer[515] = 254 // Bad value
Card is ready to stream next data block. You should awaiting for 254, then you know the following data will be correct. In fact, even before first block of data, you should also wait/get these 255...255, 254.

convert 16 bits signed (x2) to 32bits unsigned

I've got a problem with a modbus device :
The device send data in modbus protocol.
I read 4 bytes from the modbus communication that represent a pressure value
I have to convert theses 4 bytes in a unsigned 32bits integer.
There is the modbus documentation :
COMBINING 16bit REGISTERS TO 32bit VALUE
Pressure registers 2 & 3 in SENSOR INPUT REGISTER MAP of this guide are stored as u32 (UNSIGNED 32bit INTEGER)
You can calculate pressure manually :
1) Determine what display you have - if register values are positive skip to step 3.
2) Convert negative register 2 & 3 values from Signed to Unsigned (note: 65536 = 216 ):
(reg 2 value) + 65536* = 35464 ; (reg 3 value) + 65536 = 1
3) Shift register #3 as this is the upper 16 bits: 65536 * (converted reg 3 value) = 65536
4) Put two 16bit numbers together: (converted reg 2 value) + (converted reg 3 value) = 35464 + 65536 = 101000 Pa
Pressure information is then 101000 Pascal.
I don't find it very clear... For exemple, we don't have the 4 bytes that gives this calcul.
So, if anybody has a formula to convert my bytes into a 32bits unsigned int it could be very helpful
You should be able to read your bytes in some kind of type representation (hex, dec, bin, oct...)
let's assume you're receiving the following bytes frame:
in hex:
0x00, 0x06, 0x68, 0xA0
in bin:
0000 0000, 0000 0110, 0110 1000, 1010 0000
all of these are different representation of the same 4 bytes values.
Another thing that you should know is the bytes position (endianess):
If you're frame is transmitted in big endian, you're going to read the bytes in the order that you have them ( so 0x00, 0x06, 0x68, 0xA0 is correct).
If the frame is transmitted in little endian, you need to perform the following operation:
Switch the first 2 bytes with the last 2:
0x68, 0xA0, 0x00, 0x06
and then switch the position between the first and the second byte and the third and the fourth byte:
0xA0, 0x68, 0x06, 0x00
so if your frame is in little endian, the correct frame will be 0xA0, 0x68, 0x06, 0x00.
If you don't know the endianess, assume it's in big endian.
Now you simply have to 'put' your values togheter:
0x00, 0x06, 0x68, 0xA0 will become 0x000668A0
or
0000 0000, 0000 0110, 0110 1000, 1010 0000 will become 00000000000001100110100010100000
Once you have your hex or bin, you can convert your bin to an integer or convert your hex to an integer
Here you can find an interesting tool for converting HEX to float, unit32, int32, int16 in all endianess.
TL;DR
if you can use python, you should use struct:
import struct
frame = [0x00, 0x06, 0x68, 0xA0] # or [0, 6, 104, 160] in dec or [0b00000000, 0b00000110, 0b01101000, 0b10100000] in bin
print struct.unpack('>L', ''.join(map(chr, frame)))[0]

UTF-8 & Unicode, what's with 0xC0 and 0x80?

I've been reading about Unicode and UTF-8 in the last couple of days and I often come across a bitwise comparison similar to this :
int strlen_utf8(char *s)
{
int i = 0, j = 0;
while (s[i])
{
if ((s[i] & 0xc0) != 0x80) j++;
i++;
}
return j;
}
Can someone clarify the comparison with 0xc0 and checking if it's the most significant bit ?
Thank you!
EDIT: ANDed, not comparison, used the wrong word ;)
It's not a comparison with 0xc0, it's a logical AND operation with 0xc0.
The bit mask 0xc0 is 11 00 00 00 so what the AND is doing is extracting only the top two bits:
ab cd ef gh
AND 11 00 00 00
-- -- -- --
= ab 00 00 00
This is then compared to 0x80 (binary 10 00 00 00). In other words, the if statement is checking to see if the top two bits of the value are not equal to 10.
"Why?", I hear you ask. Well, that's a good question. The answer is that, in UTF-8, all bytes that begin with the bit pattern 10 are subsequent bytes of a multi-byte sequence:
UTF-8
Range Encoding Binary value
----------------- -------- --------------------------
U+000000-U+00007f 0xxxxxxx 0xxxxxxx
U+000080-U+0007ff 110yyyxx 00000yyy xxxxxxxx
10xxxxxx
U+000800-U+00ffff 1110yyyy yyyyyyyy xxxxxxxx
10yyyyxx
10xxxxxx
U+010000-U+10ffff 11110zzz 000zzzzz yyyyyyyy xxxxxxxx
10zzyyyy
10yyyyxx
10xxxxxx
So, what this little snippet is doing is going through every byte of your UTF-8 string and counting up all the bytes that aren't continuation bytes (i.e., it's getting the length of the string, as advertised). See this wikipedia link for more detail and Joel Spolsky's excellent article for a primer.
An interesting aside by the way. You can classify bytes in a UTF-8 stream as follows:
With the high bit set to 0, it's a single byte value.
With the two high bits set to 10, it's a continuation byte.
Otherwise, it's the first byte of a multi-byte sequence and the number of leading 1 bits indicates how many bytes there are in total for this sequence (110... means two bytes, 1110... means three bytes, etc).

Need help identifying and computing a number representation

I need help identifying the following number format.
For example, the following number format in MIB:
0x94 0x78 = 2680
0x94 0x78 in binary: [1001 0100] [0111 1000]
It seems that if the MSB is 1, it means another character follows it. And if it is 0, it is the end of the number.
So the value 2680 is [001 0100] [111 1000], formatted properly is [0000 1010] [0111 1000]
What is this number format called and what's a good way for computing this besides bit manipulation and shifting to a larger unsigned integer?
I have seen this called either 7bhm (7-bit has-more) or VLQ (variable length quantity); see http://en.wikipedia.org/wiki/Variable-length_quantity
This is stored big-endian (most significant byte first), as opposed to the C# BinaryReader.Read7BitEncodedInt method described at Encoding an integer in 7-bit format of C# BinaryReader.ReadString
I am not aware of any method of decoding other than bit manipulation.
Sample PHP code can be found at
http://php.net/manual/en/function.intval.php#62613
or in Python I would do something like
def encode_7bhm(i):
o = [ chr(i & 0x7f) ]
i /= 128
while i > 0:
o.insert(0, chr(0x80 | (i & 0x7f)))
i /= 128
return ''.join(o)
def decode_7bhm(s):
o = 0
for i in range(len(s)):
v = ord(s[i])
o = 128*o + (v & 0x7f)
if v & 0x80 == 0:
# found end of encoded value
break
else:
# out of string, and end not found - error!
raise TypeError
return o