Decoding ima4 audio format - iphone

To reduce the download size of an iPhone application I'm compressing some audio files. Specifically I'm using afconvert on the command line to change .wav format to .caf format w/ ima4 compression.
I've read this (wooji-juice.com) awesome post about this exact topic. I'm having trouble w/ the "decoding ima4 packets" step. I've looked at their sample code and I'm stuck. Please help w/ some pseudo code or sample code that can guide me in the right direction.
Thanks!
Additional info:
Here is what I've completed and where I'm having trouble...
I can play .wav files in both the simulator and on the phone.
I can compress .wav files to .caf w/ ima4 compression using afconvert on the command line. I'm using the SoundEngine that came w/ CrashLanding (I fixed one memory leak).
I modified the SoundEngine code to look for the mFormatID 'ima4'.
I don't understand the blog post linked above starting w/ "Calculating the size of the unpacked data". Why do I need to do this? Also, what does the term "packet" refer to? I'm very new to any sort of audio programming.

After gathering all the data from Wooji-Juice, Multimedia Wiki and Apple, here is my proposal (may need some experiment):
File structure
Apple IMA4 file are made of packet of 34 bytes. This is the packet unit used to build the file.
Each 34 bytes packet has two parts:
the first 2 bytes contain the preamble: an initial predictor and a step index
the 32 bytes left contain the sound nibbles (a nibble of 4 bits is used to retrieve a 16 bits sample)
Each packet has 32 bytes of compressed data, that represent 64 samples of 16 bits.
If the sound file is stereo, the packets are interleaved (one for the left, one for the right); there must be an even number of packets.
Decoding
Each packet of 34 bytes will lead to the decompression of 64 samples of 16 bits. So the size of the uncompressed data is 128 bytes per packet.
The decoding pseudo code looks like:
int[] ima_index_table = ... // Index table from [Multimedia Wiki][2]
int[] step_table = ... // Step table from [Multimedia Wiki][2]
byte[] packet = ... // A packet of 34 bytes compressed
short[] output = ... // The output buffer of 128 bytes
int preamble = (packet[0] << 8) | packet[1];
int predictor = preamble && 0xFF80; // See [Multimedia Wiki][2]
int step_index = preamble && 0x007F; // See [Multimedia Wiki][2]
int i;
int j = 0;
for(i = 2; i < 34; i++) {
byte data = packet[i];
int lower_nibble = data && 0x0F;
int upper_nibble = (data && 0xF0) >> 4;
// Decode the lower nibble
step_index += ima_index_table[lower_nibble];
diff = ((signed)nibble + 0.5f) * step / 4;
predictor += diff;
step = ima_step_table[step index];
// Clamp the predictor value to stay in range
if (predictor > 65535)
output[j++] = 65535;
else if (predictor < -65536)
output[j++] = -65536;
else
output[j++] = (short) predictor;
// Decode the uppper nibble
step_index += ima_index_table[upper_nibble];
diff = ((signed)nibble + 0.5f) * step / 4;
predictor += diff;
step = ima_step_table[step index];
// Clamp the predictor value to stay in range
if (predictor > 65535)
output[j++] = 65535;
else if (predictor < -65536)
output[j++] = -65536;
else
output[j++] = (short) predictor;
}

The term "packet" refers to a group of compressed audio samples with a header. You need the header to decode the data immediately following. If you consider your ima4 file to be a book, then each packet is a page. At the top are the values needed to decode that page, followed by the compressed audio.
That's why you need to calculate the size of the unpacked data (and then make space for it) -- since it's compressed, you need to convert data from compressed audio to uncompressed audio before you can output it. In order to allocate an output buffer, you need to know how big it has to be (note: you may need to output in chunks that are larger than a single packet at a time).
It looks like the typical structure, per the earlier "Overview" section, is that sets of 64 samples, each 16 bits (so 128 bytes) are translated to a 2-byte header and a 32-byte set of compressed samples (34 bytes in all). So, in the typical case, you can produce your expected output datasize by taking the input data size, dividing by 34 to get the number of packets, then multiplying by 128 bytes for the uncompressed audio per packet.
You shouldn't do that, though. It looks like you should instead query kAudioFilePropertyDataFormat to get the mBytesPerPacket -- this is the "34" value above, and mFramesPerPacket -- this is the 64, above, that gets multiplied by 2 (for 16-byte samples) to make 128 bytes of output.
Then, for each packet, you will need to run through the decoding described in the post. In somewhat longer pseudo C-code, assuming you are getting arrays of bytes, to handle the header:
packet = GetPacket();
Header = (packet[0] << 8) | packet[1]; //Big-endian 16-bit value
step_index = Header & 0x007f; //Lower seven bits
predictor = Header & 0xff80; //Upper nine bits
for (i = 2; i < mBytesPerPacket; i++)
{
nibble = packet[i] & 0x0f; //Low Nibble
process that nibble, per the blogpost -- be careful on sign-extension!
nibble = (packet[i] & 0xf0) >> 4; //High Nibble
process that nibble, per the blogpost -- be careful on sign-extension!
}
The sign-extension above refers to the fact that the post involves handling each nibble both in an unsigned and a signed way. If the high bit of a nibble (bit 3) is a 1, then it is negative; additionally the bit-shift may do sign-extension. This is not handled in the above pseudocode.

Related

Why are the hex numbers for big endian different than little endian?

#include<stdio.h>
int main()
{
typedef unsigned char *byte_pointer;
void show_bytes(byte_pointer start, size_t len)
{
int i;
for (i = 0; i < len; i++)
{
printf(" %.2x", start[i]);
printf("\n");
}
}
void show_int(int x)
{
show_bytes((byte_pointer) &x, sizeof(int));
}
void show_float(int x)
{
show_bytes((byte_pointer) &x, sizeof(float));
}
void show_pointer(int x)
{
show_bytes((byte_pointer) &x, sizeof(void *));
}
int a = 0x12345678;
byte_pointer ap = (byte_pointer) &a;
show_bytes(ap, 3);
return 0;
}
(Solutions according to the CS:APP book)
Big endian: 12 34 56
Little endian: 78 56 34
I know systems have different conventions for storage allocation but if two systems use the same convention but are different endian why are the hex values different?
Endian-ness is an issue that arises when we use more than one storage location for a value/type, which we do because somethings won't fit in a single storage location.
As soon as we use multiple storage locations for a single value that gives rise to the question of:  What part of the value will we store in each storage location?
The first byte of a two-byte item will have a lower address than the second byte, and in particular, the address of the second byte will be at +1 from the address of the lower byte.
Storing a two-byte item in two bytes of storage, do we store the most significant byte first and the least significant byte second, or vice versa?
We choose to use directly consecutive bytes for the two bytes of the two-byte item, so no matter which (endian) way we choose to store such an item, we refer to the whole two-byte item by the lower address (the address of its first byte).
We can express these storage choices with a formula, here item[0] refer to the first byte while item[1] refers to the second byte.
item[0] = value >> 8 // also value / 256
item[1] = value & 0xFF // also value % 256
value = (item[0]<<8) | item[1] // also item[0]*256 | item[1]
--vs--
item[0] = value & 0xFF // also value % 256
item[1] = value >> 8 // also value / 256
value = item[0] | (item[1]<<8) // also item[0] | item[1]*256
The first set of formulas is for big endian, and the second for little endian.
By these formulas, it doesn't matter what order we access memory as to whether item[0] first, then item[1], or vice versa, or both at the same time (common in hardware), as long as the formulas for one endian are consistently used.
If the item in question is a four-byte value, then there are 4 possible orderings(!) — though only two of them are truly sensible.
For efficiency, the hardware offers us multibyte memory access in one instruction (and with one reference, namely to the lowest address of the multibyte item), and therefore, the hardware itself needs to define and consistently use one of the two possible/reasonable orderings.
If the hardware did not offer multibyte memory access, then the ordering would be entirely up to the software program itself to define (accessing memory one byte at a time), and the program could choose big or little endian, even differently for each variable, as long as it consistently accesses the multiple bytes of memory in the same manner to reassemble the values stored there.
In a similar manner, when we define a structure of multiple items (e.g. struct point { int x; int y; }, software chooses whether x comes first or y comes first in memory ordering.  However, since programmers (and compilers) will still choose to use hardware instructions to access individual fields such as x in one go, the hardware's endian configuration remains necessary.

Heart Rate Value in BLE

I am having a hard time getting a valid value out of the HR characteristics. I am clearly not handling the values properly in Dart.
Example Data:
List<int> value = [22, 56, 55, 4, 7, 3];
Flags Field:
I convert the first item in the main byte array to binary to get the flags
22 = 10110 (as binary)
this leads me to believe that it is U16 (bit[0] is == 1)
HR Value:
Because it is 16 bit I am trying to get the bytes in the 1 & 2 indexes. I then try to buffer them into a ByteData. From there I get convert them to Uint16 with the Endian set to Little. This is giving me a value of 14136. Clearly I am missing something fundamental about how this is supposed to work.
Any help in clearing up what I am not understanding about how to process the 16 bit BLE values would be much appreciated.
Thank you.
/*
Constructor - constructs the heart rate value from a BLE message
*/
HeartRate(List<int> values) {
var flags = values[0];
var s = flags.toRadixString(2);
List<String> flagsArray = s.split("");
int offset = 0;
//Determine whether it is U16 or not
if (flagsArray[0] == "0") {
//Since it is Uint8 i will only get the first value
var hr = values[1];
print(hr);
} else {
//Since UTF 16 is two bytes I need to combine them
//Create a buffer with the first two bytes after the flags
var buffer = new Uint8List.fromList(values.sublist(1, 3)).buffer;
var hrBuffer = new ByteData.view(buffer);
var hr = hrBuffer.getUint16(0, Endian.little);
print(hr);
}
}
Your updated data looks much better. Here's how to decode it, and the process you'd use to figure this out yourself from scratch.
Determine the format
The Bluetooth site has been reorganized recently (~2020), and in particular they got rid of some of the document viewers, which makes things much harder to find and read IMO. All the documentation is in the Heart Rate Service (HRS) document, linked from the main GATT page, but for just parsing the format, the best source I know of is the XML for org.bluetooth.characteristic.heart_rate_measurement. (Since the reorganization, I don't know how you can find this page without searching for it. It doesn't seem to be linked anymore.)
Byte 0 - Flags: 22 (0001 0110)
Bits are numbered from LSB (0) to MSB (7).
Bit 0 - Heart Rate Value Format: 0 => UINT8 beats per minute
Bit 1-2 - Sensor Contact Status: 11 => Supported and detected
Bit 3 - Energy Expended Status: 0 => Not present
Bit 4 - RR-Interval: 1 => One or more values are present
The meaning of RR-intervals is explained in the HRS document, linked above. It sounds like you just want the heart rate value, so I won't go into them here.
Byte 1 - UINT8 BPM: 56
Since Bit 0 of flags was 0, this is the beats per minute. 56.
Bytes 2-5 - UINT16 RR Intervals: 55, 4, 7, 3
You probably don't care about these, but there are two UINT16 values here (there can be an arbitrary number of RR-Interval values). BLE is always little-endian, so [55, 4] is 1,079 (55 + 4<<8), and [7, 3] is 775 (7 + 3<<8).
I believe the docs are a little confusing on this one. The XML suggests that these values are in seconds, but the comments say "Resolution of 1/1024 second." The normal way to express this would be <BinaryExponent>-10</BinaryExponent>, and I'm certain that's what they meant. So these would be:
RR0: 1.05s (1079/1024)
RR1: 0.76s (775/1024)

Convert 24-bit ADC serial read data from 3-byte format to signed integer (int32) in Matlab

I am receiving EEG data from a 24 bit ADC over serial. The ADC data is transmitting in 3 bytes from MSB to LSB. The full packet is 21 bytes:
The first byte is the start byte - 0xFF (255 in decimal)
Then packet number byte.
Then the next 3 bytes are the 24 bit ADC value broken into MSB LSB2 LSB1
I can parse the data fine, but re-constructing a 2's complement signed int32 number is causing issues. The values I am getting out certainly don't reflect what the ADC should be giving out.
Below are the lines to read and parse the 504 samples (which gives me 24 ADC values (504samples/21bytes = 24 values)). I have tried uint8 instead of uchar with similar results (when I try int8 I get a invalid specified precision error).
comEEGSMT = serial(com,'BaudRate',3000000);
fopen(comEEGSMT);
rawData(1:504) = fread(comEEGSMT, 504, 'uchar');
fclose(comEEGSMT);
startPackets = find(rawData == 255);
bytes = rawData([startpackets+2 startpackets+3 startpackets+4]);
I have tried the following method to reconstruct the value:
ADC_value = bytes(:,1)*256^2 + bytes(:,2)*256 + bytes(:,3);
and the following line is the formula to convert the above number to volts:
ADC_value_volts = ADC_value*(5/3)*(1/(2^32));
The values are in the range of 4000 - 8000 microvolts with large jumps in value. The values SHOULD be in the range of 200 - 600 microvolts with small changes.
I have found other questions relating to similar issues, but have had no success trying the proposed solutions such as in the link below:
https://uk.mathworks.com/matlabcentral/answers/137965-concatenate-3-bytes-array-of-real-time-serial-data-into-single-precision
Any help would be very much appreciated as I've been stuck on this for quite long.
Thanks Mark
Starting with ADC_value as int32 with value 0, then:
ADC_value |= MSB << 16;
ADC_value |= LSB2 << 8;
ADC_value |= LSB1;
And then, to find out the corresponding volts value, supposing your ADC has a reference voltage VREF, in volts (e.g. 5.0V):
ADC_value_volts = (ADC_value * VREF)/2^24
since your converter is 24 bits, not 32.
Note the above expressions are in C language equivalent, not Matlab.
EDIT:
The ADC data sheet tell us the PGA gain can be set for the following values:
1, 2, 4, 6, 8, 12, 24, one value at the time for each channel.
The FSR (full scale range) of measurement is: (2*VREF)/Gain = 5/3, for Gain=6,
(eq.(5) page 23) so this must be accounted for in expression computing the volts
values. (these can be verified if you have access to the hardware and can make some
measurements).
Data resulted from ADC is already in two's complement, binary form, 24 bits.
The weird thing is the data sheet counts bits starting with 1, not 0, so this
is why shifting with "17" instead 16 - this is in fact 16 for coding.
(revealed in fig 47, page 42).
So the computing formula of ADC_value_volts should be:
ADC_value_volts = (AC_value * FSR/(2^23))/3 (1LSB=FSR/(2^23), pg.37)
If some other calculations/modifications from original, them these must be explained by provider.
If the provider is not friendly, worth to be changed...

How to store data larger than 128 byte in JavaCard

I can't write data at index above 128 in byte array.
code is given below.
private void Write1(APDU apdu) throws ISOException
{
apdu.setIncomingAndReceive();
byte[] apduBuffer = apdu.getBuffer();
byte j = (byte)apduBuffer[4]; // Return incoming bytes lets take 160
Buffer1 = new byte[j]; // initialize a array with size 160
for (byte i=0; i<j; i++)
Buffer1[(byte)i] = (byte)apduBuffer[5+i];
}
It gives me error 6F 00 (It means reach End Of file).
I am using:
smart card type = contact card
using java card 2.2.2 with jcop using apdu
Your code contains several problems:
As already pointed out by 'pst' you are using a signed byte value which works only up to 128 - use a short instead
Your are creating a new buffer Buffer1 on every call of your Write1 method. On JavaCard there is usually no automatic garbage collection - therefore memory allocation should only be done once when the app is installed. If you only want to process the data in the adpu buffer just use it from there. And if you want to copy data from one byte array into another better use javacard.framework.Util.arrayCopy(..).
You are calling apdu.setIncomingAndReceive(); but ignore the return value. The return value gives you the number of bytes of data you can read.
The following code is from the API docs and shows the common way:
short bytesLeft = (short) (buffer[ISO7816.OFFSET_LC] & 0x00FF);
if (bytesLeft < (short)55) ISOException.throwIt( ISO7816.SW_WRONG_LENGTH );
short readCount = apdu.setIncomingAndReceive();
while ( bytesLeft > 0){
// process bytes in buffer[5] to buffer[readCount+4];
bytesLeft -= readCount;
readCount = apdu.receiveBytes ( ISO7816.OFFSET_CDATA );
}
short j = (short) apdu_buffer[ISO7816.OFFSET_LC] & 0xFF
Elaborating on pst's answer. A byte has 2^8 bits numbers, or rather 256. But if you are working with signed numbers, they will work in a cycle instead. So, 128 will be actually -128, 129 will be -127 and so on.
Update: While the following answer is "valid" for normal Java, please refer to Roberts answer for Java Card-specific information, as well additional concerns/approaches.
In Java a byte has values in the range [-128, 127] so, when you say "160", that's not what the code is really giving you :)
Perhaps you'd like to use:
int j = apduBuffer[4] & 0xFF;
That "upcasts" the value apduBuffer[4] to an int while treating the original byte data as an unsigned value.
Likewise, i should also be an int (to avoid a nasty overflow-and-loop-forever bug), and the System.arraycopy method could be handy as well...
(I have no idea if that is the only/real problem -- or if the above is a viable solution on a Java Card -- but it sure is a problem and aligns with the "128 limit" mentioned.)
Happy coding.

Creating and loading .pngs in RGBA4444 RGBA5551 for openGL

I'm creating an openGL game and so far I have been using .pngs in the RGBA8888 format as texture sheets, but those are too memory hungry, and my app crashes frequently. I read in Apple's site that such format such be used just when too much quality is needed, and recommends to use RGBA4444 and RGBA5551 instead ( I already converted my textures to PVR but the quality loss is too great in most of the sprite sheets).
I only need to use GL_UNSIGNED_SHORT_5_5_5_1 or GL_UNSIGNED_SHORT_4_4_4_4 in my glTexImage2D call inside my texture loader class in order to load my textures, but I need to convert my texture sheets to RGBA4444 and RGBA5551, and I'm clueless about how could I achieve this.
Seriously? There are libraries to do this kind of conversion. But frankly, this is a bit of bit twiddling. There are libraries that use asm, or specialized SSE commands to accellerate this which will be fast, but its pretty easy to roll your own format converter in C/C++.
Your basic process would be:
Given a buffer of RGBA8888 encoded values
Create a buffer big enough to hold the RGBA4444 or RGBA5551 values. In this case, its simple - half the size.
Loop over the source buffer, unpacking each component, and repacking into the destination format, and write it into the destination buffer.
void* rgba8888_to_rgba4444(
void* src, // IN, pointer to source buffer
int cb) // IN size of source buffer, in bytes
{
// this code assumes that a long is 4 bytes and short is 2.
//on some compilers this isnt true
int i;
// compute the actual number of pixel elements in the buffer.
int cpel = cb/4;
unsigned long* psrc = (unsigned long*)src;
// create the RGBA4444 buffer
unsigned short* pdst = (unsigned short*)malloc(cpel*2);
// convert every pixel
for(i=0;i<cpel; i++)
{
// read a source pixel
unsigned pel = psrc[i];
// unpack the source data as 8 bit values
unsigned r = p & 0xff;
unsigned g = (pel >> 8) & 0xff;
unsigned b = (pel >> 16) & 0xff;
unsigned a = (pel >> 24) & 0xff;
//convert to 4 bit vales
r >>= 4;
g >>= 4;
b >>= 4;
a >>= 4;
// and store
pdst[i] = r | g << 4 | b << 8 | a << 12;
}
return pdst;
}
The actual conversion loop I did very wastefully, the components can be extracted, converted and repacked in a single pass, making for far faster code. I did it this way to make the conversion explicit, and easy to change. Also, im not sure that I got the component order the right way around. So it might be b, r, g, a, but it shouldn't effect the result of the function as it repackes in the same order into the dest buffer.
Using ImageMagick you can create RGBA4444 PNG files by running:
convert source.png -depth 4 destination.png
You can get ImageMagick from MacPorts.
You may consider using Imagination's PVRTexTool for Windows. It's specifically for creating PVR textures in every supported color format. It can create both PVRTC compressed textures (what you call "PVR") as well as uncompressed textures in 8888, 5551, 4444, etc.
However, it doesn't output PNGs (only PVRs) so your loading code would have change. Also, sometimes PVRs are much larger than PNGs because the pixels in PNGs are compressed with deflate compression.
Since you're most likely running OS X, you can use Darwine (now WineBottler) to run it (and other windows programs) on OS X.
You'll need to register as an Imagination developer before you can download PVRTexTool. Registration and the tool are both free.
Once you set it up, it's pretty painless and it gives you a decent GUI for working with PVRs.
You might also want to look how to optimize RGBA8888 for conversion to RGBA4444 using floyd-steinberg dithering in GIMP: http://www.youtube.com/watch?v=v1xGYecsnX0
You could also use http://www.texturepacker.com for conversion.
Here is an optimized in-place conversion of Chris' code which should run 2x as fast but is not as strait forward. The in-place conversion helps to avoid crashes by lowering the memory spike. Just thought I'd share in case anyone was planning on using this code. I've tested it and it works great:
void* rgba8888_to_rgba4444( void* src, // IN, pointer to source buffer
int cb) // IN size of source buffer, in bytes
{
int i;
// compute the actual number of pixel elements in the buffer.
int cpel = cb/4;
unsigned long* psrc = (unsigned long*)src;
unsigned short* pdst = (unsigned short*)src;
// convert every pixel
for(i=0;i<cpel; i++)
{
// read a source pixel
unsigned pel = psrc[i];
// unpack the source data as 8 bit values
unsigned r = (pel << 8) & 0xf000;
unsigned g = (pel >> 4) & 0x0f00;
unsigned b = (pel >> 16) & 0x00f0;
unsigned a = (pel >> 28) & 0x000f;
// and store
pdst[i] = r | g | b | a;
}
return pdst;
}