I have three CGLayers who's data I'd like to compare.
void *a = CGBitmapContextGetData(CGLayerGetContext(layerA));
void *b = CGBitmapContextGetData(CGLayerGetContext(layerB));
void *c = CGBitmapContextGetData(CGLayerGetContext(layerC));
I'd like to get a result like ((a OR b) AND c) where only bits that are on in layerA or layerB and also on in layerC end up in the result. These layers are kCGImageAlphaOnly so they are only 8 bits "deep", and I've only drawn into them with 1.0 alpha. I also don't need to know where the overlap lies, I just need to know whether there are any bits on in the result.
I'm really missing QuickDraw today, it had plenty of bit-oriented operations that were very speedy. Any thoughts on how to accomplish something like this?
Here's a naive implementation, assuming all three are the same size:
unsigned char *a = CGBitmapContextGetData(CGLayerGetContext(layerA));
unsigned char *b = CGBitmapContextGetData(CGLayerGetContext(layerB));
CGContextRef context = CGLayerGetContext(layerC);
unsigned char *c = CGBitmapContextGetData(context);
size_t bytesPerRow = CGBitmapContextGetBytesPerRow(context);
size_t height = CGBitmapContextGetHeight(context);
size_t len = bytesPerRow * height;
BOOL bitsFound = NO;
for (int i = 0; i < len; i++) {
if ((a[i] | b[i]) & c[i]) { bitsFound = YES; break; }
}
Since you're hankering for QuickDraw, I assume you could have written that yourself, and you know that will probably be slow.
If you can guarantee the bitmap sizes, you could use int instead of char and operate on four bytes at a time.
For more serious optimization, you should check out the Accelerate framework.
What about the CGBlendModes? kCGBlendModeDestinationOver acts as OR for A and B, and then you can use kCGBlendModeDestinationIn to AND that result with C.
Related
I'm just starting to explore Cairo, but right now I really want to use it for something very simple.
I have a very low-tech bitmap, i.e., a 3*X*Y array of numbers. I'd like to use Cairo to make this into a bitmap and write to a file. I'm looking through tutorials and I'm not seeing a way to use it for comparatively low-level functions like this.
I don't think I need guidance on how to use the tool once I know what the tool is.
I didn't actually test this, but the following should give you lots of useful hints:
#include <cairo.h>
#include <stdint.h>
#define WIDTH 42
#define HEIGHT 42
uint8_t data[WIDTH][HEIGHT][3];
cairo_surface_t* convert()
{
cairo_surface_t *result;
unsigned char *current_row;
int stride;
result = cairo_image_surface_create(CAIRO_FORMAT_RGB24, WIDTH, HEIGHT);
if (cairo_surface_status(result) != CAIRO_STATUS_SUCCESS)
return result;
cairo_surface_flush(result);
current_row = cairo_image_surface_get_data(result);
stride = cairo_image_surface_get_stride(result);
for (int y = 0; y < HEIGHT; y++) {
uint32_t *row = (void *) current_row;
for (int x = 0; x < WIDTH; x++) {
uint32_t r = data[x][y][0];
uint32_t g = data[x][y][1];
uint32_t b = data[x][y][2];
row[x] = (r << 16) | (g << 8) | b;
}
current_row += stride;
}
cairo_surface_mark_dirty(result);
return result;
}
I have a problem with understand some basics. I want create pointer array (a1) copy it to NSData (c1) than again create pointer array (b1) from NSData and check if content of a1 and b1 is the same.
But I face with two erors:
First is "incorrect checksum for freed object" when I want create NSData dataWithBytes:length:
Second bytes in a1 and b1 aren't the same.
Could someone tell me why? For record I use xcode5 with arc.
- (void) testCopyBuffer {
int const bufferSize =4096;
int* a1;
a1 = (int*)malloc(bufferSize);
for (int i=0; i<bufferSize; i++) {
a1[i] = i;
}
NSData *c1 = [NSData dataWithBytes:a1 length:bufferSize];
int* b1;
b1 = malloc(bufferSize);
[c1 getBytes:b1 length:bufferSize];
for (int i=0; i<bufferSize; i++) {
XCTAssertTrue(a1[i]==b1[i], "Powinny być takie same");
}
}
You don't allocate the right amount of memory.
For bufferSize number of ints, you need to allocate
a1 = malloc(bufferSize * sizeof(int));
And later consequently
NSData *c1 = [NSData dataWithBytes:a1 length:(bufferSize * sizeof(int))];
etc. In your case,
for (int i=0; i<bufferSize; i++) {
a1[i] = i;
}
writes beyond the allocated memory, which can lead to all kinds of undefined
behaviour.
Can anyone help converting the Int to char array
as i have buffer as
char *buffer = NULL;
int lengthOfComponent = -1;
char *obj;
buffer[index]= (char *)&lengthOfComponent;
if i do this it is thorwing EXCESS BAD ACCESS after the execution how to store the value of the obj to buffer using memcpy
Of course you cannot write in buffer[index], it is not allocated!
buffer = malloc(sizeof(char) * lengthOfBuffer);
should do it. After that you can write the buffer with memcpy or with an assignation, like you are doing.
buffer[index] = (char *)&lengthOfComponent;
buffer[index] is like dereferencing the pointer. But buffer is not pointing to any valid location. Hence the runtime error.
The C solution is using snprintf. Try -
int i = 11;
char buffer[10];
snprintf(buffer, sizeof(buffer), "%d", i);
I am currently in the process of building an application that reads in audio from my iPhone's microphone, and then does some processing and visuals. Of course I am starting with the audio stuff first, but am having one minor problem.
I am defining my sampling rate to be 44100 Hz and defining my buffer to hold 4096 samples. Which is does. However, when I print this data out, copy it into MATLAB to double check accuracy, the sample rate I have to use is half of my iPhone defined rate, or 22050 Hz, for it to be correct.
I think it has something to do with the following code and how it is putting 2 bytes per packet, and when I am looping through the buffer, the buffer is spitting out the whole packet, which my code assumes is a single number. So what I am wondering is how to split up those packets and read them as individual numbers.
- (void)setupAudioFormat {
memset(&dataFormat, 0, sizeof(dataFormat));
dataFormat.mSampleRate = kSampleRate;
dataFormat.mFormatID = kAudioFormatLinearPCM;
dataFormat.mFramesPerPacket = 1;
dataFormat.mChannelsPerFrame = 1;
// dataFormat.mBytesPerFrame = 2;
// dataFormat.mBytesPerPacket = 2;
dataFormat.mBitsPerChannel = 16;
dataFormat.mReserved = 0;
dataFormat.mBytesPerPacket = dataFormat.mBytesPerFrame = (dataFormat.mBitsPerChannel / 8) * dataFormat.mChannelsPerFrame;
dataFormat.mFormatFlags =
kLinearPCMFormatFlagIsSignedInteger |
kLinearPCMFormatFlagIsPacked;
}
If what I described is unclear, please let me know. Thanks!
EDIT
Adding the code that I used to print the data
float *audioFloat = (float *)malloc(numBytes * sizeof(float));
int *temp = (int*)inBuffer->mAudioData;
int i;
float power = pow(2, 31);
for (i = 0;i<numBytes;i++) {
audioFloat[i] = temp[i]/power;
printf("%f ",audioFloat[i]);
}
I found the problem with what I was doing. It was a c pointer issue, and since I have never really programmed in C before, I of course got them wrong.
You can not directly cast inBuffer->mAudioData to an int array. So what I simply did was the following
SInt16 *buffer = malloc(sizeof(SInt16)*kBufferByteSize);
buffer = inBuffer->mAudioData;
This worked out just fine and now my data is of correct length and the data is represented properly.
I saw your answer, there also is an underlying issue which gives wrong sample data bytes which is because of an endian issue of bytes being swapped.
-(void)feedSamplesToEngine:(UInt32)audioDataBytesCapacity audioData:(void *)audioData {
int sampleCount = audioDataBytesCapacity / sizeof(SAMPLE_TYPE);
SAMPLE_TYPE *samples = (SAMPLE_TYPE*)audioData;
//SAMPLE_TYPE *sample_le = (SAMPLE_TYPE *)malloc(sizeof(SAMPLE_TYPE)*sampleCount );//for swapping endians
std::string shorts;
double power = pow(2,10);
for(int i = 0; i < sampleCount; i++)
{
SAMPLE_TYPE sample_le = (0xff00 & (samples[i] << 8)) | (0x00ff & (samples[i] >> 8)) ; //Endianess issue
char dataInterim[30];
sprintf(dataInterim,"%f ", sample_le/power); // normalize it.
shorts.append(dataInterim);
}
I have two image blocks stored as 1D arrays and have do the following bitwise AND operations among the elements of them.
int compare(unsigned char *a, int a_pitch,
unsigned char *b, int b_pitch, int a_lenx, int a_leny)
{
int overlap =0 ;
for(int y=0; y<a_leny; y++)
for(int x=0; x<a_lenx; x++)
{
if(a[x + y * a_pitch] & b[x+y*b_pitch])
overlap++ ;
}
return overlap ;
}
Actually, I have to do this job about 220,000 times, so it becomes very slow on iphone devices.
How could I accelerate this job on iPhone ?
I heard that NEON could be useful, but I'm not really familiar with it. In addition it seems that NEON doesn't have bitwise AND...
Option 1 - Work in the native width of your platform (it's faster to fetch 32-bits into a register and then do operations on that register than it is to fetch and compare data one byte at a time):
int compare(unsigned char *a, int a_pitch,
unsigned char *b, int b_pitch, int a_lenx, int a_leny)
{
int overlap = 0;
uint32_t* a_int = (uint32_t*)a;
uint32_t* b_int = (uint32_t*)b;
a_leny = a_leny / 4;
a_lenx = a_lenx / 4;
a_pitch = a_pitch / 4;
b_pitch = b_pitch / 4;
for(int y=0; y<a_leny_int; y++)
for(int x=0; x<a_lenx_int; x++)
{
uint32_t aVal = a_int[x + y * a_pitch_int];
uint32_t bVal = b_int[x+y*b_pitch_int];
if (aVal & 0xFF) & (bVal & 0xFF)
overlap++;
if ((aVal >> 8) & 0xFF) & ((bVal >> 8) & 0xFF)
overlap++;
if ((aVal >> 16) & 0xFF) & ((bVal >> 16) & 0xFF)
overlap++;
if ((aVal >> 24) & 0xFF) & ((bVal >> 24) & 0xFF)
overlap++;
}
return overlap ;
}
Option 2 - Use a heuristic to get an approximate result using fewer calculations (a good approach if the absolute difference between 101 overlaps and 100 overlaps is not important to your application):
int compare(unsigned char *a, int a_pitch,
unsigned char *b, int b_pitch, int a_lenx, int a_leny)
{
int overlap =0 ;
for(int y=0; y<a_leny; y+= 10)
for(int x=0; x<a_lenx; x+= 10)
{
//we compare 1% of all the pixels, and use that as the result
if(a[x + y * a_pitch] & b[x+y*b_pitch])
overlap++ ;
}
return overlap * 100;
}
Option 3 - Rewrite your function in inline assembly code. You're on your own for this one.
Your code is Rambo for the CPU - its worst nightmare :
byte access. Like aroth mentioned, ARM is VERY slow reading bytes from memory
random access. Two absolutely unnecessary multiply/add operations in addition to the already steep performance penalty by its nature.
Simply put, everything is wrong that can be wrong.
Don't call me rude. Let me be your angel instead.
First, I'll provide you a working NEON version. Then an optimized C version showing you exactly what you did wrong.
Just give me some time. I have to go to bed right now, and I have an important meeting tomorrow.
Why don't you learn ARM assembly? It's much easier and useful than x86 assembly.
It will also improve your C programming capabilities by a huge step.
Strongly recommended
cya
==============================================================================
Ok, here is an optimized version written in C with ARM assembly in mind.
Please note that both the pitches AND a_lenx have to be multiples of 4. Otherwise, it won't work properly.
There isn't much room left for optimizations with ARM assembly upon this version. (NEON is a different story - coming soon)
Take a careful look at how to handle variable declarations, loop, memory access, and AND operations.
And make sure that this function runs in ARM mode and not Thumb for best results.
unsigned int compare(unsigned int *a, unsigned int a_pitch,
unsigned int *b, unsigned int b_pitch, unsigned int a_lenx, unsigned int a_leny)
{
unsigned int overlap =0;
unsigned int a_gap = (a_pitch - a_lenx)>>2;
unsigned int b_gap = (b_pitch - a_lenx)>>2;
unsigned int aval, bval, xcount;
do
{
xcount = (a_lenx>>2);
do
{
aval = *a++;
// ldr aval, [a], #4
bval = *b++;
// ldr bavl, [b], #4
aval &= bval;
// and aval, aval, bval
if (aval & 0x000000ff) overlap += 1;
// tst aval, #0x000000ff
// addne overlap, overlap, #1
if (aval & 0x0000ff00) overlap += 1;
// tst aval, #0x0000ff00
// addne overlap, overlap, #1
if (aval & 0x00ff0000) overlap += 1;
// tst aval, #0x00ff0000
// addne overlap, overlap, #1
if (aval & 0xff000000) overlap += 1;
// tst aval, #0xff000000
// addne overlap, overlap, #1
} while (--xcount);
a += a_gap;
b += b_gap;
} while (--a_leny);
return overlap;
}
First of all, why the double loop? You can do it with a single loop and a couple of pointers.
Also, you don't need to calculate x+y*pitch for every single pixel; just increment two pointers by one. Incrementing by one is a lot faster than x+y*pitch.
Why exactly do you need to perform this operation? I would make sure there are no high-level optimizations/changes available before looking into a low-level solution like NEON.