I'm just starting to explore Cairo, but right now I really want to use it for something very simple.
I have a very low-tech bitmap, i.e., a 3*X*Y array of numbers. I'd like to use Cairo to make this into a bitmap and write to a file. I'm looking through tutorials and I'm not seeing a way to use it for comparatively low-level functions like this.
I don't think I need guidance on how to use the tool once I know what the tool is.
I didn't actually test this, but the following should give you lots of useful hints:
#include <cairo.h>
#include <stdint.h>
#define WIDTH 42
#define HEIGHT 42
uint8_t data[WIDTH][HEIGHT][3];
cairo_surface_t* convert()
{
cairo_surface_t *result;
unsigned char *current_row;
int stride;
result = cairo_image_surface_create(CAIRO_FORMAT_RGB24, WIDTH, HEIGHT);
if (cairo_surface_status(result) != CAIRO_STATUS_SUCCESS)
return result;
cairo_surface_flush(result);
current_row = cairo_image_surface_get_data(result);
stride = cairo_image_surface_get_stride(result);
for (int y = 0; y < HEIGHT; y++) {
uint32_t *row = (void *) current_row;
for (int x = 0; x < WIDTH; x++) {
uint32_t r = data[x][y][0];
uint32_t g = data[x][y][1];
uint32_t b = data[x][y][2];
row[x] = (r << 16) | (g << 8) | b;
}
current_row += stride;
}
cairo_surface_mark_dirty(result);
return result;
}
Related
I have two variable bit-shifting code fragments that I want to SSE-vectorize by some means:
1) a = 1 << b (where b = 0..7 exactly), i.e. 0/1/2/3/4/5/6/7 -> 1/2/4/8/16/32/64/128/256
2) a = 1 << (8 * b) (where b = 0..7 exactly), i.e. 0/1/2/3/4/5/6/7 -> 1/0x100/0x10000/etc
OK, I know that AMD's XOP VPSHLQ would do this, as would AVX2's VPSHLQ. But my challenge here is whether this can be achieved on 'normal' (i.e. up to SSE4.2) SSE.
So, is there some funky SSE-family opcode sequence that will achieve the effect of either of these code fragments? These only need yield the listed output values for the specific input values (0-7).
Update: here's my attempt at 1), based on Peter Cordes' suggestion of using the floating point exponent to do simple variable bitshifting:
#include <stdint.h>
typedef union
{
int32_t i;
float f;
} uSpec;
void do_pow2(uint64_t *in_array, uint64_t *out_array, int num_loops)
{
uSpec u;
for (int i=0; i<num_loops; i++)
{
int32_t x = *(int32_t *)&in_array[i];
u.i = (127 + x) << 23;
int32_t r = (int32_t) u.f;
out_array[i] = r;
}
}
I'm following this example but I'm not sure what I missed. Specifically, I have this struct in MATLAB:
a = struct; a.one = 1.0; a.two = 2.0; a.three = 3.0; a.four = 4.0;
And this is my test code in MEX ---
First, I wanted to make sure that I'm passing in the right thing, so I did this check:
int nfields = mxGetNumberOfFields(prhs[0]);
mexPrintf("nfields =%i \n\n", nfields);
And it does yield 4, since I have four fields.
However, when I tried to extract the value in field three:
tmp = mxGetField(prhs[0], 0, "three");
mexPrintf("data =%f \n\n", (double *)mxGetData(tmp) );
It returns data =1.000000. I'm not sure what I did wrong. My logic is that I want to get the first element (hence index is 0) of the field three, so I expected data =3.00000.
Can I get a pointer or a hint?
EDITED
Ok, since you didn't provide your full code but you are working on a test, let's try to make a new one from scratch.
On Matlab side, use the following code:
a.one = 1;
a.two = 2;
a.three = 3;
a.four = 4;
read_struct(a);
Now, create and compile the MEX read_struct function as follows:
#include "mex.h"
void read_struct(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[])
{
if (nrhs != 1)
mexErrMsgTxt("One input argument required.");
/* Let's check if the input is a struct... */
if (!mxIsStruct(prhs[0]))
mexErrMsgTxt("The input must be a structure.");
int ne = mxGetNumberOfElements(prhs[0]);
int nf = mxGetNumberOfFields(prhs[0]);
mexPrintf("The structure contains %i elements and %i fields.\n", ne, nf);
mwIndex i;
mwIndex j;
mxArray *mxValue;
double *value;
for (i = 0; i < nf; ++i)
{
for (j = 0; j < ne; ++j)
{
mxValue = mxGetFieldByNumber(prhs[0], j, i);
value = mxGetPr(mxValue);
mexPrintf("Field %s(%d) = %.1f\n", mxGetFieldNameByNumber(prhs[0],i), j, value[0]);
}
}
return;
}
Does this correctly prints your structure?
1) How can I access in forEach_root() other elements except for the current one?
In OpenCL we have pointer to the first element and then can use get_global_id(0) to get current index. But we can still access all other elements. In Renderscript, do we only have pointer to the current element?
2) How can I loop through an Allocation in forEach_root()?
I have a code that uses nested (double) loop in java. Renderscript automates the outer loop, but I can't find any information on implementing the inner loop. Below is my best effort:
void root(const float3 *v_in, float3 *v_out) {
rs_allocation alloc = rsGetAllocation(v_in);
uint32_t cnt = rsAllocationGetDimX(alloc);
*v_out = 0;
for(int i=0; i<cnt; i++)
*v_out += v_in[i];
}
But here rsGetAllocation() fails when called from forEach_root().
05-11 21:31:29.639: E/RenderScript(17032): ScriptC::ptrToAllocation, failed to find 0x5beb1a40
Just in case I add my OpenCL code that works great under Windows. I'm trying to port it to Renderscript
typedef float4 wType;
__kernel void gravity_kernel(__global wType *src,__global wType *dst)
{
int id = get_global_id(0);
int count = get_global_size(0);
double4 tmp = 0;
for(int i=0;i<count;i++) {
float4 diff = src[i] - src[id];
float sq_dist = dot(diff, diff);
float4 norm = normalize(diff);
if (sq_dist<0.5/60)
tmp += convert_double4(norm*sq_dist);
else
tmp += convert_double4(norm/sq_dist);
}
dst[id] = convert_float4(tmp);
}
You can provide data apart from your root function. In the current android version (4.2) you could do the following (It is an example from an image processing scenario):
Renderscript snippet:
#pragma version(1)
#pragma rs java_package_name(com.example.renderscripttests)
//Define global variables in your renderscript:
rs_allocation pixels;
int width;
int height;
// And access these in your root function via rsGetElementAt(pixels, px, py)
void root(uchar4 *v_out, uint32_t x, uint32_t y)
{
for(int px = 0; px < width; ++px)
for(int py = 0; py < height; ++py)
{
// unpack a color to a float4
float4 f4 = rsUnpackColor8888(*(uchar*)rsGetElementAt(pixels, px, py));
...
Java file snippet
// In your java file, create a renderscript:
RenderScript renderscript = RenderScript.create(this);
ScriptC_myscript script = new ScriptC_myscript(renderscript);
// Create Allocations for in- and output (As input the bitmap 'bitmapIn' should be used):
Allocation pixelsIn = Allocation.createFromBitmap(renderscript, bitmapIn,
Allocation.MipmapControl.MIPMAP_NONE, Allocation.USAGE_SCRIPT);
Allocation pixelsOut = Allocation.createTyped(renderscript, pixelsIn.getType());
// Set width, height and pixels in the script:
script.set_width(640);
script.set_height(480);
script.set_pixels(pixelsIn);
// Call the for each loop:
script.forEach_root(pixelsOut);
// Copy Allocation to the bitmap 'bitmapOut':
pixelsOut.copyTo(bitmapOut);
You can see, the input 'pixelsIn' is previously set and used inside the renderscript when calling the forEach_root function to calculate values for 'pixelsOut'. Also width and height are previously set.
I have three CGLayers who's data I'd like to compare.
void *a = CGBitmapContextGetData(CGLayerGetContext(layerA));
void *b = CGBitmapContextGetData(CGLayerGetContext(layerB));
void *c = CGBitmapContextGetData(CGLayerGetContext(layerC));
I'd like to get a result like ((a OR b) AND c) where only bits that are on in layerA or layerB and also on in layerC end up in the result. These layers are kCGImageAlphaOnly so they are only 8 bits "deep", and I've only drawn into them with 1.0 alpha. I also don't need to know where the overlap lies, I just need to know whether there are any bits on in the result.
I'm really missing QuickDraw today, it had plenty of bit-oriented operations that were very speedy. Any thoughts on how to accomplish something like this?
Here's a naive implementation, assuming all three are the same size:
unsigned char *a = CGBitmapContextGetData(CGLayerGetContext(layerA));
unsigned char *b = CGBitmapContextGetData(CGLayerGetContext(layerB));
CGContextRef context = CGLayerGetContext(layerC);
unsigned char *c = CGBitmapContextGetData(context);
size_t bytesPerRow = CGBitmapContextGetBytesPerRow(context);
size_t height = CGBitmapContextGetHeight(context);
size_t len = bytesPerRow * height;
BOOL bitsFound = NO;
for (int i = 0; i < len; i++) {
if ((a[i] | b[i]) & c[i]) { bitsFound = YES; break; }
}
Since you're hankering for QuickDraw, I assume you could have written that yourself, and you know that will probably be slow.
If you can guarantee the bitmap sizes, you could use int instead of char and operate on four bytes at a time.
For more serious optimization, you should check out the Accelerate framework.
What about the CGBlendModes? kCGBlendModeDestinationOver acts as OR for A and B, and then you can use kCGBlendModeDestinationIn to AND that result with C.
I have used the following code for converting the bigint in decimal to bytearray (raw data), but I'm getting wrong result.
What is the mistake here?
I'm trying this in Apple Mac ( for Iphone app)
COMP_BYTE_SIZE is 4
Is there any bigendian/ little endian issue, please Help.
void bi_export(BI_CTX *ctx, bigint *x, uint8_t *data, int size)
{
int i, j, k = size-1;
check(x);
memset(data, 0, size); /* ensure all leading 0's are cleared */
for (i = 0; i < x->size; i++)
{
for (j = 0; j < COMP_BYTE_SIZE; j++)
{
comp mask = 0xff << (j*8);
int num = (x->comps[i] & mask) >> (j*8);
data[k--] = num;
if (k < 0)
{
break;
}
}
}
Thanks.
The argument size is at least x->size*4, ie. the target array is big enough? Also use
comp mask = (comp)0xff << (j*8);
num should be cast to uint8_t before copy
data[k--] = (uint8_t)num;