Newbie X11 double buffering coding help request - double

I botched together the following code to do a simpler rotating radar display.
q1) How can I eliminate the flicker of the line being drawn and then drawn over. Can I use double buffering somehow ?
q2) How can I get Mouse and Keyborad inputs and process/parse them.
My goal is to read angle and distance data from an ultrasonic sensor and display that as a sweep. This data comes from an Arduino with an ultrasonic rangefinder mounted on a servo sweeping backwards and forwards 180 Deg. I also cant quite get the math right. I'd like it to go from 0 to 180, left to right and back depending on the angle data from the Arduino. 0 Deg is 90 Deg off to the right. So I know I have to offset it by 180 Deg but am hopeless at math. The sweep would have to start at 270 Deg and sweep through to 0 Deg.
Any help would be appreciated.
Pls don't jump on me. Just trying to learn.
Here is my code. I used a tutorial as a base and kind of guessed my way through.
CODE:
/*
* simple-drawing.c - demonstrate drawing of pixels, lines, arcs, etc.
* on a window. All drawings are done in black color
* over a white background.
*/
#include <X11/Xlib.h>
#include <stdio.h>
#include <stdlib.h> /* getenv(), etc. */
#include <unistd.h> /* sleep(), etc. */
#include <math.h>
/*
* function: create_simple_window. Creates a window with a white background
* in the given size.
* input: display, size of the window (in pixels), and location of the window
* (in pixels).
* output: the window's ID.
* notes: window is created with a black border, 2 pixels wide.
* the window is automatically mapped after its creation.
*/
Window
create_simple_window(Display* display, int width, int height, int x, int y)
{
int screen_num = DefaultScreen(display);
int win_border_width = 2;
Window win;
/* create a simple window, as a direct child of the screen's */
/* root window. Use the screen's black and white colors as */
/* the foreground and background colors of the window, */
/* respectively. Place the new window's top-left corner at */
/* the given 'x,y' coordinates. */
win = XCreateSimpleWindow(display, RootWindow(display, screen_num),
x, y, width, height, win_border_width,
BlackPixel(display, screen_num),
WhitePixel(display, screen_num));
/* make the window actually appear on the screen. */
XMapWindow(display, win);
/* flush all pending requests to the X server. */
XFlush(display);
return win;
}
GC
create_gc(Display* display, Window win, int reverse_video)
{
GC gc; /* handle of newly created GC. */
unsigned long valuemask = 0; /* which values in 'values' to */
/* check when creating the GC. */
XGCValues values; /* initial values for the GC. */
unsigned int line_width = 2; /* line width for the GC. */
int line_style = LineSolid; /* style for lines drawing and */
int cap_style = CapButt; /* style of the line's edje and */
int join_style = JoinBevel; /* joined lines. */
int screen_num = DefaultScreen(display);
gc = XCreateGC(display, win, valuemask, &values);
if (gc < 0) {
fprintf(stderr, "XCreateGC: \n");
}
/* allocate foreground and background colors for this GC. */
if (reverse_video) {
XSetForeground(display, gc, WhitePixel(display, screen_num));
XSetBackground(display, gc, BlackPixel(display, screen_num));
}
else {
XSetForeground(display, gc, BlackPixel(display, screen_num));
XSetBackground(display, gc, WhitePixel(display, screen_num));
}
/* define the style of lines that will be drawn using this GC. */
XSetLineAttributes(display, gc,
line_width, line_style, cap_style, join_style);
/* define the fill style for the GC. to be 'solid filling'. */
XSetFillStyle(display, gc, FillSolid);
return gc;
}
void
main(int argc, char* argv[])
{
Display* display; /* pointer to X Display structure. */
int screen_num; /* number of screen to place the window on. */
Window win; /* pointer to the newly created window. */
unsigned int display_width,
display_height; /* height and width of the X display. */
unsigned int width, height; /* height and width for the new window. */
char *display_name = getenv("DISPLAY"); /* address of the X display. */
GC gc; /* GC (graphics context) used for drawing */
/* in our window. */
Colormap screen_colormap; /* color map to use for allocating colors. */
XColor black, white;
/* used for allocation of the given color */
/* map entries. */
Status rc; /* return status of various Xlib functions. */
// Johns stuff
int angle=0;
int x1,x2,y1,y2;
int px,py;
// End
/* open connection with the X server. */
display = XOpenDisplay(display_name);
if (display == NULL) {
fprintf(stderr, "%s: cannot connect to X server '%s'\n",
argv[0], display_name);
exit(1);
}
/* get the geometry of the default screen for our display. */
screen_num = DefaultScreen(display);
display_width = DisplayWidth(display, screen_num);
display_height = DisplayHeight(display, screen_num);
/* make the new window occupy 1/9 of the screen's size. */
width = (display_width / 2);
height = (display_height / 2);
printf("window width - '%d'; height - '%d'\n", width, height);
/* create a simple window, as a direct child of the screen's */
/* root window. Use the screen's white color as the background */
/* color of the window. Place the new window's top-left corner */
/* at the given 'x,y' coordinates. */
win = create_simple_window(display, width, height, 50, 50);
/* allocate a new GC (graphics context) for drawing in the window. */
gc = create_gc(display, win, 0);
XSync(display, False);
/* catch expose events */
XSelectInput(display, win, ExposureMask);
/* wait for the expose event */
XEvent ev;
XNextEvent(display, &ev);
/* get access to the screen's color map. */
screen_colormap = DefaultColormap(display, DefaultScreen(display));
/* allocate the set of colors we will want to use for the drawing. */
rc = XAllocNamedColor(display, screen_colormap, "black", &black, &black);
if (rc == 0) {
fprintf(stderr, "XAllocNamedColor - failed to allocated 'black' color.\n");
exit(1);
}
rc = XAllocNamedColor(display, screen_colormap, "white", &white, &white);
if (rc == 0) {
fprintf(stderr, "XAllocNamedColor - failed to allocated 'white' color.\n");
exit(1);
}
/* draw one pixel near each corner of the window */
XDrawPoint(display, win, gc, 5, 5);
XDrawPoint(display, win, gc, 5, height-5);
XDrawPoint(display, win, gc, width-5, 5);
XDrawPoint(display, win, gc, width-5, height-5);
// Draw line.
x1=width / 2;
y1=height / 2;
px=x1;
py=y1;
while(1){
for(angle=0; angle < 360; angle++)
{
x2=x1 + cos((angle * M_PI) / 180) * 150;
y2=y1 + sin((angle * M_PI) / 180) * 150;
XSetForeground(display, gc, black.pixel);
XDrawLine(display, win, gc, x1, y1, x2, y2);
px=x2;
py=y2;
XFlush(display);
usleep(3500);
XSetForeground(display, gc, white.pixel);
XDrawLine(display, win, gc, x1, y1, px ,py);
XFlush(display);
XSetForeground(display, gc, black.pixel);
usleep(3500);
}
}
// End.
/* flush all pending requests to the X server. */
XFlush(display);
XSync(display, False);
/* make a delay for a short period. */
sleep(4);
/* close the connection to the X server. */
XCloseDisplay(display);
}
Simply trying to draw a moving line without flicker. I understand the concept but am a crap coder.
Any help would be appreciated.
Thanks,
John.``

Related

How to scale, crop, and rotate all at once in Android RenderScript

Is it possible to take a camera image in Y'UV format and using RenderScript:
Convert it to RGBA
Crop it to a certain region
Rotate it if necessary
Yes! I figured out how and thought I would share it with others. RenderScript has a bit of a learning curve, and more simple examples seem to help.
When cropping, you still need to set up an input and output allocation as well as one for the script itself. It might seem strange at first, but the input and output allocations have to be the same size so if you are cropping you need to set up yet another Allocation to write the cropped output. More on that in a second.
#pragma version(1)
#pragma rs java_package_name(com.autofrog.chrispvision)
#pragma rs_fp_relaxed
/*
* This is mInputAllocation
*/
rs_allocation gInputFrame;
/*
* This is where we write our cropped image
*/
rs_allocation gOutputFrame;
/*
* These dimensions define the crop region that we want
*/
uint32_t xStart, yStart;
uint32_t outputWidth, outputHeight;
uchar4 __attribute__((kernel)) yuv2rgbFrames(uchar4 in, uint32_t x, uint32_t y)
{
uchar Y = rsGetElementAtYuv_uchar_Y(gInputFrame, x, y);
uchar U = rsGetElementAtYuv_uchar_U(gInputFrame, x, y);
uchar V = rsGetElementAtYuv_uchar_V(gInputFrame, x, y);
uchar4 rgba = rsYuvToRGBA_uchar4(Y, U, V);
/* force the alpha channel to opaque - the conversion doesn't seem to do this */
rgba.a = 0xFF;
uint32_t translated_x = x - xStart;
uint32_t translated_y = y - yStart;
uint32_t x_rotated = outputWidth - translated_y;
uint32_t y_rotated = translated_x;
rsSetElementAt_uchar4(gOutputFrame, rgba, x_rotated, y_rotated);
return rgba;
}
To set up the allocations:
private fun createAllocations(rs: RenderScript) {
/*
* The yuvTypeBuilder is for the input from the camera. It has to be the
* same size as the camera (preview) image
*/
val yuvTypeBuilder = Type.Builder(rs, Element.YUV(rs))
yuvTypeBuilder.setX(mImageSize.width)
yuvTypeBuilder.setY(mImageSize.height)
yuvTypeBuilder.setYuvFormat(ImageFormat.YUV_420_888)
mInputAllocation = Allocation.createTyped(
rs, yuvTypeBuilder.create(),
Allocation.USAGE_IO_INPUT or Allocation.USAGE_SCRIPT)
/*
* The RGB type is also the same size as the input image. Other examples write this as
* an int but I don't see a reason why you wouldn't be more explicit about it to make
* the code more readable.
*/
val rgbType = Type.createXY(rs, Element.RGBA_8888(rs), mImageSize.width, mImageSize.height)
mScriptAllocation = Allocation.createTyped(
rs, rgbType,
Allocation.USAGE_SCRIPT)
mOutputAllocation = Allocation.createTyped(
rs, rgbType,
Allocation.USAGE_IO_OUTPUT or Allocation.USAGE_SCRIPT)
/*
* Finally, set up an allocation to which we will write our cropped image. The
* dimensions of this one are (wantx,wanty)
*/
val rgbCroppedType = Type.createXY(rs, Element.RGBA_8888(rs), wantx, wanty)
mOutputAllocationRGB = Allocation.createTyped(
rs, rgbCroppedType,
Allocation.USAGE_SCRIPT)
}
Finally, since you're cropping you need to tell the script what to do before invocation. If the image sizes don't change you can probably optimize this by moving the LaunchOptions and variable settings so they occur just once (rather than every time) but I'm leaving them here for my example to make it clearer.
override fun onBufferAvailable(a: Allocation) {
// Get the new frame into the input allocation
mInputAllocation!!.ioReceive()
// Run processing pass if we should send a frame
val current = System.currentTimeMillis()
if (current - mLastProcessed >= mFrameEveryMs) {
val lo = Script.LaunchOptions()
/*
* These coordinates are the portion of the original image that we want to
* include. Because we're rotating (in this case) x and y are reversed
* (but still offset from the actual center of each dimension)
*/
lo.setX(starty, endy)
lo.setY(startx, endx)
mScriptHandle.set_xStart(lo.xStart.toLong())
mScriptHandle.set_yStart(lo.yStart.toLong())
mScriptHandle.set_outputWidth(wantx.toLong())
mScriptHandle.set_outputHeight(wanty.toLong())
mScriptHandle.forEach_yuv2rgbFrames(mScriptAllocation, mOutputAllocation, lo)
val output = Bitmap.createBitmap(
wantx, wanty,
Bitmap.Config.ARGB_8888
)
mOutputAllocationRGB!!.copyTo(output)
/* Do something with the resulting bitmap */
listener?.invoke(output)
mLastProcessed = current
}
}
All this might seem like a bit much but it's very fast - way faster than doing the rotation on the java/kotlin side, and thanks to RenderScript's ability to run the kernel function over a subset of the image it's less overhead than creating a bitmap then creating a second, cropped one.
For me, all the rotation is necessary because the image seen by the RenderScript was 90 degrees rotated from the camera. I am told this is some kind of peculiarity of having a Samsung phone.
RenderScript was intimidating at first but once you get used to what it's doing it's not so bad. I hope this is helpful to someone.

pass matlab image to open3d three::Image in a mex script

I am trying to load an image in a mex script and cast it to the corresponding format that the Open3D library uses, i.e. three::Image. I am using the following code:
uint8_t* rgb_image = (uint8_t*) mxGetPr(prhs[3]);
int* dims = (int*) mxGetDimensions(prhs[3]);
int height = dims[0];
int width = dims[2];
int channels = dims[4];
int imsize = height * width;
Image image;
image.PrepareImage(height, width, 3, sizeof(uint8_t)); // parameters: height, width, num_of_channels, bytes_per_channel
memcpy(image.data_.data(), rgb_image, image.data_.size());
The above works well when I give a grayscale image and specify num_of_channels to 1 but not for 3 channel images as you can notice below:
Then I tried to create a function where I am manually looping through the raw data and assigning them to the output image
auto image_ptr = std::make_shared<Image>();
image_ptr->PrepareImage(height, width, channels, sizeof(uint8_t));
for (int i = 0; i < height * width; i++) {
uint8_t *p = (uint8_t *)(image_ptr->data_.data() + i * channels * sizeof(uint8_t));
*p++ = *rgb_image++;
}
But now it seems that the color channels are wrongly assigned:
Any idea how to address this issue. The point is that it seems to be something easy but since my knowledge with C++ and pointers is quite limited I cannot figure it out straight forward.
I found this solution here (Reading image in matlab in a format acceptable to mex) as well but I am not sure how exactly I can use it. To be honest I am quite of confused.
ok the solution was quite straight forward as I was though in first place. It was just playing correctly with the pointers:
std::shared_ptr<Image> CreateRGBImageFromMat(uint8_t *mat_image, int width, int height, int channels)
{
auto open3d_image = std::make_shared<Image>();
open3d_image->PrepareImage(height, width, channels, sizeof(uint8_t));
for (int i = 0; i < height * width; i++) {
uint8_t *p = (uint8_t *)(open3d_image->data_.data() + i * channels * sizeof(uint8_t));
*p++ = *(mat_image + i);
*p++ = *(mat_image + i + height*width);
*p++ = *(mat_image + i + height*width*2);
}
return open3d_image;
}
since the three::Image expects the data in contiguous order row x col x channel while from matlab the image comes in blocks rows x cols x channel_1 (after you transpose the image since matlab is column major). My question though now is whether I can do the same with memcpy() or std::copy() where I can copy the bloc data to contiguous form so that I bypass the for loop.

How to manipulate a shaped area of terrain in runtime - Unity 3D

My game has a drawing tool - a looping line renderer that is used as a marker to manipulate an area of the terrain in the shape of the line. This all happens in runtime as soon as the player stops drawing the line.
So far I have managed to raise terrain verteces that match the coordinates of the line renderer's points, but I have difficulties with raising the points that fall inside the marker's shape. Here is an image describing what I currently have:
I tried using the "Polygon Fill Algorithm" (http://alienryderflex.com/polygon_fill/), but raising the terrain vertices one line at a time is too resourceful (even when the algorithm is narrowed to a rectangle that surrounds only the marked area). Also my marker's outline points have gaps between them, meaning I need to add a radius to the line that raises the terrain, but that might leave the result sloppy.
Maybe I should discard the drawing mechanism and use a mesh with a mesh collider as the marker?
Any ideas are appreciated on how to get the terrain manipulated in the exact shape as the marker.
Current code:
I used this script to create the line - the first and the last line points have the same coordinates.
The code used to manipulate the terrain manipulation is currently triggered when clicking a GUI button:
using System;
using System.Collections;
using UnityEngine;
public class changeTerrainHeight_lineMarker : MonoBehaviour
{
public Terrain TerrainMain;
public LineRenderer line;
void OnGUI()
{
//Get the terrain heightmap width and height.
int xRes = TerrainMain.terrainData.heightmapWidth;
int yRes = TerrainMain.terrainData.heightmapHeight;
//GetHeights - gets the heightmap points of the tarrain. Store them in array
float[,] heights = TerrainMain.terrainData.GetHeights(0, 0, xRes, yRes);
if (GUI.Button(new Rect(30, 30, 200, 30), "Line points"))
{
/* Set the positions to array "positions" */
Vector3[] positions = new Vector3[line.positionCount];
line.GetPositions(positions);
/* use this height to the affected terrain verteces */
float height = 0.05f;
for (int i = 0; i < line.positionCount; i++)
{
/* Assign height data */
heights[Mathf.RoundToInt(positions[i].z), Mathf.RoundToInt(positions[i].x)] = height;
}
//SetHeights to change the terrain height.
TerrainMain.terrainData.SetHeights(0, 0, heights);
}
}
}
Got to the solution thanks to Siim's personal help, and thanks to the article: How can I determine whether a 2D Point is within a Polygon?.
The end result is visualized here:
First the code, then the explanation:
using System;
using System.Collections;
using UnityEngine;
public class changeTerrainHeight_lineMarker : MonoBehaviour
{
public Terrain TerrainMain;
public LineRenderer line;
void OnGUI()
{
//Get the terrain heightmap width and height.
int xRes = TerrainMain.terrainData.heightmapWidth;
int yRes = TerrainMain.terrainData.heightmapHeight;
//GetHeights - gets the heightmap points of the tarrain. Store them in array
float[,] heights = TerrainMain.terrainData.GetHeights(0, 0, xRes, yRes);
//Trigger line area raiser
if (GUI.Button(new Rect(30, 30, 200, 30), "Line fill"))
{
/* Set the positions to array "positions" */
Vector3[] positions = new Vector3[line.positionCount];
line.GetPositions(positions);
float height = 0.10f; // define the height of the affected verteces of the terrain
/* Find the reactangle the shape is in! The sides of the rectangle are based on the most-top, -right, -bottom and -left vertex. */
float ftop = float.NegativeInfinity;
float fright = float.NegativeInfinity;
float fbottom = Mathf.Infinity;
float fleft = Mathf.Infinity;
for (int i = 0; i < line.positionCount; i++)
{
//find the outmost points
if (ftop < positions[i].z)
{
ftop = positions[i].z;
}
if (fright < positions[i].x)
{
fright = positions[i].x;
}
if (fbottom > positions[i].z)
{
fbottom = positions[i].z;
}
if (fleft > positions[i].x)
{
fleft = positions[i].x;
}
}
int top = Mathf.RoundToInt(ftop);
int right = Mathf.RoundToInt(fright);
int bottom = Mathf.RoundToInt(fbottom);
int left = Mathf.RoundToInt(fleft);
int terrainXmax = right - left; // the rightmost edge of the terrain
int terrainZmax = top - bottom; // the topmost edge of the terrain
float[,] shapeHeights = TerrainMain.terrainData.GetHeights(left, bottom, terrainXmax, terrainZmax);
Vector2 point; //Create a point Vector2 point to match the shape
/* Loop through all points in the rectangle surrounding the shape */
for (int i = 0; i < terrainZmax; i++)
{
point.y = i + bottom; //Add off set to the element so it matches the position of the line
for (int j = 0; j < terrainXmax; j++)
{
point.x = j + left; //Add off set to the element so it matches the position of the line
if (InsidePolygon(point, bottom))
{
shapeHeights[i, j] = height; // set the height value to the terrain vertex
}
}
}
//SetHeights to change the terrain height.
TerrainMain.terrainData.SetHeightsDelayLOD(left, bottom, shapeHeights);
TerrainMain.ApplyDelayedHeightmapModification();
}
}
//Checks if the given vertex is inside the the shape.
bool InsidePolygon(Vector2 p, int terrainZmax)
{
// Assign the points that define the outline of the shape
Vector3[] positions = new Vector3[line.positionCount];
line.GetPositions(positions);
int count = 0;
Vector2 p1, p2;
int n = positions.Length;
// Find the lines that define the shape
for (int i = 0; i < n; i++)
{
p1.y = positions[i].z;// - p.y;
p1.x = positions[i].x;// - p.x;
if (i != n - 1)
{
p2.y = positions[(i + 1)].z;// - p.y;
p2.x = positions[(i + 1)].x;// - p.x;
}
else
{
p2.y = positions[0].z;// - p.y;
p2.x = positions[0].x;// - p.x;
}
// check if the given point p intersects with the lines that form the outline of the shape.
if (LinesIntersect(p1, p2, p, terrainZmax))
{
count++;
}
}
// the point is inside the shape when the number of line intersections is an odd number
if (count % 2 == 1)
{
return true;
}
else
{
return false;
}
}
// Function that checks if two lines intersect with each other
bool LinesIntersect(Vector2 A, Vector2 B, Vector2 C, int terrainZmax)
{
Vector2 D = new Vector2(C.x, terrainZmax);
Vector2 CmP = new Vector2(C.x - A.x, C.y - A.y);
Vector2 r = new Vector2(B.x - A.x, B.y - A.y);
Vector2 s = new Vector2(D.x - C.x, D.y - C.y);
float CmPxr = CmP.x * r.y - CmP.y * r.x;
float CmPxs = CmP.x * s.y - CmP.y * s.x;
float rxs = r.x * s.y - r.y * s.x;
if (CmPxr == 0f)
{
// Lines are collinear, and so intersect if they have any overlap
return ((C.x - A.x < 0f) != (C.x - B.x < 0f))
|| ((C.y - A.y < 0f) != (C.y - B.y < 0f));
}
if (rxs == 0f)
return false; // Lines are parallel.
float rxsr = 1f / rxs;
float t = CmPxs * rxsr;
float u = CmPxr * rxsr;
return (t >= 0f) && (t <= 1f) && (u >= 0f) && (u <= 1f);
}
}
The used method is filling the shape one line at a time - "The Ray Casting method". It turns out that this method starts taking more resources only if the given shape as a lot of sides. (A side of the shape is a line that connects two points in the outline of the shape.)
When I posted this question, my Line Renderer had 134 points defining the line. This also means the shape has the same number of sides that needs to pass the ray cast check.
When I narrowed down the number of points to 42, the method got fast enough, and also the shape did not lose almost any detail.
Furthermore I am planning on using some methods to make the contours smoother, so the shape can be defined with even less points.
In short, you need these steps to get to the result:
Create the outline of the shape;
Find the 4 points that mark the bounding box around the shape;
Start ray casting the box;
Check the number of how many times the ray intersects with the sides of the shape. The points with the odd number are located inside the shape:
Assign your attributes to all of the points that were found in the shape.

How is defined the data type FXPT2DOT30 in the BMP file structure?

The type FXPT2DOT30 appears in defining the struct CIEXYZ for BMP files,
according to the definition provided by Microsoft:
http://msdn.microsoft.com/en-us/library/windows/desktop/dd371828(v=vs.85).aspx
However, I cannot found the exact definition of FXPT2DOT30 anywhere.
Which is its precise definition?
What kind of data is supposed to hold?
#RobinGreen provided a link:
There it is the answer to my question:
[...] FXPT2DOT30 [...] which means that they are interpreted as fixed-point values with a 2-bit integer part and a 30-bit fractional part.
According to fileformat.info here is an unpacked version of the BITMAPV4HEADER structure. The fields between CSType and GammaRed correspond to the CIEXYZTRIPLE bV4Endpoints in BITMAPV4HEADER.
typedef struct _Win4xBitmapHeader
{
DWORD Size; /* Size of this header in bytes */
LONG Width; /* Image width in pixels */
LONG Height; /* Image height in pixels */
WORD Planes; /* Number of color planes */
WORD BitsPerPixel; /* Number of bits per pixel */
DWORD Compression; /* Compression methods used */
DWORD SizeOfBitmap; /* Size of bitmap in bytes */
LONG HorzResolution; /* Horizontal resolution in pixels per meter */
LONG VertResolution; /* Vertical resolution in pixels per meter */
DWORD ColorsUsed; /* Number of colors in the image */
DWORD ColorsImportant; /* Minimum number of important colors */
/* Fields added for Windows 4.x follow this line */
DWORD RedMask; /* Mask identifying bits of red component */
DWORD GreenMask; /* Mask identifying bits of green component */
DWORD BlueMask; /* Mask identifying bits of blue component */
DWORD AlphaMask; /* Mask identifying bits of alpha component */
DWORD CSType; /* Color space type */
LONG RedX; /* X coordinate of red endpoint */
LONG RedY; /* Y coordinate of red endpoint */
LONG RedZ; /* Z coordinate of red endpoint */
LONG GreenX; /* X coordinate of green endpoint */
LONG GreenY; /* Y coordinate of green endpoint */
LONG GreenZ; /* Z coordinate of green endpoint */
LONG BlueX; /* X coordinate of blue endpoint */
LONG BlueY; /* Y coordinate of blue endpoint */
LONG BlueZ; /* Z coordinate of blue endpoint */
DWORD GammaRed; /* Gamma red coordinate scale value */
DWORD GammaGreen; /* Gamma green coordinate scale value */
DWORD GammaBlue; /* Gamma blue coordinate scale value */
} WIN4XBITMAPHEADER;
From gdiplus.h:
typedef long FXPT2DOT30, FAR *LPFXPT2DOT30;
typedef struct tagCIEXYZ
{
FXPT2DOT30 ciexyzX;
FXPT2DOT30 ciexyzY;
FXPT2DOT30 ciexyzZ;
} CIEXYZ;

iPhone SDK - Optimize a for loop

I'm developing an image processing application and I'm looking for an advise to tune my code.
My need is to split the image into blocs (80x80), and for each blocs, calculate the average color.
My first method contains the main loops where the second method is called :
- (NSArray*)getRGBAsFromImage:(UIImage *)image {
int width = image.size.width;
int height = image.size.height;
int blocPerRow = 80;
int blocPerCol = 80;
int pixelPerRowBloc = width / blocPerRow;
int pixelPerColBloc = height / blocPerCol;
int xx,yy;
// Row loop
for (int i=0; i<blocPerRow; i++) {
xx = (i * pixelPerRowBloc) + 1;
// Colon loop
for (int j=0; j<blocPerCol; j++) {
yy = (j * pixelPerColBloc) +1;
[self getRGBAsFromImageBloc:image
atX:xx
andY:yy
withPixelPerRow:pixelPerRowBloc
AndPixelPerCol:pixelPerColBloc];
}
}
// return my NSArray not done yet !
}
My second method browses the pixel bloc and returns a ColorStruct :
- (ColorStruct*)getRGBAsFromImageBloc:(UIImage*)image
atX:(int)xx
andY:(int)yy
withPixelPerRow:(int)pixelPerRow
AndPixelPerCol:(int)pixelPerCol {
// First get the image into your data buffer
CGImageRef imageRef = [image CGImage];
NSUInteger width = CGImageGetWidth(imageRef);
NSUInteger height = CGImageGetHeight(imageRef);
CGColorSpaceRef colorSpace = CGColorSpaceCreateDeviceRGB();
unsigned char *rawData = malloc(height * width * 4);
NSUInteger bytesPerPixel = 4;
NSUInteger bytesPerRow = bytesPerPixel * width;
NSUInteger bitsPerComponent = 8;
CGContextRef context = CGBitmapContextCreate(rawData, width, height,
bitsPerComponent, bytesPerRow, colorSpace,
kCGImageAlphaPremultipliedLast | kCGBitmapByteOrder32Big);
CGColorSpaceRelease(colorSpace);
CGContextDrawImage(context, CGRectMake(0, 0, width, height), imageRef);
CGContextRelease(context);
// Now your rawData contains the image data in the RGBA8888 pixel format.
int byteIndex = (bytesPerRow * yy) + xx * bytesPerPixel;
int red = 0;
int green = 0;
int blue = 0;
int alpha = 0;
int currentAlpha;
// bloc loop
for (int i = 0 ; i < (pixelPerRow*pixelPerCol) ; ++i) {
currentAlpha = rawData[byteIndex + 3];
red += (rawData[byteIndex] ) * currentAlpha;
green += (rawData[byteIndex + 1]) * currentAlpha;
blue += (rawData[byteIndex + 2]) * currentAlpha;
alpha += currentAlpha;
byteIndex += 4;
if ( i == pixelPerRow ) {
byteIndex += (width-pixelPerRow) * 4;
}
}
red /= alpha;
green /= alpha;
blue /= alpha;
ColorStruct *bColorStruct = newColorStruct(red, blue, green);
free(rawData);
return bColorStruct;
}
ColorStruct :
typedef struct {
int red;
int blue;
int green;
} ColorStruct;
with constructor :
ColorStruct *newColorStruct(int red, int blue, int green) {
ColorStruct *ret = malloc(sizeof(ColorStruct));
ret->red = red;
ret->blue = blue;
ret->green = green;
return ret;
}
As you can see, I have three level of loop : the row loop, the colon loop, and the bloc loop.
I have tested my code and it takes about 5 to 6 seconds for an 320x480 pictures.
Any help is welcomed.
Thanks,
Bahaaldine
Seem like a perfect problem to give it the Grand Central Dispatch ?
I think the main problem in this code is there are too many image reads. The entire image is loaded to memory for every(!) block (malloc is expensive too). You should preload image data once (cache it) and then use that memory in getRGBAsFromImageBloc(). Now for 320x480 picture you have 4 x 6 = 24 blocks. So you can speed up you app manyfold by only using caching.
At the end of the day taking an image and performing three multiplies and five additions on each pixel sequentially is always going to be relatively slow.
Luckily, what you're doing can be thought of as a special case of interpolating an image from one size to another - i.e. the average pixel of an image is the same as that image resized to a size of 1x1 (assuming the resizing is using some form of linear interpolation, but that's usually the standard way to do it) and there's a few highly optimized (or at least more optimized than you're likely to get without enormous effort) options for doing that that are part of the iPhone's graphics libraries. At first I'd try using the Quartz methods to resize an image:
CGImageRef sourceImage = yourImage;
int numBytesPerPixel = 4;
u_char* scaledImageData = (u_char*)malloc(numBytesPerPixel);
CGColorSpaceRef colorspace = CGImageGetColorSpace(sourceImage);
CGContextRef context = CGBitmapContextCreate (scaledImageData, 1, 1, 8, numBytesPerPixel, colorspace, kCGImageAlphaNoneSkipFirst);
CGColorSpaceRelease(colorspace);
CGContextDrawImage(context, CGRectMake(0,0,1,1), sourceImage);
int a = scaledImageData[0];
int r = scaledImageData[1];
int g = scaledImageData[2];
int b = scaledImageData[3];
(this just scales the original image down to 1 pixel and doesn't show the cropping of the sub regions but unfortunately I don't have time for that code right now - if you try to implement it and get stuck add a comment and I can show you how you would do that).
If that doesn't work you could always try using OpenGL ES to do this (create a texture out of the part of your image you need to scale, render it to a 1x1 buffer, and test the result from the buffer). This is a lot more complicated but might have some advantages in that it gives you access to the GPU, which might be a lot faster for large images.
Hope that makes sense and helps...
P.S. - Definitely follow y0prst's suggestion and only read the image in once - that is an easy fix that is going to buy you a ton of performance.
P.P.S - I haven't tested the code so usual caveats apply.
You're inspecting every single pixel - something that, it would seem, is going to take roughly the same amount of time no matter how you loop through it (provided you only inspect each pixel once).
I would suggest using a random sampling within the bloc - every "n'th" pixel, which would reduce the loop time (and the accuracy), or allow for an adjustable granularity.
Now, if there is an existing algorithm for computing the average of a group of pixels - that would be something to consider as an alternative.
You can speed things up by not calling a method in the middle of your loop. Just include the code inline.
ADDED: Also, you might try doing the draw image only once, not repeated in a loop, if you have enough memory.
After you do that, you can try hoisting some of the multiplies out of the inner loop as well for a little additional performance (although the Compiler may optimize some of this for you).