libyuv::NV21ToI420 messes up colors - yuv

I am using libyuv to convert NV21 image format to I420:
void convert(uint8* input, int width, int height) {
int size = width * height * 3/2;
uint8* output = new uint8[size];
uint8* src_y = input;
int src_stride_y = width;
uint8* src_vu = input + (width * height);
int src_stride_vu = width / 2;
uint8* dst_y = output;
int dst_stride_y = width;
uint8* dst_u = dst_y + (width * height);
int dst_stride_u = width / 2;
uint8* dst_v = dst_u + (width * height) / 4;
int dst_stride_v = width / 2;
libyuv::NV21ToI420(src_y, src_stride_y,
src_vu, src_stride_vu,
dst_y, dst_stride_y,
dst_u, dst_stride_u,
dst_v, dst_stride_v,
width, height);
dumpToFile(dst_y, size);
...
}
The size of my input is 640x480.
I display the dumped file using ImageMagick's display:
$ display -size 640x480 -depth 8 -sampling-factor 4:2:0 -colorspace srgb MyI420_1.yuv
However, the colors are messed up in the displayed image. The other aspects of the image look okay.
I am wondering if I am making a mistake in my code. Perhaps my stride calculations are not correct.
Note that if I use my custom function to rearrange V1U1V2U2... as U1U2...V1V2... and dump the output, it displays fine. However, I prefer to use libyuv as it has some optimization for neon, SSE2, etc. Regards.

Your src_stride_vu need to be same as width since it's combined stride to UV pixels interlaced.

Related

pangocairo text in middle of the circle

I am writing an extension / widgets for gnome-shell.So I want to show some information using text. Actually I want to put text in middle of the circle using pangoCairo but I can't figure out how can I put text using pangoCairo?
here is my code of drawing circles with a function to set text middle of the circles but the text is not showing in the middle.
draw_stuff(canvas, cr, width, height) {
cr.setSourceRGBA(1,1,1,0.2);
cr.save ();
cr.setOperator (Cairo.Operator.CLEAR);
cr.paint ();
cr.restore ();
cr.setOperator (Cairo.Operator.OVER);
cr.scale (width, height);
cr.setLineCap (Cairo.LineCap.BUTT);
//test
cr.setLineWidth (0.08);
cr.translate (0.5, 0.5);
cr.arc (0, 0, 0.4, 0, Math.PI * 2);
cr.stroke ();
cr.setSourceRGBA(1,1,1,0.8);
cr.rotate(-Math.PI/2);
cr.save();
cr.arc (0,0, 0.4, 0, this.currentR /100 * 2 * Math.PI);
cr.stroke();
cr.setSourceRGBA(1,1,1,0.9);
cr.save();
cr.arc (0,0, 0.25, 0, 2 * Math.PI);
cr.fill();
cr.moveTo(0, 0);
cr.setSourceRGBA(0,0,0,0.9);
cr.save();
this.text_show(cr, "WoW");
cr.restore();
return true; }
update() { this.currentR = 60;
this._canvas.connect ("draw", this.draw_stuff.bind(this));
this._canvas.invalidate();}
text_show(cr, showtext, font = "DejaVuSerif Bold 16") {
let pl = PangoCairo.create_layout(cr);
pl.set_text(showtext, -1);
pl.set_font_description(Pango.FontDescription.from_string(font));
PangoCairo.update_layout(cr, pl);
let [w, h] = pl.get_pixel_size();
cr.relMoveTo(-w / 2, 0);
PangoCairo.show_layout(cr, pl);
cr.relMoveTo(w / 2, 0);}
I dont know what's wrong here?
let [w, h] = pl.get_pixel_size();
cr.relMoveTo(-w / 2, 0);
You are placing your text something like 1000 pixels off the screen. get_pixel_size() is documented as:
pango_layout_get_size() returns the width and height scaled by PANGO_SCALE.
and PANGO_SCALE has the value 1024:
Thus, you need to change the code above to
let [w, h] = pl.get_pixel_size();
cr.relMoveTo(-w / 2048, 0);
Bonus points for not hardcoding the value of PANGO_SCALE and instead getting it from your Pango API bindings. However, for that I don't know enough about gnome-shell-extensions. Is this code in Javascript? Dunno.

pass matlab image to open3d three::Image in a mex script

I am trying to load an image in a mex script and cast it to the corresponding format that the Open3D library uses, i.e. three::Image. I am using the following code:
uint8_t* rgb_image = (uint8_t*) mxGetPr(prhs[3]);
int* dims = (int*) mxGetDimensions(prhs[3]);
int height = dims[0];
int width = dims[2];
int channels = dims[4];
int imsize = height * width;
Image image;
image.PrepareImage(height, width, 3, sizeof(uint8_t)); // parameters: height, width, num_of_channels, bytes_per_channel
memcpy(image.data_.data(), rgb_image, image.data_.size());
The above works well when I give a grayscale image and specify num_of_channels to 1 but not for 3 channel images as you can notice below:
Then I tried to create a function where I am manually looping through the raw data and assigning them to the output image
auto image_ptr = std::make_shared<Image>();
image_ptr->PrepareImage(height, width, channels, sizeof(uint8_t));
for (int i = 0; i < height * width; i++) {
uint8_t *p = (uint8_t *)(image_ptr->data_.data() + i * channels * sizeof(uint8_t));
*p++ = *rgb_image++;
}
But now it seems that the color channels are wrongly assigned:
Any idea how to address this issue. The point is that it seems to be something easy but since my knowledge with C++ and pointers is quite limited I cannot figure it out straight forward.
I found this solution here (Reading image in matlab in a format acceptable to mex) as well but I am not sure how exactly I can use it. To be honest I am quite of confused.
ok the solution was quite straight forward as I was though in first place. It was just playing correctly with the pointers:
std::shared_ptr<Image> CreateRGBImageFromMat(uint8_t *mat_image, int width, int height, int channels)
{
auto open3d_image = std::make_shared<Image>();
open3d_image->PrepareImage(height, width, channels, sizeof(uint8_t));
for (int i = 0; i < height * width; i++) {
uint8_t *p = (uint8_t *)(open3d_image->data_.data() + i * channels * sizeof(uint8_t));
*p++ = *(mat_image + i);
*p++ = *(mat_image + i + height*width);
*p++ = *(mat_image + i + height*width*2);
}
return open3d_image;
}
since the three::Image expects the data in contiguous order row x col x channel while from matlab the image comes in blocks rows x cols x channel_1 (after you transpose the image since matlab is column major). My question though now is whether I can do the same with memcpy() or std::copy() where I can copy the bloc data to contiguous form so that I bypass the for loop.

Reduce border width on QR Codes generated by ZXing?

I'm using com.google.zxing.qrcode.QRCodeWriter to encode data and com.google.zxing.client.j2se.MatrixToImageWriter to generate the QR Code image. On a 400x400 image, there is about a 52 pixel wide border around the code. I'd like this border to be narrower, maybe 15 pixels, but I don't see anything in the API for doing that. Am I missing something in the documenation? Or would I need to process the image myself?
For reference, here is an example 400x400 QR Code produced with the ZXing library:
The QR spec requires a four module quiet zone and that's what zxing creates. (See QUIET_ZONE_SIZE in QRCodeWriter.renderResult.)
More recent versions of ZXing allow you to set the size of the quiet zone (basically the intrinsic padding of the QR code) by supplying an int value with the EncodeHintType.MARGIN key. Simply include it in the hints Map you supply to the Writer's encode(...) method, e.g.:
Map<EncodeHintType, Object> hints = new EnumMap<EncodeHintType, Object>(EncodeHintType.class);
hints.put(EncodeHintType.CHARACTER_SET, "UTF-8");
hints.put(EncodeHintType.MARGIN, 2); /* default = 4 */
If you change this, you risk lowering the decode success rate.
Even by setting EncodeHintType.MARGIN to 0, the algorithm that convert the QRCode "dot" matrix to pixels data can generate a small margin (the algorithm enforce a constant number of pixels per dots, so the margin pixel size is the remainder of the integer division of pixels size by QR-Code dot size).
However you can completely bypass this "dot to pixel" generation: you compute the QRCode dot matrix directly by calling the public com.google.zxing.qrcode.encoder.Encoder class, and generate the pixel image yourself. Code below:
// Step 1 - generate the QRCode dot array
Map<EncodeHintType, Object> hints = new HashMap<EncodeHintType, Object>(1);
hints.put(EncodeHintType.CHARACTER_SET, "UTF-8");
QRCode qrCode = Encoder.encode(what, ErrorCorrectionLevel.L, hints);
// Step 2 - create a BufferedImage out of this array
int width = qrCode.getMatrix().getWidth();
int height = qrCode.getMatrix().getHeight();
BufferedImage image = new BufferedImage(width, height, BufferedImage.TYPE_INT_RGB);
int[] rgbArray = new int[width * height];
int i = 0;
for (int y = 0; y < height; y++) {
for (int x = 0; x < width; x++) {
rgbArray[i] = qrCode.getMatrix().get(x, y) > 0 ? 0xFFFFFF : 0x000000;
i++;
} }
image.setRGB(0, 0, width, height, rgbArray, 0, width);
The conversion of the BufferedImage to PNG data is left as an exercise to the reader. You can also scale the image by setting a fixed number of pixels per dots.
It's usually more optimized that way, the generated image size is the smallest possible. If you rely on client to scale the image (w/o blur) you do not need more than 1 pixel per dot.
HashMap hintMap = new HashMap();
hintMap.put(EncodeHintType.ERROR_CORRECTION, ErrorCorrectionLevel.Q);
hintMap.put(EncodeHintType.MARGIN, -1);
no margin
UPDATE
Add dependencies (from comments)
<dependency>
<groupId>com.google.zxing</groupId>
<artifactId>core</artifactId>
<version>3.2.0</version>
<type>jar</type>
</dependency>
<dependency>
<groupId>com.google.zxing</groupId>
<artifactId>javase</artifactId>
<version>3.2.0</version>
</dependency>
In swift you can:
let hints = ZXEncodeHints()
hints!.margin = NSNumber(int: 0)
let result = try writer.encode(code, format: format, width: Int32(size.width), height: Int32(size.height), hints: hints)
let cgImage = ZXImage(matrix: result, onColor: UIColor.blackColor().CGColor, offColor: UIColor.clearColor().CGColor).cgimage
let QRImage = UIImage(CGImage: cgImage)
My problem is that I need to generate a PNG image with a transparent background fixed to x * x pixels.
I find that whatever I do with EncodeHintType.MARGIN, these is always some unexpected margin.
After reading its source code, I find a way to fix my problem, this is my code. There is no margin in the output BufferedImage.
BufferedImage oriQrImg = getQrImg(CONTENT_PREFIX+userInfo, ErrorCorrectionLevel.L,BLACK);
BufferedImage scaledImg = getScaledImg(oriQrImg,REQUIRED_QR_WIDTH,REQUIRED_QR_HEIGHT);
private static BufferedImage getQrImg(String content, ErrorCorrectionLevel level, int qrColor) throws WriterException {
QRCode qrCode = Encoder.encode(content, level, QR_HINTS);
ByteMatrix input = qrCode.getMatrix();
int w=input.getWidth(),h=input.getHeight();
BufferedImage qrImg = new BufferedImage(w, h, BufferedImage.TYPE_INT_RGB);
Graphics2D g2d = qrImg.createGraphics();
qrImg = g2d.getDeviceConfiguration().createCompatibleImage(w,h, Transparency.BITMASK);
g2d.dispose();
for (int y = 0; y < h; y++) {
for (int x = 0; x < w; x++) {
if (input.get(x,y) == 1) {
qrImg.setRGB(x, y, qrColor);
}else{
qrImg.setRGB(x, y, Transparency.BITMASK);
}
}
}
return qrImg;
}
static BufferedImage getScaledImg(BufferedImage oriImg,int aimWidth,int aimHeight){
Image scaled = oriImg.getScaledInstance(aimWidth,aimHeight,SCALE_DEFAULT);
Graphics2D g2d = new BufferedImage(aimWidth,aimHeight, BufferedImage.TYPE_INT_RGB).createGraphics();
BufferedImage scaledImg = g2d.getDeviceConfiguration().createCompatibleImage(aimWidth,aimHeight, Transparency.BITMASK);
g2d.dispose();
scaledImg.createGraphics().drawImage(scaled, 0, 0,null);
return scaledImg;
}

iPhone SDK - Optimize a for loop

I'm developing an image processing application and I'm looking for an advise to tune my code.
My need is to split the image into blocs (80x80), and for each blocs, calculate the average color.
My first method contains the main loops where the second method is called :
- (NSArray*)getRGBAsFromImage:(UIImage *)image {
int width = image.size.width;
int height = image.size.height;
int blocPerRow = 80;
int blocPerCol = 80;
int pixelPerRowBloc = width / blocPerRow;
int pixelPerColBloc = height / blocPerCol;
int xx,yy;
// Row loop
for (int i=0; i<blocPerRow; i++) {
xx = (i * pixelPerRowBloc) + 1;
// Colon loop
for (int j=0; j<blocPerCol; j++) {
yy = (j * pixelPerColBloc) +1;
[self getRGBAsFromImageBloc:image
atX:xx
andY:yy
withPixelPerRow:pixelPerRowBloc
AndPixelPerCol:pixelPerColBloc];
}
}
// return my NSArray not done yet !
}
My second method browses the pixel bloc and returns a ColorStruct :
- (ColorStruct*)getRGBAsFromImageBloc:(UIImage*)image
atX:(int)xx
andY:(int)yy
withPixelPerRow:(int)pixelPerRow
AndPixelPerCol:(int)pixelPerCol {
// First get the image into your data buffer
CGImageRef imageRef = [image CGImage];
NSUInteger width = CGImageGetWidth(imageRef);
NSUInteger height = CGImageGetHeight(imageRef);
CGColorSpaceRef colorSpace = CGColorSpaceCreateDeviceRGB();
unsigned char *rawData = malloc(height * width * 4);
NSUInteger bytesPerPixel = 4;
NSUInteger bytesPerRow = bytesPerPixel * width;
NSUInteger bitsPerComponent = 8;
CGContextRef context = CGBitmapContextCreate(rawData, width, height,
bitsPerComponent, bytesPerRow, colorSpace,
kCGImageAlphaPremultipliedLast | kCGBitmapByteOrder32Big);
CGColorSpaceRelease(colorSpace);
CGContextDrawImage(context, CGRectMake(0, 0, width, height), imageRef);
CGContextRelease(context);
// Now your rawData contains the image data in the RGBA8888 pixel format.
int byteIndex = (bytesPerRow * yy) + xx * bytesPerPixel;
int red = 0;
int green = 0;
int blue = 0;
int alpha = 0;
int currentAlpha;
// bloc loop
for (int i = 0 ; i < (pixelPerRow*pixelPerCol) ; ++i) {
currentAlpha = rawData[byteIndex + 3];
red += (rawData[byteIndex] ) * currentAlpha;
green += (rawData[byteIndex + 1]) * currentAlpha;
blue += (rawData[byteIndex + 2]) * currentAlpha;
alpha += currentAlpha;
byteIndex += 4;
if ( i == pixelPerRow ) {
byteIndex += (width-pixelPerRow) * 4;
}
}
red /= alpha;
green /= alpha;
blue /= alpha;
ColorStruct *bColorStruct = newColorStruct(red, blue, green);
free(rawData);
return bColorStruct;
}
ColorStruct :
typedef struct {
int red;
int blue;
int green;
} ColorStruct;
with constructor :
ColorStruct *newColorStruct(int red, int blue, int green) {
ColorStruct *ret = malloc(sizeof(ColorStruct));
ret->red = red;
ret->blue = blue;
ret->green = green;
return ret;
}
As you can see, I have three level of loop : the row loop, the colon loop, and the bloc loop.
I have tested my code and it takes about 5 to 6 seconds for an 320x480 pictures.
Any help is welcomed.
Thanks,
Bahaaldine
Seem like a perfect problem to give it the Grand Central Dispatch ?
I think the main problem in this code is there are too many image reads. The entire image is loaded to memory for every(!) block (malloc is expensive too). You should preload image data once (cache it) and then use that memory in getRGBAsFromImageBloc(). Now for 320x480 picture you have 4 x 6 = 24 blocks. So you can speed up you app manyfold by only using caching.
At the end of the day taking an image and performing three multiplies and five additions on each pixel sequentially is always going to be relatively slow.
Luckily, what you're doing can be thought of as a special case of interpolating an image from one size to another - i.e. the average pixel of an image is the same as that image resized to a size of 1x1 (assuming the resizing is using some form of linear interpolation, but that's usually the standard way to do it) and there's a few highly optimized (or at least more optimized than you're likely to get without enormous effort) options for doing that that are part of the iPhone's graphics libraries. At first I'd try using the Quartz methods to resize an image:
CGImageRef sourceImage = yourImage;
int numBytesPerPixel = 4;
u_char* scaledImageData = (u_char*)malloc(numBytesPerPixel);
CGColorSpaceRef colorspace = CGImageGetColorSpace(sourceImage);
CGContextRef context = CGBitmapContextCreate (scaledImageData, 1, 1, 8, numBytesPerPixel, colorspace, kCGImageAlphaNoneSkipFirst);
CGColorSpaceRelease(colorspace);
CGContextDrawImage(context, CGRectMake(0,0,1,1), sourceImage);
int a = scaledImageData[0];
int r = scaledImageData[1];
int g = scaledImageData[2];
int b = scaledImageData[3];
(this just scales the original image down to 1 pixel and doesn't show the cropping of the sub regions but unfortunately I don't have time for that code right now - if you try to implement it and get stuck add a comment and I can show you how you would do that).
If that doesn't work you could always try using OpenGL ES to do this (create a texture out of the part of your image you need to scale, render it to a 1x1 buffer, and test the result from the buffer). This is a lot more complicated but might have some advantages in that it gives you access to the GPU, which might be a lot faster for large images.
Hope that makes sense and helps...
P.S. - Definitely follow y0prst's suggestion and only read the image in once - that is an easy fix that is going to buy you a ton of performance.
P.P.S - I haven't tested the code so usual caveats apply.
You're inspecting every single pixel - something that, it would seem, is going to take roughly the same amount of time no matter how you loop through it (provided you only inspect each pixel once).
I would suggest using a random sampling within the bloc - every "n'th" pixel, which would reduce the loop time (and the accuracy), or allow for an adjustable granularity.
Now, if there is an existing algorithm for computing the average of a group of pixels - that would be something to consider as an alternative.
You can speed things up by not calling a method in the middle of your loop. Just include the code inline.
ADDED: Also, you might try doing the draw image only once, not repeated in a loop, if you have enough memory.
After you do that, you can try hoisting some of the multiplies out of the inner loop as well for a little additional performance (although the Compiler may optimize some of this for you).

Image resized error: CGBitmapContextCreate: unsupported parameter

I am using the following code (from a blog post) to resize an image
if (inImage.size.width <= inImage.size.height) {
// Portrait
ratio = inImage.size.height / inImage.size.width;
resizedRect = CGRectMake(0, 0, width, width * ratio);
}
else {
// Landscape
ratio = inImage.size.width / inImage.size.height;
resizedRect = CGRectMake(0, 0, height * ratio, height);
}
CGImageRef imageRef = [inImage CGImage];
CGImageAlphaInfo alphaInfo = CGImageGetAlphaInfo(imageRef);
if (alphaInfo == kCGImageAlphaNone)
alphaInfo = kCGImageAlphaNoneSkipLast;
CGContextRef bitmap = CGBitmapContextCreate(
NULL,
resizedRect.size.width, // width
resizedRect.size.height, // height
CGImageGetBitsPerComponent(imageRef), // really needs to always be 8
4 * resizedRect.size.width, // rowbytes
CGImageGetColorSpace(imageRef),
alphaInfo
);
but for some reason depending on the size I am try to resize to I get the following error generated
CGBitmapContextCreate: unsupported
parameter combination: 8 integer
bits/component; 32 bits/pixel;
3-component colorspace;
kCGImageAlphaNoneSkipFirst; XXX
bytes/row.
where XXX differs depending on which image.
The rect I am creating is propotional to the image, I take a ratio from the width/height (depending on aspect) and multiple that be target width/height.
Here are some examples (X errors, / doesnt), the resize size will be 50xX or Xx50 depending on aspect:
Source 50x50 69x69
430x320 / X
240x320 / /
272x320 / /
480x419 / X
426x320 X X
480x256 X X
Where you wrote thumbRect, did you mean resizedRect? thumbRect does not otherwise occur.
I suspect the problem is that resizedRect.size.width is non integral. Note that it's floating point.
The width and bytesPerRow parameters of CGBitmapContextCreate are declared as integers. When you pass a floating point value, such as here, it gets truncated.
Suppose your resizedRect.size.width is 1.25. Then you will end up passing 1 for the width, and floor(1.25 * 4) == 5 as the bytes per row. That's inconsistent. You always want to pass four times whatever you passed for the width for the bytes per row.
You can also just leave bytesPerRow as 0, by the way. Then the system picks the best bytesPerRow (which is often larger than 4 times the width - it pads out for alignment).