Not able to save image in png format while using Halide - png

I tried running the following program on a Windows machine using Visual Studio:
#include <Halide.h>
#include "halide_image_io.h"
#include "png.h"
using namespace Halide;
using namespace Halide::Tools;
int main(int argc, char** argv)
{
Buffer<uint8_t> in = load_image("images/rgb.png");
Func blurx, out;
Var x, y,c, xi, yi;
printf("width : %d, height: %d and channels: %d",in.width(),in.height(),in.channels());
//width = 768, height = 1280
blurx(x, y,c) = (in(x, y,c) + in(x, y,c) + in(x, y,c)) / 3.0f;
out(x, y,c) = (blurx(x, y,c) + blurx(x, y,c) + blurx(x, y,c)) / 3.0f;
out.tile(x, y, xi, yi, 256, 32).vectorize(xi, 8).parallel(y);
Buffer<uint8_t> result = out.realize(in.width(), in.height(),in.channels());
save_image(result, "output/output.png");
return 0;
}
I am getting the error "Image cannot be saved in this format". The error does not occur when I remove the '/3.0f'. So, division may be causing some pixel values to be an invalid format. Hence, I am not able to save it in .png format. How can I solve this?
Please note: the formula should have been (in(x-1, y,c) + in(x, y,c) + in(x+1, y,c)) / 3.0f; .... but, that gives me error"access out of boundaries of input buffer".... I am trying to resolve the division error first, so for the time being, I modified the formula .. which helped me to catch this error.

Adding the division by 3.0f changes the inferred type of out (and blurx) from uint8_t to float. I believe the issue is that the save_image helper doesn't allow saving floating point buffers as PNGs. To get back to uint8_t try wrapping the whole expression defining out in a cast<uint8_t>(...).
(Note: you also probably want to upcast the input values at least uint16_t to avoid quickly overflowing when you add 3 of them together.)

I made the following changes and it works now:
#include <Halide.h>
#include "halide_image_io.h"
#include "png.h"
using namespace Halide;
using namespace Halide::Tools;
int main(int argc, char** argv)
{
Buffer<uint8_t> in = load_image("images/rgb.png");
Func blurx, out;
Var x, y,c, xi, yi;
Func in_bounded = BoundaryConditions::repeat_edge(in);
Func input_16;
input_16(x, y) = cast<uint16_t>(in_bounded(x, y));
blurx(x, y,c) = (input_16(x, y,c) + input_16(x, y,c) + input_16(x, y,c)) / 3;
out(x, y,c) = cast<uint8_t>(blurx(x, y,c) + blurx(x, y,c) + blurx(x, y,c)) / 3);
out.tile(x, y, xi, yi, 256, 32).vectorize(xi, 8).parallel(y);
Buffer<uint8_t> result = out.realize(in.width(), in.height(),in.channels());
save_image(result, "output/output.png");
return 0;
}

Related

pass matlab image to open3d three::Image in a mex script

I am trying to load an image in a mex script and cast it to the corresponding format that the Open3D library uses, i.e. three::Image. I am using the following code:
uint8_t* rgb_image = (uint8_t*) mxGetPr(prhs[3]);
int* dims = (int*) mxGetDimensions(prhs[3]);
int height = dims[0];
int width = dims[2];
int channels = dims[4];
int imsize = height * width;
Image image;
image.PrepareImage(height, width, 3, sizeof(uint8_t)); // parameters: height, width, num_of_channels, bytes_per_channel
memcpy(image.data_.data(), rgb_image, image.data_.size());
The above works well when I give a grayscale image and specify num_of_channels to 1 but not for 3 channel images as you can notice below:
Then I tried to create a function where I am manually looping through the raw data and assigning them to the output image
auto image_ptr = std::make_shared<Image>();
image_ptr->PrepareImage(height, width, channels, sizeof(uint8_t));
for (int i = 0; i < height * width; i++) {
uint8_t *p = (uint8_t *)(image_ptr->data_.data() + i * channels * sizeof(uint8_t));
*p++ = *rgb_image++;
}
But now it seems that the color channels are wrongly assigned:
Any idea how to address this issue. The point is that it seems to be something easy but since my knowledge with C++ and pointers is quite limited I cannot figure it out straight forward.
I found this solution here (Reading image in matlab in a format acceptable to mex) as well but I am not sure how exactly I can use it. To be honest I am quite of confused.
ok the solution was quite straight forward as I was though in first place. It was just playing correctly with the pointers:
std::shared_ptr<Image> CreateRGBImageFromMat(uint8_t *mat_image, int width, int height, int channels)
{
auto open3d_image = std::make_shared<Image>();
open3d_image->PrepareImage(height, width, channels, sizeof(uint8_t));
for (int i = 0; i < height * width; i++) {
uint8_t *p = (uint8_t *)(open3d_image->data_.data() + i * channels * sizeof(uint8_t));
*p++ = *(mat_image + i);
*p++ = *(mat_image + i + height*width);
*p++ = *(mat_image + i + height*width*2);
}
return open3d_image;
}
since the three::Image expects the data in contiguous order row x col x channel while from matlab the image comes in blocks rows x cols x channel_1 (after you transpose the image since matlab is column major). My question though now is whether I can do the same with memcpy() or std::copy() where I can copy the bloc data to contiguous form so that I bypass the for loop.

Renderscript Greyscale not quite working

This is my renderscript code for now:
#pragma version(1)
#pragma rs java_package_name(com.apps.foo.bar)
rs_allocation inPixels;
uchar4 RS_KERNEL root(uchar4 in, uint32_t x, uint32_t y) {
uchar4 pixel = in.rgba;
pixel.r = (pixel.r + pixel.g + pixel.b)/3;
pixel.g = (pixel.r + pixel.g + pixel.b)/3;
pixel.b = (pixel.r + pixel.g + pixel.b)/3;
return pixel;
}
My phone shows a "greyscaled" picture. I say "grayscaled" because red for example, is still kinda red...It is gray-ish but you can still see that is red. I know I can use more sophisticated methods, but I would like to stick to the simple one for now.
I would like to know if my renderscript code is wrong. Should I be converting the char to another type?
Use a temporary variable to hold the result as you compute it. Otherwise, in the first line you're modifying pixel.r, and in the very next one you are using it to calculate pixel.g. No wonder you get artifacts.
Also, don't forget to assign the alpha value to avoid surprises with "invisible" output.
Also I would recommend not to use equal weights for r, g and b but the weights as below. See e.g. http://www.johndcook.com/blog/2009/08/24/algorithms-convert-color-grayscale/
char4 __attribute__((kernel)) gray(uchar4 in) {
uchar4 out;
float gr= 0.2125*in.r + 0.7154*in.g + 0.0721*in.b;
out.r = out.g = out.b = gr;
out.a = in.a;
return out;
}

Using OpenGL in Matlab to get depth buffer

Ive asked a similar question before and didnt manage to find a direct answer.
Could someone provide sample code for extracting the depth buffer of the rendering of an object into a figure in Matlab?
So lets say I load an obj file or even just a simple surf call, render it and now want to get to its depth buffer then what code will do that for me using both Matlab and OpenGL. I.e. how do I set this up and then access the actual data?
I essentially want to be able to use Matlabs powerful plotting functions and then be able to access the underlying graphics context for getting the depth buffer out.
NOTE: The bounty specifies JOGL but that is not a must. Any code which acts as above and can provide me with the depth buffer after running it in Matlab is sufficient)
Today, I went drinking with my colleagues, and after five beers and some tequillas I found this question and thought, "have at ya!" So I was struggling for a while but then I found a simple solution using MEX. I theorized that the OpenGL context, created by the last window, could be left active and therefore could be accessible from "C", if the script ran in the same thread.
I created a simple "C" program which calls one matlab function, called "testofmyfilter" which plots frequency response of a filter (that was the only script I had at hand). This is rendered using OpenGL. Then the program uses glGetViewport() and glReadPixels() to get to the OpenGL buffers. Then it creates a matrix, fills it with the depth values, and passes it to the second function, called "trytodisplaydepthmap". It just displays the depthmap using the imshow function. Note that the MEX function is allowed to return values as well, so maybe the postprocessing would not have to be another function, but I'm in no state to be able to understand how it's done. Should be trivial, though. I'm working with MEX for the first time today.
Without further delay, there are source codes I used:
testofmyfilter.m
imp = zeros(10000,1);
imp(5000) = 1;
% impulse
[bwb,bwa] = butter(3, 0.1, 'high');
b = filter(bwb, bwa, imp);
% filter impulse by the filter
fs = 44100; % sampling frequency (all frequencies are relative to fs)
frequency_response=fft(b); % calculate response (complex numbers)
amplitude_response=20*log10(abs(frequency_response)); % calculate module of the response, convert to dB
frequency_axis=(0:length(b)-1)*fs/length(b); % generate frequency values for each response value
min_f=2;
max_f=fix(length(b)/2)+1; % min, max frequency
figure(1);
lighting gouraud
set(gcf,'Renderer','OpenGL')
semilogx(frequency_axis(min_f:max_f),amplitude_response(min_f:max_f),'r-') % plot with logarithmic axis using red line
axis([frequency_axis(min_f) frequency_axis(max_f) -90 10]) % set axis limits
xlabel('frequency [Hz]');
ylabel('amplitude [dB]'); % legend
grid on % draw grid
test.c
//You can include any C libraries that you normally use
#include "windows.h"
#include "stdio.h"
#include "math.h"
#include "mex.h" //--This one is required
extern WINAPI void glGetIntegerv(int n_enum, int *p_value);
extern WINAPI void glReadPixels(int x,
int y,
int width,
int height,
int format,
int type,
void * data);
#define GL_VIEWPORT 0x0BA2
#define GL_DEPTH_COMPONENT 0x1902
#define GL_FLOAT 0x1406
void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[])
{
int viewport[4], i, x, y;
int colLen;
float *data;
double *matrix;
mxArray *arg[1];
mexCallMATLAB(0, NULL, 0, NULL, "testofmyfilter");
// call an .m file which creates OpenGL window and draws a plot inside
glGetIntegerv(GL_VIEWPORT, viewport);
printf("GL_VIEWPORT = [%d, %d, %d, %d]\n", viewport[0], viewport[1], viewport[2], viewport[3]);
// print viewport dimensions, should be [0, 0, m, n]
// where m and n are size of the GL window
data = (float*)malloc(viewport[2] * viewport[3] * sizeof(float));
glReadPixels(0, 0, viewport[2], viewport[3], GL_DEPTH_COMPONENT, GL_FLOAT, data);
// alloc data and read the depth buffer
/*for(i = 0; i < 10; ++ i)
printf("%f\n", data[i]);*/
// debug
arg[0] = mxCreateNumericMatrix(viewport[3], viewport[2], mxDOUBLE_CLASS, mxREAL);
matrix = mxGetPr(arg[0]);
colLen = mxGetM(arg[0]);
printf("0x%08x 0x%08x 0x%08x %d\n", data, arg[0], matrix, colLen); // debug
for(x = 0; x < viewport[2]; ++ x) {
for(y = 0; y < viewport[3]; ++ y)
matrix[x * colLen + y] = data[x + (viewport[3] - 1 - y) * viewport[2]];
}
// create matrix, copy data (this is stupid, but matlab switches
// rows/cols, also convert float to double - but OpenGL could have done that)
free(data);
// don't need this anymore
mexCallMATLAB(0, NULL, 1, arg, "trytodisplaydepthmap");
// pass the array to a function (returnig something from here
// is beyond my understanding of mex, but should be doable)
mxDestroyArray(arg[0]);
// cleanup
return;
}
trytodisplaydepthmap.m:
function [] = trytodisplaydepthmap(depthMap)
figure(2);
imshow(depthMap, []);
% see what's inside
Save all of these to the same directory, compile test.c with (type that to Matlab console):
mex test.c Q:\MATLAB\R2008a\sys\lcc\lib\opengl32.lib
Where "Q:\MATLAB\R2008a\sys\lcc\lib\opengl32.lib" is path to "opengl32.lib" file.
And finally execute it all by merely typing "test" in matlab console. It should bring up a window with filter frequency response, and another window with the depth buffer. Note the front and back buffers are swapped at the moment "C" code reads the depth buffer, so it might be required to run the script twice to get any results (so the front buffer which now contains the results swaps with back buffer again, and the depth can be read out). This could be done automatically by "C", or you can try including getframe(gcf); at the end of your script (that reads back from OpenGL as well so it swaps the buffers for you, or something).
This works for me in Matlab 7.6.0.324 (R2008a). The script runs and spits out the following:
>>test
GL_VIEWPORT = [0, 0, 560, 419]
0x11150020 0x0bd39620 0x12b20030 419
And of course it displays the images. Note the depth buffer range depends on Matlab, and can be quite high, so making any sense of the generated images may not be straightforward.
the swine's answer is the correct one.
Here is a slightly formatted and simpler version that is cross-platform.
Create a file called mexGetDepth.c
#include "mex.h"
#define GL_VIEWPORT 0x0BA2
#define GL_DEPTH_COMPONENT 0x1902
#define GL_FLOAT 0x1406
void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[])
{
int viewport[4], i, x, y;
int colLen;
float *data;
double *matrix;
glGetIntegerv(GL_VIEWPORT, viewport);
data = (float*)malloc(viewport[2] * viewport[3] * sizeof(float));
glReadPixels(0, 0, viewport[2], viewport[3], GL_DEPTH_COMPONENT, GL_FLOAT, data);
plhs[0] = mxCreateNumericMatrix(viewport[3], viewport[2], mxDOUBLE_CLASS, mxREAL);
matrix = mxGetPr(plhs[0]);
colLen = mxGetM(plhs[0]);
for(x = 0; x < viewport[2]; ++ x) {
for(y = 0; y < viewport[3]; ++ y)
matrix[x * colLen + y] = data[x + (viewport[3] - 1 - y) * viewport[2]];
}
free(data);
return;
}
Then if youre on windows compile using
mex mexGetDepth.c "path to OpenGL32.lib"
or if youre on a nix system
mex mexGetDepth.c "path to opengl32.a"
Then run the following small script to test out the new function
peaks;
figure(1);
depthData=mexGetDepth;
figure
imshow(depthData);

Transferring an image from Matlab to an OpenCV IplImage

I have an image in Matlab:
img = imopen('image.jpg')
which returns an uint8 array height x width x channels (3 channels: RGB).
Now I want to use openCV to do some manipulations on it, so I write up a MEX file which takes the image as a parameter and constructs an IplImage from it:
#include "mex.h"
#include "cv.h"
void mexFunction(int nlhs, mxArray **plhs, int nrhs, const mxArray **prhs) {
char *matlabImage = (char *)mxGetData(prhs[0]);
const mwSize *dim = mxGetDimensions(prhs[0]);
CvSize size;
size.height = dim[0];
size.width = dim[1];
IplImage *iplImage = cvCreateImageHeader(size, IPL_DEPTH_8U, dim[2]);
iplImage->imageData = matlabImage;
iplImage->imageDataOrigin = iplImage->imageData;
/* Show the openCV image */
cvNamedWindow("mainWin", CV_WINDOW_AUTOSIZE);
cvShowImage("mainWin", iplImage);
}
This result looks completely wrong, because openCV uses other conventions than matlab for storing an image (for instance, they interleave the color channels).
Can anyone explain what the differences in conventions are and give some pointers on how to display the image correctly?
After spending the day doing fun image format conversions </sarcasm> I can now answer my own question.
Matlab stores images as 3 dimensional arrays: height × width × color
OpenCV stores images as 2 dimensional arrays: (color × width) × height
Furthermore, for best performance, OpenCV pads the images with zeros so rows are always aligned on 32 bit blocks.
I've done the conversion in Matlab:
function [cv_img, dim, depth, width_step] = convert_to_cv(img)
% Exchange rows and columns (handles 3D cases as well)
img2 = permute( img(:,end:-1:1,:), [2 1 3] );
dim = [size(img2,1), size(img2,2)];
% Convert double precision to single precision if necessary
if( isa(img2, 'double') )
img2 = single(img2);
end
% Determine image depth
if( ndims(img2) == 3 && size(img2,3) == 3 )
depth = 3;
else
depth = 1;
end
% Handle color images
if(depth == 3 )
% Switch from RGB to BGR
img2(:,:,[3 2 1]) = img2;
% Interleave the colors
img2 = reshape( permute(img2, [3 1 2]), [size(img2,1)*size(img2,3) size(img2,2)] );
end
% Pad the image
width_step = size(img2,1) + mod( size(img2,1), 4 );
img3 = uint8(zeros(width_step, size(img2,2)));
img3(1:size(img2,1), 1:size(img2,2)) = img2;
cv_img = img3;
% Output to openCV
cv_display(cv_img, dim, depth, width_step);
The code to transform this into an IplImage is in the MEX file:
#include "mex.h"
#include "cv.h"
#include "highgui.h"
#define IN_IMAGE prhs[0]
#define IN_DIMENSIONS prhs[1]
#define IN_DEPTH prhs[2]
#define IN_WIDTH_STEP prhs[3]
void mexFunction(int nlhs, mxArray **plhs, int nrhs, const mxArray **prhs) {
bool intInput = true;
if(nrhs != 4)
mexErrMsgTxt("Usage: cv_disp(image, dimensions, depth, width_step)");
if( mxIsUint8(IN_IMAGE) )
intInput = true;
else if( mxIsSingle(IN_IMAGE) )
intInput = false;
else
mexErrMsgTxt("Input should be a matrix of uint8 or single precision floats.");
if( mxGetNumberOfElements(IN_DIMENSIONS) != 2 )
mexErrMsgTxt("Dimension vector should contain two elements: [width, height].");
char *matlabImage = (char *)mxGetData(IN_IMAGE);
double *imgSize = mxGetPr(IN_DIMENSIONS);
size_t width = (size_t) imgSize[0];
size_t height = (size_t) imgSize[1];
size_t depth = (size_t) *mxGetPr(IN_DEPTH);
size_t widthStep = (size_t) *mxGetPr(IN_WIDTH_STEP) * (intInput ? sizeof(unsigned char):sizeof(float));
CvSize size;
size.height = height;
size.width = width;
IplImage *iplImage = cvCreateImageHeader(size, intInput ? IPL_DEPTH_8U:IPL_DEPTH_32F, depth);
iplImage->imageData = matlabImage;
iplImage->widthStep = widthStep;
iplImage->imageDataOrigin = iplImage->imageData;
/* Show the openCV image */
cvNamedWindow("mainWin", CV_WINDOW_AUTOSIZE);
cvShowImage("mainWin", iplImage);
}
You could optimize your program with mxGetDimensions and mxGetNumberOfDimensions to get the size of the image and use the mxGetClassID to determine the depth more accurately.
I wanted to do the same but I think it would be better to do this using matlab dll and calllib. I would not do the transformation of the image in opencv format not in matlab because it would be slow. This is one of the biggest problems with matlab openCV. I think opencv2.2 has some good solutions for that problem. It looks like there are some solutions like that done from opencv community for octave but I still don't understand them. They are somehow using the c++ functionality of opencv.
Try using the library developed by Kota Yamaguchi:
http://github.com/kyamagu/mexopencv
It defines a class called 'MxArray' that can perform all types of conversions from MATLAB mxArray variables to OpenCV objects (and from OpenCV to MATLAB). For example, this library can convert between mxArray and cv::Mat data types. Btw, IplImage is not relevant anymore if you use C++ API of OpenCV, it's better to use cv::Mat instead.
Note: if using the library, make sure to compile your mex function with MxArray.cpp file from the library; you can do so in MATLAB command line with:
mex yourmexfile.cpp MxArray.cpp
Based on the answer and How the image matrix is stored in the memory on OpenCV, we can make it with Opencv Mat operation only!
C++: Mat::Mat(int ndims, const int* sizes, int type, void* data, const size_t* steps=0)
C++: void merge(const Mat* mv, size_t count, OutputArray dst)
Then the mex C/C++ code is:
#include "mex.h"
#include <opencv2/opencv.hpp>
#define uint8 unsigned char
void mexFunction(int nlhs, mxArray *out[], int nrhs, const mxArray *input[])
{
// assume the type of image is uint8
if(!mxIsClass(input[0], "uint8"))
{
mexErrMsgTxt("Only image arrays of the UINT8 class are allowed.");
return;
}
uint8* rgb = (uint8*) mxGetPr(input[0]);
int* dims = (int*) mxGetDimensions(input[0]);
int height = dims[0];
int width = dims[1];
int imsize = height * width;
cv::Mat imR(1, imsize, cv::DataType<uint8>::type, rgb);
cv::Mat imG(1, imsize, cv::DataType<uint8>::type, rgb+imsize);
cv::Mat imB(1, imsize, cv::DataType<uint8>::type, rgb+imsize + imsize);
// opencv is BGR and matlab is column-major order
cv::Mat imA[3];
imA[2] = imR.reshape(1,width).t();
imA[1] = imG.reshape(1,width).t();
imA[0] = imB.reshape(1,width).t();
// done! imf is what we want!
cv::Mat imf;
merge(imA,3,imf);
}

Looking for some help working with premultiplied alpha

I am trying to update a source image with the contents of multiple destination images. From what I can tell using premultiplied alpha is the way to go with this, but I think I am doing something wrong (function below). the image I am starting with is initialized with all ARGB values set to 0. When I run the function once the resulting image looks great, but when I start compositing on any others all the pixels that have alpha information get really messed up. Does anyone know if I am doing something glaringly wrong or if there is something extra I need to do to modify the color values?
void CompositeImage(unsigned char *src, unsigned char *dest, int srcW, int srcH){
int w = srcW;
int h = srcH;
int px0;
int px1;
int px2;
int px3;
int inverseAlpha;
int r;
int g;
int b;
int a;
int y;
int x;
for (y = 0; y < h; y++) {
for (x= 0; x< w*4; x+=4) {
// pixel number
px0 = (y*w*4) + x;
px1 = (y*w*4) + (x+1);
px2 = (y*w*4) + (x+2);
px3 = (y*w*4) + (x+3);
inverseAlpha = 1 - src[px3];
// create new values
r = src[px0] + inverseAlpha * dest[px0];
g = src[px1] + inverseAlpha * dest[px1];
b = src[px2] + inverseAlpha * dest[px2];
a = src[px3] + inverseAlpha * dest[px3];
// update destination image
dest[px0] = r;
dest[px1] = g;
dest[px2] = b;
dest[px3] = a;
}
}
}
I'm not clear on what data you are working with. Do your source images already have the alpha values pre-multiplied as they are stored? If not, then pre-multiplied alpha does not apply here and you would need to do normal alpha blending.
Anyway, the big problem in your code is that you're not keeping track of the value ranges that you're dealing with.
inverseAlpha = 1 - src[px3];
This needs to be changed to:
inverseAlpha = 255 - src[px3];
You have all integral value types here, so the normal incoming 0..255 value range will result in an inverseAlpha range of -254..1, which will give you some truly wacky results.
After changing the 1 to 255, you also need to divide your results for each channel by 255 to scale them back down to the appropriate range. The alternative is to do the intermediate calculations using floats instead of integers and divide the initial channel values by 255.0 (instead of these other changes) to get values in the 0..1 range.
If your source data really does already have pre-multiplied alpha, then your result lines should look like this.
r = src[px0] + inverseAlpha * dest[px0] / 255;
If your source data does not have pre-multiplied alpha, then it should be:
r = src[px0] * src[px3] / 255 + inverseAlpha * dest[px0] / 255;
There's nothing special about blending the alpha channel. Use the same calculation as for r, g, and b.