RenderScript Sobel lmplementation, different in- and output types - renderscript

I want to implement a Sobel filter in RenderScript with uchar4 as Input allocation and float[] as Output allocation. I am not quite sure whether it is possible to use different types for Input and Output allocations in a RenderScript. I want to develop the solution myself, but would be grateful to get some advice on the best Renderscript structure to takle that Problem. Somewhere I read, that it is possible to use
float attribute((kernel)) root(uchar4 *v_in, uint32_t x, uint32_t y) {
}
Would you recommend such Approach or can this be done without using actually a kernel, i.e. just a function? Thanks in advance.
My rs code for the Sobel (X direction) now looks as follows:
#pragma version(1)
#pragma rs java_package_name(com.example.xxx)
#pragma rs_fp_relaxed
rs_allocation gIn;
int32_t width;
int32_t height;
float __attribute__((kernel)) sobelX(uchar4 *v_in, uint32_t x, uint32_t y) {
float out=0;
if (x>0 && y>0 && x<(width-1) && y<(height-1){
uchar4 c11=rsGetElementAt_uchar4(gIn, x-1, y-1);
uchar4 c21=rsGetElementAt_uchar4(gIn, x, y-1);
uchar4 c31=rsGetElementAt_uchar4(gIn, x+1, y-1);
uchar4 c13=rsGetElementAt_uchar4(gIn, x-1, y+1);
uchar4 c23=rsGetElementAt_uchar4(gIn, x, y+1);
uchar4 c33=rsGetElementAt_uchar4(gIn, x+1, y+1);
float4 f11=convert_float4(c11);
float4 f21=convert_float4(c21);
float4 f31=convert_float4(c31);
float4 f13=convert_float4(c13);
float4 f23=convert_float4(c23);
float4 f33=convert_float4(c33);
out= f11.r-f13.r + 2*(f21.r-f23.r) + f31.r-f33.r;
}
return out;
}
What I am struggling is passing the Parameters from Java side:
float[][] gx = new float[width][height];
ScriptC_sobel script;
script=new ScriptC_sobel(rs);
script.set_width(width) ;
script.set_height(height) ;
script.set_gIn(bmpGray);
Allocation inAllocation = Allocation.createFromBitmap(rs, bmpGray, Allocation.MipmapControl.MIPMAP_NONE,
Allocation.USAGE_SCRIPT);
Allocation outAllocation = Allocation.createTyped(rs, float,2) ;
script.forEach_sobelX(inAllocation, outAllocation);
outAllocation.copyTo(gx) ;
I understand that, in order to use rsGetElementAt function (to access neighboring data within the kernel) I need to set the input allocation as a script global as well (rs_allocation gIn in rs code). However, I'm not sure how to handle this "double allocation" from the Java side. Also the outAllocation Statement in the Java code is probably not correct. Specifiyally I am not sure, whether the Kernel will returned this as float[] or as float[][].

It is possible to use different types for input and output. In your case, I would actually suggest:
float __attribute__((kernel)) sobel(unchar4 *v_in, uint32_t x, uint32_t y) {}
You certainly want to use a kernel, so that the performance can benefit from execution by multiple threads.
Also, have a look at this example of doing 3x3 convolution in RS.
UPDATE: generally, the best in/out parameters to use depend on the type of output you want this filter to generate - is it just the magnitude? Then uint output will most likely suffice.
UPDATE2: If you are going to use a variable to pass input allocation, then you don't need it in the kernel parameters, i.e.:
float __attribute__((kernel)) sobelX(uint32_t x, uint32_t y)
The rest of the script looks ok (sans missing parenthesis in the conditional). As for the Java part, below I am pasting a demonstration of how you should prepare the output allocation and start the script. The kernel will then be invoked for every cell (i.e. every float) in the output allocation.
float[] gx = new float[width * height];
Type.Builder TypeIn = new Type.Builder(mRS, Element.F32(mRS));
TypeIn.setX(width).setY(height);
Allocation outAllocation = Allocation.createTyped(mRS, TypeIn.create());
mScript.forEach_sobelX(outAllocation);

Related

Do two floats in a compute shader being added or subtracted not give the same value 100% of the time?

I have a function I call to generate some randomness in my hlsl compute shader code
float rand3dTo1d(float3 value, float3 dotDir = float3(12.9898, 78.233, 37.719)){
//make value smaller to avoid artefacts
float3 smallValue = sin(value);
//get scalar value from 3d vector
float random = dot(smallValue, dotDir);
//make value more random by making it bigger and then taking the factional part
random = frac(sin(random) * 43758.5453);
return random;
}
If I pass in an incoming vectors location, all is fine, but if I try to pass in the center point of three vectors using this function into the randomness:
float3 GetTriangleCenter3d(float3 a, float3 b, float3 c) {
return (a + b + c) / 3.0;
}
Then ocassionally SOME of my points are not the same from frame to frame (shown by the color I paint the triangles with using this code). I get flickering of color.
float3 color = lerp(_ColorFrom, _ColorTo, rand1d);
I am at a total loss. I was able to at least get consitant results by using the thread id as the seed for the randomness, but not being able to use the centerpoint of the triangle is really weird to me and I have no idea what I am doing wrong or what I am missing. Any help would be great.

How to scale, crop, and rotate all at once in Android RenderScript

Is it possible to take a camera image in Y'UV format and using RenderScript:
Convert it to RGBA
Crop it to a certain region
Rotate it if necessary
Yes! I figured out how and thought I would share it with others. RenderScript has a bit of a learning curve, and more simple examples seem to help.
When cropping, you still need to set up an input and output allocation as well as one for the script itself. It might seem strange at first, but the input and output allocations have to be the same size so if you are cropping you need to set up yet another Allocation to write the cropped output. More on that in a second.
#pragma version(1)
#pragma rs java_package_name(com.autofrog.chrispvision)
#pragma rs_fp_relaxed
/*
* This is mInputAllocation
*/
rs_allocation gInputFrame;
/*
* This is where we write our cropped image
*/
rs_allocation gOutputFrame;
/*
* These dimensions define the crop region that we want
*/
uint32_t xStart, yStart;
uint32_t outputWidth, outputHeight;
uchar4 __attribute__((kernel)) yuv2rgbFrames(uchar4 in, uint32_t x, uint32_t y)
{
uchar Y = rsGetElementAtYuv_uchar_Y(gInputFrame, x, y);
uchar U = rsGetElementAtYuv_uchar_U(gInputFrame, x, y);
uchar V = rsGetElementAtYuv_uchar_V(gInputFrame, x, y);
uchar4 rgba = rsYuvToRGBA_uchar4(Y, U, V);
/* force the alpha channel to opaque - the conversion doesn't seem to do this */
rgba.a = 0xFF;
uint32_t translated_x = x - xStart;
uint32_t translated_y = y - yStart;
uint32_t x_rotated = outputWidth - translated_y;
uint32_t y_rotated = translated_x;
rsSetElementAt_uchar4(gOutputFrame, rgba, x_rotated, y_rotated);
return rgba;
}
To set up the allocations:
private fun createAllocations(rs: RenderScript) {
/*
* The yuvTypeBuilder is for the input from the camera. It has to be the
* same size as the camera (preview) image
*/
val yuvTypeBuilder = Type.Builder(rs, Element.YUV(rs))
yuvTypeBuilder.setX(mImageSize.width)
yuvTypeBuilder.setY(mImageSize.height)
yuvTypeBuilder.setYuvFormat(ImageFormat.YUV_420_888)
mInputAllocation = Allocation.createTyped(
rs, yuvTypeBuilder.create(),
Allocation.USAGE_IO_INPUT or Allocation.USAGE_SCRIPT)
/*
* The RGB type is also the same size as the input image. Other examples write this as
* an int but I don't see a reason why you wouldn't be more explicit about it to make
* the code more readable.
*/
val rgbType = Type.createXY(rs, Element.RGBA_8888(rs), mImageSize.width, mImageSize.height)
mScriptAllocation = Allocation.createTyped(
rs, rgbType,
Allocation.USAGE_SCRIPT)
mOutputAllocation = Allocation.createTyped(
rs, rgbType,
Allocation.USAGE_IO_OUTPUT or Allocation.USAGE_SCRIPT)
/*
* Finally, set up an allocation to which we will write our cropped image. The
* dimensions of this one are (wantx,wanty)
*/
val rgbCroppedType = Type.createXY(rs, Element.RGBA_8888(rs), wantx, wanty)
mOutputAllocationRGB = Allocation.createTyped(
rs, rgbCroppedType,
Allocation.USAGE_SCRIPT)
}
Finally, since you're cropping you need to tell the script what to do before invocation. If the image sizes don't change you can probably optimize this by moving the LaunchOptions and variable settings so they occur just once (rather than every time) but I'm leaving them here for my example to make it clearer.
override fun onBufferAvailable(a: Allocation) {
// Get the new frame into the input allocation
mInputAllocation!!.ioReceive()
// Run processing pass if we should send a frame
val current = System.currentTimeMillis()
if (current - mLastProcessed >= mFrameEveryMs) {
val lo = Script.LaunchOptions()
/*
* These coordinates are the portion of the original image that we want to
* include. Because we're rotating (in this case) x and y are reversed
* (but still offset from the actual center of each dimension)
*/
lo.setX(starty, endy)
lo.setY(startx, endx)
mScriptHandle.set_xStart(lo.xStart.toLong())
mScriptHandle.set_yStart(lo.yStart.toLong())
mScriptHandle.set_outputWidth(wantx.toLong())
mScriptHandle.set_outputHeight(wanty.toLong())
mScriptHandle.forEach_yuv2rgbFrames(mScriptAllocation, mOutputAllocation, lo)
val output = Bitmap.createBitmap(
wantx, wanty,
Bitmap.Config.ARGB_8888
)
mOutputAllocationRGB!!.copyTo(output)
/* Do something with the resulting bitmap */
listener?.invoke(output)
mLastProcessed = current
}
}
All this might seem like a bit much but it's very fast - way faster than doing the rotation on the java/kotlin side, and thanks to RenderScript's ability to run the kernel function over a subset of the image it's less overhead than creating a bitmap then creating a second, cropped one.
For me, all the rotation is necessary because the image seen by the RenderScript was 90 degrees rotated from the camera. I am told this is some kind of peculiarity of having a Samsung phone.
RenderScript was intimidating at first but once you get used to what it's doing it's not so bad. I hope this is helpful to someone.

HLSL Unity5> Multiple render targets

I am attempting to specify a frag output into a set texture (render target) depending on some logic.
To summarise my shader:
I am preforming a Texture3D raycasting method that allows for the user to 'see' inside the texture 3D data.
My issue arises when wanting to sample an area of this main texture and dump it into another of a smaller resolution (allowing for an eventual 'zooming' affect)
My research thus far has brought me to the use of Multiple render targets . in that (to my understanding) I would send this frag function output to another function which then outputs to set different targets.
Feedback appreciated;especially if there is an easier way to sample area into another texture (I have tried various compute shader methods- CPU methods are too slow) a CPU based analogy being the Unity's:
GetPixels function
Extract:
.float alpha is actually a raycast step result
.float4 t the colour plus alpha at mapping of the input Texture3D
._sample a pseudo bool flag for sampling
.texture3Dsampler (within commented if statement) is the smaller resolution Texture3D that I wish to write to; given the pixel being evaluated of the input Texture3D is within texture3Dsampler bounds from a certain start point - as shown in if statement logic.
float a = (1 - alpha);
float4 t = float4(t3d, a);
//Cn = Current pixel
int Cx = start.x;
int Cy = start.y;
int Cz = start.z;
if(_sample == 1 &&
((Cx >= _XSS) && (Cx <= (_XSS+_Tex3DSampled.x))) &&
((Cy >= _YSS) && (Cy <= (_XSS+_Tex3DSampled.y))) &&
((Cz >= _ZSS) && (Cz <= (_YSS+_Tex3DSampled.z)))
)
{
//render t into BOTH texture3D to screen output and texture3Dsampler output
}
else //if not sampling into other Texture3d, simple return to render t onto screen
{
return t; //returning into t
}
}

Renderscript Greyscale not quite working

This is my renderscript code for now:
#pragma version(1)
#pragma rs java_package_name(com.apps.foo.bar)
rs_allocation inPixels;
uchar4 RS_KERNEL root(uchar4 in, uint32_t x, uint32_t y) {
uchar4 pixel = in.rgba;
pixel.r = (pixel.r + pixel.g + pixel.b)/3;
pixel.g = (pixel.r + pixel.g + pixel.b)/3;
pixel.b = (pixel.r + pixel.g + pixel.b)/3;
return pixel;
}
My phone shows a "greyscaled" picture. I say "grayscaled" because red for example, is still kinda red...It is gray-ish but you can still see that is red. I know I can use more sophisticated methods, but I would like to stick to the simple one for now.
I would like to know if my renderscript code is wrong. Should I be converting the char to another type?
Use a temporary variable to hold the result as you compute it. Otherwise, in the first line you're modifying pixel.r, and in the very next one you are using it to calculate pixel.g. No wonder you get artifacts.
Also, don't forget to assign the alpha value to avoid surprises with "invisible" output.
Also I would recommend not to use equal weights for r, g and b but the weights as below. See e.g. http://www.johndcook.com/blog/2009/08/24/algorithms-convert-color-grayscale/
char4 __attribute__((kernel)) gray(uchar4 in) {
uchar4 out;
float gr= 0.2125*in.r + 0.7154*in.g + 0.0721*in.b;
out.r = out.g = out.b = gr;
out.a = in.a;
return out;
}

Using OpenGL in Matlab to get depth buffer

Ive asked a similar question before and didnt manage to find a direct answer.
Could someone provide sample code for extracting the depth buffer of the rendering of an object into a figure in Matlab?
So lets say I load an obj file or even just a simple surf call, render it and now want to get to its depth buffer then what code will do that for me using both Matlab and OpenGL. I.e. how do I set this up and then access the actual data?
I essentially want to be able to use Matlabs powerful plotting functions and then be able to access the underlying graphics context for getting the depth buffer out.
NOTE: The bounty specifies JOGL but that is not a must. Any code which acts as above and can provide me with the depth buffer after running it in Matlab is sufficient)
Today, I went drinking with my colleagues, and after five beers and some tequillas I found this question and thought, "have at ya!" So I was struggling for a while but then I found a simple solution using MEX. I theorized that the OpenGL context, created by the last window, could be left active and therefore could be accessible from "C", if the script ran in the same thread.
I created a simple "C" program which calls one matlab function, called "testofmyfilter" which plots frequency response of a filter (that was the only script I had at hand). This is rendered using OpenGL. Then the program uses glGetViewport() and glReadPixels() to get to the OpenGL buffers. Then it creates a matrix, fills it with the depth values, and passes it to the second function, called "trytodisplaydepthmap". It just displays the depthmap using the imshow function. Note that the MEX function is allowed to return values as well, so maybe the postprocessing would not have to be another function, but I'm in no state to be able to understand how it's done. Should be trivial, though. I'm working with MEX for the first time today.
Without further delay, there are source codes I used:
testofmyfilter.m
imp = zeros(10000,1);
imp(5000) = 1;
% impulse
[bwb,bwa] = butter(3, 0.1, 'high');
b = filter(bwb, bwa, imp);
% filter impulse by the filter
fs = 44100; % sampling frequency (all frequencies are relative to fs)
frequency_response=fft(b); % calculate response (complex numbers)
amplitude_response=20*log10(abs(frequency_response)); % calculate module of the response, convert to dB
frequency_axis=(0:length(b)-1)*fs/length(b); % generate frequency values for each response value
min_f=2;
max_f=fix(length(b)/2)+1; % min, max frequency
figure(1);
lighting gouraud
set(gcf,'Renderer','OpenGL')
semilogx(frequency_axis(min_f:max_f),amplitude_response(min_f:max_f),'r-') % plot with logarithmic axis using red line
axis([frequency_axis(min_f) frequency_axis(max_f) -90 10]) % set axis limits
xlabel('frequency [Hz]');
ylabel('amplitude [dB]'); % legend
grid on % draw grid
test.c
//You can include any C libraries that you normally use
#include "windows.h"
#include "stdio.h"
#include "math.h"
#include "mex.h" //--This one is required
extern WINAPI void glGetIntegerv(int n_enum, int *p_value);
extern WINAPI void glReadPixels(int x,
int y,
int width,
int height,
int format,
int type,
void * data);
#define GL_VIEWPORT 0x0BA2
#define GL_DEPTH_COMPONENT 0x1902
#define GL_FLOAT 0x1406
void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[])
{
int viewport[4], i, x, y;
int colLen;
float *data;
double *matrix;
mxArray *arg[1];
mexCallMATLAB(0, NULL, 0, NULL, "testofmyfilter");
// call an .m file which creates OpenGL window and draws a plot inside
glGetIntegerv(GL_VIEWPORT, viewport);
printf("GL_VIEWPORT = [%d, %d, %d, %d]\n", viewport[0], viewport[1], viewport[2], viewport[3]);
// print viewport dimensions, should be [0, 0, m, n]
// where m and n are size of the GL window
data = (float*)malloc(viewport[2] * viewport[3] * sizeof(float));
glReadPixels(0, 0, viewport[2], viewport[3], GL_DEPTH_COMPONENT, GL_FLOAT, data);
// alloc data and read the depth buffer
/*for(i = 0; i < 10; ++ i)
printf("%f\n", data[i]);*/
// debug
arg[0] = mxCreateNumericMatrix(viewport[3], viewport[2], mxDOUBLE_CLASS, mxREAL);
matrix = mxGetPr(arg[0]);
colLen = mxGetM(arg[0]);
printf("0x%08x 0x%08x 0x%08x %d\n", data, arg[0], matrix, colLen); // debug
for(x = 0; x < viewport[2]; ++ x) {
for(y = 0; y < viewport[3]; ++ y)
matrix[x * colLen + y] = data[x + (viewport[3] - 1 - y) * viewport[2]];
}
// create matrix, copy data (this is stupid, but matlab switches
// rows/cols, also convert float to double - but OpenGL could have done that)
free(data);
// don't need this anymore
mexCallMATLAB(0, NULL, 1, arg, "trytodisplaydepthmap");
// pass the array to a function (returnig something from here
// is beyond my understanding of mex, but should be doable)
mxDestroyArray(arg[0]);
// cleanup
return;
}
trytodisplaydepthmap.m:
function [] = trytodisplaydepthmap(depthMap)
figure(2);
imshow(depthMap, []);
% see what's inside
Save all of these to the same directory, compile test.c with (type that to Matlab console):
mex test.c Q:\MATLAB\R2008a\sys\lcc\lib\opengl32.lib
Where "Q:\MATLAB\R2008a\sys\lcc\lib\opengl32.lib" is path to "opengl32.lib" file.
And finally execute it all by merely typing "test" in matlab console. It should bring up a window with filter frequency response, and another window with the depth buffer. Note the front and back buffers are swapped at the moment "C" code reads the depth buffer, so it might be required to run the script twice to get any results (so the front buffer which now contains the results swaps with back buffer again, and the depth can be read out). This could be done automatically by "C", or you can try including getframe(gcf); at the end of your script (that reads back from OpenGL as well so it swaps the buffers for you, or something).
This works for me in Matlab 7.6.0.324 (R2008a). The script runs and spits out the following:
>>test
GL_VIEWPORT = [0, 0, 560, 419]
0x11150020 0x0bd39620 0x12b20030 419
And of course it displays the images. Note the depth buffer range depends on Matlab, and can be quite high, so making any sense of the generated images may not be straightforward.
the swine's answer is the correct one.
Here is a slightly formatted and simpler version that is cross-platform.
Create a file called mexGetDepth.c
#include "mex.h"
#define GL_VIEWPORT 0x0BA2
#define GL_DEPTH_COMPONENT 0x1902
#define GL_FLOAT 0x1406
void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[])
{
int viewport[4], i, x, y;
int colLen;
float *data;
double *matrix;
glGetIntegerv(GL_VIEWPORT, viewport);
data = (float*)malloc(viewport[2] * viewport[3] * sizeof(float));
glReadPixels(0, 0, viewport[2], viewport[3], GL_DEPTH_COMPONENT, GL_FLOAT, data);
plhs[0] = mxCreateNumericMatrix(viewport[3], viewport[2], mxDOUBLE_CLASS, mxREAL);
matrix = mxGetPr(plhs[0]);
colLen = mxGetM(plhs[0]);
for(x = 0; x < viewport[2]; ++ x) {
for(y = 0; y < viewport[3]; ++ y)
matrix[x * colLen + y] = data[x + (viewport[3] - 1 - y) * viewport[2]];
}
free(data);
return;
}
Then if youre on windows compile using
mex mexGetDepth.c "path to OpenGL32.lib"
or if youre on a nix system
mex mexGetDepth.c "path to opengl32.a"
Then run the following small script to test out the new function
peaks;
figure(1);
depthData=mexGetDepth;
figure
imshow(depthData);