I've been struggling with this for while now and it is quite time critical so I have to ask here. I'm quite new to compute shaders but from what I've read, it is what I need for my usecase. I'm trying to find the total score from an array of textures, with the score being the product of each channel and a given weight. Previously, I was using NodeJS to do it but it doesn't scale as well given increasing the dimensions by 4 would increase the area required per texture by 16 and with multiple textures this isn't a good solution.
This is my compute shader right now:
// Each #kernel tells which function to compile; you can have many kernels
#pragma kernel CSMain
// Create a RenderTexture with enableRandomWrite flag and set it
// with cs.SetTexture
SamplerState linearClampSampler;
float4 weights;
RWStructuredBuffer<Texture2DArray<float4>> scoreInput;
float output;
[numthreads(8,8,1)]
void CSMain (uint3 id : SV_DispatchThreadID)
{
float4 result_mult = scoreInput[id.z].Sample(id.uv).rgba * weights.xyzw;
output = result_mult.r + result_mult.g + result_mult.b + result_mult.a;
}
For my C# dispatcher, I am doing:
string[] paths = new string[sessionData.masks.Length];
Texture2D[] textures = new Texture2D[sessionData.masks.Length];
for (int i = 0; i < sessionData.masks.Length; i++)
{
paths[i] = sessionData.masks[i].combinedMasks;
textures[i] = CustomUtility.LoadPNG(paths[i]);
}
int colourSize = sizeof(float) * 4;
ComputeBuffer wallBuffer = new ComputeBuffer(textures.Length, colourSize);
wallBuffer.SetData(textures);
CalculateScoreShader.SetBuffer(0, "scoreInput", wallBuffer);
CalculateScoreShader.Dispatch(0, 8,8,1);
I can't figure out how to sample the texture properly, and I want to make sure that I am setting up the buffer correctly for the shader to used like this. I also want to retrieve the output, but again I'm unsure how to do this.
I have looked through a decent amount of tutorials and documentation but I just can't seem to find the solution.
Related
I want to implement an algorithm on GPU using Graphics.Blit. The input values are floats and output values are also float. I Create a texture with RFloat format and want to set values for every pixel. How can I set that? According to unity manual SetPixels doesn't work:
This function works only on ARGB32, RGB24 and Alpha8 texture formats.
For other formats SetPixels is ignored.
The algorithm needs float precision so the neither of these formats are usable. So how can it be done?
EDIT: After more struggle with unity RenderTextures, Here is the code I came up with to transfer data to GPU.
int res=512;
Texture2D tempTexture= new Texture2D(res, res, TextureFormat.RFloat, false);
public void ApplyHeightsToRT(float[,] heights, RenderTexture renderTexture)
{
RenderTexture.active = renderTexture;
Texture2D tempTexture = new Texture2D(res, res, TextureFormat.RFloat, false);
tempTexture.ReadPixels(new Rect(0, 0, renderTexture.width, renderTexture.height), 0, 0
, false);
for (int i = 0; i < tempTexture.width; i++)
for (int j = 0; j < tempTexture.height; j++)
{
tempTexture.SetPixel(i, j, new Color(heights[i, j], 0, 0, 0));
}
tempTexture.Apply();
RenderTexture.active = null;
Graphics.Blit(tempTexture, renderTexture);
}
This code successfully uploads the tempTexture to RenderTexture. The inverse operation is similarly done with the following method (RenderTexture is copied to tempTexture):
public void ApplyRTToHeights(RenderTexture renderTexture, float[,] heights)
{
RenderTexture.active = renderTexture;
tempTexture.ReadPixels(new Rect(0, 0, renderTexture.width, renderTexture.height), 0, 0
, false);
for (int i = 0; i < tempTexture.width; i++)
for (int j = 0; j < tempTexture.height; j++)
{
heights[i, j]=tempTexture.GetPixel(i, j).r;
}
RenderTexture.active = null;
}
To test the code I get the heightmap of a terrain then I call the first method to fill the RenderTexture with heightmap. Then I call the second method to get pixels from RenderTexture and put them on the terrain. It should do nothing. Right?
Actually calling the two methods one after another would flip the terrain heightmap and also create banding artifacts. Very weird. After further investigation the reason for flip turned out to be a formatting problem. The tempTexture that is created above the two methods is actually an ARGB32 texture not the RFloat I hoped it would be.
This explains the flips. After changing the code of tempTexture to be a ARGB32 texture and changing RenderTexture to be RGBA32, flip behaviour gone away. Now there is only banding artifacts:
And that would be understandable since I'm using only 8 bits (red channel) of both tempTexture and RenderTexture.
Now the problem is not about setting data on a RFloat texture. The problem is RFloat textures are not supported in my graphics card and probably many other different graphic devices. The problem is to find a way to transfer float arrays to the RenderTexture.
I've been working on a scene in Unity3D where I have the KinectV2 depth information coming in at 512 x 424 and I'm converting that in real time to Mesh that is also 512 x 424. So there is a 1:1 ratio of pixel data (depth) and vertices (mesh).
My end goal is to make the 'Monitor 3D View' scene found in 'Microsoft Kinect Studio v2.0' with the Depth.
I've pretty much got it working in terms of the point cloud. However, there is a large amount of warping in my Unity scene. I though it might of been down to my maths, etc.
However I noticed that its the same case for the Unity Demo kinect supplied in their Development kit.
I'm just wondering if I'm missing something obvious here? Each of my pixels (or vertices in this case) is mapped out in a 1 by 1 fashion.
I'm not sure if its because I need to process the data from the DepthFrame before rendering it to scene? Or if there's some additional step I've missed out to get the true representation of my room? Because it looks like theres a slight 'spherical' effect being added right now.
These two images are a top down shot of my room. The green line represents my walls.
The left image is the Kinect in a Unity scene, and the right is within Microsoft Kinect Studio. Ignoring the colour difference, you can see that the left (Unity) is warped, whereas the right is linear and perfect.
I know it's quite hard to make out, especially that you don't know the layout of the room I'm sat in :/ Side view too. Can you see the warping on the left? Use the green lines as a reference - these are straight in the actual room, as shown correctly on the right image.
Check out my video to get a better idea:
https://www.youtube.com/watch?v=Zh2pAVQpkBM&feature=youtu.be
Code C#
Pretty simple to be honest. I'm just grabbing the depth data straight from the Kinect SDK, and placing it into a point cloud mesh on the Z axis.
//called on application start
void Start(){
_Reader = _Sensor.DepthFrameSource.OpenReader();
_Data = new ushort[_lengthInPixels];
_Sensor.Open();
}
//called once per frame
void Update(){
if(_Reader != null){
var dep_frame = _Reader.AcquireLatestFrame();
dep_frame.CopyFrameDataToArray(_Data);
dep_frame.Dispose();
dep_frame = null;
UpdateScene();
}
}
//update point cloud in scene
void UpdateScene(){
for(int y = 0; y < height; y++){
for(int x = 0; x < width; x++){
int index = (y * width) + x;
float depthAdjust = 0.1;
Vector3 new_pos = new Vector3(points[index].x, points[index].y, _Data[index] * depthAdjust;
points[index] = new_pos;
}
}
}
Kinect API can be found here:
https://msdn.microsoft.com/en-us/library/windowspreview.kinect.depthframe.aspx
Would appreciate any advise, thanks!
With thanks to Edward Zhang, I figured out what I was doing wrong.
It's down to me not projecting my depth points correctly, in where I need to use the CoordinateMapper to map my DepthFrame into CameraSpace.
Currently, my code assumes an orthogonal depth instead of using a perspective depth camera. I just needed to implement this:
https://msdn.microsoft.com/en-us/library/windowspreview.kinect.coordinatemapper.aspx
//called once per frame
void Update(){
if(_Reader != null){
var dep_frame = _Reader.AcquireLatestFrame();
dep_frame.CopyFrameDataToArray(_Data);
dep_frame.Dispose();
dep_frame = null;
CameraSpacePoint[] _CameraSpace = new CameraSpacePoint[_Data.Length];
_Mapper.MapDepthFrameToCameraSpace(_Data, _CameraSpace);
UpdateScene();
}
}
//update point cloud in scene
void UpdateScene(){
for(int y = 0; y < height; y++){
for(int x = 0; x < width; x++){
int index = (y * width) + x;
Vector3 new_pos = new Vector3(_CameraSpace[index].X, _CameraSpace[index].Y, _CameraSpace[index].Z;
points[index] = new_pos;
}
}
}
I am attempting to specify a frag output into a set texture (render target) depending on some logic.
To summarise my shader:
I am preforming a Texture3D raycasting method that allows for the user to 'see' inside the texture 3D data.
My issue arises when wanting to sample an area of this main texture and dump it into another of a smaller resolution (allowing for an eventual 'zooming' affect)
My research thus far has brought me to the use of Multiple render targets . in that (to my understanding) I would send this frag function output to another function which then outputs to set different targets.
Feedback appreciated;especially if there is an easier way to sample area into another texture (I have tried various compute shader methods- CPU methods are too slow) a CPU based analogy being the Unity's:
GetPixels function
Extract:
.float alpha is actually a raycast step result
.float4 t the colour plus alpha at mapping of the input Texture3D
._sample a pseudo bool flag for sampling
.texture3Dsampler (within commented if statement) is the smaller resolution Texture3D that I wish to write to; given the pixel being evaluated of the input Texture3D is within texture3Dsampler bounds from a certain start point - as shown in if statement logic.
float a = (1 - alpha);
float4 t = float4(t3d, a);
//Cn = Current pixel
int Cx = start.x;
int Cy = start.y;
int Cz = start.z;
if(_sample == 1 &&
((Cx >= _XSS) && (Cx <= (_XSS+_Tex3DSampled.x))) &&
((Cy >= _YSS) && (Cy <= (_XSS+_Tex3DSampled.y))) &&
((Cz >= _ZSS) && (Cz <= (_YSS+_Tex3DSampled.z)))
)
{
//render t into BOTH texture3D to screen output and texture3Dsampler output
}
else //if not sampling into other Texture3d, simple return to render t onto screen
{
return t; //returning into t
}
}
as i'm a complete noob with shaders i've got some problems while trying to get to work a 2D lighting system that basically covers the screen with a 2D black texture with transparent holes where the lighten areas are.
As i'm using only one texture I guess that i must do this in the fragment shader, right?
Fragment shader:
#ifdef GL_ES
precision mediump float;
#endif
// Texture, coordinates and size
uniform sampler2D u_texture;
varying vec2 v_texCoord;
uniform vec2 textureSize;
uniform int lightCount;
struct LightSource
{
vec2 position;
float radius;
float strength;
};
uniform LightSource lights[10];
void main()
{
float alpha = 1.0;
vec2 pos = vec2(v_texCoord.x * textureSize.x, v_texCoord.y * textureSize.y);
int i;
for (i = 0; i < lightCount; i++)
{
LightSource source = lights[i];
float distance = distance(source.position, pos);
if (distance < source.radius)
{
alpha -= mix(source.strength, 0.0, distance/source.radius);
}
}
gl_FragColor = vec4(0.0, 0.0, 0.0, alpha);
}
The problem is that the performance is really terrible (cannot run at 60fps with 2 lights and nothing else on screen), any suggestions to make it better or even different ways to approach this problem?
By the way, i'm doing this from cocos2d-x, so if anyone has any idea that uses cocos2d elements it will be welcome as well :)
I totally agree with Tim. If you want to improve the total speed, you've to avoid for loops. I recommend you that, if the lights array size is always ten, swap the loop statement with ten copies of the loop content. You should be aware that any variable that you declare into a loop statement will be freed up at the end of the loop! So its a good idea to span the loop in ten parts (ugly, but it's an old school trick ;))))
Besides, I also recommend you to put some println in every statement, to see what instructions is messing around. I bet that the mix operation is the culprit. I don't know anything about cocos2d, but, it is possible to make an unique call to mix at the end of the process, with a sumarization of distances and strengths? It seems that at some point there's a pretty float-consuming annoying operation
Two things I would try (not guaranteed to help)
Remove the for loop and just hardcode in two lights. For loops can be expensive if they are not handled properly by the driver. It would be good to know if that is slowing you down.
If statements can be expensive, and I don't think that's a good application of mix (you're doing an a*(1-c) + 0.0 * c, and the second half of that term is pointless). I might try replacing this if statement:
if (distance < source.radius)
{
alpha -= mix(source.strength, 0.0, distance/source.radius);
}
With this single line:
alpha -= (1.0-min(distance/source.radius, 1.0)) * source.strength;
Hello friendly computer people,
I've been studying openGL with the book iPhone 3D programming from O'Reilly. Below I've posted an example from the text which shows how to draw a cone. I'm still trying to wrap my head around it which is a bit difficult since I'm not super familiar with C++.
Anyway, what I would like to do is draw a cube. Could anyone suggest the best way to replace the following code with one that would draw a simple cube?
const float coneRadius = 0.5f;
const float coneHeight = 1.866f;
const int coneSlices = 40;
{
// Allocate space for the cone vertices.
m_cone.resize((coneSlices + 1) * 2);
// Initialize the vertices of the triangle strip.
vector<Vertex>::iterator vertex = m_cone.begin();
const float dtheta = TwoPi / coneSlices;
for (float theta = 0; vertex != m_cone.end(); theta += dtheta) {
// Grayscale gradient
float brightness = abs(sin(theta));
vec4 color(brightness, brightness, brightness, 1);
// Apex vertex
vertex->Position = vec3(0, 1, 0);
vertex->Color = color;
vertex++;
// Rim vertex
vertex->Position.x = coneRadius * cos(theta);
vertex->Position.y = 1 - coneHeight;
vertex->Position.z = coneRadius * sin(theta);
vertex->Color = color;
vertex++;
}
}
Thanks for all the help.
If all you want is an OpenGL ES 1.1 cube, I created such a sample application (that has texture and lets you rotate it using your finger) that you can grab the code for here. I generated this sample for the OpenGL ES session of my course on iTunes U (I've since fixed the broken texture rendering you see in that class video).
The author is demonstrating how to build a generic 3-D engine in C++ in the book, so his code is a little more involved than mine. In this part of the code, he's looping through an angle from 0 to 2 * pi in a number of steps corresponding to coneSlices. You could replace his loop with a series of manual vertex additions corresponding to the vertices I have in my sample application in order to draw a cube instead of his cone. You'd also need to remove the code he has elsewhere for drawing the circular base of the cone.
In OpenGLES 1 you would probably draw a cub using glVertexPointer to submit geometry and glDrawArrays to draw the cube. See these tutorials:
http://iphonedevelopment.blogspot.com/2009/05/opengl-es-from-ground-up-table-of.html
OpenGLES is a C based library.