So I ran a frame capture to see the performance. To my surprise it was my full screen rending things that were to blame. Take a look
Here are the two hogging functions. I have disabled the texture look up on the full screen texture to illustrate how ridiculous this is!
Program #3
Vert:
precision highp float;
attribute vec2 position;
uniform mat4 matrix;
void main()
{
gl_Position = matrix * vec4(position.xy, 0.0, 1.0);
}
Frag:
precision highp float;
uniform float alpha;
void main()
{
gl_FragColor = vec4(0.0, 0.0, 0.0, 1.0 - alpha);
}
Context:
//**Set up data
glUseProgram(shade_black.progId)
glBindBuffer(GLenum(GL_ARRAY_BUFFER), black_buffer) //Bind the coordinates
//**Pass in coordinates
let aTexCoordLoc = GLuint(black_attribute_position)
glEnableVertexAttribArray(aTexCoordLoc);
glVertexAttribPointer(aTexCoordLoc, 2, GLenum(GL_FLOAT), GLboolean(GL_FALSE), 0, BUFFER_OFFSET(0)) //Send to shader
//**Pass in uniforms
glUniformMatrix4fv(black_uniform_ortho, 1, GLboolean(GL_FALSE), &orthographicMatrix) //Pass matrix
glUniform1f(black_unifrom_alpha, 0.95) //Pass alpha
counter += timedo
//**Draw (instanced)
//The number 3 is actually variable but for this purpose I set it flat out
glDrawArraysInstanced(GLenum(GL_TRIANGLE_STRIP), 0, 4, 3 )// GLsizei(timedo)) //Draw it
//**Clean up
glBindBuffer(GLenum(GL_ARRAY_BUFFER), 0) //Clean up
glBindBuffer(GLenum(GL_ARRAY_BUFFER), 0)
Program #2
Vert:
precision highp float;
attribute vec4 data;
uniform mat4 matrix;
uniform float alpha;
varying vec2 v_texcoord;
varying float o_alpha;
void main()
{
gl_Position = matrix * vec4(data.xy, 0.0, 1.0);
v_texcoord = data.zw;
o_alpha = alpha;
}
Frag:
precision highp float;
uniform sampler2D s_texture;
varying float o_alpha;
varying vec2 v_texcoord;
void main()
{
//vec4 color = texture2D(s_texture, v_texcoord);
gl_FragColor = vec4(1.0);
//This line below is what it should be, but I wanted to isolate the issue, the picture results are from setting it to white.
//gl_FragColor = vec4(color.rgb, step(0.4, color.a ) * (color.a - o_alpha));
}
Context:
func drawTexture(texture: FBO, alpha: GLfloat)
{
//**Start up
//DONE EARLIER
//**Pass in vertices
glBindBuffer(GLenum(GL_ARRAY_BUFFER), textures_buffer)
let aTexCoordLoc = GLuint(textures_attribute_data)
glEnableVertexAttribArray(aTexCoordLoc);
glVertexAttribPointer(aTexCoordLoc, 4, GLenum(GL_FLOAT), GLboolean(GL_FALSE), 0, BUFFER_OFFSET(0)) //Tell gpu where
//**Pass in uniforms
glUniform1i(textures_uniform_texture, 0)
glUniformMatrix4fv(textures_uniform_matrix, 1, GLboolean(GL_FALSE), &orthographicMatrix)
glUniform1f(textures_uniform_alpha, alpha)
//**Texture
glBindTexture(GLenum(GL_TEXTURE_2D), texture.texture)
//**Draw
glDrawArrays(GLenum(GL_TRIANGLE_STRIP), 0, 4)
//**Clean up
glBindTexture(GLenum(GL_TEXTURE_2D), 0)
glBindBuffer(GLenum(GL_ARRAY_BUFFER), 0)
}
For the others you can at least se their draw call, but they aren't causing much damage.
What on earth is going on to cause the most complicated shaders to only be responsible for less then 1% of the latency?
NOTE: Both of these shaders use a VBO that is created and filled at the start of the app.
It does look kind of surprising. Here's how I'd try to make sense of those figures (assuming those times are GPU timings for those render calls):
Fill-rate is everything on mobile. Even running a simple pixel shader over 3 million pixels or so (iPad retina) is going to be an expensive task, and you shouldn't be too surprised that it's more expensive than a large number of much smaller particles. Your percentages are going to add up to 100%, so if all your other stuff is just a few hundred vertices and fills a few thousand pixels, you shouldn't be surprised if the full-screen stuff is huge relative to that. It also says '5ms', which is tempting to think of as an absolute figure, but bear in mind that the CPU and GPU automatically start running slower when there's not much work to do, so even a millisecond timing can be very misleading when the device is mostly idle.
Do you have a glClear at the start of the frame? If not, then you can pay a pretty high price because the first thing the GPU must do when it processes a tile is load in the old contents. With a glClear at the start of your rendering, it knows it needn't bother loading old contents. Maybe you're seeing that price on your first full-screen pass if you don't have a glClear.
Related
So I have a simple simulation set up on my phone. The goal is to have circles of red, white, and blue that appear on the screen with various transparencies. I have most of that working, except for one thing, while transparency sort of works, the only blending happens with the black background. As a result the circle in the center appears dark red instead of showing the white circles under it. What am I doing wrong?
Note I am working in an orthographic 2d projection matrix. All of the objects z positions are the same, and are rendered in a specific order.
Here is how I set it so transparency works:
glEnable(GLenum(GL_DEPTH_TEST))
glEnable(GLenum(GL_POINT_SIZE));
glEnable(GLenum(GL_BLEND))
glBlendFunc(GLenum(GL_SRC_ALPHA), GLenum(GL_ONE_MINUS_SRC_ALPHA))
glEnable(GLenum(GL_POINT_SMOOTH))
//Note some of these things aren't compatible with OpenGL-es but they can hurt right?
Here is the fragment shader:
precision mediump float;
varying vec4 outColor;
varying vec3 center;
varying float o_width;
varying float o_height;
varying float o_pointSize;
void main()
{
vec4 fc = gl_FragCoord;
vec3 fp = vec3(fc);
vec2 circCoord = 2.0 * gl_PointCoord - 1.0;
if (dot(circCoord, circCoord) > 1.0) {
discard;
}
gl_FragColor = outColor;//colorOut;
}
Here is how I pass each circle to the shader:
func drawParticle(part: Particle,color_loc: GLint, size_loc: GLint)
{
//print("Drawing: " , part)
let p = part.position
let c = part.color
glUniform4f(color_loc, GLfloat(c.h), GLfloat(c.s), GLfloat(c.v), GLfloat(c.a))
glUniform1f(size_loc, GLfloat(part.size))
glVertexAttribPointer(0, GLint(3), GLenum(GL_FLOAT), GLboolean(GL_FALSE), 0, [p.x, p.y, p.z]);
glEnableVertexAttribArray(0);
glDrawArrays(GLenum(GL_POINTS), 0, GLint(1));
}
Here is how I set it so transparency works:
glEnable(GLenum(GL_DEPTH_TEST))
glEnable(GLenum(GL_POINT_SIZE));
glEnable(GLenum(GL_BLEND))
glBlendFunc(GLenum(GL_SRC_ALPHA), GLenum(GL_ONE_MINUS_SRC_ALPHA))
glEnable(GLenum(GL_POINT_SMOOTH))
And that's not how transparency works. OpenGL is not a scene graph, it just draws geometry in the order you specify it to. If the first thing you draw are the red circles, they will blend with the background. Once things get drawn that are "behind" the red circles, the "occulded" parts will simply be discarded due to the depth test. There is no way for OpenGL (or any other depth test based algorithm) to automatically sort the different depth layers and blend them appropriately.
What you're trying to do there is order independent transparency, a problem still in research on how to solve it efficiently.
For what you want to achieve you'll have to:
sort your geometry far to near and draw in that order
disable the depth test while rendering
I am working on a application which is related to change the color effect in image. I have done almost everything. Now the problem is that in one of effect i have to give effect like glow egdes filter in photoshop. This filter flow the edges of image with its color and rest of image colors be black. By using BradLarson GPU Image GPUImageSobelEdgeDetectionFilter or GPUImageCannyEdgeDetectionFilter i can find the edges but with white color edges, and i need to find edges in colors. Is their any other way to find edges in color by using GPUImage or openCV.
Any help be very helpful for me.
Thanks
You really owe it to yourself to play around with writing custom shaders. It's extremely approachable, and can very quickly become powerful if you invest the effort.
That said, I think you're trying for something like this result:
There are many acceptable ways you could get here, but writing a custom shader for a subclass of GPUImageTwoInputFilter then targeting it with both the original image AND the edgeDetection image is how I accomplished the picture you see here.
The subclass would look something like this:
#import "OriginalColorEdgeMixer.h"
//Assumes you have targeted this filter with the original image first, then with an edge detection filter that returns white pixels on edges
//We are setting the threshold manually here, but could just as easily be a GLint which is dynamically fed at runtime
#if TARGET_IPHONE_SIMULATOR || TARGET_OS_IPHONE
NSString *const kOriginalColorEdgeMixer = SHADER_STRING
(
varying highp vec2 textureCoordinate;
varying highp vec2 textureCoordinate2;
uniform sampler2D inputImageTexture;
uniform sampler2D inputImageTexture2;
lowp float threshold;
mediump float resultingRed;
mediump float resultingGreen;
mediump float resultingBlue;
void main()
{
mediump vec4 textureColor = texture2D(inputImageTexture, textureCoordinate);
mediump vec4 textureColor2 = texture2D(inputImageTexture2, textureCoordinate2);
threshold = step(0.3, textureColor2.r);
resultingRed = threshold * textureColor.r;
resultingGreen = threshold * textureColor.g;
resultingBlue = threshold *textureColor.b;
gl_FragColor = vec4(resultingRed, resultingGreen, resultingBlue, textureColor.a);
}
);
#else
NSString *const kGPUImageDifferenceBlendFragmentShaderString = SHADER_STRING
(
varying vec2 textureCoordinate;
varying vec2 textureCoordinate2;
uniform sampler2D inputImageTexture;
uniform sampler2D inputImageTexture2;
float threshold;
float resultingRed;
float resultingGreen;
float resultingBlue;
void main()
{
vec4 textureColor = texture2D(inputImageTexture, textureCoordinate);
vec4 textureColor2 = texture2D(inputImageTexture2, textureCoordinate2);
threshold = step(0.3,textureColor2.r);
resultingRed = threshold * textureColor.r;
resultingGreen = threshold * textureColor.g;
resultingBlue = threshold *textureColor.b;
gl_FragColor = vec4(resultingRed, resultingGreen, resultingBlue, textureColor.a);
}
);
#endif
#implementation OriginalColorEdgeMixer
- (id)init;
{
if (!(self = [super initWithFragmentShaderFromString:kOriginalColorEdgeMixer]))
{
return nil;
}
return self;
}
#end
As I've written this, we're expecting the edgeDetection filter's output to be the second input of this custom filter.
I arbitrarily chose a threshold value of 0.3 for intensities on the edgeDetection image to enable the original color to show through. This could easily be made dynamic by tying it to a GLint fed from a UISlider in your app (many examples of this in Brad's sample code)
For the sake of clarity for people just starting out with GPUImage, using that custom filter you wrote is really easy. I did it like this:
[self configureCamera];
edgeDetection = [[GPUImageSobelEdgeDetectionFilter alloc] init];
edgeMixer = [[OriginalColorEdgeMixer alloc] init];
[camera addTarget:edgeDetection];
[camera addTarget:edgeMixer];
[edgeDetection addTarget:edgeMixer];
[edgeMixer addTarget:_previewLayer];
[camera startCameraCapture];
In summary, don't be scared to start writing some custom shaders! The learning curve is brief, and the errors thrown by the debugger are extremely helpful in letting you know exactly where you f**d up the syntax.
Lastly, this is a great place for documentation of the syntax and usage of OpenGL specific functions
I am trying to learn Shaders to implement something in my iPhone app. So far I have understood easy examples like making a color image to gray scale, thresholding, etc. Most of the examples involve simple operations in where processing input image pixel I(x,y) results in a simple modification of the colors of the same pixel
But, how about Convolutions?. For example, the easiest example would the Gaussian filter,
in where output image pixel O(x,y) depends not only on I(x,y) but also on surrounding 8 pixels.
O(x,y) = (I(x,y)+ surrounding 8 pixels values)/9;
Normally, this cannot be done with one single image buffer or input pixels will change as the filter is performed. How can I do this with shaders? Also, should I handle the borders myself? or there is a built-it function or something that check invalid pixel access like I(-1,-1) ?
Thanks in advance
PS: I will be generous(read:give a lot of points) ;)
A highly optimized shader-based approach for performing a nine-hit Gaussian blur was presented by Daniel Rákos. His process uses the underlying interpolation provided by texture filtering in hardware to perform a nine-hit filter using only five texture reads per pass. This is also split into separate horizontal and vertical passes to further reduce the number of texture reads required.
I rolled an implementation of this, tuned for OpenGL ES and the iOS GPUs, into my image processing framework (under the GPUImageFastBlurFilter class). In my tests, it can perform a single blur pass of a 640x480 frame in 2.0 ms on an iPhone 4, which is pretty fast.
I used the following vertex shader:
attribute vec4 position;
attribute vec2 inputTextureCoordinate;
uniform mediump float texelWidthOffset;
uniform mediump float texelHeightOffset;
varying mediump vec2 centerTextureCoordinate;
varying mediump vec2 oneStepLeftTextureCoordinate;
varying mediump vec2 twoStepsLeftTextureCoordinate;
varying mediump vec2 oneStepRightTextureCoordinate;
varying mediump vec2 twoStepsRightTextureCoordinate;
void main()
{
gl_Position = position;
vec2 firstOffset = vec2(1.3846153846 * texelWidthOffset, 1.3846153846 * texelHeightOffset);
vec2 secondOffset = vec2(3.2307692308 * texelWidthOffset, 3.2307692308 * texelHeightOffset);
centerTextureCoordinate = inputTextureCoordinate;
oneStepLeftTextureCoordinate = inputTextureCoordinate - firstOffset;
twoStepsLeftTextureCoordinate = inputTextureCoordinate - secondOffset;
oneStepRightTextureCoordinate = inputTextureCoordinate + firstOffset;
twoStepsRightTextureCoordinate = inputTextureCoordinate + secondOffset;
}
and the following fragment shader:
precision highp float;
uniform sampler2D inputImageTexture;
varying mediump vec2 centerTextureCoordinate;
varying mediump vec2 oneStepLeftTextureCoordinate;
varying mediump vec2 twoStepsLeftTextureCoordinate;
varying mediump vec2 oneStepRightTextureCoordinate;
varying mediump vec2 twoStepsRightTextureCoordinate;
// const float weight[3] = float[]( 0.2270270270, 0.3162162162, 0.0702702703 );
void main()
{
lowp vec3 fragmentColor = texture2D(inputImageTexture, centerTextureCoordinate).rgb * 0.2270270270;
fragmentColor += texture2D(inputImageTexture, oneStepLeftTextureCoordinate).rgb * 0.3162162162;
fragmentColor += texture2D(inputImageTexture, oneStepRightTextureCoordinate).rgb * 0.3162162162;
fragmentColor += texture2D(inputImageTexture, twoStepsLeftTextureCoordinate).rgb * 0.0702702703;
fragmentColor += texture2D(inputImageTexture, twoStepsRightTextureCoordinate).rgb * 0.0702702703;
gl_FragColor = vec4(fragmentColor, 1.0);
}
to perform this. The two passes can be achieved by sending a 0 value for the texelWidthOffset (for the vertical pass), and then feeding that result into a run where you give a 0 value for the texelHeightOffset (for the horizontal pass).
I also have some more advanced examples of convolutions in the above-linked framework, including Sobel edge detection.
Horizontal Blur using advantage of bilinear interpolation. Vertical blur pass is analog. Unroll to optimise.
//5 offsets for 10 pixel sampling!
float[5] offset = [-4.0f, -2.0f, 0.0f, 2.0f, 4.0f];
//int[5] weight = [1, 4, 6, 4, 1]; //sum = 16
float[5] weightInverse = [0.0625f, 0.25f, 0.375, 0.25f, 0.0625f];
vec4 finalColor = vec4(0.0f);
for(int i = 0; i < 5; i++)
finalColor += texture2D(inputImage, vec2(offset[i], 0.5f)) * weightInverse[i];
as i'm a complete noob with shaders i've got some problems while trying to get to work a 2D lighting system that basically covers the screen with a 2D black texture with transparent holes where the lighten areas are.
As i'm using only one texture I guess that i must do this in the fragment shader, right?
Fragment shader:
#ifdef GL_ES
precision mediump float;
#endif
// Texture, coordinates and size
uniform sampler2D u_texture;
varying vec2 v_texCoord;
uniform vec2 textureSize;
uniform int lightCount;
struct LightSource
{
vec2 position;
float radius;
float strength;
};
uniform LightSource lights[10];
void main()
{
float alpha = 1.0;
vec2 pos = vec2(v_texCoord.x * textureSize.x, v_texCoord.y * textureSize.y);
int i;
for (i = 0; i < lightCount; i++)
{
LightSource source = lights[i];
float distance = distance(source.position, pos);
if (distance < source.radius)
{
alpha -= mix(source.strength, 0.0, distance/source.radius);
}
}
gl_FragColor = vec4(0.0, 0.0, 0.0, alpha);
}
The problem is that the performance is really terrible (cannot run at 60fps with 2 lights and nothing else on screen), any suggestions to make it better or even different ways to approach this problem?
By the way, i'm doing this from cocos2d-x, so if anyone has any idea that uses cocos2d elements it will be welcome as well :)
I totally agree with Tim. If you want to improve the total speed, you've to avoid for loops. I recommend you that, if the lights array size is always ten, swap the loop statement with ten copies of the loop content. You should be aware that any variable that you declare into a loop statement will be freed up at the end of the loop! So its a good idea to span the loop in ten parts (ugly, but it's an old school trick ;))))
Besides, I also recommend you to put some println in every statement, to see what instructions is messing around. I bet that the mix operation is the culprit. I don't know anything about cocos2d, but, it is possible to make an unique call to mix at the end of the process, with a sumarization of distances and strengths? It seems that at some point there's a pretty float-consuming annoying operation
Two things I would try (not guaranteed to help)
Remove the for loop and just hardcode in two lights. For loops can be expensive if they are not handled properly by the driver. It would be good to know if that is slowing you down.
If statements can be expensive, and I don't think that's a good application of mix (you're doing an a*(1-c) + 0.0 * c, and the second half of that term is pointless). I might try replacing this if statement:
if (distance < source.radius)
{
alpha -= mix(source.strength, 0.0, distance/source.radius);
}
With this single line:
alpha -= (1.0-min(distance/source.radius, 1.0)) * source.strength;
I am learning shader programming and looking for examples, specifically for image processing. I'd like to apply some Photoshop effect to my photos, e.g. Curves, Levels, Hue/Saturation adjustments, etc.
I'll assume you have a simple uncontroversial vertex shader, as it's not really relevant to the question, such as:
void main()
{
gl_Position = modelviewProjectionMatrix * position;
texCoordVarying = vec2(textureMatrix * vec4(texCoord0, 0.0, 1.0));
}
So that does much the same as ES 1.x would if lighting was disabled, including the texture matrix that hardly anyone ever uses.
I'm not a Photoshop expert, so please forgive my statements of what I think the various tools do — especially if I'm wrong.
I think I'm right to say that the levels tool effectively stretches (and clips) the brightness histogram? In that case an example shader could be:
varying mediump vec2 texCoordVarying;
uniform sampler2D tex2D;
const mediump mat4 rgbToYuv = mat4( 0.257, 0.439, -0.148, 0.06,
0.504, -0.368, -0.291, 0.5,
0.098, -0.071, 0.439, 0.5,
0.0, 0.0, 0.0, 1.0);
const mediump mat4 yuvToRgb = mat4( 1.164, 1.164, 1.164, -0.07884,
2.018, -0.391, 0.0, 1.153216,
0.0, -0.813, 1.596, 0.53866,
0.0, 0.0, 0.0, 1.0);
uniform mediump float centre, range;
void main()
{
lowp vec4 srcPixel = texture2D(tex2D, texCoordVarying);
lowp vec4 yuvPixel = rgbToYuv * srcPixel;
yuvPixel.r = ((yuvPixel.r - centre) * range) + 0.5;
gl_FragColor = yuvToRgb * yuvPixel;
}
You'd control that by setting the centre of the range you want to let through (which will be moved to the centre of the output range) and the total range you want to let through (1.0 for the entire range, 0.5 for half the range, etc).
One thing of interest is that I switch from the RGB input space to a YUV colour space for the intermediate adjustment. I do that using a matrix multiplication. I then adjust the brightness channel, and apply another matrix that transforms back from YUV to RGB. To me it made most sense to work in a luma/chroma colour space and from there I picked YUV fairly arbitrarily, though it has the big advantage for ES purposes of being a simple linear transform of RGB space.
I am under the understanding that the curves tool also remaps the brightness, but according to some function f(x) = y, which is monotonically increasing (so, will intersect any horizontal or vertical only exactly once) and is set in the interface as a curve from bottom left to top right somehow.
Because GL ES isn't fantastic with data structures and branching is to be avoided where possible, I'd suggest the best way to implement that is to upload a 256x1 luminance texture where the value at 'x' is f(x). Then you can just map through the secondary texture, e.g. with:
... same as before down to ...
lowp vec4 yuvPixel = rgbToYuv * srcPixel;
yuvPixel.r = texture2D(lookupTexture, vec2(yuvPixel.r, 0.0));
... and as above to convert back to RGB, etc ...
You're using a spare texture unit to index a lookup table, effectively. On iOS devices that support ES 2.0 you get at least eight texture units so you'll hopefully have one spare.
Hue/saturation adjustments are more painful to show because the mapping from RGB to HSV involves a lot of conditionals, but the process is basically the same — map from RGB to HSV, perform the modifications you want on H and S, map back to RGB and output.
Based on a quick Google search, this site offers some downloadable code that includes some Photoshop functions (though not curves or levels such that I can see) and, significantly, supplies example implementations of functions RGBToHSL and HSLToRGB. It's for desktop GLSL, which has a more predefined variables, types and functions, but you shouldn't have any big problems working around that. Just remember to add precision modifiers and supply your own replacements for the absent min and max functions.
For curves photoshop uses bicubic spline interpolation. For a given set of control points you can precalculate all 256 values for each channel and for the master curve. I found that it's easier to store the results as a 256x1 texture and pass it to the shader and then change values of each component:
uniform sampler2D curvesTexture;
vec3 RGBCurvesAdjustment(vec3 color)
{
return vec3(texture2D(curvesTexture, vec2(color.r, 1.0)).r,
texture2D(curvesTexture, vec2(color.g, 1.0)).g,
texture2D(curvesTexture, vec2(color.b, 1.0)).b);
}