Minimising glDrawArray calls in OpenGl es - iphone

I'd like to hear what people think the optimal draw calls are for Open GL ES (on the iphone).
Specifically I've read in many places that it is best to minimise the number of calls to glDrawArrays/glDrawElements - I think Apple say 10 should be the max in their recent WWDC presentation. As I understand it to do this you need to put all the vertices into one array if possible, so you only need to make the drawArrays call once.
But I am confused because this surely means you can't use the translate, rotate, scale functions, because it would apply across the whole geometry. Which is fine except doesn't that mean you need to pre-calculate every vertex position yourself, rather than getting open gl to do it?
Also, doesn't it mean you can't use any of the fan/strip settings unless you just have a continuous shape?
These drawbacks make me think I'm not understanding something correctly, so I guess I'm looking for confirmation that I should:
Be trying to make an uber array of all triangles to draw.
Resign myself to the fact I'll have to work out all the vertex positions myself.
Forget about push'ing and pop'ing each thing to draw into it's desired location
Is that what others do?
Thanks

Vast question, batching is always a matter of compromise.
The ideal structure for performance would be, as you mention, to one single array containing all triangles to draw.
Starting from here, we can start adding constraints :
One additional constraint is that
having vertex indices in 16bits saves
bandwidth and memory, and probably
the fast path for your platform. So
you could consider grouping triangles
in chunks of 65536 vertices.
Then, if you want to switch the
shader/material/glState used to draw
geometry, you have no choice (*) but
to emit one draw call per
shader/material/glState. So grouping
triangles could consider grouping by
shaderID/materialID/glStateID.
Next, if you want to animate things,
you have no choice (*) but to
transmit your transform matrix to GL,
and then issue a draw call. So
grouping triangles could consider
grouping triangles by 'transform
groups', for example, all static
geometry together, animated geometry
that have common transforms can be
grouped too.
In these cases, you'd have to transform the vertices yourself (using CPU) before merging the meshes together.
Regarding triangle strips, you can transform any mesh in strips, even if it has discontinuities in its topology, by introducing degenerate triangles. So this is a technique that always apply.
All in all, reducing draw calls is a game of compromises, some techniques might work well for a 3d model, while others may be more suited for other 3d models. IMHO, the key is to be creative and to carefully benchmark your application to see if your changes actually improve performance on your target platform.
HTH, cheers,
(*) actually there are techniques that allow to reduce the number of draw calls in these cases, such as :
texture atlases to group different textures in a single one, to prevent
switching textures in GL, thus
allowing to limit draw calls
(pseudo) hardware instancing that allow shaders to fetch transforms
from various sources to transform
mesh instances in different ways.
...

Related

Different ways to detect size of image on mesh versus size of mesh

I'm creating a puzzle game that generates random sized pieces with 2D meshes. The images contain transparent portions and sometimes a piece is completely transparent. I need to detect what percentage of a piece is transparent. One way I found to do this is to go pixel by pixel. I posted my solution to this HERE. However, this process adds a few seconds during loading which I'd like to avoid and I'm looking for other ideas
I've considered using the selection outline of a MeshCollider to somehow to get a surface area I can compare to the surface area of the mesh but everything I find is on the rendering of outline with specialized shaders. Does anyone have any ideas on to solve this?
.
1) I guess you could add a PolygonCollider2D to your sprite and use its Path for the outline and calculation of the surface area. Not sure however if this will be faster.
PolygonCollider2D.GetPath:
A path is a cyclic sequence of line segments between points that define the outline of the Collider
Checking PolygonCollider2D.GetTotalPointCount or path length may be good enough to determine if the sprite is 'empty'.
Sprite.vertices, Sprite.triangles may also be helpful.
2) You could also improve performance of your first approach:
instead of calling GetPixel as you do now use GetPixels or GetPixels32 and loop through the array in one for loop.
Using GetPixels can be faster than calling GetPixel repeatedly, especially for large textures. In addition, GetPixels can access individual mipmap levels. For most textures, even faster is to use GetPixels32 which returns low precision color data without costly integer-to-float conversions.
check only every 2nd or nth pixel as it should be good enough for approximation
limit number of type casts

Why does merging geometries improve rendering speed?

In my web application I only need to add static objects to my scene. It worked slow so I started searching and I found that merging geometries and merging vertices were the solution. When I implemented it, it indeed worked a lot better. All the articles said that the reason for this improvement is the decrease in number of WebGL calls. As I am not very familiar with things like OpenGL and WebGL (I use Three.js to avoid their complexity), I would like to know why exactly it reduces the WebGL calls?
Because you send one large object instead of many littles, the overhead reduces. So I understand that loading one big mesh to the scene goes faster than many small meshes.
BUT I do not understand why merging geometries also has a positive influence on the rendering calculation? I would also like to know the difference between merging geometries and merging vertices?
Thanks in advance!
three.js is a framework that helps you work with the WebGL API.
What a "mesh" is to three.js, to webgl, it's a series of low level calls that set up state and issue calls to the GPU.
Let's take a sphere for example. With three.js you would create it with a few lines:
var sphereGeometry = new THREE.SphereGeometry(10);
var sphereMaterial = new THREE.MeshBasicMaterial({color:'red'});
var sphereMesh = new THREE.Mesh( sphereGeometry, sphereMaterial);
myScene.add( sphereMesh );
You have your renderer.render() call, and poof, a sphere appears on screen.
A lot of stuff happens under the hood though.
The first line, creates the sphere "geometry" - the cpu will a bunch of math and logic describing a sphere with points and triangles. Points are vectors, three floats grouped together, triangles are a structure that groups these points by indecis (groups of integers).
Somewhere there is a loop that calculates the vectors based on trigonometry (sin, cos), and another, that weaves the resulting array of vectors into triangles (take every N , N + M , N + 2M, create a triangle etc).
Now these numbers exist in javascript land, it's just a bunch of floats and ints, grouped together in a specific way to describe shapes such as cubes, spheres and aliens.
You need a way to draw this construct on a screen - a two dimensional array of pixels.
WebGL does not actually know much about 3D. It knows how to manage memory on the gpu, how to compute things in parallel (or gives you the tools), it does know how to do mathematical operations that are crucial for 3d graphics, but the same math can be used to mine bitcoins, without even drawing anything.
In order for WebGL to draw something on screen, it first needs the data put into appropriate buffers, it needs to have the shader programs, it needs to be setup for that specific call (is there going to be blending - transparency in three.js land, depth testing, stencil testing etc), then it needs to know what it's actually drawing (so you need to provide strides, sizes of attributes etc to let it know where a 'mesh' actually is in memory), how it's drawing it (triangle strips, fans, points...) and what to draw it with - which shaders will it apply on the data you provided.
So, you need a way to 'teach' WebGL to do 3d.
I think the best way to get familiar with this concept is to look at this tutorial , re-reading if necessary, because it explains what happens pretty much on every single 3d object in perspective, ever.
To sum up the tutorial:
a perspective camera is basically two 4x4 matrices - a perspective matrix, that puts things into perspective, and a view matrix, that moves the entire world into camera space. Every camera you make, consists of these two matrices.
Every object exists in it's object space. TRS matrix, (world matrix in three.js terms) is used to transform this object into world space.
So this stuff - a concept such as "projective matrix" is what teaches webgl how to draw perspective.
Three.js abstracts this further and gives you things like "field of view" and "aspect ratio" instead of left right, top bottom.
Three.js also abstracts the transformation matrices (view matrix on the camera, and world matrices on every object) because it allows you to set "position" and "rotation" and computes the matrix based on this under the hood.
Since every mesh has to be processed by the vertex shader and the pixel shader in order to appear on the screen, every mesh needs to have all this information available.
When a draw call is being issued for a specific mesh, that mesh will have the same perspective matrix, and view matrix as any other object being rendered with the same camera. They will each have their own world matrices - numbers that move them around around your scene.
This is transformation alone, happening in the vertex shader. These results are then rasterized, and go to the pixel shader for processing.
Lets consider two materials - black plastic and red plastic. They will have the same shader, perhaps one you wrote using THREE.ShaderMaterial, or maybe one from three's library. It's the same shader, but it has one uniform value exposed - color. This allows you to have many instances of a plastic material, green, blue, pink, but it means that each of these requires a separate draw call.
Webgl will have to issue specific calls to change that uniform from red to black, and then it's ready to draw stuff using that 'material'.
So now imagine a particle system, displaying a thousand cubes each with a unique color. You have to issue a thousand draw calls to draw them all, if you treat them as separate meshes and change colors via a uniform.
If on the other hand, you assign vertex colors to each cube, you don't rely on the uniform any more, but on an attribute. Now if you merge all the cubes together, you can issue a single draw call, processing all the cubes with the same shader.
You can see why this is more efficient simply by taking a glance at webglrenderer from three.js, and all the stuff it has to do in order to translate your 3d calls to webgl. Better done once than a thousand times.
Back to those 3 lines, the sphereMaterial can take a color argument, if you look at the source, this will translate to a uniform vec3 in the shader. However, you can also achieve the same thing by rendering the vertex colors, and assigning the color you want before hand.
sphereMesh will wrap that computed geometry into an object that three's webglrenderer understands, which in turn sets up webgl accordingly.

Alternatives to diamond-square for incremental procedural terrain generation?

I'm currently in the process of coding a procedural terrain generator for a game. For that purpose, I divide my world into chunks of equal size and generate them one by one as the player strolls along. So far, nothing special.
Now, I specifically don't want the world to be persistent, i.e. if a chunk gets unloaded (maybe because the player moved too far away) and later loaded again, it should not be the same as before.
From my understanding, implicit approaches like treating 3D Simplex Noise as a density function input for Marching Cubes don't suit my problem. That is because I would need to reseed the generator to obtain different return values for the same point in space, leading to discontinuities along chunk borders.
I also looked into Midpoint Displacement / Diamond-Square. By seeding each chunk's heightmap with values from the borders of adjacent chunks and randomizing the chunk corners that don't have any other chunks nearby, I was able to generate a tileable terrain that exhibits the desired behavior. Still, the results look rather dull. Specifically, since this method relies on heightmaps, it lacks overhangs and the like. Moreover, even with the corner randomization, terrain features tend to be confined to small areas, i.e. there are no multiple-chunk hills or similar landmarks.
Now I was wondering if there are other approaches to this that I haven't heard of/thought about yet. Any help is highly appreciated! :)
Cheers!
Post process!
After you do the heightmaps, run back through adding features.
This is how Minecraft does it to get the various caverns and cliff overhangs.

Drawing a 3D tree structure in WebGL

I am working on drawing large directed acyclic graphs in WebGL using the gwt-g3d library as per the technique shown here: http://www-graphics.stanford.edu/papers/h3/
At this point, I have a simple two-level graph rendering:
Performance is terrible -- it takes about 1.5-2 seconds to render this thing. I'm not an OpenGL expert, so here is the general approach I am taking. Maybe somebody can point out some optimizations that will get this rendering quicker.
I am astonished how long it takes to push the MODELVIEW matrix and buffers to the graphics card. This is where the lion's share of the time is wasted. Should I instead be doing MODELVIEW transformations in the vertex shader?
This leads me to believe that manipulating the MODELVIEW matrix and pushing it once for each node shouldn't be a bad practice, but the timings don't lie:
https://gamedev.stackexchange.com/questions/27042/translate-the-modelview-matrix-or-change-vertex-coordinates
Group nodes in larger chunks instead of rendering them separately. Do background caching of all geometry with applied transformations that most likely will not be modified and store it in one buffer and render in one call.
Another solution: Store nodes(box + line) in one buffer(You can store more than you need at current time) and their transformations in texture. apply transformations in vertex shader based on node index(texture coordinates) It should be faster drastically faster.
To test support use this site. I have MAX_VERTEX_TEXTURE_IMAGE_UNITS | 4
Best solution will be Geometry Instancing but it currently isn't supported in WebGL.

fractal microscope simulator

I've done work on software used for controlling imaging hardware, such as microscopes, that are sometimes hard to get time on. This means it is difficult to test out new/different algorithms which would require access to the instrument. I'd like to create a synthetic instrument that could be used for some of these testing purposes, and I was thinking of using some kind of fractal image generation to create the synthetic images. The key would be to be able to generate features at many different 'magnifications' and locations in some sort of deterministic manner. This is because some of the algorithms being tested may need to pan/zoom and relocate previously 'imaged' areas. Onto these base images I can then apply whatever instrument 'defects' are appropriate (focus, noise, saturation, etc.).
I'm at a bit of a loss on how to select/implement a good fractal algorithm for the base image. Any help would be appreciated. Preferably it would have the following qualities:
Be fast at rendering new image areas.
Fairly wide 'feature' coverage at as many locations and scales as possible.
Be deterministic (but initialized from random starting parameters).
Ability to tune to make images look more like 'real' images.
Item 2 is important, for example a mandelbrot set, with its large smooth/empty regions, might not be good since the software controlling the synthetic scope might fall into one of these areas.
So far I've thought of using something like a mandelbrot, but randomly shifting/rotating/scaling and merging two or more fractal sets to get more complete 'feature' coverage.
I've also seen images of the fractal flame algorithms and they seem to generate images that might be useful (and nice to look at).
Finally, I've thought of using some sort of paused particle simulation run to generate images that are more cell-like (my current imaging target), but I'm not sure if this approach can be made to work with the other requirements.
Edit:
#Jeffrey - So it sounds like some kind of terrain generation might be the way to go, as long as I have complete control over the PSRNG. Perhaps I can use some stored initial seed + x position + y position to generate my random numbers? But then I am unsure of how to consistently generate the terrains across scales, except, as you mentioned, to create the base terrain at the coursest scale, and at certain pre-determined 'magnifications' add new deterministic pseudo-random variations to this base. I'd also have to be careful about when to generate the next level of terrain, since if I'm too aggressive I'd have to generate and integrate the results appropriately for display at the coarser level... This is why I initially was leaning toward a more 'traditional' fractal, since this integration from finer scales would be handled more implicitly (I think).
The idea behind a fractal terrain creation algorithm is to build the image at each scale separately. For a landscape it's easy: just make a small array of height values, and set them randomly. Then scale it up to a larger array, averaging the values so that the contour is smooth, and then add small random amounts to those values. Then scale it up, etc. The original small bumps have become mountains, and they are filled with complex terrain.
There are two particular difficulties with the problem posed here, though. First, you don't want to store any of these values, since it would be potentially huge. Secondly, the features at each scale are of a different kind than the features at other scales.
These problems are not insurmountable.
Basically, you would divide the image up into a grid, and using deterministic psedorandom numbers establish the key features of each square in the grid. For example, each square could have a certain density of cell types.
At the next level of magnification, subdivide each square into another grid, apply a gradiant of values across the grid that is based on the values of the containing square and its surrounding squares. Then apply pseudorandom variations to that seeded with the containing square's grid coordinates. For the random seed, always use the coordinates of the immediately containing square of the subdivision under consideration regardless of where the image is cropped, in order to ensure that it is recreated correctly accross multiple runs.
At some level of magnification the random values go from being densities of paticles types to particle locations. Then for each particle, there are partical features. Then features on those features.
Although arbitrary left/right and up/down scrolling will be desired, the image at all levels of magnification above the current scene will have to be calculated each time the frame is shifted to ensure that all necessary features are included. This way the image can be scrolled from one cell to another without loss of consistancy. Partical simulations can be used to ensure that cells or cell features don't overlap. This could be done in a repeatable, deterministic manner.
And don't forget to apply a smoothing gradient based on averages of surrounding squares at higher levels before adding in the random variations. Otherwise, the abrupt changes will make the squares themselves appear in the images!
This answer is somewhat rambling and probably confusing, but that is best I can explain it right now. I hope it helps!