The right way to handle procedural LOD, distant terrain chunks - unity3d

I would really appreciate your thoughts on compute-generated terrain, LOD, etc. and what the 'right way' to do it is.
Here's my current plan:
I'm procedurally generating a large finite world, where at any point most or all of the map is visible.
Texturing/Colouring is done in vert/frag shader.
I'm about 50% through implementing:
Generate the closest chunks (eg a 5x5 heightmap around the player) on the CPU using a noise function.
In the middle distance, using instances of a compute shader to generate the vertices of heightmap chunks and passing buffer to vert/frag shader.
In the far distance, generating 1 huge combined chunk with much less dense vertices and passing to vert/frag.
My questions are:
Is this the (or A) right way to handle LOD/Chunks/Distant terrain?
Should I instead generate everything on the GPU and pass a mesh back to CPU for collision, instead of using CPU for near chunks?
What function should I be using to generate and draw the map? I'm currently using DrawProceduralNow in OnRenderObject().
I'm just starting experimenting with using MaterialPropertyBlocks and DrawProcedural in Update().
One idea is to have semi-autonomous chunks that change their settings depending on player location.
Or relative chunks that are based on the space around the player, so a middle distance chunk will always be a middle LOD (distance between vertices), but the heightmap is updated as the player walks towards it.
I'm trying to avoid spending too much time going down the wrong rabbit hole.
If I can establish the right concepts first, that will save me a lot of time.
Edit: I'm also considering pre-generating heightmap textures to save on procedural calculation time. But that's a moot point, my questions are around what to do after I already have the vertices.

Related

Different ways to detect size of image on mesh versus size of mesh

I'm creating a puzzle game that generates random sized pieces with 2D meshes. The images contain transparent portions and sometimes a piece is completely transparent. I need to detect what percentage of a piece is transparent. One way I found to do this is to go pixel by pixel. I posted my solution to this HERE. However, this process adds a few seconds during loading which I'd like to avoid and I'm looking for other ideas
I've considered using the selection outline of a MeshCollider to somehow to get a surface area I can compare to the surface area of the mesh but everything I find is on the rendering of outline with specialized shaders. Does anyone have any ideas on to solve this?
.
1) I guess you could add a PolygonCollider2D to your sprite and use its Path for the outline and calculation of the surface area. Not sure however if this will be faster.
PolygonCollider2D.GetPath:
A path is a cyclic sequence of line segments between points that define the outline of the Collider
Checking PolygonCollider2D.GetTotalPointCount or path length may be good enough to determine if the sprite is 'empty'.
Sprite.vertices, Sprite.triangles may also be helpful.
2) You could also improve performance of your first approach:
instead of calling GetPixel as you do now use GetPixels or GetPixels32 and loop through the array in one for loop.
Using GetPixels can be faster than calling GetPixel repeatedly, especially for large textures. In addition, GetPixels can access individual mipmap levels. For most textures, even faster is to use GetPixels32 which returns low precision color data without costly integer-to-float conversions.
check only every 2nd or nth pixel as it should be good enough for approximation
limit number of type casts

How can I create larger worlds/levels in Unity without adding lag?

How can I scale up the size of my world/level to include more gameobjects without causing lag for the player?
I am creating an asset for the asset store. It is a random procedural world generator. There is only one major problem: world size.
I can't figure out how to scale up the worlds to have more objects/tiles.
I have generated worlds up to 2000x500 tiles, but it lags very badly.
The maximum sized world that will not affect the speed of the game is
around 500x200 tiles.
I have generated worlds of the same size with smaller blocks: 1/4th the size (it doesn't affect how many tiles you can spawn)
I would like to create a world at least the size of 4200x1200 blocks without lag spikes.
I have looked at object pooling (it doesn't seem like it can help me
that much)
I have looked at LoadLevelAsync (don't really know how to use this,
and rumor is that you need Unity Pro which I do not have)
I have tried setting chunks Active or Deactive based on player
position (This caused more lag than just leaving the blocks alone).
Additional Information:
The terrain is split up into chunks. It is 2d, and I have box colliders on all solid tiles/blocks. Players can dig/place blocks. I am not worried about the amount of time it takes for the level to load initially, but rather about the smoothness of the game while playing it -no lag spikes while playing.
question on Unity Forums
If you're storing each tile as an individual GameObject, don't. Use a texture atlas and 'tile data' to generate the look of each chunk whenever it is dug into or a tile placed on it.
Also make sure to disable, potentially even delete any chunks not within the visible range of the player. Object pooling will help significantly here if you can work out the maximum number of chunks that will ever be needed at once, and just recycle chunks as they go off the screen.
DETAILS:
There is a lot to talk about for the optimal generation, so I'm going to post this link (http://studentgamedev.blogspot.co.uk/2013/08/unity-voxel-tutorial-part-1-generating.html) It shows you how to do it in a 3D space, but the principales are essentially the same if not a little easier for 2D space. The following is just a rough outline of what might be involved, and going down this path will result in huge benefits, but will require a lot of work to get there. I've included all the benefits at the bottom of the answer.
Each tile can be made to be a simple struct with fields like int id, vector2d texturePos, bool visible in it's simplest form. You can then store these tiles in a 2 dimensional array within each chunk, though to make them even more memory efficient you could store the texturePos once elsewhere in the program and write a method to get a texturePos by id.
When you make changes to this 2 dimensional array which represents either the addition or removal of a tile, you update the chunk, which is the actual GameObject used to represent the tiles. By iterating over the tile data stored in the chunk, it will be possible to generate a mesh of vertices based on the position of each tile in the 2 dimensional array. If visible is false, simply don't generate any vertices for it.
This mesh alone could be used as a collider, but won't look like anything. It will also be necessary to generate UV co-ords which happen to be the texturePos. When Unity then displays the mesh, it will display specific points of the texture atlas as defined by the UV co-ords of the mesh.
This has the benefit of resulting in significantly fewer GameObjects, better texture batching for Unity, less memory usage, faster random access for any tile as it's not got any MonoBehaviour overhead, and a genuine plethora of additional benefits.

Why does merging geometries improve rendering speed?

In my web application I only need to add static objects to my scene. It worked slow so I started searching and I found that merging geometries and merging vertices were the solution. When I implemented it, it indeed worked a lot better. All the articles said that the reason for this improvement is the decrease in number of WebGL calls. As I am not very familiar with things like OpenGL and WebGL (I use Three.js to avoid their complexity), I would like to know why exactly it reduces the WebGL calls?
Because you send one large object instead of many littles, the overhead reduces. So I understand that loading one big mesh to the scene goes faster than many small meshes.
BUT I do not understand why merging geometries also has a positive influence on the rendering calculation? I would also like to know the difference between merging geometries and merging vertices?
Thanks in advance!
three.js is a framework that helps you work with the WebGL API.
What a "mesh" is to three.js, to webgl, it's a series of low level calls that set up state and issue calls to the GPU.
Let's take a sphere for example. With three.js you would create it with a few lines:
var sphereGeometry = new THREE.SphereGeometry(10);
var sphereMaterial = new THREE.MeshBasicMaterial({color:'red'});
var sphereMesh = new THREE.Mesh( sphereGeometry, sphereMaterial);
myScene.add( sphereMesh );
You have your renderer.render() call, and poof, a sphere appears on screen.
A lot of stuff happens under the hood though.
The first line, creates the sphere "geometry" - the cpu will a bunch of math and logic describing a sphere with points and triangles. Points are vectors, three floats grouped together, triangles are a structure that groups these points by indecis (groups of integers).
Somewhere there is a loop that calculates the vectors based on trigonometry (sin, cos), and another, that weaves the resulting array of vectors into triangles (take every N , N + M , N + 2M, create a triangle etc).
Now these numbers exist in javascript land, it's just a bunch of floats and ints, grouped together in a specific way to describe shapes such as cubes, spheres and aliens.
You need a way to draw this construct on a screen - a two dimensional array of pixels.
WebGL does not actually know much about 3D. It knows how to manage memory on the gpu, how to compute things in parallel (or gives you the tools), it does know how to do mathematical operations that are crucial for 3d graphics, but the same math can be used to mine bitcoins, without even drawing anything.
In order for WebGL to draw something on screen, it first needs the data put into appropriate buffers, it needs to have the shader programs, it needs to be setup for that specific call (is there going to be blending - transparency in three.js land, depth testing, stencil testing etc), then it needs to know what it's actually drawing (so you need to provide strides, sizes of attributes etc to let it know where a 'mesh' actually is in memory), how it's drawing it (triangle strips, fans, points...) and what to draw it with - which shaders will it apply on the data you provided.
So, you need a way to 'teach' WebGL to do 3d.
I think the best way to get familiar with this concept is to look at this tutorial , re-reading if necessary, because it explains what happens pretty much on every single 3d object in perspective, ever.
To sum up the tutorial:
a perspective camera is basically two 4x4 matrices - a perspective matrix, that puts things into perspective, and a view matrix, that moves the entire world into camera space. Every camera you make, consists of these two matrices.
Every object exists in it's object space. TRS matrix, (world matrix in three.js terms) is used to transform this object into world space.
So this stuff - a concept such as "projective matrix" is what teaches webgl how to draw perspective.
Three.js abstracts this further and gives you things like "field of view" and "aspect ratio" instead of left right, top bottom.
Three.js also abstracts the transformation matrices (view matrix on the camera, and world matrices on every object) because it allows you to set "position" and "rotation" and computes the matrix based on this under the hood.
Since every mesh has to be processed by the vertex shader and the pixel shader in order to appear on the screen, every mesh needs to have all this information available.
When a draw call is being issued for a specific mesh, that mesh will have the same perspective matrix, and view matrix as any other object being rendered with the same camera. They will each have their own world matrices - numbers that move them around around your scene.
This is transformation alone, happening in the vertex shader. These results are then rasterized, and go to the pixel shader for processing.
Lets consider two materials - black plastic and red plastic. They will have the same shader, perhaps one you wrote using THREE.ShaderMaterial, or maybe one from three's library. It's the same shader, but it has one uniform value exposed - color. This allows you to have many instances of a plastic material, green, blue, pink, but it means that each of these requires a separate draw call.
Webgl will have to issue specific calls to change that uniform from red to black, and then it's ready to draw stuff using that 'material'.
So now imagine a particle system, displaying a thousand cubes each with a unique color. You have to issue a thousand draw calls to draw them all, if you treat them as separate meshes and change colors via a uniform.
If on the other hand, you assign vertex colors to each cube, you don't rely on the uniform any more, but on an attribute. Now if you merge all the cubes together, you can issue a single draw call, processing all the cubes with the same shader.
You can see why this is more efficient simply by taking a glance at webglrenderer from three.js, and all the stuff it has to do in order to translate your 3d calls to webgl. Better done once than a thousand times.
Back to those 3 lines, the sphereMaterial can take a color argument, if you look at the source, this will translate to a uniform vec3 in the shader. However, you can also achieve the same thing by rendering the vertex colors, and assigning the color you want before hand.
sphereMesh will wrap that computed geometry into an object that three's webglrenderer understands, which in turn sets up webgl accordingly.

Alternatives to diamond-square for incremental procedural terrain generation?

I'm currently in the process of coding a procedural terrain generator for a game. For that purpose, I divide my world into chunks of equal size and generate them one by one as the player strolls along. So far, nothing special.
Now, I specifically don't want the world to be persistent, i.e. if a chunk gets unloaded (maybe because the player moved too far away) and later loaded again, it should not be the same as before.
From my understanding, implicit approaches like treating 3D Simplex Noise as a density function input for Marching Cubes don't suit my problem. That is because I would need to reseed the generator to obtain different return values for the same point in space, leading to discontinuities along chunk borders.
I also looked into Midpoint Displacement / Diamond-Square. By seeding each chunk's heightmap with values from the borders of adjacent chunks and randomizing the chunk corners that don't have any other chunks nearby, I was able to generate a tileable terrain that exhibits the desired behavior. Still, the results look rather dull. Specifically, since this method relies on heightmaps, it lacks overhangs and the like. Moreover, even with the corner randomization, terrain features tend to be confined to small areas, i.e. there are no multiple-chunk hills or similar landmarks.
Now I was wondering if there are other approaches to this that I haven't heard of/thought about yet. Any help is highly appreciated! :)
Cheers!
Post process!
After you do the heightmaps, run back through adding features.
This is how Minecraft does it to get the various caverns and cliff overhangs.

Minimising glDrawArray calls in OpenGl es

I'd like to hear what people think the optimal draw calls are for Open GL ES (on the iphone).
Specifically I've read in many places that it is best to minimise the number of calls to glDrawArrays/glDrawElements - I think Apple say 10 should be the max in their recent WWDC presentation. As I understand it to do this you need to put all the vertices into one array if possible, so you only need to make the drawArrays call once.
But I am confused because this surely means you can't use the translate, rotate, scale functions, because it would apply across the whole geometry. Which is fine except doesn't that mean you need to pre-calculate every vertex position yourself, rather than getting open gl to do it?
Also, doesn't it mean you can't use any of the fan/strip settings unless you just have a continuous shape?
These drawbacks make me think I'm not understanding something correctly, so I guess I'm looking for confirmation that I should:
Be trying to make an uber array of all triangles to draw.
Resign myself to the fact I'll have to work out all the vertex positions myself.
Forget about push'ing and pop'ing each thing to draw into it's desired location
Is that what others do?
Thanks
Vast question, batching is always a matter of compromise.
The ideal structure for performance would be, as you mention, to one single array containing all triangles to draw.
Starting from here, we can start adding constraints :
One additional constraint is that
having vertex indices in 16bits saves
bandwidth and memory, and probably
the fast path for your platform. So
you could consider grouping triangles
in chunks of 65536 vertices.
Then, if you want to switch the
shader/material/glState used to draw
geometry, you have no choice (*) but
to emit one draw call per
shader/material/glState. So grouping
triangles could consider grouping by
shaderID/materialID/glStateID.
Next, if you want to animate things,
you have no choice (*) but to
transmit your transform matrix to GL,
and then issue a draw call. So
grouping triangles could consider
grouping triangles by 'transform
groups', for example, all static
geometry together, animated geometry
that have common transforms can be
grouped too.
In these cases, you'd have to transform the vertices yourself (using CPU) before merging the meshes together.
Regarding triangle strips, you can transform any mesh in strips, even if it has discontinuities in its topology, by introducing degenerate triangles. So this is a technique that always apply.
All in all, reducing draw calls is a game of compromises, some techniques might work well for a 3d model, while others may be more suited for other 3d models. IMHO, the key is to be creative and to carefully benchmark your application to see if your changes actually improve performance on your target platform.
HTH, cheers,
(*) actually there are techniques that allow to reduce the number of draw calls in these cases, such as :
texture atlases to group different textures in a single one, to prevent
switching textures in GL, thus
allowing to limit draw calls
(pseudo) hardware instancing that allow shaders to fetch transforms
from various sources to transform
mesh instances in different ways.
...