3D Texture emulation in shader (subpixel related) - unity3d

I am working on a Unity3D project which relies on a 3D texture momentarily.
The problem is, Unity only allows Pro users to make use of Texture3D. Hence I'm looking for an alternative to Texture3D, perhaps a one dimensional texture (although not natively available in Unity) that is interpreted as 3 dimensional in the shader (which uses the 3D texture).
Is there a way to do this whilst (preferably) keeping subpixel information?
(GLSL and Cg tags added because here lies the core of the problem)
Edit: The problem is addressed here as well: webgl glsl emulate texture3d
However this is not yet finished and working properly.
Edit: For the time being I disregard proper subpixel information. So any help on converting a 2D texture to contain 3D information is appreciated!
Edit: I retracted my own answer as it isn't sufficient as of yet:
float2 uvFromUvw( float3 uvw ) {
float2 uv = float2(uvw.x, uvw.y / _VolumeTextureSize.z);
uv.y += float(round(uvw.z * (_VolumeTextureSize.z - 1))) / _VolumeTextureSize.z;
return uv;
With initialization as Texture2D(volumeWidth, volumeHeight * volumeDepth).
Most of the time it works, but sometimes it shows wrong pixels, probably because of subpixel information it is picking up on. How can I fix this? Clamping the input doesn't work.

I'm using this for my 3D clouds if that helps:
float SampleNoiseTexture( float3 _UVW, float _MipLevel )
float2 WrappedUW = fmod( 16.0 * (1000.0 + _UVW.xz), 16.0 ); // UW wrapped in [0,16[
float IntW = floor( WrappedUW.y ); // Integer slice number
float dw = WrappedUW.y - IntW; // Remainder for intepolating between slices
_UVW.x = (17.0 * IntW + WrappedUW.x + 0.25) * 0.00367647058823529411764705882353; // divided by 17*16 = 272
float4 Value = tex2D( _TexNoise3D, float4( _UVW.xy, 0.0, 0.0 ) );
return lerp( Value.x, Value.y, dw );
The "3D texture" is packed as 16 slices of 17 pixels wide in a 272x16 texture, with the 17th column of each slice being a copy of the 1st column (wrap address mode)...
Of course, no mip-mapping allowed with this technique.

Here's the code I'm using to create the 3D texture if that's what bothering you:
static const NOISE3D_TEXTURE_POT = 4;
// <summary>
// Create the "3D noise" texture
// To simulate 3D textures that are not available in Unity, I create a single long 2D slice of (17*16) x 16
// The width is 17*16 so that all 3D slices are packed into a single line, and I use 17 as a single slice width
// because I pad the last pixel with the first column of the same slice so bilinear interpolation is correct.
// The texture contains 2 significant values in Red and Green :
// Red is the noise value in the current W slice
// Green is the noise value in the next W slice
// Then, the actual 3D noise value is an interpolation of red and green based on the W remainder
// </summary>
protected NuajTexture2D Build3DNoise()
// Build first noise mip level
for ( int W=0; W < NOISE3D_TEXTURE_SIZE; W++ )
for ( int V=0; V < NOISE3D_TEXTURE_SIZE; V++ )
for ( int U=0; U < NOISE3D_TEXTURE_SIZE; U++ )
NoiseValues[U,V,W] = (float) SimpleRNG.GetUniform();
// Build actual texture
int MipLevel = 0; // In my original code, I build several textures for several mips...
int MipSize = NOISE3D_TEXTURE_SIZE >> MipLevel;
int Width = MipSize*(MipSize+1); // Pad with an additional column
Color[] Content = new Color[MipSize*Width];
// Build content
for ( int W=0; W < MipSize; W++ )
int Offset = W * (MipSize+1); // W Slice offset
for ( int V=0; V < MipSize; V++ )
for ( int U=0; U <= MipSize; U++ )
Content[Offset+Width*V+U].r = NoiseValues[U & (MipSize-1),V,W];
Content[Offset+Width*V+U].g = NoiseValues[U & (MipSize-1),V,(W+1) & (MipSize-1)];
// Create texture
NuajTexture2D Result = Help.CreateTexture( "Noise3D", Width, MipSize, TextureFormat.ARGB32, false, FilterMode.Bilinear, TextureWrapMode.Repeat );
Result.SetPixels( Content, 0 );
Result.Apply( false, true );
return Result;

I followed Patapoms response and came to the following. However it's still off as it should be.
float getAlpha(float3 position)
float2 WrappedUW = fmod( _Volume.xz * (1000.0 + position.xz), _Volume.xz ); // UW wrapped in [0,16[
float IntW = floor( WrappedUW.y ); // Integer slice number
float dw = WrappedUW.y - IntW; // Remainder for intepolating between slices
position.x = ((_Volume.z + 1.0) * IntW + WrappedUW.x + 0.25) / ((_Volume.z + 1.0) * _Volume.x); // divided by 17*16 = 272
float4 Value = tex2Dlod( _VolumeTex, float4( position.xy, 0.0, 0.0 ) );
return lerp( Value.x, Value.y, dw );
public int GetPixelId(int x, int y, int z) {
return y * (volumeWidth + 1) * volumeDepth + z * (volumeWidth + 1) + x;
// Code to set the pixelbuffer one pixel at a time starting from a clean slate
pixelBuffer[GetPixelId(x, y, z)].r = color.r;
if (z > 0)
pixelBuffer[GetPixelId(x, y, z - 1)].g = color.r;
if (z == volumeDepth - 1 || z == 0)
pixelBuffer[GetPixelId(x, y, z)].g = color.r;
if (x == 0) {
pixelBuffer[GetPixelId(volumeWidth, y, z)].r = color.r;
if (z > 0)
pixelBuffer[GetPixelId(volumeWidth, y, z - 1)].g = color.r;
if (z == volumeDepth - 1 || z == 0)
pixelBuffer[GetPixelId(volumeWidth, y, z)].g = color.r;


MeshData GetVertexData has the incorrect length

I'm trying to optimize some mesh generation using MeshData & the Job System, but for some reason when I try to use 2 params in meshData.SetVertexBufferParams, the resulting meshData.GetVertexData is half the length it should be (I set the vertex count to 5120, but the resulting VertexData NativeArray is only 2560 items long).
When I force it to be double the length (SetVertexBufferParams(numVerts * 2, ...)), it creates a mesh that appears to treat the norms and vert positions as all position data and also makes the screen go black so no screen shot.
Here's my code:
// generate 256 height values
int[] arr = new int[256];
for (int i = 0; i < arr.Length; i++)
arr[i] = (int) (Mathf.PerlinNoise(i / 16 / 16f, i % 16 / 16f) * 5);
// put it in a NativeArray
NativeArray<int> heights = new NativeArray<int>(arr, Allocator.TempJob);
// 4 verts per face * 5 faces = 20
int numVerts = heights.Length * 20; // this value is always 5120
// 2 tris per face * 5 daces * 3 indices = 30
int indices = heights.Length * 30;
// MeshData setup
Mesh.MeshDataArray meshDataArray = Mesh.AllocateWritableMeshData(1);
Mesh.MeshData meshData = meshDataArray[0];
new VertexAttributeDescriptor(VertexAttribute.Position, VertexAttributeFormat.Float32, 3, stream:0),
new VertexAttributeDescriptor(VertexAttribute.Normal, VertexAttributeFormat.Float32, 3, stream:1)
meshData.SetIndexBufferParams(indices, IndexFormat.UInt16);
// Create job
Job job = new Job
Heights = heights,
MeshData = meshData
// run job
// struct I'm using for vertex data
public struct VData
public float3 Vert;
public float3 Norm;
// Here's some parts of the job
public struct Job : IJob
public NativeArray<int> Heights;
public Mesh.MeshData MeshData;
public void Execute()
NativeArray<VData> Verts = MeshData.GetVertexData<VData>();
NativeArray<ushort> Tris = MeshData.GetIndexData<ushort>();
// loops from 0 to 255
for (int i = 0; i < Heights.Length; i++)
ushort t1 = (ushort)(w1 + 16);
// This indicates that Verts.Length is 2560 when it should be 5120
int t = i * 30; // tris
int height = Heights[i];
// x and y coordinate in chunk
int x = i / 16;
int y = i % 16;
float3 up = new float3(0, 1, 0);
// This throws and index out of bounds error because t1 becomes larger than Verts.Length
Verts[t1] = new VData { Vert = new float3(x + 1, height, y + 1), Norm = up};
// ...
new VertexAttributeDescriptor(VertexAttribute.Position, VertexAttributeFormat.Float32, 3, stream:0),
new VertexAttributeDescriptor(VertexAttribute.Normal, VertexAttributeFormat.Float32, 3, stream:1)
Your SetVertexBufferParams here places VertexAttribute.Position and VertexAttribute.Normal on a separate streams thus halving the size of the buffer per stream and later the length of the buffers if buffer becomes reinterpreted with the wrong struct by mistake.
This is how documentation explains streams:
Vertex data is laid out in separate "streams" (each stream goes into a separate vertex buffer in the underlying graphics API). While Unity supports up to 4 vertex streams, most meshes use just one. Separate streams are most useful when some vertex attributes don't need to be processed, for example skinned meshes often use two vertex streams (one containing all the skinned data: positions, normals, tangents; while the other stream contains all the non-skinned data: colors and texture coordinates).
But why it might end up re-interpreted as half the length? Well, because of this line:
NativeArray<VData> Verts = MeshData.GetVertexData<VData>();
How? Because there is a implicit stream parameter value there (doc)
public NativeArray<T> GetVertexData(int stream = 0);
and it defaults to 0. So what happens here is this:
var Verts = Positions_Only.Reinterpret<Position_And_Normals>();
or in other words:
var Verts = NativeArray<float3>().Reinterpret<float3x2>();
case solved :T
Change stream:1 to stream:0 so both vertex attributes end up on the same stream.
or var Positions = MeshData.GetVertexData<float3>(0); & var Normals = MeshData.GetVertexData<float3>(1);
or create a dedicated VData struct per stream var Stream0 = MeshData.GetVertexData<VStream0>(0); & var Stream1 = MeshData.GetVertexData<VStream1>(1);

How to manipulate a shaped area of terrain in runtime - Unity 3D

My game has a drawing tool - a looping line renderer that is used as a marker to manipulate an area of the terrain in the shape of the line. This all happens in runtime as soon as the player stops drawing the line.
So far I have managed to raise terrain verteces that match the coordinates of the line renderer's points, but I have difficulties with raising the points that fall inside the marker's shape. Here is an image describing what I currently have:
I tried using the "Polygon Fill Algorithm" (http://alienryderflex.com/polygon_fill/), but raising the terrain vertices one line at a time is too resourceful (even when the algorithm is narrowed to a rectangle that surrounds only the marked area). Also my marker's outline points have gaps between them, meaning I need to add a radius to the line that raises the terrain, but that might leave the result sloppy.
Maybe I should discard the drawing mechanism and use a mesh with a mesh collider as the marker?
Any ideas are appreciated on how to get the terrain manipulated in the exact shape as the marker.
Current code:
I used this script to create the line - the first and the last line points have the same coordinates.
The code used to manipulate the terrain manipulation is currently triggered when clicking a GUI button:
using System;
using System.Collections;
using UnityEngine;
public class changeTerrainHeight_lineMarker : MonoBehaviour
public Terrain TerrainMain;
public LineRenderer line;
void OnGUI()
//Get the terrain heightmap width and height.
int xRes = TerrainMain.terrainData.heightmapWidth;
int yRes = TerrainMain.terrainData.heightmapHeight;
//GetHeights - gets the heightmap points of the tarrain. Store them in array
float[,] heights = TerrainMain.terrainData.GetHeights(0, 0, xRes, yRes);
if (GUI.Button(new Rect(30, 30, 200, 30), "Line points"))
/* Set the positions to array "positions" */
Vector3[] positions = new Vector3[line.positionCount];
/* use this height to the affected terrain verteces */
float height = 0.05f;
for (int i = 0; i < line.positionCount; i++)
/* Assign height data */
heights[Mathf.RoundToInt(positions[i].z), Mathf.RoundToInt(positions[i].x)] = height;
//SetHeights to change the terrain height.
TerrainMain.terrainData.SetHeights(0, 0, heights);
Got to the solution thanks to Siim's personal help, and thanks to the article: How can I determine whether a 2D Point is within a Polygon?.
The end result is visualized here:
First the code, then the explanation:
using System;
using System.Collections;
using UnityEngine;
public class changeTerrainHeight_lineMarker : MonoBehaviour
public Terrain TerrainMain;
public LineRenderer line;
void OnGUI()
//Get the terrain heightmap width and height.
int xRes = TerrainMain.terrainData.heightmapWidth;
int yRes = TerrainMain.terrainData.heightmapHeight;
//GetHeights - gets the heightmap points of the tarrain. Store them in array
float[,] heights = TerrainMain.terrainData.GetHeights(0, 0, xRes, yRes);
//Trigger line area raiser
if (GUI.Button(new Rect(30, 30, 200, 30), "Line fill"))
/* Set the positions to array "positions" */
Vector3[] positions = new Vector3[line.positionCount];
float height = 0.10f; // define the height of the affected verteces of the terrain
/* Find the reactangle the shape is in! The sides of the rectangle are based on the most-top, -right, -bottom and -left vertex. */
float ftop = float.NegativeInfinity;
float fright = float.NegativeInfinity;
float fbottom = Mathf.Infinity;
float fleft = Mathf.Infinity;
for (int i = 0; i < line.positionCount; i++)
//find the outmost points
if (ftop < positions[i].z)
ftop = positions[i].z;
if (fright < positions[i].x)
fright = positions[i].x;
if (fbottom > positions[i].z)
fbottom = positions[i].z;
if (fleft > positions[i].x)
fleft = positions[i].x;
int top = Mathf.RoundToInt(ftop);
int right = Mathf.RoundToInt(fright);
int bottom = Mathf.RoundToInt(fbottom);
int left = Mathf.RoundToInt(fleft);
int terrainXmax = right - left; // the rightmost edge of the terrain
int terrainZmax = top - bottom; // the topmost edge of the terrain
float[,] shapeHeights = TerrainMain.terrainData.GetHeights(left, bottom, terrainXmax, terrainZmax);
Vector2 point; //Create a point Vector2 point to match the shape
/* Loop through all points in the rectangle surrounding the shape */
for (int i = 0; i < terrainZmax; i++)
point.y = i + bottom; //Add off set to the element so it matches the position of the line
for (int j = 0; j < terrainXmax; j++)
point.x = j + left; //Add off set to the element so it matches the position of the line
if (InsidePolygon(point, bottom))
shapeHeights[i, j] = height; // set the height value to the terrain vertex
//SetHeights to change the terrain height.
TerrainMain.terrainData.SetHeightsDelayLOD(left, bottom, shapeHeights);
//Checks if the given vertex is inside the the shape.
bool InsidePolygon(Vector2 p, int terrainZmax)
// Assign the points that define the outline of the shape
Vector3[] positions = new Vector3[line.positionCount];
int count = 0;
Vector2 p1, p2;
int n = positions.Length;
// Find the lines that define the shape
for (int i = 0; i < n; i++)
p1.y = positions[i].z;// - p.y;
p1.x = positions[i].x;// - p.x;
if (i != n - 1)
p2.y = positions[(i + 1)].z;// - p.y;
p2.x = positions[(i + 1)].x;// - p.x;
p2.y = positions[0].z;// - p.y;
p2.x = positions[0].x;// - p.x;
// check if the given point p intersects with the lines that form the outline of the shape.
if (LinesIntersect(p1, p2, p, terrainZmax))
// the point is inside the shape when the number of line intersections is an odd number
if (count % 2 == 1)
return true;
return false;
// Function that checks if two lines intersect with each other
bool LinesIntersect(Vector2 A, Vector2 B, Vector2 C, int terrainZmax)
Vector2 D = new Vector2(C.x, terrainZmax);
Vector2 CmP = new Vector2(C.x - A.x, C.y - A.y);
Vector2 r = new Vector2(B.x - A.x, B.y - A.y);
Vector2 s = new Vector2(D.x - C.x, D.y - C.y);
float CmPxr = CmP.x * r.y - CmP.y * r.x;
float CmPxs = CmP.x * s.y - CmP.y * s.x;
float rxs = r.x * s.y - r.y * s.x;
if (CmPxr == 0f)
// Lines are collinear, and so intersect if they have any overlap
return ((C.x - A.x < 0f) != (C.x - B.x < 0f))
|| ((C.y - A.y < 0f) != (C.y - B.y < 0f));
if (rxs == 0f)
return false; // Lines are parallel.
float rxsr = 1f / rxs;
float t = CmPxs * rxsr;
float u = CmPxr * rxsr;
return (t >= 0f) && (t <= 1f) && (u >= 0f) && (u <= 1f);
The used method is filling the shape one line at a time - "The Ray Casting method". It turns out that this method starts taking more resources only if the given shape as a lot of sides. (A side of the shape is a line that connects two points in the outline of the shape.)
When I posted this question, my Line Renderer had 134 points defining the line. This also means the shape has the same number of sides that needs to pass the ray cast check.
When I narrowed down the number of points to 42, the method got fast enough, and also the shape did not lose almost any detail.
Furthermore I am planning on using some methods to make the contours smoother, so the shape can be defined with even less points.
In short, you need these steps to get to the result:
Create the outline of the shape;
Find the 4 points that mark the bounding box around the shape;
Start ray casting the box;
Check the number of how many times the ray intersects with the sides of the shape. The points with the odd number are located inside the shape:
Assign your attributes to all of the points that were found in the shape.

SKShader to create parallax background

A parallax background with a fixed camera is easy to do, but since i'm making a topdown view 2D space exploration game, I figured that having a single SKSpriteNode filling the screen and being a child of my SKCameraNode and using a SKShader to draw a parallax starfield would be easier.
I went on shadertoy and found this simple looking shader. I adapted it successfully on shadertoy to accept a vec2() for the velocity of the movement that I want to pass as an SKAttribute so it can follow the movement of my ship.
Here is the original source:
I managed to make the conversion of the original code so it compiles without any error, but nothing shows up on the screen. I tried the individual functions and they do work to generate a fixed image.
Any pointers to make it work?
This isn't really an answer, but it's a lot more info than a comment, and highlights some of the oddness and appropriateness of how SK does particles:
There's a couple of weird things about particles in SceneKit, that might apply to SpriteKit.
when you move the particle system, you can have the particles move with them. This is the default behaviour:
From the docs:
When the emitter creates particles, they are rendered as children of
the emitter node. This means that they inherit the characteristics of
the emitter node, just like nodes do. For example, if you rotate the
emitter node, the positions of all of the spawned particles are
rotated also. Depending on what effect you are simulating with the
emitter, this may not be the correct behavior.
For most applications, this is the wrong behaviour, in fact. But for what you're wanting to do, this is ideal. You can position new SKNodeEmitters offscreen where the ship is heading, and fix them to "space" so they rotate in conjunction with the directional changes of the player's ship, and the particles will do exactly as you want/need to create the feeling of moving throughout space.
SpriteKit has a prebuild, or populate ability in the form of advancing the simulation: https://developer.apple.com/reference/spritekit/skemitternode/1398027-advancesimulationtime
This means you can have stars ready to show wherever the ship is heading to, through space, as the SKEmittors come on screen. There's no need for a loading delay to build stars, this does it immediately.
As near as I can figure, you'd need a 3 particle emitters to pull this off, each the size of the screen of the device. Burst the particles out, then release each layer you want for parallax to a target node at the right "depth" from the camera, and carry on by moving these targets as per the screen movement.
Bit messy, but probably quicker, easier, and much more powerfully full of potential for playful effects than creating your own system.
Maybe... I could be wrong.
EDIT : Code is clean and working now. I've setup a GitHub repo for this.
I guess I didnt explain what I wanted properly. I needed a starfield background that follows the camera like you could find in Subspace (back in the days)
The result is pretty cool and convincing! I'll probably come back to this later when the node quantity becomes a bottleneck. I'm still convinced that the proper way to do that is with shaders!
Here is a link to my code on GitHub. I hope it can be useful to someone. It's still a work in progress but it works well. Included in the repo is the source from SKTUtils (a library by Ray Wenderlich that is already freely available on github) and from my own extension to Ray's tools that I called nuts-n-bolts. these are just extensions for common types that everyone should find useful. You, of course, have the source for the StarfieldNode and the InteractiveCameraNode along with a small demo project.
The short answer is, in SpriteKit you use the fragment coordinates directly without needing to scale against the viewport resolution (iResoultion in shadertoy land), so the line:
vec2 samplePosition = (fragCoord.xy / maxResolution) + vec2(0.0, iTime * 0.01);
can be changed to omit the scaling:
vec2 samplePosition = fragCoord.xy + vec2(0.0, iTime * 0.01);
this is likely the root issue (hard to know for sure without your rendition of the shader code) of why you're only seeing black from the shader.
For a full answer for an implementation of a SpriteKit shader making a star field, let's take the original shader and simplify it so there's only one star field, no "fog" (just to keep things simple), and add a variable to control the velocity vector of the movement of the stars:
(this is still in shadertoy code)
float Hash(in vec2 p)
float h = dot(p, vec2(12.9898, 78.233));
return -1.0 + 2.0 * fract(sin(h) * 43758.5453);
vec2 Hash2D(in vec2 p)
float h = dot(p, vec2(12.9898, 78.233));
float h2 = dot(p, vec2(37.271, 377.632));
return -1.0 + 2.0 * vec2(fract(sin(h) * 43758.5453), fract(sin(h2) * 43758.5453));
float Noise(in vec2 p)
vec2 n = floor(p);
vec2 f = fract(p);
vec2 u = f * f * (3.0 - 2.0 * f);
return mix(mix(Hash(n), Hash(n + vec2(1.0, 0.0)), u.x),
mix(Hash(n + vec2(0.0, 1.0)), Hash(n + vec2(1.0)), u.x), u.y);
vec3 Voronoi(in vec2 p)
vec2 n = floor(p);
vec2 f = fract(p);
vec2 mg, mr;
float md = 8.0;
for(int j = -1; j <= 1; ++j)
for(int i = -1; i <= 1; ++i)
vec2 g = vec2(float(i), float(j));
vec2 o = Hash2D(n + g);
vec2 r = g + o - f;
float d = dot(r, r);
if(d < md)
md = d;
mr = r;
mg = g;
return vec3(md, mr);
vec3 AddStarField(vec2 samplePosition, float threshold)
vec3 starValue = Voronoi(samplePosition);
if(starValue.x < threshold)
float power = 1.0 - (starValue.x / threshold);
return vec3(power * power * power);
return vec3(0.0);
void mainImage( out vec4 fragColor, in vec2 fragCoord )
float maxResolution = max(iResolution.x, iResolution.y);
vec2 velocity = vec2(0.01, 0.01);
vec2 samplePosition = (fragCoord.xy / maxResolution) + vec2(iTime * velocity.x, iTime * velocity.y);
vec3 finalColor = AddStarField(samplePosition * 16.0, 0.00125);
fragColor = vec4(finalColor, 1.0);
If you paste that into a new shadertoy window and run it you should see a monochrome star field moving towards the bottom left.
To adjust it for SpriteKit is fairly simple. We need to remove the "in"s from the function variables, change the name of some constants (there's a decent blog post about the shadertoy to SpriteKit changes which are needed), and use an Attribute for the velocity vector so we can change the direction of the stars for each SKSpriteNode this is applied to, and over time, as needed.
Here's the full SpriteKit shader source, with a_velocity as a needed attribute defining the star field movement:
float Hash(vec2 p)
float h = dot(p, vec2(12.9898, 78.233));
return -1.0 + 2.0 * fract(sin(h) * 43758.5453);
vec2 Hash2D(vec2 p)
float h = dot(p, vec2(12.9898, 78.233));
float h2 = dot(p, vec2(37.271, 377.632));
return -1.0 + 2.0 * vec2(fract(sin(h) * 43758.5453), fract(sin(h2) * 43758.5453));
float Noise(vec2 p)
vec2 n = floor(p);
vec2 f = fract(p);
vec2 u = f * f * (3.0 - 2.0 * f);
return mix(mix(Hash(n), Hash(n + vec2(1.0, 0.0)), u.x),
mix(Hash(n + vec2(0.0, 1.0)), Hash(n + vec2(1.0)), u.x), u.y);
vec3 Voronoi(vec2 p)
vec2 n = floor(p);
vec2 f = fract(p);
vec2 mg, mr;
float md = 8.0;
for(int j = -1; j <= 1; ++j)
for(int i = -1; i <= 1; ++i)
vec2 g = vec2(float(i), float(j));
vec2 o = Hash2D(n + g);
vec2 r = g + o - f;
float d = dot(r, r);
if(d < md)
md = d;
mr = r;
mg = g;
return vec3(md, mr);
vec3 AddStarField(vec2 samplePosition, float threshold)
vec3 starValue = Voronoi(samplePosition);
if (starValue.x < threshold)
float power = 1.0 - (starValue.x / threshold);
return vec3(power * power * power);
return vec3(0.0);
void main()
vec2 samplePosition = v_tex_coord.xy + vec2(u_time * a_velocity.x, u_time * a_velocity.y);
vec3 finalColor = AddStarField(samplePosition * 20.0, 0.00125);
gl_FragColor = vec4(finalColor, 1.0);
(worth noting, that is is simply a modified version of the original )

How does the reversebits function of HLSL SM5 work?

I am trying to implement an inverse FFT in a HLSL compute shader and don't understand how the new inversebits function works. The shader is run under Unity3D, but that shouldn't make a difference.
The problem is, that the resulting texture remains black with the exception of the leftmost one or two pixels in every row. It seems to me, as if the reversebits function wouldn't return the correct indexes.
My very simple code is as following:
#pragma kernel BitReverseHorizontal
Texture2D<float4> inTex;
RWTexture2D<float4> outTex;
uint2 getTextureThreadPosition(uint3 groupID, uint3 threadID) {
uint2 pos;
pos.x = (groupID.x * 16) + threadID.x;
pos.y = (groupID.y * 16) + threadID.y;
return pos;
void BitReverseHorizontal (uint3 threadID : SV_GroupThreadID, uint3 groupID : SV_GroupID)
uint2 pos = getTextureThreadPosition(groupID, threadID);
uint xPos = reversebits(pos.x);
uint2 revPos = uint2(xPos, pos.y);
float4 values;
values.x = inTex[pos].x;
values.y = inTex[pos].y;
values.z = inTex[revPos].z;
values.w = 0.0f;
outTex[revPos] = values;
I played around with this for quite a while and found out, that if I replace the reversebits line with this one here:
uint xPos = reversebits(pos.x << 23);
it works. Although I have no idea why. Could be just coincidence. Could someone please explain to me, how I have to use the reversebits function correctly?
Are you sure you want to reverse the bits?
x = 0: reversed: x = 0
x = 1: reversed: x = 2,147,483,648
x = 2: reversed: x = 1,073,741,824
If you fetch texels from a texture using coordinates exceeding the width of the texture then you're going to get black. Unless the texture is > 1 billion texels wide (it isn't) then you're fetching well outside the border.
I am doing the same and came to the same problem and these answers actually answered it for me but i'll give you the explanation and a whole solution.
So the solution with variable length buffers in HLSL is:
uint reversedIndx;
uint bits = 32 - log2(xLen); // sizeof(uint) - log2(numberOfIndices);
for (uint j = 0; j < xLen; j ++)
reversedIndx = reversebits(j << bits);
And what you found/noticed essentially pushes out all the leading 0 of your index so you are just reversing the least significant or rightmost bits up until the max bits we want.
for example:
int length = 8;
int bits = 32 - 3; // because 1 << 3 is 0b1000 and we want the inverse like a mask
int j = 6;
and since the size of an int is generally 32bits in binary j would be
j = 0b00000000000000000000000000000110;
and reversed it would be (AKA reversebits(j);)
j = 0b01100000000000000000000000000000;
Which was our error, so j bit shifted by bits would be
j = 0b11000000000000000000000000000000;
and then reversed and what we want would be
j = 0b00000000000000000000000000000011;

How would I implement the Matlab skeletonizing/thinning algorithm on the iPhone?

How can I implement the Matlab algorithm that will skeletonize / thin binary (black
and white) images in Objective-C within an iPhone app?
Well basically you could use morpholocial operators for this...
Build eight hit-or-miss operators like this:
0 0 0
St1 = x 1 x (for deleting upper pixels)
1 1 1
rotate this 4 times to get it for the 4 sides. Then also build 4 more ofr the corners like this:
0 0 x
St5 = 0 1 1 (rotate this again 4 times for the 4 corners)
x 1 1
Then you erode your image (with loops) until none of the operators can be used anymore... what is left is the skeleton of that image...
This shouldn't be too hard to implement in Objective C I guess... (not familiar with it) ... this is a general strategy...
Hope that helps... if not, keep asking... ;-)
I've wrote GLSL fragment shader which performs fast skeletonization on images. You can apply this shader in a loop until you get what you need. GLSL shader code:
uniform sampler2D Texture0;
varying vec2 texCoord;
// 3x3 pixel window
// (-1,+1) (0,+1) (+1,+1)
// (-1,0) (0,0) (+1,0)
// (-1,-1) (0,-1) (+1,-1)
float dtex = 1.0 / float(textureSize(Texture0,0));
vec4 pixel(int dx, int dy) {
return texture2D(Texture0,texCoord +
vec2(float(dx)*dtex, float(dy)*dtex));
int exists(int dx, int dy) {
return int(pixel(dx,dy).r < 0.5);
int neighbors() {
return exists(-1,+1) +
exists(0,+1) +
exists(+1,+1) +
exists(-1,0) +
exists(+1,0) +
exists(-1,-1) +
exists(0,-1) +
int transitions() {
return int(
clamp(float(exists(-1,+1))-float(exists(0,+1)),0.,1.) + // (-1,+1) -> (0,+1)
clamp(float(exists(0,+1))-float(exists(+1,+1)),0.,1.) + // (0,+1) -> (+1,+1)
clamp(float(exists(+1,+1))-float(exists(+1,0)),0.,1.) + // (+1,+1) -> (+1,0)
clamp(float(exists(+1,0))-float(exists(+1,-1)),0.,1.) + // (+1,0) -> (+1,-1)
clamp(float(exists(+1,-1))-float(exists(0,-1)),0.,1.) + // (+1,-1) -> (0,-1)
clamp(float(exists(0,-1))-float(exists(-1,-1)),0.,1.) + // (0,-1) -> (-1,-1)
clamp(float(exists(-1,-1))-float(exists(-1,0)),0.,1.) + // (-1,-1) -> (-1,0)
clamp(float(exists(-1,0))-float(exists(-1,+1)),0.,1.) // (-1,0) -> (-1,+1)
int MarkedForRemoval() {
int neib = neighbors();
int tran = transitions();
if (exists(0,0)==0 // do not remove if already white
|| neib==0 // do not remove an isolated point
|| neib==1 // do not remove tip of a line
|| neib==7 // do not remove located in concavity
|| neib==8 // do not remove not a boundary point
|| tran>=2 // do not remove on a bridge connecting two or more edge pieces
return 0;
return 1;
void main(void)
int remove = MarkedForRemoval();
vec4 curr = texture2D(Texture0,texCoord);
vec4 col = vec4(remove,remove,remove,1.0);
gl_FragColor = (remove==1)? col:((curr.r > 0.05)?
Only this time code is based on this lecture (actually on first part of lecture, so algorithm has some bugs :-) )
See what happens when poor chimpanzee was constantly fed with this GLSL shader:
Iteration 0
Iteration 5
Iteration 10
Iteration 15