I am studying with raywenderlich metal by tutorials book. there is an example in the book about selection object.
typedef struct {
uint width;
uint height;
uint tiling;
uint lightCount;
vector_float3 cameraPosition;
uint objectId;
uint touchX;
uint touchY;
} Params;
//this struct sent to fragment shader by
encoder.setFragmentBytes(&params, length: MemoryLayout<Params>.stride, index:
//end hit test in fragment shader like this
fragment float4 fragment_main(constant Params &params [[buffer(ParamsBuffer)]],
if(params.objectId != 0 && objectID == params.objectId) {
material.baseColor = float3(0.9, 0.5, 0);
just I can hit test in fragment shader.. if I want get selected objectId value in swift side
what should I do.. or suppose to be hit test just here.. I didn't understand this part. values can be share CPU and GPU side (in this example objectId) or just I have to process values in shader which value I sent. Is somebody can explain me please?
selection test working correctly but..
objectId value always 0 when I try get objectId value in swift side

Yes, you have to create a new MTLBuffer object from the specified data and then binding it to the fragment shader, with the setFragmentBuffer(_:offset:index:) method.


Something wrong with custom function GetLighningInformation in Shader Graph

void GetLightingInformation_float(out float3 Direction, out float3 Color,out float Attenuation)
Direction = float3(-0.5,0.5,-0.5);
Color = float3(1,1,1);
Attenuation = 0.4;
Light light = GetMainLight();
Direction = light.direction;
Attenuation = light.distanceAttenuation;
Color = light.color;
I think there's something wrong with code or unity version (unity 2021.3.4f1)
If I gonna don't using this function everything will be ok, but if I use it, I get error like this:
Shader error in 'Master': 'GetLightingInformation_float': output parameter 'Direction' not completely initialized at Assets/Shaders/MainLight.hlsl(1) (on d3d11)
Compiling Subshader: 0, Pass: BuiltIn ForwardAdd, Fragment program with POINT _ADDITIONAL_LIGHTS_VERTEX
I can't show images, cuz I have not enough reputation

updating directx constant buffer from stack or heap variable

hi i am learning directx11 recently.
and i have problem in using constant buffer.
so what i am doing is, i create the constant buffer for directional light and update the value, so that i can use this in my shader.
constant buffer structure in application side
XMFLOAT4 ambient;
XMFLOAT4 diffuse;
XMFLOAT4 specular;
XMFLOAT4 enabled;
XMFLOAT4 intensity;
class DirectionalLight
//1. member variable
//2. static variable
//3. allocated variable
constant buffer structure in shader side
cbuffer DIRECTIONAL_LIGHT : register(b0)
float4 d_Ambient;
float4 d_Diffuse;
float4 d_Specular;
float4 d_Dir;
float4 d_Enabled;
float4 d_intensity;
how i update constant buffer
//of course, edit `m_data` before using map
ZeroMemory(&mappedData, sizeof(D3D11_MAPPED_SUBRESOURCE));
HRESULT hr = dContext->Map(m_cb, 0, D3D11_MAP_WRITE_DISCARD, 0, &mappedData);
CopyMemory(mappedData.pData, &m_data, sizeof(SHADER_DIRECTIONAL_LIGHT));
dContext->Unmap(m_cb, 0);
dContext->PSSetConstantBuffers(SHADER_REG_CB_DIRECTIONAL_LIGHT, 1, &m_cb);
and the problem is that
when creating m_data like SHADER_DIRECTIONAL_LIGHT m_data; or static SHADER_DIRECTIONAL_LIGHT m_data; ,
it works fine and the value i updated into the constant buffer using map properly applied into shader side too.
But when i create m_data like SHADER_DIRECTIONAL_LIGHT* m_data, the value i update doesn't really work. the value in shader side is just uninitialized random value.
by the debug, i am just guessing the problem is from the different memory space of variable that i use to update the constant buffer. if i use variable in stack, i successfully update the constant bufferm, and when using variable from heap, it doesn't.
hope somebody to clarify what actually problem here.
You are passing address of a member pointer m_data with the call
CopyMemory(mappedData.pData, &m_data, sizeof(SHADER_DIRECTIONAL_LIGHT));
Since pointer already holds the address, pass it without address operator like this:
CopyMemory(mappedData.pData, m_data, sizeof(SHADER_DIRECTIONAL_LIGHT));

How to use SCNBufferBindingBlock in SceneKit?

I'm looking at SceneKit's handle binding method with the SCNBufferBindingBlock call back as described here:
Does anyone have an example of how this works?
let program = SCNProgram()
program.handleBinding(ofBufferNamed: "", frequency: .perFrame) { (steam, theNode, theShadable, theRenderer) in
To me it reads like I can use a *.metal shader on a SCNNode without having to go through the hassle of SCNTechniques....any takers?
Just posting this in case someone else came here looking for a concise example. Here's how SCNProgram's handleBinding() method can be used with Metal:
First define a data structure in your .metal shader file:
struct MyShaderUniforms {
float myFloatParam;
float2 myFloat2Param;
Then pass this as an argument to a shader function:
fragment half4 myFragmentFunction(MyVertex vertexIn [[stage_in]],
constant MyShaderUniforms& shaderUniforms [[buffer(0)]]) {
Next, define the same data structure in your Swift file:
struct MyShaderUniforms {
var myFloatParam: Float = 1.0
var myFloat2Param = simd_float2()
Now create an instance of this data structure, changes its values and define the SCNBufferBindingBlock:
var myUniforms = MyShaderUniforms()
myUniforms.myFloatParam = 3.0
program.handleBinding(ofBufferNamed: "shaderUniforms", frequency: .perFrame) { (bufferStream, node, shadable, renderer) in
bufferStream.writeBytes(&myUniforms, count: MemoryLayout<MyShaderUniforms>.stride)
Here, the string passed to ofBufferNamed: corresponds to the argument name in the fragment function. The block's bufferStream property then contains the user-defined data type MyShaderUniforms which can then be written to with updated values.
The .handleBinding(ofBufferNamed:frequency:handler:) method registers a block for SceneKit to call at render time for binding a Metal buffer to the shader program. This method can only be used with Metal or OpenGL shading language based programs. SCNProgram object helps perform this custom rendering. Program object contains a vertex shader and a fragment shader. Using a program object completely replaces SceneKit’s rendering. Your shaders take input from SceneKit and become responsible for all transform, lighting and shading effects you want to produce. Use .handleBinding() method to associate a block with a Metal shader program to handle setup of a buffer used in that shader.
Here's a link to Developer Documentation on SCNProgram class.
Also you need an instance method writeBytes(_:count:) that copies all your necessary data bytes into the underlying Metal buffer for use by a shader.
SCNTechniqueclass specifically made for post-processing SceneKit's rendering of a scene using additional drawing passes with custom Metal or OpenGL shaders. Using SCNTechnique you can create such effects as color grading or displacement, motion blur and render ambient occlusion as well as other render passes.
Here is a first code's excerpt how to properly use .handleBinding() method:
func useTheseAPIs(shadable: SCNShadable,
bufferStream: SCNBufferStream
voidPtr: UnsafeMutableRawPointer,
bindingBlock: #escaping SCNBindingBlock,
bufferFrequency: SCNBufferFrequency,
bufferBindingBlock: #escaping SCNBufferBindingBlock,
program: SCNProgram) {
bufferStream.writeBytes(voidPtr, count: 4)
shadable.handleBinding!(ofSymbol: "symbol", handler: bindingBlock)
shadable.handleUnbinding!(ofSymbol: "symbol", handler: bindingBlock)
program.handleBinding(ofBufferNamed: "pass",
frequency: bufferFrequency,
handler: bufferBindingBlock)
And here is a second code's excerpt:
let program = SCNProgram()
program.delegate = self as? SCNProgramDelegate
program.vertexShader = NextLevelGLContextYUVVertexShader
program.fragmentShader = NextLevelGLContextYUVFragmentShader
forSymbol: NextLevelGLContextAttributeVertex,
options: nil)
forSymbol: NextLevelGLContextAttributeTextureCoord,
options: nil)
if let material = self._material {
material.program = program
material.handleBinding(ofSymbol: NextLevelGLContextUniformTextureSamplerY, handler: {
(programId: UInt32, location: UInt32, node: SCNNode?, renderer: SCNRenderer) in
glUniform1i(GLint(location), 0);
material.handleBinding(ofSymbol: NextLevelGLContextUniformTextureSamplerUV, handler: {
(programId: UInt32, location: UInt32, node: SCNNode?, renderer: SCNRenderer) in
glUniform1i(GLint(location), 1);
Also, look at Simulating refraction in SceneKit
SO post.

Connection between [[stage_in]], MTLVertexDescriptor and MTKMesh

What is the connection between:
Using [[stage_in]] in a Metal Shader
Using MTLVertexDescriptor
Using MTKMesh
For example
Is it possible to use [[stage_in]] without using MTLVertexDescriptor?
Is it possible to use MTLVertexDescriptor without using MTKMesh, but an array of a custom struct based data structure? Such as struct Vertex {...}, Array<Vertex>?
Is it possible to use MTKMesh without using MTLVertexDescriptor? For example using the same struct based data structure?
I didn't find this information on the internet, and the Metal Shading Language Specification doesn't even include the words "descriptor" or "mesh".
No. If you try to create a render pipeline state from a pipeline descriptor without a vertex descriptor, and the corresponding vertex function has a [[stage_in]] parameter, the pipeline state creation call will fail.
Yes. After all, when you draw an MTKMesh, you're still obligated to call setVertexBuffer(...) with the buffers wrapped by the mesh's constituent MTKMeshBuffers. You could just as readily create an MTLBuffer yourself and copy your custom vertex structs into it.
Yes. Instead of having a [[stage_in]] parameter, you'd have a parameter attributed with [[buffer(0)]] (assuming all of the vertex data is interleaved in a single vertex buffer) of type MyVertexType *, as well as a [[vertex_id]] parameter that tells you where to index into that buffer.
Here's an example of setting the vertex buffers from an MTKMesh on a render command encoder:
for (index, vertexBuffer) in mesh.vertexBuffers.enumerated() {
commandEncoder.setVertexBuffer(vertexBuffer.buffer, offset: vertexBuffer.offset, index: index)
vertexBuffer is of type MTKMeshBuffer, while its buffer property is of type MTLBuffer; I mention this because it can be confusing.
Here is one way in which you might create a vertex descriptor to tell Model I/O and MetalKit to lay out the mesh data you're loading:
let mdlVertexDescriptor = MDLVertexDescriptor()
mdlVertexDescriptor.attributes[0] = MDLVertexAttribute(name: MDLVertexAttributePosition, format: MDLVertexFormat.float3, offset: 0, bufferIndex: 0)
mdlVertexDescriptor.attributes[1] = MDLVertexAttribute(name: MDLVertexAttributeNormal, format: MDLVertexFormat.float3, offset: 12, bufferIndex: 0)
mdlVertexDescriptor.attributes[2] = MDLVertexAttribute(name: MDLVertexAttributeTextureCoordinate, format: MDLVertexFormat.float2, offset: 24, bufferIndex: 0)
mdlVertexDescriptor.layouts[0] = MDLVertexBufferLayout(stride: 32)
You can create a corresponding MTLVertexDescriptor in order to create a render pipeline state suitable for rendering such a mesh:
let vertexDescriptor = MTKMetalVertexDescriptorFromModelIO(mdlVertexDescriptor)!
Here's a vertex struct that matches that layout:
struct VertexIn {
float3 position [[attribute(0)]];
float3 normal [[attribute(1)]];
float2 texCoords [[attribute(2)]];
Here's a stub vertex function that consumes one of these vertices:
vertex VertexOut vertex_main(VertexIn in [[stage_in]])
And finally, here's a vertex struct and vertex function you could use to render the exact same mesh data without a vertex descriptor:
struct VertexIn {
packed_float3 position;
packed_float3 normal;
packed_float2 texCoords;
vertex VertexOut vertex_main(device VertexIn *vertices [[buffer(0)]],
uint vid [[vertex_id]])
VertexIn in = vertices[vid];
Note that in this last case, I need to mark the struct members as packed since by default, Metal's simd types are padded for alignment purposes (specifically, the stride of a float3 is 16 bytes, not 12 as we requested in our vertex descriptor).

glDrawArrays allocates memory on every frame

I recently found that glDrawArrays allocating and releasing huge amounts of memory on every frame.
I suspect that it's related to "Shaders compiled outside of initialization" issue reported by openGL profiler. That occurs on every frame! Should it be only once, and after shaders are compiled, disappear?
EDIT: I also double checked that my vertex are properly aligned. So I'm really confused what memory driver needs to allocate on every frame.
EDIT #2: I'm using VBO's and degenerated triangle strips to render sprites and . I'm passing geometry on every frame (GL_STREAM_DRAW).
EDIT #3:
I think I'm close to issue but still unable to solve it. Problem disappears if I pass same texture id value to shader (see source code comment). Somehow this issue is relate to fragment shader I think.
In my sprite batch I have list of sprites and I render them by texture id and FIFO queue.
Here's source code of my sprite batch class:
void spriteBatch::renderInRange(shader& prog, int start, int count){
int curTexture = textures[start];
int startFrom = start;
//Looping through all vertexes and rendering them by texture id's
for(int i=start;i<start+count;++i){
if(textures[i] != curTexture || i == (start + count) -1){
//Problem occurs after decommenting this line
// prog.setUniform("texture", curTexture-1);
prog.setUniform("texture", 0); // if I pass same texture id everything is OK
int startVertex = startFrom * vertexesPerSprite;
int cnt = ((i - startFrom) * vertexesPerSprite);
//If last one has same texture we just adding it
//to last render call
if(i == (start + count) - 1 && textures[i] == curTexture)
cnt = ((i + 1) - startFrom) * vertexesPerSprite;
render(vbo, GL_TRIANGLE_STRIP, startVertex+1, cnt-1);
//if last element has different texture
//we need to render it separately
if(i == (start + count) - 1 && textures[i] != curTexture){
// prog.setUniform("texture", textures[i]-1);
render(vbo, GL_TRIANGLE_STRIP, (i * vertexesPerSprite) + 1, 5);
curTexture = textures[i];
startFrom = i;
inline GLint getUniformLocation(GLuint shaderID, const string& name) {
GLint iLocation = glGetUniformLocation(shaderID,;
if(iLocation == -1){ // shader variable not found
stringstream errorText;
errorText << "Uniform \"" << name << " was not found!";
throw logic_error(errorText.str());
return iLocation;
void shader::setUniform(const string& name, const matrix& value) {
GLint location = getUniformLocation(this->programID,;
glUniformMatrix4fv(location, 1, GL_FALSE, &(value[0]));
void shader::setUniform(const string& name, int value) {
GLint iLocation = getUniformLocation(this->programID,;
//GLenum error = glGetError();
glUniform1i(iLocation, value);
// error = glGetError();
EDIT#4: I tried to profile app on IOS 6 and Iphone5 and allocations are much bigger. But methods are different in this case. I'm attaching new screenshot.
Issue is resolved by creating separate shader for each texture.
It looks like bug in driver implementation that does happen on all IOS devices (I tested on IOS 5/6). However on higher iPhone models it's not that noticeable.
On iPhone4 performance hit was very significant from 60 FPS to 38!
More code would help, but have you checked to see if the amount of memory involved is comparable to the amount of geometry you're updating? (although that would seem like a lot of geometry!) It looks like GL is holding your update until glDrawArrays, releasing it when it can be pulled into internal GL state.
If you can run the code in a MacOS app, the OpenGL Profiler tool may be able to further isolate the condition. (look in XCode documentation for more info, if you're not familiar with this tool). I'd also suggest looking at texture use, given the amount of memory involved.
The easiest thing to do might be to conditionally break on malloc() for a large allocation, note the address, and examine what's been loaded there.
try to query the texture uniform just once (in initialization) and cache it. calling "glGetUniformLocation" too much in one frame will hammer the performance (depending on the sprite count).