How to use Depth Testing with CAMetalLayer using Metal and Swift? - swift

Recently I decided to learn how to use the Metal framework with Swift. I read some turorials, watched videos, did a few things and finally I got to part where I have to use Depth Testing to make things look good.
I haven't done such low level graphics programming before, so I looked around the whole Internet on how Depth Testing works and how to implemented it using CAMetalLayer and Metal.
However all examples of Depth Testing, which I found, were done using Open GL and I couldn't find such functions in Metal.
How do I implement Depth Testing with CAMetalLayer using Metal and Swift?
Thank you in advance!

This is a good example.
http://metalbyexample.com/up-and-running-3/
The key is that CAMetalLayer does not maintain the depth map for you. You need to create and manage explicitly the depth texture. And attach the depth texture to the depth-stencil descriptor which you use to create the render-encoder.

The question of this Stackoverflow post contains your answer, although it's written in Obj-C. But basically, like what Dong Feng has pointed out, you need to create and manage the depth texture by yourself.
Here's a Swift 4 snippet for how to create a depth texture
func buildDepthTexture(_ device: MTLDevice, _ size: CGSize) -> MTLTexture {
let desc = MTLTextureDescriptor.texture2DDescriptor(
pixelFormat: .depth32Float_stencil8,
width: Int(size.width), height: Int(size.height), mipmapped: false)
desc.storageMode = .private
desc.usage = .renderTarget
return device.makeTexture(descriptor: desc)!
}
And here's how you need to attach it to a MTLRenderPassDescriptor
let renderPassDesc = MTLRenderPassDescriptor()
let depthAttachment = renderPassDesc.depthAttachment!
// depthTexture is created using the above function
depthAttachment.texture = depthTexture
depthAttachment.clearDepth = 1.0
depthAttachment.storeAction = .dontCare
// Maybe set up color attachment, etc.

Related

How do I use my own occlusion ML model with ARKit?

ARKit has built-in people occlusion, which you can enable with something like
guard let config = arView.session.configuration as? ARWorldTrackingConfiguration else {
fatalError("Unexpectedly failed to get the configuration.")
}
guard ARWorldTrackingConfiguration.supportsFrameSemantics(.personSegmentationWithDepth) else {
fatalError("People occlusion is not supported on this device.")
}
config.frameSemantics.insert(.personSegmentationWithDepth)
arView.session.run(config)
I would like to provide my own binary segmentation model ("binary" as in person/not-person for each pixel), presumably using CoreML, instead of using whatever Apple is using to segment people because I want to occlude something else instead of people. How do I do this? Is there a straightforward way to do this or will I have to re-implement parts of the rendering pipeline? They show some code about how to use people segmentation with a custom Metal renderer in this WWDC 2019 video (starts around 9:30) but it's not clear to me how to use my own model based on that, and I would prefer to use ARKit/RealityKit instead of implementing my own rendering (I am a mere mortal).

Using CoreML to classify NSImages

I'm trying to work with Xcode CoreML to classify images that are simply single digits or letters. To start out with I'm just usiing .png images of digits. Using Create ML tool, I built an image classifier (NOT including any Vision support stuff) and provided a set of about 300 training images and separate set of 50 testing images. When I run this model, it trains and tests successfully and generates a model. Still within the tool I access the model and feed it another set of 100 images to classify. It works properly, identifying 98 of them corrrectly.
Then I created a Swift sample program to access the model (from the Mac OS X single view template); it's set up to accept a dropped image file and then access the model's prediction method and print the result. The problem is that the model expects an object of type CVPixelBuffer and I'm not sure how to properly create this from NSImage. I found some reference code and incorported but when I actually drag my classification images to the app it's only about 50% accurate. So I'm wondering if anyone has any experience with this type of model. It would be nice if there were a way to look at the "Create ML" source code to see how it processes a dropped image when predicting from the model.
The code for processing the image and invoking model prediction method is:
// initialize the model
mlModel2 = MLSample() //MLSample is model generated by ML Create tool and imported to project
// prediction logic for the image
// (included in a func)
//
let fimage = NSImage.init(contentsOfFile: fname) //fname is obtained from dropped file
do {
let fcgImage = fimage.cgImage(forProposedRect: nil, context: nil, hints: nil)
let imageConstraint = mlModel2?.model.modelDescription.inputDescriptionsByName["image"]?.imageConstraint
let featureValue = try MLFeatureValue(cgImage: fcgImage!, constraint: imageConstraint!, options: nil)
let pxbuf = featureValue.imageBufferValue
let mro = try mlModel2?.prediction(image: pxbuf!)
if mro != nil {
let mroLbl = mro!.classLabel
let mroProb = mro!.classLabelProbs[mroLbl] ?? 0.0
print(String.init(format: "M2 MLFeature: %# %5.2f", mroLbl, mroProb))
return
}
}
catch {
print(error.localizedDescription)
}
return
There are several ways to do this.
The easiest is what you're already doing: create an MLFeatureValue from the CGImage object.
My repo CoreMLHelpers has a different way to convert CGImage to CVPixelBuffer.
A third way is to get Xcode 12 (currently in beta). The automatically-generated class now accepts images instead of just CVPixelBuffer.
In cases like this it's useful to look at the image that Core ML actually sees. You can use the CheckInputImage project from https://github.com/hollance/coreml-survival-guide to verify this (it's an iOS project but easy enough to port to the Mac).
If the input image is correct, and you still get the wrong predictions, then probably the image preprocessing options on the model are wrong. For more info: https://machinethink.net/blog/help-core-ml-gives-wrong-output/

How to get frames from a local video file in Swift?

I need to get the frames from a local video file so i can process them before the video is played. I already tried using AVAssetReader and VideoOutput.
[EDIT] Here is the code i used from Accesing Individual Frames using AV Player
let asset = AVAsset(URL: inputUrl)
let reader = try! AVAssetReader(asset: asset)
let videoTrack = asset.tracksWithMediaType(AVMediaTypeVideo)[0]
// read video frames as BGRA
let trackReaderOutput = AVAssetReaderTrackOutput(track: videoTrack, outputSettings:[String(kCVPixelBufferPixelFormatTypeKey): NSNumber(unsignedInt: kCVPixelFormatType_32BGRA)])
reader.addOutput(trackReaderOutput)
reader.startReading()
while let sampleBuffer = trackReaderOutput.copyNextSampleBuffer() {
print("sample at time \(CMSampleBufferGetPresentationTimeStamp(sampleBuffer))")
if let imageBuffer = CMSampleBufferGetImageBuffer(sampleBuffer) {
// process each CVPixelBufferRef here
// see CVPixelBufferGetWidth, CVPixelBufferLockBaseAddress, CVPixelBufferGetBaseAddress, etc
}
}
I believe AVAssetReader should work. What did you try? Have you seen this sample code from Apple? https://developer.apple.com/library/content/samplecode/ReaderWriter/Introduction/Intro.html
I found out what the problem was! It was with my implementation. The code i posted is correct. Thank you all
You can have a look at VideoToolbox : https://developer.apple.com/documentation/videotoolbox
But beware: this is close to the hardware decompressor and sparsely documented terrain.
Depending on what processing you want to do, OpenCV may be a an option - in particular if you are detecting or tracking objets in your frames. If your needs are simpler, then the effort to use OpenCV with swift may be a little too much - see below.
You can open a video, read it frame by frame, do your work on the frames and then display then - bearing in mind the need to be efficient to avoid delaying the display.
The basic code structure is quite simple - this is a python example but the same principles apply across supported languages
import numpy as np
import cv2
cap = cv2.VideoCapture('vtest.avi')
while(cap.isOpened()):
ret, frame = cap.read()
//Do whatever work you want on the frame here - in this example
//from the tutorial the image is being converted from one colour
//space to another
gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
//This displays the resulting frame
cv2.imshow('frame',gray)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
cap.release()
cv2.destroyAllWindows()
More info here: http://opencv-python-tutroals.readthedocs.io/en/latest/py_tutorials/py_gui/py_video_display/py_video_display.html
The one caveat is that using OpenCV with swift requires some additional effort - this is a good example, but it evolves constantly so it is worth searching for if you decide to go this way: https://medium.com/#yiweini/opencv-with-swift-step-by-step-c3cc1d1ee5f1

CUICatalog: Invalid Request: requesting subtype without specifying idiom (Where is it coming from and how to fix it?)

When I run my SpriteKit game, I receive this error multiple times in the console. As far as I can tell (though I'm not completely sure), the game itself is unaffected, but the error might have some other implications, along with crowding the debug console.
I did some research into the error, and found a few possible solutions, none of which seem to have completely worked. These solutions include turning ignoresSiblingOrder to false, and specifying textures as SKTextureAtlas(named: "atlasName").textureNamed("textureName"), but these did not work.
I think the error is coming somewhere from the use of textures and texture atlases in the assets catalogue, though I'm not completely sure. Here is how I am implementing some of these textures/images:
let Texture = SKTextureAtlas(named: "character").textureNamed("\character1")
character = SKSpriteNode(texture: Texture)
also:
let Atlas = SKTextureAtlas(named: "character")
var Frames = [SKTexture]()
let numImages = Atlas.textureNames.count
for var i=1; i<=numImages; i++ {
let textureName = "character(i)"
Frames.append(Atlas.textureNamed(textureName))
}
for var i=numImages; i>=1; i-- {
let TextureName = "character(i)"
Frames.append(Atlas.textureNamed(textureName))
}
let firstFrame = Frames[0]
character = SKSpriteNode(texture: firstFrame)
The above code is just used to create an array from which to animate the character, and the animation runs completely fine.
For all my other sprite nodes, I initialize with SKSpriteNode(imageNamed: "imageName") with the image name from the asset catalogue, but not within a texture atlas. All the images have #1x, #2x, and #3x versions.
I'm not sure if there are any other possible sources for the error message, or if the examples above are the sources of the error.
Is this just a bug with sprite kit, or a legitimate error with my code or assets?
Thanks!
I have this error too. In my opinion, it's the Xcode 7.2 bug and not your fault. I've updated Xcode in the middle of making an app and this message starts to show up constantly in the console. According to this and that links, you have nothing to fear here.
Product > Clean
seems to do the trick.
Error seems to start popping up when you delete an Item from Asset Catalogue but its reference still stay buried in code somewhere. (In my case it was the default spaceship asset which I deleted.)

Colorizing an image in Swift

I am trying to figure out some basic operations in working with Swift and images (PNG and JPG).
I have gotten to the point where I can successfully load a given image, but am unsure how to properly apply image adjustments that will stick.
Specifically I am trying to be able to trigger the following:
colorize (HSB adjustment)
invert colors
From the samples I could find online it seems most code samples are for objective C, and I've been unable to get anything working in my current playground. It would seem from the documentation that I should be able to use filters (using CoreImage) but that is where I get lost.
Can anyone point me to or show me a valid (simple) approach that accomplishes this in Swift?
Many thanks in advance!
** EDIT ***
Here's the code I've got so far - working a bit better thanks to that link. However I still run into a crash when trying to output the results (that line is commented out)
So far all the examples I could find around the filtering code are objectiveC based.
import UIKit
var img = UIImage(named: "background.png")
var context = CIContext(options:nil)
var filter = CIFilter(name: "CIColorInvert");
filter.setValue(img, forKey: kCIInputImageKey)
//let newImg = filter.outputImage
Have you tried Google? "coreimage swift" gave me: http://www.raywenderlich.com/76285/beginning-core-image-swift
If this doesn't help, please post the code you've tried that didn't work.