I can not for the life of me figure out how to create a SCNMatrix4 from a transform in objective-c.
The swift code I'm trying to use in objective-c:
let affineTransform = frame.displayTransform(for: .portrait, viewportSize: sceneView.bounds.size)
let transform = SCNMatrix4(affineTransform)
faceGeometry.setValue(SCNMatrix4Invert(transform), forKey: "displayTransform")
I got the first and third line but I can't find anyway to create this SCNMatrix4 from the CGAffineTransform.
CGAffineTransform affine = [self.sceneView.session.currentFrame displayTransformForOrientation:UIInterfaceOrientationPortrait viewportSize:self.sceneView.bounds.size];
SCNMatrix4 trans = ??
[f setValue:SCNMatrix4Invert(trans) forKey:#"displayTransform"];
There is no SCNMatrix4Make, I tried simd_matrix4x4 but that didn't seem to work either.
Thank you
edit:
The swift code is from Apples Example project "ARKitFaceExample", this is the full code:
/*
See LICENSE folder for this sample’s licensing information.
Abstract:
Demonstrates using video imagery to texture and modify the face mesh.
*/
import ARKit
import SceneKit
/// - Tag: VideoTexturedFace
class VideoTexturedFace: TexturedFace {
override func renderer(_ renderer: SCNSceneRenderer, nodeFor anchor: ARAnchor) -> SCNNode? {
guard let sceneView = renderer as? ARSCNView,
let frame = sceneView.session.currentFrame,
anchor is ARFaceAnchor
else { return nil }
#if targetEnvironment(simulator)
#error("ARKit is not supported in iOS Simulator. Connect a physical iOS device and select it as your Xcode run destination, or select Generic iOS Device as a build-only destination.")
#else
// Show video texture as the diffuse material and disable lighting.
let faceGeometry = ARSCNFaceGeometry(device: sceneView.device!, fillMesh: true)!
let material = faceGeometry.firstMaterial!
material.diffuse.contents = sceneView.scene.background.contents
material.lightingModel = .constant
guard let shaderURL = Bundle.main.url(forResource: "VideoTexturedFace", withExtension: "shader"),
let modifier = try? String(contentsOf: shaderURL)
else { fatalError("Can't load shader modifier from bundle.") }
faceGeometry.shaderModifiers = [ .geometry: modifier]
// Pass view-appropriate image transform to the shader modifier so
// that the mapped video lines up correctly with the background video.
let affineTransform = frame.displayTransform(for: .portrait, viewportSize: sceneView.bounds.size)
let transform = SCNMatrix4(affineTransform)
faceGeometry.setValue(SCNMatrix4Invert(transform), forKey: "displayTransform")
contentNode = SCNNode(geometry: faceGeometry)
#endif
return contentNode
}
}
In case anyone ever needs this, here is the extension I was missing
extension SCNMatrix4 {
/**
Create a 4x4 matrix from CGAffineTransform, which represents a 3x3 matrix
but stores only the 6 elements needed for 2D affine transformations.
[ a b 0 ] [ a b 0 0 ]
[ c d 0 ] -> [ c d 0 0 ]
[ tx ty 1 ] [ 0 0 1 0 ]
. [ tx ty 0 1 ]
Used for transforming texture coordinates in the shader modifier.
(Needs to be SCNMatrix4, not SIMD float4x4, for passing to shader modifier via KVC.)
*/
init(_ affineTransform: CGAffineTransform) {
self.init()
m11 = Float(affineTransform.a)
m12 = Float(affineTransform.b)
m21 = Float(affineTransform.c)
m22 = Float(affineTransform.d)
m41 = Float(affineTransform.tx)
m42 = Float(affineTransform.ty)
m33 = 1
m44 = 1
}
}
To replicate the Swift extension for creating an SCNMatrix4 from a CGAffineTransform you can implement the following function:
Some .h file:
extern SCNMatrix4 SCNMatrix4FromTransform(CGAffineTransform transform);
Some .m file:
SCNMatrix4 SCNMatrix4FromTransform(CGAffineTransform transform) {
SCNMatrix4 matrix;
matrix.m11 = transform.a;
matrix.m12 = transform.b;
matrix.m21 = transform.c;
matrix.m22 = transform.d;
matrix.m41 = transform.tx;
matrix.m42 = transform.ty;
matrix.m33 = 1;
matrix.m44 = 1;
return matrix;
}
Then your code becomes:
CGAffineTransform affineTransform = [self.sceneView.session.currentFrame displayTransformForOrientation:UIInterfaceOrientationPortrait viewportSize:self.sceneView.bounds.size];
SCNMatrix4 transform = SCNMatrixFromTransform(affineTransform);
[f setValue:[NSValue valueWithSCNMatrix4:SCNMatrix4Invert(transform)] forKey:#"displayTransform"];
Note the use of NSValue valueWithSCNMatrix4:. This is needed to convert the struct to an object and should satisfy the use of KVC for setting the displayTransform property.
Related
I using Apple's example https://developer.apple.com/library/ios/samplecode/AVCustomEdit/Introduction/Intro.html and have some issues with video transformation.
If source assets have preferredTransform other than identity, output video will have incorrectly rotated frames. This problem can be fixed if AVMutableVideoComposition doesn't have value in property customVideoCompositorClass and when AVMutableVideoCompositionLayerInstruction's transform is setted up with asset.preferredTransform. But in reason of using custom video compositor, which adopting an AVVideoCompositing protocol I can't use standard video compositing instructions.
How can I pre-transform input asset tracks before it's CVPixelBuffer's putted into Metal shaders? Or there are any other way to fix it?
Fragment of original code:
func buildCompositionObjectsForPlayback(_ forPlayback: Bool, overwriteExistingObjects: Bool) {
// Proceed only if the composition objects have not already been created.
if self.composition != nil && !overwriteExistingObjects { return }
if self.videoComposition != nil && !overwriteExistingObjects { return }
guard !clips.isEmpty else { return }
// Use the naturalSize of the first video track.
let videoTracks = clips[0].tracks(withMediaType: AVMediaType.video)
let videoSize = videoTracks[0].naturalSize
let composition = AVMutableComposition()
composition.naturalSize = videoSize
/*
With transitions:
Place clips into alternating video & audio tracks in composition, overlapped by transitionDuration.
Set up the video composition to cycle between "pass through A", "transition from A to B", "pass through B".
*/
let videoComposition = AVMutableVideoComposition()
if self.transitionType == TransitionType.diagonalWipe.rawValue {
videoComposition.customVideoCompositorClass = APLDiagonalWipeCompositor.self
} else {
videoComposition.customVideoCompositorClass = APLCrossDissolveCompositor.self
}
// Every videoComposition needs these properties to be set:
videoComposition.frameDuration = CMTimeMake(1, 30) // 30 fps.
videoComposition.renderSize = videoSize
buildTransitionComposition(composition, andVideoComposition: videoComposition)
self.composition = composition
self.videoComposition = videoComposition
}
UPDATE:
I did workaround for transforming like this:
private func makeTransformedPixelBuffer(fromBuffer buffer: CVPixelBuffer, withTransform transform: CGAffineTransform) -> CVPixelBuffer? {
guard let newBuffer = renderContext?.newPixelBuffer() else {
return nil
}
// Correct transformation example I took from https://stackoverflow.com/questions/29967700/coreimage-coordinate-system
var preferredTransform = transform
preferredTransform.b *= -1
preferredTransform.c *= -1
var transformedImage = CIImage(cvPixelBuffer: buffer).transformed(by: preferredTransform)
preferredTransform = CGAffineTransform(translationX: -transformedImage.extent.origin.x, y: -transformedImage.extent.origin.y)
transformedImage = transformedImage.transformed(by: preferredTransform)
let filterContext = CIContext(mtlDevice: MTLCreateSystemDefaultDevice()!)
filterContext.render(transformedImage, to: newBuffer)
return newBuffer
}
But wondering if there are more memory-effective way without creation of new pixel buffers
How can I pre-transform input asset tracks before it's CVPixelBuffer's
putted into Metal shaders?
The best way to achieve maximum performance is to transform your video frame directly in shader. You just need to add rotation matrix in your Vertex shader.
I want to reshape the face coordinate like showing in the video: https://www.dropbox.com/s/vsttylwgt25szha/IMG_6590.TRIM.MOV?dl=0 (Sorry, unfortunetly the video is about 11 MB in size).
I've just capture the face coordinate using iOS Vision API:
// Facial landmarks are GREEN.
fileprivate func drawFeatures(onFaces faces: [VNFaceObservation], onImageWithBounds bounds: CGRect) {
CATransaction.begin()
for faceObservation in faces {
let faceBounds = boundingBox(forRegionOfInterest: faceObservation.boundingBox, withinImageBounds: bounds)
guard let landmarks = faceObservation.landmarks else {
continue
}
// Iterate through landmarks detected on the current face.
let landmarkLayer = CAShapeLayer()
let landmarkPath = CGMutablePath()
let affineTransform = CGAffineTransform(scaleX: faceBounds.size.width, y: faceBounds.size.height)
// Treat eyebrows and lines as open-ended regions when drawing paths.
let openLandmarkRegions: [VNFaceLandmarkRegion2D?] = [
//landmarks.leftEyebrow,
//landmarks.rightEyebrow,
landmarks.faceContour,
landmarks.noseCrest,
// landmarks.medianLine
]
// Draw eyes, lips, and nose as closed regions.
let closedLandmarkRegions = [
landmarks.nose
].compactMap { $0 } // Filter out missing regions.
// Draw paths for the open regions.
for openLandmarkRegion in openLandmarkRegions where openLandmarkRegion != nil {
landmarkPath.addPoints(in: openLandmarkRegion!,
applying: affineTransform,
closingWhenComplete: false)
}
// Draw paths for the closed regions.
for closedLandmarkRegion in closedLandmarkRegions {
landmarkPath.addPoints(in: closedLandmarkRegion ,
applying: affineTransform,
closingWhenComplete: true)
}
// Format the path's appearance: color, thickness, shadow.
landmarkLayer.path = landmarkPath
landmarkLayer.lineWidth = 1
landmarkLayer.strokeColor = UIColor.green.cgColor
landmarkLayer.fillColor = nil
landmarkLayer.shadowOpacity = 1.0
landmarkLayer.shadowRadius = 1
// Locate the path in the parent coordinate system.
landmarkLayer.anchorPoint = .zero
landmarkLayer.frame = faceBounds
landmarkLayer.transform = CATransform3DMakeScale(1, -1, 1)
pathLayer?.addSublayer(landmarkLayer)
}
CATransaction.commit()
}
How to step forward from here? Can anyone guide me please?
The following code places the node in front of the camera but always at the center 10cm away from the camera position. I want to place the node 10cm away in z-direction but at the x and y co-ordinates of where I touch the screen. So touching on different parts of the screen should result in a node being placed 10cm away in front of the camera but at the x and y location of the touch and not always at the center.
var cameraRelativePosition = SCNVector3(0,0,-0.1)
let sphere = SCNNode()
sphere.geometry = SCNSphere(radius: 0.0025)
sphere.geometry?.firstMaterial?.diffuse.contents = UIColor.white
Service.addChildNode(sphere, toNode: self.sceneView.scene.rootNode,
inView: self.sceneView, cameraRelativePosition:
cameraRelativePosition)
Service.swift
class Service: NSObject {
static func addChildNode(_ node: SCNNode, toNode: SCNNode, inView:
ARSCNView, cameraRelativePosition: SCNVector3) {
guard let currentFrame = inView.session.currentFrame else { return }
let camera = currentFrame.camera
let transform = camera.transform
var translationMatrix = matrix_identity_float4x4
translationMatrix.columns.3.x = cameraRelativePosition.x
translationMatrix.columns.3.y = cameraRelativePosition.y
translationMatrix.columns.3.z = cameraRelativePosition.z
let modifiedMatrix = simd_mul(transform, translationMatrix)
node.simdTransform = modifiedMatrix
toNode.addChildNode(node)
}
}
The result should look exactly like this : https://justaline.withgoogle.com
We can use the unprojectPoint(_:) method of SCNSceneRenderer (SCNView and ARSCNView both conform to this protocol) to convert a point on the screen to a 3D point.
When tapping the screen we can calculate a ray this way:
func getRay(for point: CGPoint, in view: SCNSceneRenderer) -> SCNVector3 {
let farPoint = view.unprojectPoint(SCNVector3(Float(point.x), Float(point.y), 1))
let nearPoint = view.unprojectPoint(SCNVector3(Float(point.x), Float(point.y), 0))
let ray = SCNVector3Make(farPoint.x - nearPoint.x, farPoint.y - nearPoint.y, farPoint.z - nearPoint.z)
// Normalize the ray
let length = sqrt(ray.x*ray.x + ray.y*ray.y + ray.z*ray.z)
return SCNVector3Make(ray.x/length, ray.y/length, ray.z/length)
}
The ray has a length of 1, so by multiplying it by 0.1 and adding the camera location we get the point you were searching for.
I'm actually trying to put a 3D Object on QRCode with ARKit
For that I use a AVCaptureDevice to detect a QRCode and establish the area of the QRCode that gives me a CGRect.
Then, I make a hitTest on every point of the CGRect to get the average 3D coordinates like so :
positionGiven = SCNVector3(0, 0, 0)
for column in Int(qrZone.origin.x)...2*Int(qrZone.origin.x + qrZone.width) {
for row in Int(qrZone.origin.y)...2*Int(qrZone.origin.y + qrZone.height) {
for result in sceneView.hitTest(CGPoint(x: CGFloat(column)/2,y:CGFloat(row)/2), types: [.existingPlaneUsingExtent,.featurePoint]) {
positionGiven.x+=result.worldTransform.columns.3.x
positionGiven.y+=result.worldTransform.columns.3.y
positionGiven.z+=result.worldTransform.columns.3.z
cpts += 1
}
}
}
positionGiven.x=positionGiven.x/cpts
positionGiven.y=positionGiven.y/cpts
positionGiven.z=positionGiven.z/cpts
But the hitTest doesn't detect any result and freeze the camera while when I make a hitTest with a touch on screen it works.
Do you have any idea why it's not working ?
Do you have an other idea that can help me to achieve what I want to do ?
I already thought about 3D translation with CoreMotion that can give me the tilt of the device but that seems really tedious.
I also heard about ARWorldAlignmentCamera that can locked the scene coordinate to match the orientation of the camera but I don't know how to use it !
Edit : I try to move my 3D Object every time I touch the screen and the hitTest is positive, and it's pretty accurate ! I really don't understand why hitTest on an area of pixels doesn't work...
Edit 2 : Here is the code of the hitTest who works with 2-5 touches on the screen:
#objc func touch(sender : UITapGestureRecognizer) {
for result in sceneView.hitTest(CGPoint(x: sender.location(in: view).x,y: sender.location(in: view).y), types: [.existingPlaneUsingExtent,.featurePoint]) {
//Pop up message for testing
alert("\(sender.location(in: view))", message: "\(result.worldTransform.columns.3)")
//Moving the 3D Object to the new coordinates
let objectList = sceneView.scene.rootNode.childNodes
for object : SCNNode in objectList {
object.removeFromParentNode()
}
addObject(SCNVector3(result.worldTransform.columns.3.x,result.worldTransform.columns.3.y,result.worldTransform.columns.3.z))
}
}
Edit 3 :
I manage to resolve my problem partially.
I take the transform matrix of the camera (session.currentFrame.camera.transform) so that the object is in front of the camera.
Then I apply a translation on (x,y) with the position of the CGRect.
However i can't translate the z-axis because i don't have enough informations.
And I will probably need a estimation of z coordinate like the hitTest do..
Thanks in advance ! :)
You could use Apple's Vision API to detect the QR code and place an anchor.
To start detecting QR codes, use:
var qrRequests = [VNRequest]()
var detectedDataAnchor: ARAnchor?
var processing = false
func startQrCodeDetection() {
// Create a Barcode Detection Request
let request = VNDetectBarcodesRequest(completionHandler: self.requestHandler)
// Set it to recognize QR code only
request.symbologies = [.QR]
self.qrRequests = [request]
}
In ARSession's didUpdate Frame
public func session(_ session: ARSession, didUpdate frame: ARFrame) {
DispatchQueue.global(qos: .userInitiated).async {
do {
if self.processing {
return
}
self.processing = true
// Create a request handler using the captured image from the ARFrame
let imageRequestHandler = VNImageRequestHandler(cvPixelBuffer: frame.capturedImage,
options: [:])
// Process the request
try imageRequestHandler.perform(self.qrRequests)
} catch {
}
}
}
Handle the Vision QR request and trigger the hit test
func requestHandler(request: VNRequest, error: Error?) {
// Get the first result out of the results, if there are any
if let results = request.results, let result = results.first as? VNBarcodeObservation {
guard let payload = result.payloadStringValue else {return}
// Get the bounding box for the bar code and find the center
var rect = result.boundingBox
// Flip coordinates
rect = rect.applying(CGAffineTransform(scaleX: 1, y: -1))
rect = rect.applying(CGAffineTransform(translationX: 0, y: 1))
// Get center
let center = CGPoint(x: rect.midX, y: rect.midY)
DispatchQueue.main.async {
self.hitTestQrCode(center: center)
self.processing = false
}
} else {
self.processing = false
}
}
func hitTestQrCode(center: CGPoint) {
if let hitTestResults = self.latestFrame?.hitTest(center, types: [.featurePoint] ),
let hitTestResult = hitTestResults.first {
if let detectedDataAnchor = self.detectedDataAnchor,
let node = self.sceneView.node(for: detectedDataAnchor) {
let previousQrPosition = node.position
node.transform = SCNMatrix4(hitTestResult.worldTransform)
} else {
// Create an anchor. The node will be created in delegate methods
self.detectedDataAnchor = ARAnchor(transform: hitTestResult.worldTransform)
self.sceneView.session.add(anchor: self.detectedDataAnchor!)
}
}
}
Then handle adding the node when the anchor is added.
func renderer(_ renderer: SCNSceneRenderer, nodeFor anchor: ARAnchor) -> SCNNode? {
// If this is our anchor, create a node
if self.detectedDataAnchor?.identifier == anchor.identifier {
let sphere = SCNSphere(radius: 1.0)
sphere.firstMaterial?.diffuse.contents = UIColor.redColor()
let sphereNode = SCNNode(geometry: sphere)
sphereNode.transform = SCNMatrix4(anchor.transform)
return sphereNode
}
return nil
}
Source
I am trying to display a texture loaded with MTKTextureLoader, I have a buffer that stores my vertices coordinates (I build two triangles to have a rectangle in which to display my image), then I have a buffer that stores the texture coordinates of each vertex.
I made a sampler to sample data from my texture, the problem is that I am getting nothing (black image).
I putted the Swift code just in case my error comes from there, but I think it comes form the Metal code. If you look at my fragment shader, you will see two comments, they show something that I can't understand :
If I give the coordinates directly to the sample function, it works (colours the triangles with the color that corresponds to the given coordinates).
If I give the coordinates I pass to the sampler as color components, it also displays something coherent (triangles coloured in function of the given coordinates).
So it doesn't seem to come from the sampler, nor from the coordinates, that's what I don't understand.
Here is my Swift code :
import Cocoa
import MetalKit
import Metal
class ViewController: NSViewController, MTKViewDelegate {
var device:MTLDevice!
var texture:MTLTexture!
var commandQueue:MTLCommandQueue!
var vertexBuffer:MTLBuffer!
var vertexCoordinates:[Float] = [
-1, 1, 0, 1,
-1, -1, 0, 1,
1, -1, 0, 1,
1,-1,0,1,
1,1,0,1,
-1,1,0,1,
]
var vertexUVBuffer:MTLBuffer!
var vertexUVCoordinates:[Float] = [
0,1,
0,0,
1,0,
1,0,
1,1,
0,1
]
var library:MTLLibrary!
var defaultPipelineState:MTLRenderPipelineState!
var samplerState:MTLSamplerState!
#IBOutlet var metalView: MTKView!
override func viewDidLoad() {
super.viewDidLoad()
device = MTLCreateSystemDefaultDevice()
let textureLoader = MTKTextureLoader(device: device)
metalView.device = device
metalView.delegate = self
metalView.preferredFramesPerSecond = 0
metalView.sampleCount = 4
texture = try! textureLoader.newTextureWithContentsOfURL(NSBundle.mainBundle().URLForResource("abeilles", withExtension: "jpg")!, options: [MTKTextureLoaderOptionAllocateMipmaps:NSNumber(bool: true)])
commandQueue = device.newCommandQueue()
library = device.newDefaultLibrary()
vertexBuffer = device.newBufferWithBytes(&vertexCoordinates, length: sizeof(Float)*vertexCoordinates.count, options: [])
vertexUVBuffer = device.newBufferWithBytes(&vertexUVCoordinates, length: sizeof(Float)*vertexUVCoordinates.count, options: [])
let renderPipelineDescriptor = MTLRenderPipelineDescriptor()
renderPipelineDescriptor.vertexFunction = library.newFunctionWithName("passTroughVertex")
renderPipelineDescriptor.fragmentFunction = library.newFunctionWithName("myFragmentShader")
renderPipelineDescriptor.sampleCount = metalView.sampleCount
renderPipelineDescriptor.colorAttachments[0].pixelFormat = metalView.colorPixelFormat
defaultPipelineState = try! device.newRenderPipelineStateWithDescriptor(renderPipelineDescriptor)
let samplerDescriptor = MTLSamplerDescriptor()
samplerDescriptor.minFilter = .Linear
samplerDescriptor.magFilter = .Linear
samplerDescriptor.mipFilter = .Linear
samplerDescriptor.sAddressMode = .ClampToEdge
samplerDescriptor.rAddressMode = .ClampToEdge
samplerDescriptor.tAddressMode = .ClampToEdge
samplerDescriptor.normalizedCoordinates = true
samplerState = device.newSamplerStateWithDescriptor(samplerDescriptor)
metalView.draw()
// Do any additional setup after loading the view.
}
func drawInMTKView(view: MTKView) {
let commandBuffer = commandQueue.commandBuffer()
let commandEncoder = commandBuffer.renderCommandEncoderWithDescriptor(metalView.currentRenderPassDescriptor!)
commandEncoder.setRenderPipelineState(defaultPipelineState)
commandEncoder.setVertexBuffer(vertexBuffer, offset: 0, atIndex: 0)
commandEncoder.setVertexBuffer(vertexUVBuffer, offset:0, atIndex:1)
commandEncoder.setFragmentSamplerState(samplerState, atIndex: 0)
commandEncoder.setFragmentTexture(texture, atIndex: 0)
commandEncoder.drawPrimitives(MTLPrimitiveType.Triangle, vertexStart: 0, vertexCount: 6, instanceCount: 1)
commandEncoder.endEncoding()
commandBuffer.presentDrawable(metalView.currentDrawable!)
commandBuffer.commit()
}
func mtkView(view: MTKView, drawableSizeWillChange size: CGSize) {
// view.draw()
}
override var representedObject: AnyObject? {
didSet {
// Update the view, if already loaded.
}
}
}
Here is my Metal code :
#include <metal_stdlib>
using namespace metal;
struct VertexOut {
float4 position [[position]];
float2 texCoord;
};
vertex VertexOut passTroughVertex(uint vid [[ vertex_id]],
constant float4 *vertexPosition [[ buffer(0) ]],
constant float2 *vertexUVPos [[ buffer(1)]]) {
VertexOut vertexOut;
vertexOut.position = vertexPosition[vid];
vertexOut.texCoord = vertexUVPos[vid];
return vertexOut;
}
fragment float4 myFragmentShader(VertexOut inFrag [[stage_in]],
texture2d<float> myTexture [[ texture(0)]],
sampler mySampler [[ sampler(0) ]]) {
float4 myColor = myTexture.sample(mySampler,inFrag.texCoord);
// myColor = myTexture.sample(mySampler,float2(1));
// myColor = float4(inFrag.texCoord.r,inFrag.texCoord.g,0,1);
return myColor;
}
You're allocating space for mipmaps but not actually generating them. The docs say that when specifying MTKTextureLoaderOptionAllocateMipmaps, "a full set of mipmap levels are allocated for the texture when the texture is loaded, and it is your responsibility to generate the mipmap contents."
Your sampler configuration causes the resulting texture to be sampled at the base mipmap level as long as the texture is small relative to the rect on the screen, but if you feed in a larger texture, it starts sampling the smaller levels of the mipmap stack, picking up all-black pixels, which are then blended together to either darken the image or cause the output to be entirely black.
You should use the -generateMipmapsForTexture: method on a MTLBlitCommandEncoder to generate a complete set of mipmaps once your texture is loaded.