What is the best way, if any, to use Apple's new ARKit with multiple users/devices?
It seems that each devices gets its own scene understanding individually. My best guess so far is to use raw features points positions and try to match them across devices to glue together the different points of views since ARKit doesn't offer any absolute referential reference.
===Edit1, Things I've tried===
1) Feature points
I've played around and with the exposed raw features points and I'm now convinced that in their current state they are a dead end:
they are not raw feature points, they only expose positions but none of the attributes typically found in tracked feature points
their instantiation doesn't carry over from frame to frame, nor are the positions exactly the same
it often happens that reported feature points change by a lot when the camera input is almost not changing, with either a lot appearing or disappearing.
So overall I think it's unreasonable to try to use them in some meaningful way, not being able to make any kind of good point matching within one device, let alone several.
Alternative would to implement my own feature point detection and matching, but that'd be more replacing ARKit than leveraging it.
2) QR code
As #Rickster suggested, I've also tried identifying an easily identifiable object like a QR code and getting the relative referential change from that fixed point (see this question) It's a bit difficult and implied me using some openCV to estimate camera pose. But more importantly very limiting
As some newer answers have added, multiuser AR is a headline feature of ARKit 2 (aka ARKit on iOS 12). The WWDC18 talk on ARKit 2 has a nice overview, and Apple has two developer sample code projects to help you get started: a basic example that just gets 2+ devices into a shared experience, and SwiftShot, a real multiplayer game built for AR.
The major points:
ARWorldMap wraps up everything ARKit knows about the local environment into a serializable object, so you can save it for later or send it to another device. In the latter case, "relocalizing" to a world map saved by another device in the same local environment gives both devices the same frame of reference (world coordinate system).
Use the networking technology of your choice to send the ARWorldMap between devices: AirDrop, cloud shares, carrier pigeon, etc all work, but Apple's Multipeer Connectivity framework is one good, easy, and secure option, so it's what Apple uses in their example projects.
All of this gives you only the basis for creating a shared experience — multiple copies on your app on multiple devices all using a world coordinate system that lines up with the same real-world environment. That's all you need to get multiple users experiencing the same static AR content, but if you want them to interact in AR, you'll need to use your favorite networking technology some more.
Apple's basic multiuser AR demo shows encoding an ARAnchor
and sending it to peers, so that one user can tap to place a 3D
model in the world and all others can see it. The SwiftShot game example builds a whole networking protocol so that all users get the same gameplay actions (like firing slingshots at each other) and synchronized physics results (like blocks falling down after being struck). Both use Multipeer Connectivity.
(BTW, the second and third points above are where you get the "2 to 6" figure from #andy's answer — there's no limit on the ARKit side, because ARKit has no idea how many people may have received the world map you saved. However, Multipeer Connectivity has an 8 peer limit. And whatever game / app / experience you build on top of this may have latency / performance scaling issues as you add more peers, but that depends on your technology and design.)
Original answer below for historical interest...
This seems to be an area of active research in the iOS developer community — I met several teams trying to figure it out at WWDC last week, and nobody had even begun to crack it yet. So I'm not sure there's a "best way" yet, if even a feasible way at all.
Feature points are positioned relative to the session, and aren't individually identified, so I'd imagine correlating them between multiple users would be tricky.
The session alignment mode gravityAndHeading might prove helpful: that fixes all the directions to a (presumed/estimated to be) absolute reference frame, but positions are still relative to where the device was when the session started. If you could find a way to relate that position to something absolute — a lat/long, or an iBeacon maybe — and do so reliably, with enough precision... Well, then you'd not only have a reference frame that could be shared by multiple users, you'd also have the main ingredients for location based AR. (You know, like a floating virtual arrow that says turn right there to get to Gate A113 at the airport, or whatever.)
Another avenue I've heard discussed is image analysis. If you could place some real markers — easily machine recognizable things like QR codes — in view of multiple users, you could maybe use some form of object recognition or tracking (a ML model, perhaps?) to precisely identify the markers' positions and orientations relative to each user, and work back from there to calculate a shared frame of reference. Dunno how feasible that might be. (But if you go that route, or similar, note that ARKit exposes a pixel buffer for each captured camera frame.)
Good luck!
Now, after releasing ARKit 2.0 at WWDC 2018, it's possible to make games for 2....6 users.
For this, you need to use ARWorldMap class. By saving world maps and using them to start new sessions, your iOS application can now add new Augmented Reality capabilities: multiuser and persistent AR experiences.
AR Multiuser experiences. Now you may create a shared frame of a reference by sending archived ARWorldMap objects to a nearby iPhone or iPad. With several devices simultaneously tracking the same world map, you may build an experience where all users (up to 6) can share and see the same virtual 3D content (use Pixar's USDZ file format for 3D in Xcode 10+ and iOS 12+).
session.getCurrentWorldMap { worldMap, error in
guard let worldMap = worldMap else {
showAlert(error)
return
}
}
let configuration = ARWorldTrackingConfiguration()
configuration.initialWorldMap = worldMap
session.run(configuration)
AR Persistent experiences. If you save a world map and then your iOS application becomes inactive, you can easily restore it in the next launch of app and in the same physical environment. You can use ARAnchors from the resumed world map to place the same virtual 3D content (in USDZ or DAE format) at the same positions from the previous saved session.
Not bulletproof answers more like workarounds but maybe you'll find these helpful.
All assume the players are in the same place.
DIY ARKit sets up it's world coordinate system quickly after the AR session has been started. So if you can have all players, one after another, put and align their devices to the same physical location and let them start the session there, there you go. Imagine the inside edges of an L square ruler fixed to whatever available. Or any flat surface with a hole: hold phone agains surface looking through the hole with camera, (re)init session.
Medium Save the player aligning phone manually, instead detect a real world marker with image analysis just like #Rickster described.
Involved Train an Core ML model to recognize iPhones and iPads and their camera location. Like it's done with human face and eyes. Aggregate data on a server, then turn off ML to save power. Note: make sure your model is cover-proof. :)
I'm in the process of updating my game controller framework (https://github.com/robreuss/VirtualGameController) to support a shared controller capability, so all devices would receive input from the control elements on the screens of all devices. The purpose of this enhancement is to support ARKit-based multiplayer functionality. I'm assuming developers will use the first approach mentioned by diviaki, where the general positioning of the virtual space is defined by starting the session on each device from a common point in physical space, a shared reference, and specifically I have in mind being on opposite sides of a table. All the devices would launch the game at the same time and utilize a common coordinate space relative to physical size, and using the inputs from all the controllers, the game would remain theoretically in sync on all devices. Still testing. The obvious potential problem is latency or disruption in the network and the sync falls apart, and it would be difficult to recover except by restarting the game. The approach and framework may work for some types of games fairly well - for example, straightforward arcade-style games, but certainly not for many others - for example, any game with significant randomness that cannot be coordinated across devices.
This is a hugely difficult problem - the most prominent startup that is working on it is 6D.ai.
"Multiplayer AR" is the same problem as persistent SLAM, where you need to position yourself in a map that you may not have built yourself. It is the problem that most self driving car companies are actively working on.
I'm doing research that requires a camera that is automated, but it also has to coordinate with the rotation of a filter wheel and take a series of images relatively quickly (4 images in less than 2 seconds). I'd like to do this by writing a Matlab script to control everything and handle incoming data.
I know there are scientific cameras out there that can do this job and have very good SDKs, but they are also very expensive if they have the sensor size that I need (APS-C or larger). Using a simple Sony mirrorless camera would work perfectly for my needs as long as I can control it.
I'd like to use Matlab or LabView to automate the data acquisition, but I'm not sure what is possible with this API Beta SDK. My understanding is that it is designed to allow the user to create a stand-alone app, but not to integrate camera commands into a programming environment like Matlab. I know there are ways to call an external application from within Matlab, but I've also read one person's account of trying this indirect method and it sounds like it takes a long time to trigger the camera this way (five seconds or more for a single image). That would be too slow.
Does the SDK allow camera control directly from a program like Matlab?
My understanding is that it is designed to allow the user to create a stand-alone app, but not to integrate camera commands into a programming environment like Matlab.
Don't trust marketing statements, that's just how they advertise their SDK. If you take a closer look into the documentation, you will realize your Camera runs a server which accepts JSON-RPC over HTTP commands. I would use an already exiting examples for Android (Java) and adapt it to run on your operating system, you can directly call java code from your matlab console.
I've had great success communicating between MatLab and a Sony QX1 (the 'webwrite' function is your friend!).
That said, you will definitely struggle to implement anything like precise triggering. The call-response times vary greatly (~5 seconds +-2 ish).
You might be able to get away with shooting video and then pulling the relevant frames out of the sequence?
according to sensiya(http://www.sensiya.com/) their SDK can detect motions like walking running sitting driving etc.
I guess acceleration data can be used for classifier to detect run and walk.
But sitting and driving are quite the same, what else technique they used in order to distinct driving and sitting? does anyone have any insight?
Many thanks
For full disclosure, I am working at Sensiya. Many algorithms that recognize device's user activity rely mainly on the accelerometer sensor data analytics, as you mentioned, but if you want to fine tune and expand the type of activities you want to track I suggest using other device's sensors like proximity, magnetic field etc, or just use our tools ;)
For the specific driving and still recognition technique:
Differentiation of the still and driving states is a tough task. A simple solution will be to recognize that the device is in still state but its gps location changes, although this solution will not be efficient in terms of battery life. Our driving recognition tries to save battery life during this kind of recognition and we succeeded to find a slight difference between device's perfect still state and driving still state in terms of real time data you can collect from the device.
This is a good material to start with:
dialnet.unirioja.es/descarga/articulo/3954593.pdf
In order to do usability testing I'd like to record an iPhone's display along with every user action. I can't modify the application itself however jailbreaking the phone wouldn't be a problem.
Ideally I'd like to get a full resolution video of the screen display with an overlay showing touch events on top of it.
For now the best solution I've found is using a video-out cable and record its output, but with this solution I'd need an external camera to capture what the user was doing and it wouldn't be very precise.
Other ideas?
The application display recorder, found in the big boss repo (cydia) works very well for this.
I have tried MirrorOp (requires JailBreak) and AirSquirrels' Reflector (no JB required) for usability testing. Both work very well, but none grab touch feedback. You can use a second camera or a Hug the notebook approach.
A coworker displayed the route he used to commute to/from work on Google Earth
but won't tell me how he did it. I have a laptop with GE installed.
Can you tell me how to do this. I guess I need something to collect the coordinates,
and then create somesort of tracks but any pointers would be helpful.
Also, can this be done in real time. I nother words, can I update my location on
GE while driving?
You can drag and drop xml-files with coordinates onto GE. These files can be created with most GPS software/systems.
http://en.wikipedia.org/wiki/GPS_eXchange_Format
I just figured out how to do this. You will need a USB GPS Receiver. I also got a program called "Earth-Bridge" which allows generic USB GPS units to send their data in realtime to Google Earth.
Just install Earth Bridge, and then Install your Drivers for the USB receiver, and then plug in your Receiver and start Earth Bridge. The GPS will take about 45-60 seconds the first time to get a lock on your position but after that it is pretty good at staying up to date.
Since the GPS uses an internal serial to USB you have to deal with a virtual com port, just make sure you check your instructions on the USB and configure it correctly in Earth Bridge to use the right com Port, baud rate, etc...
Sometimes, the GPS shows me out in the ocean (or somewhere completely different from where I am) and I wait about a minute for it to lock on but it won't get a lock. Then there is a button on Earth Bridge "Reload in Google Earth" and it will usually fix it right away.
A good GPS receiver (the one I bought and love) is here on Amazon for about 38USD.
Google Earth speaks KML, or keyhole markup language. It is nothing more than an XML file with a specific schema.
from wikipedia: http://en.wikipedia.org/wiki/Keyhole_Markup_Language
You don't need an XML package to build it. you can just write out to a text file if you want to. You will still obviously need a GPS to pull the coords from.
you could even go so far as to write a blackberry app. to use its on-board GPS to do this. i've been contemplating doing this for a while to track my walks. my plan is to then automatically upload the tracklog to the web where i can view it later or my wife can view it real-time.
This is an old post, but I have been doing this with the MotionX-GPS iPhone app for years. You record & save a track on the app, and then email it to yourself. The email has an .KMZ attachment, which can open directly in Google Earth to display your track in glorious detail. I have a friend who has tracked all of his flight training with this. It's way Cool!