I'm currently developing an application with the initial goal of obtaining, in real time, a 3D model of the environment "seen" by a Kinect device. This information would be later on used for projection mapping but that's not an issue, for the moment.
There are a couple of challenges to overcome, namely the fact that the Kinect will be mounted on a mobile platform (robot) and the model generation has to be in real-time (or close to it).
After a long research on this topic, I came up with several possible (?) architectures:
1) Use the depth data obtained from Kinect, convert it into a point cloud (using PCL for this step), then a Mesh and then export it into Unity for further work.
2) Use the depth data obtained from Kinect, convert it into a point cloud (using PCL for this step), export it into Unity and then convert it into a Mesh.
3) Use KinectFusion that already the option of creating a Mesh model, and (somehow) automatically load the Mesh model created into Unity.
4) Use OpenNI+ZDK (+ wrapper) to obtain the depth map and generate the Mesh using Unity.
Quite honestly, I'm kinda lost here, my main issue is the real-time requirement along with being forced to integrate several software components makes this it tricky problem. I don't know which if any of these solutions are viable and the information/tutorials on these issues isn't exactly abundant like the one, for example, for Skeleton tracking .
Any sort of help would be greatly appreciated.
Regards,
Nuno
Sorry, I might not be providing a solution for realtime mesh creation within Unity - but the process discussion here, was interesting enough for me to reply.
In the hard science novel Memories with Maya - there is discussion of exactly such a scenario:
"“Point taken,” he said. “So… Satish showed me a demo of the Quad [Quad=Drone] acquiring real-time depth and texture maps.”
“Nothing new in that,” I said.
“Yeah, but look above us.”
I tilted my head up. The crude shape of the Quad came into view.
“The Quad is here, but you can't see it because the FishEye [Fisheye=Kinect 2] is on it aimed straight ahead.”
“So it's mapping video texture over live geometry? Cool,” I said.
“Yeah, the breakthrough is I can freeze a frame… freeze real life as it were, step out of the scene and study it.”
“All you do is block out the live world with the cross polarizers?”
“Yeah,” he said. “It's a big deal for AYREE to be able to use such data-sets.”
“The resolution has improved,” I said.
“Good observation,” he said. “So has the range sensing. The lens optics have also been upgraded.”
“I noticed that if I turn around I don't see the live feed, just the empty street,” I said.
“Yes, of course,” he replied. “The Quad is facing the other way around. It's why I'm standing in front of you. The whole street, however, is a 3D model done by a standard laser scan taken from the top of that high tower.”
Krish pointed to a building block at the far end of the street. I turned back to the live 3D view again. He walked in front of me.
“This is uber cool. Everyone looks so real.”
“Haha. You should see how cool it is when you're here in person with the Wizer on,” he said. “I'm here watching these real people pass by, only they have a mesh of themselves mapped onto them.”
“Ahhh! Yes.”
“Yeah, it's like they have living paint on them. I feel like reaching out and touching, just to feel the texture.”...
The work that you're thinking of doing in this area, and this use of a live mesh goes far beyond Projection Mapping for events- for sure!
Wishing you the best on the project, and I will be following your updates.
Some of the science behind the story is on www.dirrogate.com if the topic interests you.
Kind Regards.
I would use Kinect Fusion, as it has a sample with the ability to export to .obj, which unity supports. You can automatically save it, and import it to unity to generate a mesh automatically. Especially if you have multiple Kinects, then Microsoft even has a sample to show the basics of Kinect Fusion with multiple Kinects. Also, since Fusion is already pre-written, there is not much code you will have to write.
Here is an example of a mesh from Fusion with one camera:
I do want you to notice how many vertices there are though... This could cause performance problems later on.
Good luck!
Related
I'm actually doing a project with the Hololens of Microsoft. The problem is that the Hololens memory is bad, so i can only make a spatialmapping of a room and not of a building because he can't remember all the building. I had an idea, maybe a can create more object and assemble them ? But no one talk about this... Do you think it's possible ?
Thanks for reading me.
Y.P
Since you don’t have a compass, you could establish some convention to help. For example, you could start the scanning by giving a voice command (and stop it by another one), and decide to only start scanning when you’re facing north, for example. Then it would be easy to know the orientation of each room. What may be harder is to get the angle exactly right. Your head might be off by a few degrees and you may have to work some “magic” (post processing) to correct it.
Or placing QR codes on a wall (printer paper + scotch tape) and using something like Vuforia can help you avoid this orientation problem altogether (you would get the QR code’s orientation which would match that of the wall).
You can also simplify the scanned mesh and convert it to planes. That way you can remember simpler objects instead of the raw spatial mapping mesh. (Search for the SurfaceToPlanes script in the Holographic Academy tutorials).
Scanning, the first layer, as in HoloLens trying to reason about the environment is an unstoppable process. There is no API for starting or stopping it. And that process also does slowly consume more and more memory as far as I know. The only thing you can do is deleting space (aka deleting holograms) or covering the sensors. But that's OS/hardware level, not app level, which you presumably want.
Layer two, what you are you probably talking about, is starting and stopping the spatial reconstruction process, where that raw spatial data is processed into a low-poly mesh (aka spatial mapping). This process can be started or stopped. For example through Unity's SpatialMappingCollider and SpatialMappingRenderer components, if you use Unity.
Finally the third level is extracting some objects/segments from that spatial mapping mesh into primitives. Like that SurfaceToPlanes. That you can also fully control in terms of when.
There has been a great confusion, especially due to the a re-naming parties in MixedRealityToolkit (overuse of word Scanning) and Unity (SpatialAnchor to WorldAnchor etc.) and misleading tutorials using a lot of colloquialisms instead of crisp terminology.
Theory aside. If you want the HoloLens to think of your entire building as one continuous space in terms of the first layer, you're out of luck. It was designed for a living room and there is a lot of voodoo involved into making it work stable in facilities 30x30 meters. You probably want to rely on disjointed "islands" with specific detection anchors to identify where you are. Or rely on markers and coordinates relative to them.
Cheers
Say a Tango (Unity) device is used in a controlled room, where all objects and walls are stationary and pre-known.
You take a 3DR scan of the room, and have a good ADF.
How could you approach detecting/tracking only new/changed objects?
Eg a single chair is added to the room.
Has anyone done this piece of work previously? Or can you suggest the way you would approach the problem?
So far my best idea is to use a "constructive solid geometry" library on a new mesh generated, compared to the pre-captured mesh... although it looks like it might be too expensive to run in real-time alongside everything else..
tl;dr;
How a software developer who get's a very high detailed 3D model, quickly & easily optimize it for mobile apps, so he can focus his time & energy on developing the app logic?
I think it's a pretty common use case and there may be a tool for this already out somewhere.
Long story
I have a 3D model (collada) of a machine. This model being created by the machine's engineering team, contains a lot of minute details essential in creating the machine hardware.
Now, I am developing a mobile app with unity that needs to render this machine along with 10 other machines in a single scene. The thing I like about the available models is that they look exactly like the real stuff. At the same time, I am not interested in the internal stuff, the external shell is just enough for me. I have no interaction with the 3D modelling team (let's assume I downloaded the model from some archive), and hence can't ask them to make any changes for me. The model is all I have. I am on my own.
There are two problems I am facing
How to get rid of the interiors of the models?
How to get rid of the high resolutions details in the external shell which the human eye can't detect in a mobile phone?
To give a sense of the scale, the real equipment and hence the model can be as big as 100 ft. (30 m.) while these will never occupy more than a 5 inch HD display. The size of the models ranges from 50MB to 400MB. The entire scene hence can go up to 2GB. Each model has nearly 300k faces.
The other challenge I am facing is that I am a software developer familiar with code and my familiarity with 3D modelling tools is very limited and I would like it to be that way :) I can play around with these tools, but I don't want to start spending half my time with these tools.
I have tried blender's decimate modifier. But the result's aren't good, the amount of details lost is very uniform, instead of being targeted to the interiors. I don't want to spend time in going through each mesh and deleting them manually.
Also, for some reason when I import a model that is exported from blender into unity, they look horrible (some faces/polygons that I can see in Blender I don't see them in unity), even with 0 decimation.
I am unable to digest the fact that the manual process is the only way. I feel with today's technology this would be a simple automation. The steps as I see are
Detect polygons that aren't directly reachable from any exterior raycasts. If required, I can define the set (14 may be enough) of points for the raycast's origin, basically camera's locations.
Delete these polygons
Detect polygons with dimensions less than a threshold
Delete these polygons
Blender to unity models can have slight problems if you don't export them the right way. How to do this is out of my field as I am also a developer and personally prefer 3DSMax.
What I would reccomend you to do is do what you don't want to do, it is the easiest way. Select the faces (just drag and select) and then just delete them all (the inside faces that is) you should be able to hide the outer shell if you got a propper 3d program, just google how to do it if it's too complicated for you.
If you want to delete smaller details on the outside, do exactly the same... just select the polys and delete them. I wouldn't reccomend using a build in tool because most of those tend to take the whole object and make it less polys or more polys depending on what tool you use.
In the end next time just try to get in to the program, as a programmer I dislike having to use 3D modelling software as well, but it is part of the job, so put some effort in and just learn the tools. It's less work than it seems.
Edit: As for the tools you are asking for, those do not really exist, you don't normally take a high poly model and change it to a low poly model for a mobile game. Instead you usually get a 3D artist to make a low poly model. The fact you do not have any communication with the team is a bit odd, but so be it. I'd reccomend either getting in touch with them or like I said before, putting the effort in and learning a 3D program, what you are wanting to do literally sounds like just click, drag, select and then press delete to delete some polys you wouldn't see anyways.
-Lars
with vcglib
vcglib may work for you, you can see the sample for simplify a ply 3D file. And it can applyed for many other 3D file format such as stl, obj... As vcglib is a C++ library, you can write a simple to use this library to simplify your stl model. This method work on the OS without X, such as ubuntu server. You can refer to my quesiton Failed to to simplify 3D models with vcglib, Assertion `0' failed on how to use vcglib to simplify a ply 3D file.
with meshlabserver
If you want do the auto simplyfication on OS with X, or windows, or Mac OS, it's much easy, you can refer to the meshlabserver, meshlab is also build on vcglib. You can run such ccommand, where the PLYmesher_script.mlx is the filter file, you can write this file or generate it with meshlab refer here.
meshlabserver -i ./option-0000.ply -o ./meshed.ply -m vc vn -s scripts/PLYmesher_script.mlx
I need to find a way to model a physical place inside an iPhone application. For example, I want to be able to take images for a restaurant and then use some tools or programming API to model this resturant as a 3d place and make the user able to navigate and explore the place and rooms.
I have thought about HTML 5 inside a web view but I don't think the WebGL is compatible with iPhone Web View (Safari Engine).
Can you please recommend a method, API, Commercial Library or anything to help me achieve this task?
First, you need to be able to display 3D models for IPhone. One of the most popular 3D engine is Unity3D:
http://unity3d.com/
It is extremely easy to start playing with Unity3D. You even have a free license with limited features:
http://unity3d.com/unity/licenses
Then, you now need to reconstruct a 3D model from pictures. This is not a trivial problem so it is better if you know some computer vision. You can try to play with OpenCV:
http://opencv.willowgarage.com/wiki/
Best regards.
Actually Nuke from the Foundry has a decent start at the future of creating computer models from images.
Basically it takes a high contrast point and tracks it through successive moments. Given hundreds and thousands of tracked points, the next step is to calculate the perspective change between points.
Say two points are a known pixel distance apart at time zero and a certain time period later they are a different distance apart. This change in difference could be a bad tracking point. But assuming that the two points are perfectly tracking, then the distance change could be caused by a camera motion laterally or rotationally. And in real space a point further away from you will have a different perspective then a closer point . This perspective change is a mathematical certainty.
Initially the tracking is typically used to refilm a piece of film to stabilize it. But the process the software uses to analyze the film can be saved , it is often called a point cloud. connection of many close points that track very closely usually are because the points are parts of a surface, so a model can be built.
But my friend, we are barbarians to the speed and software that can do that perfectly yet. Or all the CG Artists out there would not have anything to model in Maya except fantasy monsters and space ships that don't exist yet....
For a while I've been attempting to simulate flowing water with algorithms I've scavenged from "Real-Time Fluid Dynamics for Games". The trouble is I don't seem to get out water-like behavior with those algorithms.
Myself I guess I'm doing something wrong and that those algorithms aren't all suitable for water-like fluids.
What am I doing wrong with these algorithms? Are these algorithms correct at all?
I have the associated project in bitbucket repository. (requires gletools and newest pyglet to run)
Voxel-based solutions are fine for simulating liquids, and are frequently used in film.
Ron Fedkiw's website gives some academic examples - all of the ones there are based on a grid. That code underpins many of the simulations used by Pixar and ILM.
A good source is also Robert Bridson's Fluid Simulation course notes from SIGGRAPH and his website. He has a book "Fluid Simulation for Computer Graphics" that goes through developing a liquid simulator in detail.
The most specific answer I can give to your question is that Stam's real-time fluids for games is focused on smoke, ie. where there isn't a boundary between the fluid (water), and an external air region. Basically smoke and liquids use the same underlying mechanism, but for liquid you also need to track the position of the liquid surface, and apply appropriate boundary conditions on the surface.
Cem Yuksel presented a fantastic talk about his Wave Particles at SIGGRAPH 2007. They give a very realistic effect for quite a low cost. He was even able to simulate interaction with rigid bodies like boxes and boats. Another interesting aspect is that the boat motion isn't scripted, it's simulated via the propeller's interaction with the fluid.
(source: cemyuksel.com)
At the conference he said he was planning to release the source code, but I haven't seen anything yet. His website contains the full paper and the videos he showed at the conference.
Edit: Just saw your comment about wanting to simulate flowing liquids rather than rippling pools. This wouldn't be suitable for that, but I'll leave it here in case someone else finds it useful.
What type of water are you trying to simulate? Pools of water that ripple, or flowing liquids?
I don't think I've ever seen flowing water ever, except in rendered movies. Rippling water is fairly easy to do, this site usually crops up in this type of question.
Yeah, this type of voxel based solution only really work if your liquid is confined to very discrete and static boundaries.
For simulating flowing liquid, do some investigation into particles. Quite alot of progress has been made recently accelerating them on the GPU, and you can get some stunning results.
Take a look at, http://nzone.com/object/nzone_cascades_home.html as a great example of what can be achieved.