Is there any way to tell if a video is 360 or panorama? - virtual-reality

for a project I am working on I will need to automatically decide if a video is a VR (360) video, and if so what format it is. Is there any way to tell? I was thinking the metadata but could not find any information on this.

Checking size is pointless, most properly encoded movies are using standard sizes like 1080p (1920x1080), WQHD (2560×1440) or 4K (3840×2160) because of better hardware decoding. To do that they don't have square pixels. Therefore you shouldn't guess anything by ratio.
What you should do, is check the presence of zenith and nadir. That is check for the topmost and bottommost region of image if it is the same color (assuming the most standard equirectangular projection).
This approach will need some adjusting if you have stereoscopy involved. You would have to repeat this procedure for each eye region. As a bonus, you can also deduce some stereoscopy types - for example you could differentiate top-bottom, mono and left-right. Unfortunately you couldn't guess which image is for which eye, so you would have to assume the more common situation where the left eye is usually the top or left one in the image.

There is an RFC for Metadata to describe spherical data in MP4 videos:
https://github.com/google/spatial-media/blob/master/docs/spherical-video-v2-rfc.md
This includes the introduction of a new spherical video header box, svhd, which you can test for the presence of to detect if a video is a VR 360 video.
This is not ubiquitous yet but it does have support from key players like Google, and as you have discovered, something like this is necessary so it seems likely its use will spread.

Related

Is it possible to create a 3D photo from a normal photo?

If I have understand well, 3D 360 photos are created from a panorama photo, so I guess it should be possible to create a 3D photo (non 360) from a normal photo. But how? I did not find anything in Google! Any idea of what should I search??
So far, if nothing available (I don't think so), I'll try to duplicate the same photo in each eye. One of the pictures a little bit moved to the right, and the other one moved a little bit to the left. But I think the distortion algorithm is much more complicated.
Note: I'm also receiving answers here: https://plus.google.com/u/0/115463690952639951338/posts/4KdqFcqUTT9
I am in no way certain of this, but my intuition on how 3D 360 images are created in GoogleVR is this:
As you take a panorama image, it actually takes a series of images. As you turn the phone around, the perspective changes slightly with each image, not only by angle, but also offset (except in the unlikely event you spin the phone around its own axis). When it stitches together the final image, it creates one image for each eye, picking suitable images from the series so that it creates a 3D effect when viewed together. The same "area" of the image for each eye comes from a different source image.
You can't do anything similar with a single image. It's the multitude of images produced, each with a different perspective coming from the turning of the phone, that enables the algorithm to create a 3D image.
2D lacks a dimension hence cannot be converted to 3D just like that, but there are clever ways for example Google Pixel even though doesn't have 2 camera can make it seem like the image is 3D by applying some Machine learning algorithm that create the effect of perspective and depth by selective blurring.
3d photos can't be taken by normal but you can take 360 photos with normal camera ..... There are many apps via which you can do this ..... Also there are many algorithms to do it programmatically

Taking Depth Image From Iphone (or consumer camera)

I have read that it's possible to create a depth image from a stereo camera setup (where two cameras of identical focal length/aperture/other camera settings take photographs of an object from an angle).
Would it be possible to take two snapshots almost immediately after each other(on the iPhone for example) and use the differences between the two pictures to develop a depth image?
Small amounts of hand-movement and shaking will obviously rock the camera creating some angular displacement, and perhaps that displacement can be calculated by looking at the general angle of displacement of features detected in both photographs.
Another way to look at this problem is as structure-from-motion, a nice review of which can be found here.
Generally speaking, resolving spatial correspondence can also be factored as a temporal correspondence problem. If the scene doesn't change, then taking two images simultaneously from different viewpoints - as in stereo - is effectively the same as taking two images using the same camera but moved over time between the viewpoints.
I recently came upon a nice toy example of this in practice - implemented using OpenCV. The article includes some links to other, more robust, implementations.
For a deeper understanding I would recommend you get hold of an actual copy of Hartley and Zisserman's "Multiple View Geometry in Computer Vision" book.
You could probably come up with a very crude depth map from a "cha-cha" stereo image (as it's known in 3D photography circles) but it would be very crude at best.
Matching up the images is EXTREMELY CPU-intensive.
An iPhone is not a great device for doing the number-crunching. It's CPU isn't that fast, and memory bandwidth isn't great either.
Once Apple lets us use OpenCL on iOS you could write OpenCL code, which would help some.

How to transform a video (using MATLAB) which was taken at an offset to correct viewing angle?

I have a video taken at an angle to the axis of a circular body. Since it was taken from an unknown angle, the circle appears as a ellipse.
How to find the angle of camera offset from the video? Also, Is it correct to apply the same transformation to all the frames in the video; as the video camera was in a fixed location?
For a super easy fix, go back to the scene and take the video again. This time, make sure the circle look like a circle.
That being said, this is an interesting topic in the academia. I believe there's various solutions/articles that are aimed to solve this kind of problem. Base on your reputation, I believe you already know that, but still wanted to give Stackoverflow members a shot at answering this problem. So here it goes.
For an easy fix, you can start with this function, by guessing the camera location by trial and error until you find an acceptable transformation to your image (a frame of the video). The function does not work right out of the box, you have to debug it a little bit.
If you have access to the (virtual) scene of the image, you can take an image. Base on mutual feature points from the new image and the original image, register the two images (and get the transformation) (ex1, ex2).
Finally, apply the same transformation to each frame of the video.
To answer your second question, though the camera location is fixed, there may be objects moving in the scene. So applying the same transformation to every frame will only correct the objects that are still. So it's not ideal. In the end, it depends on what the aims of the project is and how this non/correction affects the project aims.

Measuring distance with iPhone camera

How to implement a way to measure distances in real time (video camera?) on the iPhone, like this app that uses a card to compare the size of the card with the actual distance?
Are there any other ways to measure distances? Or how to go about doing this using the card method? What framework should I use?
Well you do have something for reference, hence the use of the card. Saying that after watching the a video for the app I can't seem it seems too user friendly.
So you either need a reference of an object that has some known size, or you need to deduct the size from the image. One idea I just had that might help you do it is what the iPhone's 4 flash (I'm sure it's very complicated by it might just work for some stuff).
Here's what I think.
When the user wants to measure something, he takes a picture of it, but you're actually taking two separate images, one with flash on, one with flash off. Then you can analyze the lighting differences in the image and the flash reflection to determine the scale of the image. This will only work for close and not too shining objects I guess.
But that's about the only other way I thought about deducting scale from an image without any fixed objects.
I like Ron Srebro's idea and have thought about something similar -- please share if you get it to work!
An alternative approach would be to use the auto-focus feature of the camera. Point-and-shoot camera's often have a laser range finder that they use to auto-focus. iPhone doesn't have this and the f-stop is fixed. However, users can change the focus by tapping the camera screen. The phone can also switch between regular and macro focus.
If the API exposes the current focus settings, maybe there's a way to use this to determine range?
Another solution may be to use two laser pointers.
Basically you would shine two laser pointers at, say, a wall in parallel. Then, the further back you go, the beams will look closer and closer together in the video, but they will still remain the same distance apart. Then you can easily come up with some formula to measure the distance based on how far apart the dots are in the photo.
See this thread for more details: Possible to measure distance with an iPhone and laser pointer?.

iphone, Image processing

I am building an application on night vision but i don't find any useful algorithm which I can apply on the dark images to make it clear. Anyone please suggest me some good algorithm.
Thanks in advance
With the size of the iphone lens and sensor, you are going to have a lot of noise no matter what you do. I would practice manipulating the image in Photoshop first, and you'll probably find that it is useful to select a white point out of a sample of the brighter pixels in the image and to use a curve. You'll probably also need to run an anti-noise filter and smoother. Edge detection or condensation may allow you to bold some areas of the image. As for specific algorithms to perform each of these filters there are a lot of Computer Science books and lists on the subject. Here is one list:
http://www.efg2.com/Lab/Library/ImageProcessing/Algorithms.htm
Many OpenGL implementations can be found if you find a standard name for an algorithm you need.
Real (useful) night vision typically uses an infrared light and an infrared-tuned camera. I think you're out of luck.
Of course using the iPhone 4's camera light could be considered "night vision" ...
Your real problem is the camera and not the algorithm.
You can apply algorithm to clarify images, but it won't make from dark to real like by magic ^^
But if you want to try some algorithms you should take a look at OpenCV (http://opencv.willowgarage.com/wiki/) there is some port like here http://ildan.blogspot.com/2008/07/creating-universal-static-opencv.html
I suppose there are two ways to refine the dark image. first is active which use infrared and other is passive which manipulates the pixel of the image....
The images will be noisy, but you can always try scaling up the pixel values (all of the components in RGB or just the luminance of HSV, either linear or applying some sort of curve, either globally or local to just the darker areas) and saturating them, and/or using a contrast edge enhancement filter algorithm.
If the camera and subject matter are sufficiently motionless (tripod, etc.) you could try summing each pixel over several image captures. Or you could do what some HDR apps do, and try aligning images before pixel processing across time.
I haven't seen any documentation on whether the iPhone's camera sensor has a wider wavelength gamut than the human eye.
I suggest conducting a simple test before trying to actually implement this:
Save a photo made in a dark room.
Open in GIMP (or a similar application).
Apply "Stretch HSV" algorithm (or equivalent).
Check if the resulting image quality is good enough.
This should give you an idea as to whether your camera is good enough to try it.