How to extract orientation information from videos? - iphone

After surfing through tons of documentation on the web it seems that the iPhone always shoots the video at a 480x360 aspect ratio and applies a transformation matrix on the video track. (480x360 may change but its always the same for a given device)
Here is a way of modifying the ffmpeg source within a iOS project and accessing the matrix http://www.seqoy.com/correct-orientation-for-iphone-recorded-movies-with-ffmpeg/
Here is a cleaner way of finding the transformation matrix in iOS-4
How to detect (iPhone SDK) if a video file was recorded in portrait orientation, or landscape.
How can the orientation of the video be extracted in either of the options below -
- iOS 3.2
- ffmpeg (through the command line server side)
- ruby
Any help will be appreciated.

Since most Cameras store their rotation/orientation within the exif-metadata, i would suggest using exifttool and the a ruby wrapper gem called mini_exiftool which is actively maintained.
Install exiftool:
apt-get exiftool || brew install exiftool || port install exiftool
or use what ever package manager is available
Install mini_exiftool:
gem install mini_exiftool
Try it:
irb>
require 'mini_exiftool'
movie = MiniExiftool.new('test_movie.mov')
movie.orientation #=> 90
cheers

You can use ffprobe. No need for any grep, or any other additional processes, or any regex operations to parse the output as shown in other answers.
If you want the rotate metadata:
Command:
ffprobe -loglevel error -select_streams v:0 -show_entries stream_tags=rotate -of default=nw=1:nk=1 input.mp4
Example output:
90
If you want the display matrix rotation side data:
Command:
ffprobe -loglevel error -select_streams v:0 -show_entries side_data=rotation -of default=nw=1:nk=1 input.mp4
Example output:
-90
If you want the display matrix:
Command:
ffprobe -loglevel error -select_streams v:0 -show_entries side_data=displaymatrix -of default=nw=1:nk=1 input.mp4
Example output:
00000000: 0 65536 0
00000001: -65536 0 0
00000002: 15728640 0 1073741824
What the options mean
-loglevel error Omit the header and other info from output.
-select_streams v:0 Only process the first video stream and ignore everything else. Useful if your input contains multiple video streams and you only want info from one.
-show_entries stream_tags=rotate Chooses to output the rotate tag from the video stream.
-of default=nw=1:nk=1 Use default output format, but omit including the section header/footer wrappers and each field key.
Output format
The output from ffprobe can be formatted in several ways. For example, JSON:
ffprobe -loglevel error -show_entries stream_tags=rotate -of json input.mp4
{
"streams": [
{
"tags": {
"rotate": "90"
},
"side_data_list": [
{
}
]
}
]

From what I've found thus far, ffmpeg doesn't have the ability to detect iPhone's orientation. But, the open source library, mediainfo can. A command line example:
$ mediainfo test.mp4 | grep Rotation
Rotation : 90°
More example output from the same iphone video:
Video
ID : 1
Format : AVC
Format/Info : Advanced Video Codec
Format profile : Baseline#L3.0
Format settings, CABAC : No
Format settings, ReFrames : 1 frame
Codec ID : avc1
Codec ID/Info : Advanced Video Coding
Duration : 7s 941ms
Bit rate mode : Variable
Bit rate : 724 Kbps
Width : 480 pixels
Height : 360 pixels
Display aspect ratio : 4:3
Rotation : 90°
Frame rate mode : Variable
Frame rate : 29.970 fps
Minimum frame rate : 28.571 fps
Maximum frame rate : 31.579 fps
Color space : YUV
Chroma subsampling : 4:2:0
Bit depth : 8 bits
Scan type : Progressive
Bits/(Pixel*Frame) : 0.140
Stream size : 702 KiB (91%)
Title : Core Media Video
Encoded date : UTC 2011-06-22 15:58:25
Tagged date : UTC 2011-06-22 15:58:34
Color primaries : BT.601-6 525, BT.1358 525, BT.1700 NTSC, SMPTE 170M
Transfer characteristics : BT.709-5, BT.1361
Matrix coefficients : BT.601-6 525, BT.1358 525, BT.1700 NTSC, SMPTE 170M

ffmpeg reports the metadata with the rotation value for .mov files:
ffmpeg -i myrotatedMOV.mov
....
Duration: 00:00:14.31, start: 0.000000, bitrate: 778 kb/s
Stream #0:0(und): Video: h264 (Baseline) (avc1 / 0x31637661), yuv420p, 480x360, 702 kb/s, 29.98 fps, 30 tbr, 600 tbn, 1200 tbc
Metadata:
rotate : 180
creation_time : 2013-01-09 12:47:36
handler_name : Core Media Data Handler
Stream #0:1(und): Audio: aac (mp4a / 0x6134706D), 44100 Hz, mono, s16, 62 kb/s
Metadata:
creation_time : 2013-01-09 12:47:36
handler_name : Core Media Data Handler
In my application I pull it out with regex, ie in python:
import subprocess, re
cmd = 'ffmpeg -i %s' % pathtofile
p = subprocess.Popen(
cmd.split(" "),
stderr = subprocess.PIPE,
close_fds=True
)
stdout, stderr = p.communicate()
reo_rotation = re.compile('rotate\s+:\s(?P<rotation>.*)')
match_rotation = reo_rotation.search(stderr)
rotation = match_rotation.groups()[0]
I havent tried this with a wide range of videos, only a couple .movs recorded from an iphone5, using ffmpeg version 1.0. But so far so good.

Similar to #HdN8's answer, but without the python regex:
$ ffprobe -show_streams any.MOV 2>/dev/null | grep rotate
TAG:rotate=180
Or JSON:
$ ffprobe -of json -show_streams IMG_8738.MOV 2>/dev/null | grep rotate
"rotate": "180",
Or you could parse the JSON (or other output format).

I have extracted on iOS using an AVAssetExportSession, AVMutableComposition and the input AVAssetTrack's preferredTransform. I concatenate the preferred transform with a transformation to fill the target size.
After exporting to a file, I upload using ASIHTTPRequest to my rails server and send the data to Amazon S3 using paperclip.

Related

Lossless compression of a sequence of similar grayscale images

I would like to have the best compression ratio of a sequence of similar grayscale images. I note that I need an absolute lossless solution (meaning I should be able to check it with an hash algorithm).
What I tried
I had the idea to convert my images into a video because there is a chronology between images. The encoding algorithm would compress using the fact that not all the scene change between 2 pictures. So I tried using ffmpeg but I had several problems due to sRGB -> YUV colorspace compression. I didn't understand all the thing but it's seems like a nightmare.
Example of code used :
ffmpeg -i %04d.png -c:v libx265 -crf 0 video.mp4 #To convert into video
ffmpeg -i video.mp4 %04d.png #To recover images
My second idea was to do it by hand with imagemagik. So I took the first image as reference and create a new image that is the difference between image1 and image2. Then I tried to add the difference image with the image 1 (trying to recover image 2) but it didn't work. Noticing the size of the recreated picture, it's clear that the image is not the same. I think there was an unwanted compression during the process.
Example of code used :
composite -compose difference 0001.png 0002.png diff.png #To create the diff image
composite -compose difference 0001.png diff.png recover.png #To recover image 2
Do you have any idea about my problem ?
And why I don't manage to do the perfect recover with iamgemagik ?
Thanks ;)
Here are 20 samples images : https://cloud.damien.gdn/d/f1a7954a557441989432/
I tried a few ideas with your dataset and summarise what I found below. My calculations and percentages assume that 578kB is a representative image size.
Method 1 - crush - 69%
I just ran pngcrush on one of your images like this:
pngcrush -bruteforce input.png crushed.png
The output size was 400kB, so your image is now only taking 69% of the original space on disk.
Method 2 - rotate and crush - 34%
I rotated your images through 90 degrees and crushed the result:
magick input.png -rotate 90 result.png
pngcrush -bruteforce result.png crushed.png
The rotated crushed image takes 34% of the original space on disk.
Method 3 - rotate and difference - 24%
I rotated your images with ImageMagick, then differenced two adjacent images in the series and saved the result. I then "pngcrushed" that which resulted in 142kB, or 24% of the original space.
Method 4 - combined to RGB - 28%
I combined three of your single channel images into a 3-channel RGB image and pngcrushed the result:
magick 000[123].png -combine result.png
pngcrush -bruteforce result.png crushed.png
That resulted in a 490kB file containing 3 images, i.e. 163kB per image or 28% of the original size.
I suspect video with "motion" estimation/detection would yield the best results if you are able to do it losslessly.
You might get some gain out of MNG, which is intended for lossless animation compression. You can use libmng to try it out.

Difference between 'display_aspect_ratio' and 'sample_aspect_ratio' in ffprobe [duplicate]

I am trying to change the dimensions of the video file through FFMPEG.
I want to convert any video file to 480*360 .
This is the command that I am using...
ffmpeg -i oldVideo.mp4 -vf scale=480:360 newVideo.mp4
After this command 1280*720 dimensions are converted to 640*360.
I have also attached video. it will take less than minute for any experts out there. Is there anything wrong ?
You can see here. (in Video, after 20 seconds, direclty jump to 1:35 , rest is just processing time).
UPDATE :
I found the command from this tutorial
Every video has a Sample Aspect Ratio associated with it. A video player will multiply the video width with this SAR to produce the display width. The height remains the same. So, a 640x720 video with a SAR of 2 will be displayed as 1280x720. The ratio of 1280 to 720 i.e. 16:9 is labelled the Display Aspect Ratio.
The scale filter preserves the input's DAR in the output, so that the output does not look distorted. It does this by adjusting the SAR of the output. The remedy is to reset the SAR after scaling.
ffmpeg -i oldVideo.mp4 -vf scale=480:360,setsar=1 newVideo.mp4
Since the DAR may no longer be the same, the output can look distorted. One way to avoid this is by scaling proportionally and then padding with black to achieve target resolution.
ffmpeg -i oldVideo.mp4 -vf scale=480:360:force_original_aspect_ratio=decrease,pad=480:360:(ow-iw)/2:(oh-ih)/2,setsar=1 newVideo.mp4

Losslessly encode png frames to webm with ffmpeg

I need to convert a directory of frames to webm with absolutely no image compression, just raw images-to-frames. Using ffmpeg version N-82889-g54931fd, this is what I'm at right now.
ffmpeg -framerate 30 -f image2 -i frames/%02d.png -pix_fmt yuva420p -crf 0 output.webm
The crf 0 flag was told to be the answer, but the output is still full of compression and artifacts. Is there an option to make each frame as close to identical as possible to their corresponding png image frame?
VP8, default encoder for WebM, does not have a lossless mode. Use VP9.
ffmpeg -framerate 30 -i frames/%02d.png -c:v libvpx-vp9 -pix_fmt yuva420p -lossless 1 out.webm
Note that due to the pixel format conversion (RGB -> YUV), output will not be perfectly lossless, as there will be some rounding errors when decoding back to RGB.

Matlab - Reading and writing the same video results in a bigger size file

I am trying to read and write the same video using the following code:
video = VideoReader('test.mp4');
videoOutput = VideoWriter('testOutput.mp4', 'MPEG-4');
open(videoOutput);
while hasFrame(video)
writeVideo(videoOutput, readFrame(video, 'native'));
end
close(videoOutput);
However, testOutput.mp4 is almost double in size. For example:
Input video file size: 5.01 MB
Output video file size: 8.15 MB
I use MPEG-4 on VideoWriter because input video is H264 - MPEG4 (Part 10) as well.
take a look at the Quality property of the VideoWriter object -
Video quality, specified as an integer in the range, [0,100]. Higher quality numbers result in higher video quality and larger file sizes. Lower quality numbers result in lower video quality and smaller file sizes.

To geo-tag a photo, what attributes are required? More than lat/lon?

I am trying to geo-tag photos taken with the iPhone camera. Since I'm not saving the photos to the camera roll, the photos do not have any EXIF data.
So, what elements are required in valid EXIF data? I think I have written the latitude, longitude, and altitude to the EXIF data, but when I export the photo, there doesn't seem to be any EXIF data.
I suspect that for a service like Flickr to display the EXIF data, I need to add more than just the latitude and longitude or some such foolishness.
Anyone have any experience with writing EXIF data?
EDIT #3: Aha, got it now! Thanks for your help. What I had to do to address EDIT #2 was add the make, model, and software.
EDIT #2: Thanks to the first respondent (who I will give the correct answer two regardless), my photo is now being geo-tagged, but now I have a related question.
The photo gets automatically located when I upload it to Flickr (when I go to map it it suggests the location), but it doesn't show the nice chart of EXIF data.
This photo is geo-tagged. but no nice chart of EXIF data on this page:http://www.flickr.com/photos/33766454#N02/3980488205/.
This one exported from the camera roll shows a More Properties link to show the EXIF data in full: http://www.flickr.com/photos/33766454#N02/3978868900/
Now I'm wondering what I have to do get Flickr (and also Picasa on my desktop) to display the link to see the EXIF data. When I look at the photos properties on my Mac too, I don't see any EXIF data, but I know it's there.
EDIT: Based on the advice in the first two answers, I set the suggested EXIF tags and inspected them in EXIFTool. Unfortunately, though the EXIF data shows up in EXIFTool, it doesn't show up when I upload to Picasa or Flickr. Here is the output from EXIFTool. Any idea what I still might be missing?
~/Downloads: exiftool My_Photo_-_10-4-09_5_22_20AM.jpg
ExifTool Version Number : 7.96
File Name : My_Photo_-_10-4-09_5_22_20AM.jpg
Directory : .
File Size : 287 kB
File Modification Date/Time : 2009:10:04 05:22:51-07:00
File Type : JPEG
MIME Type : image/jpeg
JFIF Version : 1.01
Resolution Unit : None
X Resolution : 1
Y Resolution : 1
Exif Byte Order : Big-endian (Motorola, MM)
Orientation : Horizontal (normal)
Color Space : sRGB
Exif Image Width : 319
Exif Image Height : 480
GPS Version ID : 2.2.0.0
GPS Latitude Ref : North
GPS Longitude Ref : West
GPS Altitude Ref : Above Sea Level
Image Width : 319
Image Height : 480
Encoding Process : Baseline DCT, Huffman coding
Bits Per Sample : 8
Color Components : 3
Y Cb Cr Sub Sampling : YCbCr4:4:4 (1 1)
GPS Altitude : 0 m Above Sea Level
GPS Latitude : 37 deg 19' 54.08" N
GPS Longitude : 122 deg 1' 50.63" W
GPS Position : 37 deg 19' 54.08" N, 122 deg 1' 50.63" W
Image Size : 319x480
I've done this to geo-tag photos which Picasa will recognise, for Picasa you need to add the following tags:
GPSVersionID ("0 0 2 2 "), GPSlongituderef ("W" or "E"), GPSlatituderef ("N" or "S") and also GPSAltitudeRef ("Above Sea Level")
Values in brackets are the ones I used. These are in addition to the lat, long and altitude tags. As Brian mentions exiftool is excellent for examining and modifying EXIF tags.
EDIT
Output from exiftool:
ExifTool Version Number : 7.01
File Name : bleatarn.jpg
Directory : .
File Size : 3 MB
File Modification Date/Time : 2008:03:01 12:43:44
File Type : JPEG
MIME Type : image/jpeg
JFIF Version : 1.1
Resolution Unit : None
X Resolution : 1
Y Resolution : 1
Exif Byte Order : Little-endian (Intel)
Software : Picasa 3.0
Exif Version : 0210
Interoperability Index : Unknown ( )
Interoperability Version : 0100
Image Unique ID : 6fda6fa9628b8615d99abc81663c9b01
GPS Version ID : 0.0.2.2
GPS Latitude Ref : North
GPS Longitude Ref : West
GPS Altitude Ref : Above Sea Level
GPS Altitude : 0 m
Caption-Abstract : Blea Tarn
Image Width : 3151
Image Height : 1375
Encoding Process : Baseline DCT, Huffman coding
Bits Per Sample : 8
Color Components : 3
Y Cb Cr Sub Sampling : YCbCr4:2:0 (2 2)
GPS Latitude : 54 deg 25' 44.33" N
GPS Longitude : 3 deg 5' 27.44" W
GPS Position : 54 deg 25' 44.33" N, 3 deg 5' 27.44" W
Image Size : 3151x1375
Only difference I can see is that GPS Version ID is different, and you're Big-endian rather than little-endian (that shouldn't matter, should it?)
This may help. It details how to copy the geotagging data to a photo without geotagged data, and details the EXIF fields used by exiftool.
I would also set GPSMapDatum, as the EXIF standard strongly recommends including this tag. The value would be "WGS-84", assuming your coordinates come from a GPS unit or are based on common web maps or any other current source of location data. (If you're using a different datum, it'd be better to convert it to WGS-84 anyway, since I suspect most viewers just assume that to be the datum.)
If you're not familar, a map datum tells how to interpret the latitude and longitude values. Since the earth is not a perfect sphere, there are many ways the coordinates can be interpreted. Using the wrong interpretation can sometimes have a significant effect on where the marker will end up on a map.