Store orientation to an array - and compare - iphone

I want to achieve the following:
I want the user to be able to "record" the movement of the iPhone using the gyroscope. And after that, the user should be able to replicate the same movement. I extract the pitch, roll and yaw using:
[self.motionManager startDeviceMotionUpdatesToQueue:[NSOperationQueue currentQueue]
withHandler: ^(CMDeviceMotion *motion, NSError *error)
{
CMAttitude *attitude = motion.attitude;
NSLog(#"pitch: %f, roll: %f, yaw: %f]", attitude.pitch, attitude.roll, attitude.yaw);
}];
I'm thinking that I could store these values into an array, if the user is in record mode. And when the user tries to replicate that movement, I'm could compare the replicated movement array to the recorded one. The thing is, how can I compare the two arrays in a smart way? They will never have exactly the same values, but they can be somewhat the same.
Am I at all on the right track here?
UPDATE: I think that maybe Alis answer about using DTW could be the right way for me here. But I'm not that smart (apparently), so if anyone could help me out with the first steps with comparing to arrays I would be a happy man!
Thanks!

Try dynamic time warping. Here is an illustrative example with 1D arrays. In the database we already have the following 2 arrays:
Array 1: [5, 3, 1]
Array 2: [1, 3, 5, 8, 8]
We measured [2, 4, 6, 7]. Which array is the most similar to the newly measured? Obviously, the second array is similar to the newly measured and the first is not.
Let's compute the cost matrices according to this paper, subsection 2.1:
D(i,j)=Dist(i,j)+MIN(D(i-1,j),D(i,j-1),D(i-1,j-1))
Here D(i,j) is the (i,j) element of the cost matrix, see below. Check Figure 3 of that paper to see this recurrence relation is applied. In short: columns are computed first, starting from D(1,1); D(0,*) and D(*,0) are left out in the MIN. If we are comparing arrays A and B then Dist(i,j) is the distance between A[i] and B[j]. I simply used ABS(A[i]-B[j]). The cost matrices for this example:
For Array 1 we have 13 as score, for Array 2 we have 5. The lower score wins, so the most similar array is Array 2. The best warping path is marked gray.
This is only a sketch of DTW. There are a number of issues you have to address in a real-world application. For example using offset instead of fixed ending points, or defining measures of fit: see this paper, page 363, 5. boundary conditions and page 364. The above linked paper has further details too.
I just noticed you are using yaw, pitch and roll. Simply put: don't and another reason not to. Can you use the accelerometer data instead? "An accelerometer is a direct measurement of orientation" (from the DCM manuscript) and that is what you need. And as for tc's question, does the orientation relative to North matter? I guess not.
It is far easier to compare the acceleration vectors than orientations (Euler angles, rotation matrices, quaternions) as tc pointed that out. If you are using acceleration data, you have 3 dimensional vectors at each time point, the (x,y,z) coordinates. I would simply compute
Dist(i,j)=SQRT((A[i][X]-B[j][X])^2+(A[i][Y]-B[j][Y])^2+(A[i][Z]-B[j][Z])^2),
that is the Eucledian distance between the two points.

I think Ali's approach is in general a good way to go, but there is a general problem called gimbal lock (or SO discussions on this topic) when using Euler angles i.e. pitch, roll and yaw. You will run into it when you record a more complex movement lasting longer than a few ticks and thus leading to large angle deltas in different angular directions.
In a nutshell that means, that you will have more than one mathematical representation for the same position just depending on the order of movements you made to get there - and a loss of information on the other side. Consider an airplane flying up in the air from left to right. X axis is from left to right, Y axis points up to the air. The following two movement sequences will lead to the same end position although you will get there on totally different ways:
Sequence A:
Rotation around yaw +90°
Rotation around pitch +90°
Sequence B:
Rotation around pitch +90°
Rotation around roll +90°
In both cases your airplane points down to the ground and you can see its bottom from your position.
The only solution to this is to avoid Euler angles and thus make things more complicated. Quaternions are the best way to deal with this but it took a while (for me) to get an idea of this pretty abstract representation. OK, this answer doesn't take you any step further regarding your original problem, but it might help you avoiding waste of time. Maybe you can do some conceptual changes to set up your idea.
Kay

Related

Hashing a graph to find duplicates (including rotated and reflected versions)

I am making a game that involves solving a path through graphs. Depending on the size of the graph this can take a little while so I want to cache my results.
This has me looking for an algorithm to hash a graph to find duplicates.
This is straightforward for exact copies of a graph, I simply use the node positions relative to the top corner. It becomes quite a bit more complicated for rotated or even reflected graphs. I suspect this isn't a new problem, but I'm unsure of what the terminology for it is?
My specific case is on a grid, so a node (if present) will always be connected to its four neighbors, north, south, east and west. In my current implementation each node stores an array of its adjacent nodes.
Suggestions for further reading or even complete algorithms are much appreciated.
My current hashing implementation starts at the first found node in the graph which depends on how i iterate over the playfield, then notes the position of all nodes relative to it. The base graph will have a hash that might be something like: 0:1,0:2,1:2,1:3,-1:1,
I suggest you do this:
Make a function to generate a hash for any graph, position-independent. It sounds like you already have this.
When you first generate the pathfinding solution for a graph, cache it by the hash for that graph...
...Then also generate the 7 other unique forms of that graph (rotated 90deg; rotated 270deg; flipped x; flipped y; flipped x & y; flipped along one diagonal axis; flipped along the other diagonal axis). You can of course generate these using simple vector/matrix transformations. For each of these 7 transformed graphs, you also generate that graph's hash, and cache the same pathfinding solution (which you first apply the same transform to, so the solution maps appropriately to the new graph configuration).
You're done. Later your code will look up the pathfinding solution for a graph, and even if it's an alternate (rotated, flipped) form of the graph you found the earlier solution for, the cache already contains the correct solution.
I spent some time this morning thinking about this and I think this is probably the most optimal solution. But I'll share the other over-analyzed versions of the solution that I was also thinking about...
I was considering the fact that what you really needed was a function that would take a graph G, and return the "canonical version" of G (which I'll call G'), AND the transform matrix required to convert G to G'. (It seemed like you would need the transform so you could apply it to the pathfinding data and get the correct path for G, since you would have just stored the pathfinding data for G'.) You could, of course, look up pathfinding data for G', apply the transform matrix to it, and have your pathfinding solution.
The problem is that I don't think there's any unambiguous and performant way to determine a "canonical version" of G, because it means you have to recognize all 8 variants of G and always pick the same one as G' based on some criteria. I thought I could do something clever by looking at each axis of the graph, counting the number of points along each row/column in that axis, and then rotating/flipping to put the more imbalanced half of the axis always in the top-or-left... in other words, if you pass in "d", "q", "b", "d", "p", etc. shapes, you would always get back the "p" shape (where the imbalance is towards the top-left). This would have the nice property that it should recognize when the graph was symmetrical along a given axis, and not bother to distinguish between the flipped versions on that axis, since they were the same.
So basically I just took the row-by-row/column-by-column point counts, counting the points in each half of the shape, and then rotating/flipping until the count is higher in the top-left. (Note that it doesn't matter that the count would sometimes be the same for different shapes, because all the function was concerned with was transforming the shape into a single canonical version out of all the different possible permutations.)
Where it fell down for me was deciding which axis was which in the canonical case - basically handling the case of whether to invert along the diagonal axis. Once again, for shapes that are symmetrical about a diagonal axis, the function should recognize this and not care; for any other case, it should have a criteria for saying "the axis of the shape that has the property [???] is, in the canonical version, the x axis of the shape, while the other axis will be the y axis". And without this kind of criteria, you can't distinguish two graphs that are flipped about the diagonal axis (e.g. "p" versus "σ"/sigma). The criteria I was trying to use was again "imbalance", but this turned out to be harder and harder to determine, at least the way I was approaching it. (Maybe I should have just applied the technique I was using for the x/y axes to the diagonal axes? I haven't thought through how that would work.) If you wanted to go with such a solution, you'd either need to solve this problem I failed to solve, or else give up on worrying about treating versions that are flipped about the diagonal axis as equivalent.
Although I was trying to focus on solutions that just involved calculating simple sums, I realized that even this kind of summing is going to end up being somewhat expensive to do (especially on large graphs) at runtime in pathfinding code (which needs to be as performant as possible, and which is the real point of your problem). In other words I realized that we were probably both overthinking it. You're much better off just taking a slight hit on the initial caching side and then having lightning-fast lookups based on the graph's position-independent hash, which also seems like a pretty foolproof solution as well.
Based on the twitter conversation, let me rephrase the problem (I hope I got it right):
How to compare graphs (planar, on a grid) that are treated as invariant under 90deg rotations and reflection. Bonus points if it uses hashes.
I don't have a full answer for you, but a few ideas that might be helpful:
Divide the problem into subproblems that are independently solvable. That would make
How to compare the graphs given the invariance conditions
How to transform them into a canonical basis
How to hash this canonical basis subject to tradeoffs (speed, size, collisions, ...)
You could try to solve 1 and 2 in a singe step. A naive geometric approach could be as follows:
For rotation invariance, you could try to count the edges in each direction and rotate the graph so that the major direction always point to the right. If there is no main direction you could see the graph as a point cloud of its vertices and use Eigenvectors and Priciple Compoment Analysis (PCA) to obtain the main direction and rotate it accordingly.
I don't have a smart solution for the reflection problem. My brute force way would be to just create the reflected graph all the time. Say you have a graph g and the reflected graph r(g). If you want to know if some other graph h == g you have to answer h == g || h == r(g).
Now onto the hashing:
For the hashing you probably have to trade off speed, size and collisions. If you just use the string of edges, you are high on speed and size and low on collisions. If you just take this string and apply some generic string hasher to it, you get different results.
If you use a short hash, with more frequent collisions, you can get achieve a rather small cost for comparing non matching graphs. The cost for matching graphs is a bit higher then, as you have to do a full comparison to see if they actually match.
Hope this makes some kind of sense...
best, Simon
update: another thought on the rotation problem if the edges don't give a clear winner: Compute the center of mass of the vertices and see to which side of the center of the bounding box it falls. Rotate accordingly.

Calculating Lean Angle with Core Motion

I have a record session for my application. When user started a record session I start collecting data from device's CMMotionManager object and store them on CoreData to process and present later. The data I'm collecting includes gps data, accelerometer data and gyro data. The frequency of data is 10Hz.
Currently I'm struggling to calculate the lean angle of device with motion data. It is possible to calculate which side of device is land by using gravity data but I want to calculate right or left angle between user and ground regardless of travel direction.
This problem requires some linear algebra knowledge to solve. For example for calculation on some point I must calculate the equation of a 3D line on a calculated plane. I am working on this one for a day and it's getting more complex. I'm not good at math at all. Some math examples related to the problem is appreciated too.
It depends on what you want to do with the collected data and what ways the user will go with that recording iPhone in her/his pocket. The reason is that Euler angles are no safe and especially no unique way to express a rotation. Consider a situation where the user puts the phone upright into his jeans' back pocket and then turns left around 90°. Because CMAttitude is related to a device lying flat on the table, you have two subsequent rotations for (pitch=x, roll=y, yaw=z) according to this picture:
pitch +90° for getting the phone upright => (90, 0, 0)
roll +90° for turning left => (90, 90, 0)
But you can get the same position by:
yaw +90° for turning the phone left (0, 0, 90)
pitch -90° for making the phone upright (-90, 0, 90)
You see two different representations (90, 90, 0) and (-90, 0, 90) for getting to the same rotation and there are more of them. So you press Start button, do some fancy rotations to put the phone into the pocket and you are in trouble because you can't rely on Euler Angles when doing more complex motions (s. gimbal lock for more headaches on this ;-)
Now the good news: you are right linear algebra will do the job. What you can do is force your users to put the phone in always the same position e.g. fixed upright in the right back pocket and calculate the angle(s) relative to the ground by building the dot product of gravity vector from CMDeviceMotion g = (x, y, z) and the postion vector p which is the -Y axis (0, -1, 0) in upright position:
g • x = x*0 + y*(-1) + z*0 = -y = ||g||*1*cos (alpha)
=> alpha = arccos (-y/9.81) as total angle. Note that gravitational acceleration g is constantly about 9.81
To get the left-right lean angle and forward-back angle we use the tangens:
alphaLR = arctan (x/y)
alphaFB = arctan (z/y)
[UPDATE:]
If you can't rely on having the phone at a predefined postion like (0, -1, 0) in the equations above, you can only calculate the total angle but not the specific ones alphaLR and alphaFB. The reason is that you only have one axis of the new coordinate system where you need two of them. The new Y axis y' will then be defined as average gravity vector but you don't know your new X axis because every vector perpedicular to y' will be valid.
So you have to provide further information like let the users walk a longer distance into one direction without deviating and use GPS and magnetometer data to get the 2nd axis z'. Sounds pretty error prone in practise.
The total angle is no problem as we can replace (0, -1, 0) with the average gravity vector (pX, pY, pZ):
g•p = xpX + ypY + zpZ = ||g||||p||*cos(alpha) = ||g||^2*cos(alpha)
alpha = arccos ((xpX + ypY + z*pZ) / 9.81^2)
Two more things to bear in mind:
Different persons wear different trowsers with different pockets. So the gravity vector will be different even for the same person wearing other clothes and you might need some kind of normalisation
CMMotionManager does not work in the background i.e. the users must not push the standby button
If I understand your question, I think you are interested in getting the attitude of your device. You can do this using the attitude property of the CMDeviceMotion object that you get from the deviceMotion property of the CMMotionManager object.
There are two different angles that you might be interested in the CMAttitude class: roll and pitch. If you imagine your device as an airplane with the propeller at the top (where the headphone jack is), pitch is the angle the plane/device would make with the ground if the plane were in a climb or dive. Meanwhile, roll is the angle that the "wings" would make with the ground if the plane were to be banking or in mid barrel roll.
(BTW, there is a third angle called yaw that I think is not relevant for your question.)
The angles will be given in radians, but it's easy enough to convert them to degrees if that's what you want (by multiplying by 180 and then dividing by pi).
Assuming I understand what you want, the good news is that you may not need to understand any linear algebra to capture and use these angles. (If I'm missing something, please clarify and I'd be happy to help further.)
UPDATE (based on comments):
The attitude values in the CMAttitude object are relative to the ground (i.e., the default reference frame has the Z-axis as vertical, that is pointing in the opposite direction as gravity), so you don't have to worry about cancelling out gravity. So, for example, if you lie your device on a flat table top, and then roll it up onto its side, the roll property of the CMAttitude object will change from 0 to plus or minus 90 degrees (+- .5pi radians), depending on which side you roll it onto. Meanwhile, if you start it lying flat and then gradually stand it up on its end, the same will happen to the pitch property.
While you can use the pitch, roll, and yaw angles directly if you want, you can also set a different reference frame (e.g., a different direction for "up"). To do this, just capture the attitude in that orientation during a "calibration" step and then use CMAttitude's multiplyByInverseOfAttitude: method to transform your attitude data to the new reference frame.
Even though your question only mentioned capturing the "lean angle" (with the ground), you will probably want to capture at least 2 of the 3 attitude angles (e.g., pitch and either roll or yaw, depending on what they are doing), potentially all three, if the device is going to be in a person's pocket. (The device could rotate in the pocket in various ways if the pocket is baggy, for example.) For the most part, though, I think you will probably be able to rely on just two of the three (unless you see radical shifts in yaw throughout the course of a recording session). So for example, in my jeans pocket, the phone is usually nearly vertical. Thus, for me, pitch would vary a whole bunch as I, say, walk, sit or run. Roll would vary whenever I change the direction I'm facing. Meanwhile, yaw would not vary much at all (unless I do kart-wheels, which I can't!). So yaw can probably be ignored for me.
To summarize the main point: to use these attitude angles, you don't need to do any linear algebra, nor worry about gravity (although you may want to use this for other purposes, of course).
UPDATE 2 (based on Kay's new post):
Kay just replied and showed how to use gravity and linear algebra to make sure your angles are unique. (And, btw, I think you should give the bounty to that post, fwiw.)
Depending on what you want to do, you may want to use this math. You would want to use the linear algebra and gravity if you need a standardized way of "talking about" and/or comparing attitudes over the course of your recording session. If you just want to visualize them, you can probably still get away with not using the increased complexity. (For example, visualizing (pitch=90, roll=0, yaw=0) should be the same as visualizing (pitch=0, roll=90, yaw=90).) In my approach above, while you could have multiple ways of referring to the "same" attitude, none of them is actually wrong, per se. They will still give you the angles relative to the ground.
But the fact that the gyroscope can switch from one valid description of an attitude to another means that what I wrote above about getting away with only 2 of the 3 components needs to be corrected: because of this, you will need to capture all three components, no matter what. Sorry.

iPhone iOS is it possible to create a rangefinder with 2 laser pointers and an iPhone?

I'm working on an IPhone robot that would be moving around. One of the challenges is estimating distance to objects- I don't want the robot to run into things. I saw some very expensive (~1000$) laser rangefinders, and would like to emulate one using iPhone.
I got one or two camera feeds and two laser pointers. The laser pointers are mounted about 6 inches apart, at an angle The angle of lasers in relation to the cameras is known. The Angle of cameras to each other is known.
The lasers are pointing ahead of cameras, creating 2 dots on a camera feed. Is it possible to estimate the distance to the dots by looking at the distance between the dots in a camera image?
The lasers form a trapezoid from the
/wall \
/ \
/laser mount \
As the laser mount gets closer to the wall, the points should be moving further away from each other.
Is what I'm talking about feasible? Has anyone done something like that?
Would I need one or two cameras for such calculation?
If you just don't want to run into things, rather than have an accurate idea of the distance to them, then you could go "dambusters" on it and just detect when the two points become one - this would be at a known distance from the object.
For calculation, it is probaby cheaper to have four lasers instead, in two pairs, each pair at a different angle, one pair above the other. Then a comparison between the relative differences of the dots would probably let you work out a reasonably accurate distance. Math overflow for that one, though.
In theory, yes, something like this can work. Google "light striping" or "structured light depth measurement" for some good discussions of using this sort of idea on a larger scale.
In practice, your measurements are likely to be crude. There are a number of factors to consider: the camera intrinsic parameters (focal length, etc) and extrinsic parameters will affect how the dots appear in the image frame.
With only two sample points (note that structured light methods use lines, etc), the environment will present difficulties for distance measurement. Surfaces that are directly perpendicular to the floor (and direction of travel) can be handled reasonably well. Slopes and off-angle walls may be detectable, but you will find many situations that will give ambiguous or incorrect distance measures.

Calibrating code to iphone acellerometer and gyro data

I'm concepting an iPhone app that will require precise calibration to the iPhones accelerometer and gyro data. I will have to simulate specific movements that I would eventually like to execute code. (Think shake-to-shuffle, or undo).
Is there a good way of doing this already? or something you can come up with? Perhaps some way to generate a time/value graph of the movement data as it is being captured?
Movement data being captured - see the accelerometer graph sample app, which shows the data in real time: http://developer.apple.com/library/ios/#samplecode/AccelerometerGraph/Introduction/Intro.html
The data is pretty noisy - the gyro and accelerometer aren't good enough right now to be able to track where the phone is in local 3d space, for example. The rotation, however, is very solid, and the orientation of the device can be pretty accurately tracked. You may have the best results making gestures out of rotation data instead of movement along an axis. Or, basic direction like shakes along an axis will work as Jacob Jennings said.
A good starting point for accelerometer gesture recognition is this tutorial by Kevin Bomberry at AblePear:
http://blog.ablepear.com/2010/02/iphone-sdk-shake-rattle-roll.html
He sets a blanket threshold for the absolute value of acceleration on any axis. I would generate an 'event' for the axis that had the highest acceleration during the break of the threshold (Z POSITIVE, X NEGATIVE, etc), and push these on an 'event history' queue. At the end of each didAccelerate call, evaluate the queue for patterns that match a gesture, for example:
X POSITIVE, X NEGATIVE, X POSITIVE, X NEGATIVE might be considered a 'shake' along that axis. This should provide a couple different gesture commands.
See the following for a simple queue category addition to NSMutableArray:
How do I make and use a Queue in Objective-C?

How to determine absolute orientation

I have a xyz accelerometer and magnetometer. Now I want to determine the orientation of the device using both. The problem I see is that depending on the device orientation, I'd need to use the sensors in different order.
Let me give an example. If I have the device facing me then changes in both the roll and pitch can be determined with the accelerometer. For yaw I use the magnetometer.
But if I put the device horizontally (ie. turn it 90º, facing the ceiling) then any change in the up vector (now horizontal) isn't notice, as the accelerometer doesn't detect any change. This can now be detected with the magnetometer.
So the question is, how to determine when to use one or the other. Is this enough with both sensors or do I need something else?
Thanks
The key is to use the cross product of the two vectors, gravity and magnetometer. The cross product gives a new vector perpendicular to them both. That means it is horizontal (perpendicular to down) and 90 degrees away from north. Now you have three orthogonal vectors which define orientation. It is a little ugly because they are not all perpendicular but that is easy to fix. If you then cross this new vector back with the gravity vector that gives a third vector perpendicular to the gravity vector and the magnet plane vector. Now you have three perpendicular vectors which defines your 3D orientation coordinate system. The original accelerometer (gravity) vector defines Z (up/down) and the two cross product vectors define the east/west and north/south components of the orientation.
Here is some documentation that walks through this project. As is clear from other answers, the math can be tricky.
http://www.freescale.com/files/sensors/doc/app_note/AN4248.pdf
I think the question "how to determine when to use one or the other" is misguided. You should always use both sensors for orientation. There are cases where one of them is useless. However, these are edge cases.
If I understand you correctly, you'll need something to detect pitch (tilting) and orientation according to the cardinal points (North, East, South and West).
The pitch can be read from the accelerometer.
The orientation according to the cardinal points can be read from a compass.
Combining the output from these two sensors correctly with the right math in your software will most likely give you the absolute orientation.
I think it's doable that way.
Good luck.
In the event you still need absolute orientation you can check this break out board from Adafruit: https://www.adafruit.com/products/2472. The nice thing about this is board is that it has an ARM Cortex-M0 processor to do all of the calculations for you.