Users can sketch in my app using a very simple tool (move mouse while holding LMB). This results in a series of mousemove events and I record the cursor location at each event. The resulting polyline curve tends to be rather dense, with recorded points almost every other pixel. I'd like to smooth this pixelated polyline, but I don't want to smooth intended kinks. So how do I figure out where the kinks are?
The image shows the recorded trail (red pixels) and the 'implied' shape as a human would understand it. People tend to slow down near corners, so there is usually even more noise here than on the straight bits.
Polyline tracker http://www.freeimagehosting.net/uploads/c83c6b462a.png
What you're describing may be related to gesture recognition techniques, so you could search on them for ideas.
The obvious approach is to apply a curve fit, but that will have the effect of smoothing away all the interesting details and kinks. Another approach suggested is to look at speeds and accelerations, but that can get hairy (direction changes can be very fast or very slow and deliberate)
A fairly basic but effective approach is to simplify the samples directly into a polyline.
For example, work your way through the samples (e.g.) from sample 1 to sample 4, and check if all 4 samples lie within a reasonable error of the straight line between 1 & 4. If they do, then extend this to points 1..5 and repeat until such a time as the straight line from the start point to the end point no longer provides a resonable approximation to the curve defined by those samples. Create a line segment up to the previous sample point and start accumulating a new line segment.
You have to be careful about your thresholds when the samples are too close to each other, so you might want to adjust the sensitivity when regarding samples fewer than 4-5 pixels away from each other.
This will give you a set of straight lines that will follow the original path fairly accurately.
If you require additional smoothing, or want to create a scalable vector graphic, then you can then curve-fit from the polyline. First, identify the kinks (the places in your polyline where the angle between one line and the next is sharp - e.g. anything over 140 degrees is considered a smooth curve, anything less than that is considered a kink) and break the polyline at those discontinuities. Then curve-fit each of these sub-sections of the original gesture to smooth them. This will have the effect of smoothing the smooth stuff and sharpening the kinks. (You could go further and insert small smooth corner fillets instead of these sharp joints to reduce the sharpness of the joins)
Brute force, but it may just achieve what you want.
Rather than trying to do this from the resultant data, have you considered looking at the timing of the data as it comes in? If the mouse stops or slows noticably, you use the trend since the last 'kink' (the last time the mouse slowed) to establish the direction of travel. If the user goes off in a new direction, you call it a kink, otherwise, you ignore the current slowing trend and start waiting for the next one.
Well, one way would be to use a true curve-fitting algorithm. Generate a bezier curve (with exact endpoints, using Catmull-Rom or something similar), then optimize & recursively subdivide (using distance from actual line points as a cost metric). This may be too complicated for your use-case, though.
Record the order the pixels are drawn in. Then, compute the slope between pixels that are "near" but not "close". I'm guessing a graph of the slope between pixel(i) and pixel(i+7) might exhibit easily identifable "jumps" around kinks in the curve.
Related
I have several images that i would like to correct from artifacts. They show different animals but they appear to look like they were folded (look at the image attached). The folds are straight and they go through the wings as well, they are just hard to see but they are there. I would like to remove the folds but at the same time preserve the information from the picture (structure and color of the wings).
I am using MATLAB right now and i have tried several methods but nothing seems to work.
Initially i tried to see if i can see anything by using an FFT but i do not see a structure in the spectrum that i can remove. I tried to use several edge detection methods (like Sobel, etc) but the problem is that the edge detection always finds the edges of the wings (because they are stronger)
rather than the straight lines. I was wondering if anyone has any ideas about how to proceed with this problem? I am not attaching any code because none of the methods i have tried (and described) are working.
Thank you for the help in advance.
I'll leave this bit here for anyone that knows how to erase those lines without affecting the quality of the image:
a = imread('https://i.stack.imgur.com/WpFAA.jpg');
b = abs(diff(a,1,2));
b = max(b,[],3);
c = imerode(b,strel('rectangle',[200,1]));
I think you should use a 2-dimensional Fast Fourier Transform
It might be easier to first use GIMP / Photoshop if a filter can resolve it.
I'm guessing the CC sensor got broken (it looks to good for old scanner problems). Maybe an electric distortion while it was reading the camera sensor. Such signals in theory have a repeating nature.
I dont think this was caused by a wrong colordepth/colorspace translation
If you like to code, then you might also write a custom pixel based filter in which you take x vertical pixels (say 20 or so) compare them to the next vertical row of 20 pixels. Compare against HSL (L lightnes), not RGB.
From all pixels calculate brightness changes this way.
Then per pixel check H (heu) is within range of nearby pixels take slope average of their brightness(ea take 30 pixels horizontal, calculate average brightnes of first 10 and last 10 pixels apply that brightness to center pixel 15,... //30, 15, 10 try to find what works well
Since you have whole strokes that apear brighter/darker such filter would smooth that effect out, the difficulty is to remain other patterns (the wings are less distorted), knowing what color space the sensor had might allow for a better decision as HSL, maybe HSV or so..
I am making a game that involves solving a path through graphs. Depending on the size of the graph this can take a little while so I want to cache my results.
This has me looking for an algorithm to hash a graph to find duplicates.
This is straightforward for exact copies of a graph, I simply use the node positions relative to the top corner. It becomes quite a bit more complicated for rotated or even reflected graphs. I suspect this isn't a new problem, but I'm unsure of what the terminology for it is?
My specific case is on a grid, so a node (if present) will always be connected to its four neighbors, north, south, east and west. In my current implementation each node stores an array of its adjacent nodes.
Suggestions for further reading or even complete algorithms are much appreciated.
My current hashing implementation starts at the first found node in the graph which depends on how i iterate over the playfield, then notes the position of all nodes relative to it. The base graph will have a hash that might be something like: 0:1,0:2,1:2,1:3,-1:1,
I suggest you do this:
Make a function to generate a hash for any graph, position-independent. It sounds like you already have this.
When you first generate the pathfinding solution for a graph, cache it by the hash for that graph...
...Then also generate the 7 other unique forms of that graph (rotated 90deg; rotated 270deg; flipped x; flipped y; flipped x & y; flipped along one diagonal axis; flipped along the other diagonal axis). You can of course generate these using simple vector/matrix transformations. For each of these 7 transformed graphs, you also generate that graph's hash, and cache the same pathfinding solution (which you first apply the same transform to, so the solution maps appropriately to the new graph configuration).
You're done. Later your code will look up the pathfinding solution for a graph, and even if it's an alternate (rotated, flipped) form of the graph you found the earlier solution for, the cache already contains the correct solution.
I spent some time this morning thinking about this and I think this is probably the most optimal solution. But I'll share the other over-analyzed versions of the solution that I was also thinking about...
I was considering the fact that what you really needed was a function that would take a graph G, and return the "canonical version" of G (which I'll call G'), AND the transform matrix required to convert G to G'. (It seemed like you would need the transform so you could apply it to the pathfinding data and get the correct path for G, since you would have just stored the pathfinding data for G'.) You could, of course, look up pathfinding data for G', apply the transform matrix to it, and have your pathfinding solution.
The problem is that I don't think there's any unambiguous and performant way to determine a "canonical version" of G, because it means you have to recognize all 8 variants of G and always pick the same one as G' based on some criteria. I thought I could do something clever by looking at each axis of the graph, counting the number of points along each row/column in that axis, and then rotating/flipping to put the more imbalanced half of the axis always in the top-or-left... in other words, if you pass in "d", "q", "b", "d", "p", etc. shapes, you would always get back the "p" shape (where the imbalance is towards the top-left). This would have the nice property that it should recognize when the graph was symmetrical along a given axis, and not bother to distinguish between the flipped versions on that axis, since they were the same.
So basically I just took the row-by-row/column-by-column point counts, counting the points in each half of the shape, and then rotating/flipping until the count is higher in the top-left. (Note that it doesn't matter that the count would sometimes be the same for different shapes, because all the function was concerned with was transforming the shape into a single canonical version out of all the different possible permutations.)
Where it fell down for me was deciding which axis was which in the canonical case - basically handling the case of whether to invert along the diagonal axis. Once again, for shapes that are symmetrical about a diagonal axis, the function should recognize this and not care; for any other case, it should have a criteria for saying "the axis of the shape that has the property [???] is, in the canonical version, the x axis of the shape, while the other axis will be the y axis". And without this kind of criteria, you can't distinguish two graphs that are flipped about the diagonal axis (e.g. "p" versus "σ"/sigma). The criteria I was trying to use was again "imbalance", but this turned out to be harder and harder to determine, at least the way I was approaching it. (Maybe I should have just applied the technique I was using for the x/y axes to the diagonal axes? I haven't thought through how that would work.) If you wanted to go with such a solution, you'd either need to solve this problem I failed to solve, or else give up on worrying about treating versions that are flipped about the diagonal axis as equivalent.
Although I was trying to focus on solutions that just involved calculating simple sums, I realized that even this kind of summing is going to end up being somewhat expensive to do (especially on large graphs) at runtime in pathfinding code (which needs to be as performant as possible, and which is the real point of your problem). In other words I realized that we were probably both overthinking it. You're much better off just taking a slight hit on the initial caching side and then having lightning-fast lookups based on the graph's position-independent hash, which also seems like a pretty foolproof solution as well.
Based on the twitter conversation, let me rephrase the problem (I hope I got it right):
How to compare graphs (planar, on a grid) that are treated as invariant under 90deg rotations and reflection. Bonus points if it uses hashes.
I don't have a full answer for you, but a few ideas that might be helpful:
Divide the problem into subproblems that are independently solvable. That would make
How to compare the graphs given the invariance conditions
How to transform them into a canonical basis
How to hash this canonical basis subject to tradeoffs (speed, size, collisions, ...)
You could try to solve 1 and 2 in a singe step. A naive geometric approach could be as follows:
For rotation invariance, you could try to count the edges in each direction and rotate the graph so that the major direction always point to the right. If there is no main direction you could see the graph as a point cloud of its vertices and use Eigenvectors and Priciple Compoment Analysis (PCA) to obtain the main direction and rotate it accordingly.
I don't have a smart solution for the reflection problem. My brute force way would be to just create the reflected graph all the time. Say you have a graph g and the reflected graph r(g). If you want to know if some other graph h == g you have to answer h == g || h == r(g).
Now onto the hashing:
For the hashing you probably have to trade off speed, size and collisions. If you just use the string of edges, you are high on speed and size and low on collisions. If you just take this string and apply some generic string hasher to it, you get different results.
If you use a short hash, with more frequent collisions, you can get achieve a rather small cost for comparing non matching graphs. The cost for matching graphs is a bit higher then, as you have to do a full comparison to see if they actually match.
Hope this makes some kind of sense...
best, Simon
update: another thought on the rotation problem if the edges don't give a clear winner: Compute the center of mass of the vertices and see to which side of the center of the bounding box it falls. Rotate accordingly.
I'm currently in the process of coding a procedural terrain generator for a game. For that purpose, I divide my world into chunks of equal size and generate them one by one as the player strolls along. So far, nothing special.
Now, I specifically don't want the world to be persistent, i.e. if a chunk gets unloaded (maybe because the player moved too far away) and later loaded again, it should not be the same as before.
From my understanding, implicit approaches like treating 3D Simplex Noise as a density function input for Marching Cubes don't suit my problem. That is because I would need to reseed the generator to obtain different return values for the same point in space, leading to discontinuities along chunk borders.
I also looked into Midpoint Displacement / Diamond-Square. By seeding each chunk's heightmap with values from the borders of adjacent chunks and randomizing the chunk corners that don't have any other chunks nearby, I was able to generate a tileable terrain that exhibits the desired behavior. Still, the results look rather dull. Specifically, since this method relies on heightmaps, it lacks overhangs and the like. Moreover, even with the corner randomization, terrain features tend to be confined to small areas, i.e. there are no multiple-chunk hills or similar landmarks.
Now I was wondering if there are other approaches to this that I haven't heard of/thought about yet. Any help is highly appreciated! :)
Cheers!
Post process!
After you do the heightmaps, run back through adding features.
This is how Minecraft does it to get the various caverns and cliff overhangs.
I'm working on an IPhone robot that would be moving around. One of the challenges is estimating distance to objects- I don't want the robot to run into things. I saw some very expensive (~1000$) laser rangefinders, and would like to emulate one using iPhone.
I got one or two camera feeds and two laser pointers. The laser pointers are mounted about 6 inches apart, at an angle The angle of lasers in relation to the cameras is known. The Angle of cameras to each other is known.
The lasers are pointing ahead of cameras, creating 2 dots on a camera feed. Is it possible to estimate the distance to the dots by looking at the distance between the dots in a camera image?
The lasers form a trapezoid from the
/wall \
/ \
/laser mount \
As the laser mount gets closer to the wall, the points should be moving further away from each other.
Is what I'm talking about feasible? Has anyone done something like that?
Would I need one or two cameras for such calculation?
If you just don't want to run into things, rather than have an accurate idea of the distance to them, then you could go "dambusters" on it and just detect when the two points become one - this would be at a known distance from the object.
For calculation, it is probaby cheaper to have four lasers instead, in two pairs, each pair at a different angle, one pair above the other. Then a comparison between the relative differences of the dots would probably let you work out a reasonably accurate distance. Math overflow for that one, though.
In theory, yes, something like this can work. Google "light striping" or "structured light depth measurement" for some good discussions of using this sort of idea on a larger scale.
In practice, your measurements are likely to be crude. There are a number of factors to consider: the camera intrinsic parameters (focal length, etc) and extrinsic parameters will affect how the dots appear in the image frame.
With only two sample points (note that structured light methods use lines, etc), the environment will present difficulties for distance measurement. Surfaces that are directly perpendicular to the floor (and direction of travel) can be handled reasonably well. Slopes and off-angle walls may be detectable, but you will find many situations that will give ambiguous or incorrect distance measures.
I have an application in which users interact with each-other. I want to visualize these interactions so that I can determine whether clusters of users exist (within which interactions are more frequent).
I've assigned a 2D point to each user (where each coordinate is between 0 and 1). My idea is that two users' points move closer together when they interact, an "attractive force", and I just repeatedly go through my interaction logs over and over again.
Of course, I need a "repulsive force" that will push users apart too, otherwise they will all just collapse into a single point.
First I tried monitoring the lowest and highest of each of the XY coordinates, and normalizing their positions, but this didn't work, a few users with a small number of interactions stayed at the edges, and the rest all collapsed into the middle.
Does anyone know what equations I should use to move the points, both for the "attractive" force between users when they interact, and a "repulsive" force to stop them all collapsing into a single point?
Edit: In response to a question, I should point out that I'm dealing with about 1 million users, and about 10 million interactions between users. If anyone can recommend a tool that could do this for me, I'm all ears :-)
In the past, when I've tried this kind of thing, I've used a spring model to pull linked nodes together, something like: dx = -k*(x-l). dx is the change in the position, x is the current position, l is the desired separation, and k is the spring coefficient that you tweak until you get a nice balance between spring strength and stability, it'll be less than 0.1. Having l > 0 ensures that everything doesn't end up in the middle.
In addition to that, a general "repulsive" force between all nodes will spread them out, something like: dx = k / x^2. This will be larger the closer two nodes are, tweak k to get a reasonable effect.
I can recommend some possibilities: first, try log-scaling the interactions or running them through a sigmoidal function to squash the range. This will give you a smoother visual distribution of spacing.
Independent of this scaling issue: look at some of the rendering strategies in graphviz, particularly the programs "neato" and "fdp". From the man page:
neato draws undirected graphs using ``spring'' models (see Kamada and
Kawai, Information Processing Letters 31:1, April 1989). Input files
must be formatted in the dot attributed graph language. By default,
the output of neato is the input graph with layout coordinates
appended.
fdp draws undirected graphs using a ``spring'' model. It relies on a
force-directed approach in the spirit of Fruchterman and Reingold (cf.
Software-Practice & Experience 21(11), 1991, pp. 1129-1164).
Finally, consider one of the scaling strategies, an attractive force, and some sort of drag coefficient instead of a repulsive force. Actually moving things closer and then possibly farther later on may just get you cyclic behavior.
Consider a model in which everything will collapse eventually, but slowly. Then just run until some condition is met (a node crosses the center of the layout region or some such).
Drag or momentum can just be encoded as a basic resistance to motion and amount to throttling the movements; it can be applied differentially (things can move slower based on how far they've gone, where they are in space, how many other nodes are close, etc.).
Hope this helps.
The spring model is the traditional way to do this: make an attractive force between each node based on the interaction, and a repulsive force between all nodes based on the inverse square of their distance. Then solve, minimizing the energy. You may need some fairly high powered programming to get an efficient solution to this if you have more than a few nodes. Make sure the start positions are random, and run the program several times: a case like this almost always has several local energy minima in it, and you want to make sure you've got a good one.
Also, unless you have only a few nodes, I would do this in 3D. An extra dimension of freedom allows for better solutions, and you should be able to visualize clusters in 3D as well if not better than 2D.