Visual Odometry, Dataset - kitti

I am working with VO (Visual Odometry) I don't understand many things, for example, is a dataset always needed, I want to use VO but I don't want to use a Kitti Dataset, I want to use the algorithm implemented in my drone, and my drone will be flying in my neighborhood (that's why I don't want to use Kitti Dataset), in case a dataset is always needed, how to do it, how to get the poses?

The purpose of the KITTI dataset is two-fold. First, it's a standardized set of images and LIDAR data that researchers use in order to compare the relative performance of different algorithms. Second -- and most importantly for your case -- it's also a source of ground truth to debug or analyze your algorithm.
So, if you want to use visual odometry in your drone: pick a VO algorithm that will work on your drone hardware. Get it working on your desktop computer, using KITTI data to debug. That is: make sure your VO algorithm reports the same position as the KITTI ground truth for the sequence you are using. Note that most VO algorithms require stereo cameras, and many also use the IMU in order to generate better results.
This is a big project; don't expect quick results in a day or even a week. Work carefully, document your process, and be prepared to fail over and over again until it works.

Related

Analyze 3D objects by Voxels

I plan to use OpenVDB to analyze 3D objects/meshes. The objective is:
To detect object surface regions with a certain criterion, like slope
Then manipulate those regions
The manipulation might be adding other 3D objects to those regions, for example
OpenVDB has some tools available:
Conversion Tools
Filters
Topological Operations
Level Set Tools
Morphological Operations
Geometric Transforms
Compositing Tools
...
It is a large set of confusing tools to choose from. Does anybody with OpenVDB experience know:
Is OpenVDB the proper library to achieve my objective
If so, which OpenVDB tool best suits my needs
Answer provided by OpenVDB community:
An important question is what you mean by "3D objects/meshes."
OpenVDB is very good at performing those regions with surfaces by
representing them as signed distance fields. But the word "mesh"
raises some alarm bells that you may want to maintain topology. In
this case another library may be more effective.
It also sounds like you have a problem domain you are trying to
explore. For that, I would not go straight to code but instead
explore solutions using 3d applications first. My own biased first
choice would be Houdini, whose apprentice version you can get for
free. This provides most of the VDB code as separate nodes. So, for
example, you can use a File SOP to load a mesh from disk, a VDB From
Polygons to convert it to a Signed Distance FIeld, and then VDB
Analysis to compute the Gradient. The gradient I think matches what
you are looking for as slope, but it is also possible you are looking
for curvature...
To return to mesh land, you can use a VDB Convert. Finally a ROP
Geometry can save it out.
Attached is a file showing a network to compute an approximate Y-slope
as a volume, apply it back to a mesh, and save to disk.
Attached file

iBeacon: Particle filter extension for Fingerprinting position estimation

i have implemented a full fingerprint solution in my application.
Offline phase: I can create multiple observation points and calibrate them with the mean rssi values of all the beacons in the room.
Live phase: Here I compare the actual values with the database values to get the closest position.
Now I've read that the inclusion of a particle filter can improve the accuracy of the fingerprint solution.
Does anybody know how and why can I implement this?
I assume you can use them together as complementary solutions to each other, since I'm not aware of an approach that combines both of them practically.
Here is a nice paper about using particle filters with BLE, it does discuss other approaches as well including Fingerprinting.
To comment on your question, I know that particle filters will work better when there is line of sight between the observer and beacons. On the other hand, your current solution should work with better accuracy when there is no line of sight and especially when you are already using a database to map beacon distances to your observations.
What I would do as an "extension" is to use both methods side by side, and take advantage of the database when inside known locations depending on line of sight. For example you can use particle filter inside small rooms with less obstacles, otherwise you can put a threshold for your estimation and compare it with your database value and switch to Fingerprinting when inside more obsolete or larger indoor areas.

Mapping Vision Outputs To Neural Network Inputs

I'm fairly new to MATLAB, but have acquainted myself with Simulink and Computer Vision over the past few days. My problem statement involves taking a traffic/highway video input and detecting if an accident has occurred.
I plan to do this by extracting the values of centroid to plot trajectory, velocity difference (between frames) and distance between two vehicles. I can successfully track the centroids, and aim to derive the rest of the features.
What I don't know is how to map these to ANN. I mean, every image has more than one vehicle blobs, which means, there are multiple centroids in a single frame/image. So, how does NN act on multiple inputs (the extracted features per vehicle) simultaneously? I am obviously missing the link. Help me figure it out please.
Also, am I looking at time series data?
I am not exactly sure about your question. The problem can be both time series data and not. You might be able to transform the time series version of the problem, such that it can be solved using ANN, but it is sort of a Maslow's hammer :). Also, Could you rephrase the problem.
As you said, you could give it features from two or three frames and then use the classifier to detect accident or not, but it might be difficult to train such a classifier. The problem is really difficult and the so you might need tons of training samples to get it right, esp really good negative samples (for examples cars travelling close to each other) etc.
There are multiple ways you can try to solve this problem of accident detection. For example : Build a classifier (ANN/SVM etc) to detect accidents without time series data. In which case your input would be accident images and non accident images or some sort of positive and negative samples for training and later images for test. In this specific case, you are not looking at the time series data. But here you might need lots of features to detect the same (this in some sense a single frame version of the problem).
The second method would be to use time series data, in which case you will have to detect the features, track the features (say using Lucas Kanade/Horn and Schunck) and then use the information about velocity and centroid to detect the accident. You might even be able to formulate it for HMMs.

Matlab versus simulation products such as ANSYS and COMSOL

This may be the wrong place to ask this, but I can't find a better place on the SE network.
I've briefly worked with both Matlab and Ansys, and from what I have learnt/can gather, Matlab is a programming environment that has functions that perform common math, visualization and analysis operations. You primarily write programs in a textual fashion (.m files) or use Simulink to generate flow graphs (model-based development). Ansys on the other hand is primary a simulation environment where quite a lot can be done simply with the GUI (3D models, physics domains, configuration, display settings), and you can add equations at various points in the simulation engine in order to modify the simulation flow.
Whatever I understand is cursory and only serves as an overview. Can anyone give me a suitable real-world comparison between Matlab and Ansys (or any other simulation product such as COMSOL) that would allow us to understand when to use which, and the weaknesses of each system.
I haven't used Ansys, but Ansys is often compared with Comsol, and I've used Comsol and Matlab for years.
Matlab:
Programming language and environment that runs it. Which means it can do anything (that any other programming language can do). What are its highlights, compared to other languages?
Hundreds of built-in functions to work with Matrices. For example, in one project I needed to do simple matrix algebra (add, multiply, scale matrices), and also needed singular value decomposition. SVD is not something you could write in 50 lines of code, so I needed a ready-made library. At the time I used a library for Java, and wrote my own code for representing matrices and doing matrix algebra on them. That's a few hundreds of lines of code. Had I used Matlab, it would have been about ten lines of code, because all of it is there. I would have needed only to type help svd to find out how to use it. However, if you don't need any of that, stay away from Matlab at all costs! There are much better languages that are free.
Great to use as a calculator that is always open on the desktop, and can do back-of-the-envelope style calculations.
Plotting graphs. Many academics recommend Matlab as the tool of choice for producing publication-quality graphics. These can be exported as PDF and imported into Inkscape for further editing. The best thing is that commands for plotting a graph could be put into a script file, and then parts of it can be changed later as needed, which can save a lot of work compared to manually drawing a graph (imagine you wanted to change the axes or symbols used to present the data points).
Personally, I also use it for curve-fitting. It has many toolboxes, one of which is a neat tool that allows me to find equations that model a set of data points.
Comsol:
Specialised tool for solving partial differential equations (PDEs) on complicated domains using the finite element method (FEM). This might sound obscure, but many real-world engineering needs reduce to this. Such things as:
Finding loads, stresses and strains in civil engineering structures with complicated real-world geometry (what happens when there is gusty wind blowing onto a building or bridge?)
How do currents flow in particular conductive objects?
Chemical reactions in various industrial reactors.
What is the power efficiency of a generator (magnet spinning in coil) design?
How to place aircon outlets in a nontrivially-shaped room to achieve both good temperature distribution and good efficiency?
Comsol, as any other FEM tool that can work with arbitrary equations, can do multiphysics, which means, for example, that one could solve for chemistry of a battery, as well as the temperature and pressure, and how that feeds back into the chemical reaction (speeds up or slows down). Compared with a tool where you need to provide the equations, in Comsol, most of the things that would be needed to solve most problems are already there, and just need to be selected and applied to the geometry, which is also built inside Comsol. Also, equations of arbitrary description can be introduced.
The physical descriptions of how these physical substances behave are called PDEs.
Once Comsol has finished solving a problem, the data could be exported for post-processing into Matlab, which has much more versatile tools for manipulating data and making various plots.

classification technique

My BE final year project is about sign language recognition. I'm terribly confused in choosing the right classification technique for patterns seen in the video of signs generated by a dumb user. I learned neural nets(NN) are better than hidden markov model in several aspects but fine tuning the parameters of NN requires a lot of time. Further, some reports say that Support Vector Machine are better in performance than NN. What do I choose among these alternatives or are there any other better alternatives so that it would be feasible to complete my project within 4-5 months and I could continue with that field in my masters?
Actually the system will be fed with real time video and we intend to recognize the hand postures and spatiotemporal gestures. So, its the entire sentences I'm trying to find.
On the basis of studies till now, I'm making my mind to use
1. Hu moments & eigenspace size functions to represent hand shapes
2. SVM for posture classification &
3. Threshold HMM for spatiotemporal gesture recognition.
What would u comment in these decisions?