classification technique - neural-network

My BE final year project is about sign language recognition. I'm terribly confused in choosing the right classification technique for patterns seen in the video of signs generated by a dumb user. I learned neural nets(NN) are better than hidden markov model in several aspects but fine tuning the parameters of NN requires a lot of time. Further, some reports say that Support Vector Machine are better in performance than NN. What do I choose among these alternatives or are there any other better alternatives so that it would be feasible to complete my project within 4-5 months and I could continue with that field in my masters?

Actually the system will be fed with real time video and we intend to recognize the hand postures and spatiotemporal gestures. So, its the entire sentences I'm trying to find.
On the basis of studies till now, I'm making my mind to use
1. Hu moments & eigenspace size functions to represent hand shapes
2. SVM for posture classification &
3. Threshold HMM for spatiotemporal gesture recognition.
What would u comment in these decisions?

Related

Use a trained neural network to imitate its training data

I'm in the overtures of designing a prose imitation system. It will read a bunch of prose, then mimic it. It's mostly for fun so the mimicking prose doesn't need to make too much sense, but I'd like to make it as good as I can, with a minimal amount of effort.
My first idea is to use my example prose to train a classifying feed-forward neural network, which classifies its input as either part of the training data or not part. Then I'd like to somehow invert the neural network, finding new random inputs that also get classified by the trained network as being part of the training data. The obvious and stupid way of doing this is to randomly generate word lists and only output the ones that get classified above a certain threshold, but I think there is a better way, using the network itself to limit the search to certain regions of the input space. For example, maybe you could start with a random vector and do gradient descent optimisation to find a local maximum around the random starting point. Is there a word for this kind of imitation process? What are some of the known methods?
How about Generative Adversarial Networks (GAN, Goodfellow 2014) and their more advanced siblings like Deep Convolutional Generative Adversarial Networks? There are plenty of proper research articles out there, and also more gentle introductions like this one on DCGAN and this on GAN. To quote the latter:
GANs are an interesting idea that were first introduced in 2014 by a
group of researchers at the University of Montreal lead by Ian
Goodfellow (now at OpenAI). The main idea behind a GAN is to have two
competing neural network models. One takes noise as input and
generates samples (and so is called the generator). The other model
(called the discriminator) receives samples from both the generator
and the training data, and has to be able to distinguish between the
two sources. These two networks play a continuous game, where the
generator is learning to produce more and more realistic samples, and
the discriminator is learning to get better and better at
distinguishing generated data from real data. These two networks are
trained simultaneously, and the hope is that the competition will
drive the generated samples to be indistinguishable from real data.
(DC)GAN should fit your task quite well.

OCR for Equations and Formulae on the iOS Platform (Xcode)

I'm currently developing an application which uses the iOS enabled device camera to recognise equations from the photo and then match these up to the correct equation in a library or database - basically an equation scanner. For example you could scan an Image of the Uncertainty Principle or Schrodinger Equation and the iOS device would be able to inform the user it's name and certain feedback.
I was wondering how to implement this using Xcode, I was thinking of using an open-source framework such as Tesseract OCR or OpenCV but I'm not sure how to apply these to equations.
Any help would be greatly appreciated.
Thanks.
Here's the reason why this is super ambitious. What OCR is doing is basically taking a confined set of dots and trying to match it to one of a number of members of a very small set. What you are talking about doing is more at the idiom than the character level. For instance, if I do a representation of Bayes' Rule as an equation, I have something like:
P(A|B) = P(B|A)P(A)/P(B)
Even if it recognizes each of those characters successfully, you have to have it then patch up features in the equation to families of equations. Not to mention, this is only one representation of Bayes Rule. There are others that use Sigma Notation (LaPlace's variant), and some use logs so they don't have to special case 0s.
This, btw, could be done with Bayes. Here are a few thoughts on that:
First you would have to treat the equations as Classifications, and you would have to describe them in terms of a set of features, for instance, the presence of Sigma Notation, or the application of a log.
The System would then be trained by being shown all the equations you want it to recognize, presumably several variations of each (per above). Then these classifications would have feature distributions.
Finally, when shown a new equation, the system would have to find each of these features, and then loop through the classifications and compute the overall probability that the equation matches the given classification.
This is how 90% of spam engines are done, but there, they only have two classifications: spam and not spam, and the feature representations are ludicrously simple: merely ratios of word occurrences in different document types.
Interesting problem, surely no simple answer.

Mapping Vision Outputs To Neural Network Inputs

I'm fairly new to MATLAB, but have acquainted myself with Simulink and Computer Vision over the past few days. My problem statement involves taking a traffic/highway video input and detecting if an accident has occurred.
I plan to do this by extracting the values of centroid to plot trajectory, velocity difference (between frames) and distance between two vehicles. I can successfully track the centroids, and aim to derive the rest of the features.
What I don't know is how to map these to ANN. I mean, every image has more than one vehicle blobs, which means, there are multiple centroids in a single frame/image. So, how does NN act on multiple inputs (the extracted features per vehicle) simultaneously? I am obviously missing the link. Help me figure it out please.
Also, am I looking at time series data?
I am not exactly sure about your question. The problem can be both time series data and not. You might be able to transform the time series version of the problem, such that it can be solved using ANN, but it is sort of a Maslow's hammer :). Also, Could you rephrase the problem.
As you said, you could give it features from two or three frames and then use the classifier to detect accident or not, but it might be difficult to train such a classifier. The problem is really difficult and the so you might need tons of training samples to get it right, esp really good negative samples (for examples cars travelling close to each other) etc.
There are multiple ways you can try to solve this problem of accident detection. For example : Build a classifier (ANN/SVM etc) to detect accidents without time series data. In which case your input would be accident images and non accident images or some sort of positive and negative samples for training and later images for test. In this specific case, you are not looking at the time series data. But here you might need lots of features to detect the same (this in some sense a single frame version of the problem).
The second method would be to use time series data, in which case you will have to detect the features, track the features (say using Lucas Kanade/Horn and Schunck) and then use the information about velocity and centroid to detect the accident. You might even be able to formulate it for HMMs.

Training for pattern recognition (neural network)

How do you train Neural Network for pattern recognition? For example a face recognition in a picture how would you define the output neurons? (eg. how to detect where is the face exactly, rather than just saying that there is a face in camera). Also, how about detecting multiple faces and different size of faces?
If anyone could give me a pointer it would be really great
Cheers!
Generally speaking I would split the problem into multiple stages e.g.
1 - Is there a face in the picture?
2 - Where is the face in the picture?
3 - Is the face in the picture one that the NN (Neural network) recognises?
In each instance I would suggest you build a separate NN and train it to answer the questions posed.
As for the structure of the NN, that's a bit trickier to answer as it depends on your input data and desired output. For example if you had a 100x100 px image then I suppose its feasible to have 10,000 inputs. You might want to consider doing some preprocessing before hand to say detect ovals that way you could look and see if there are a number of ovals in a predictable outline (1 for the face, 2 for the eyes, and one for the mouth possibly). If you are preprocessing the data then you might have inputs for each oval.
Now for the output... for question one you could just have one output to say how sure the NN is that there is a face in the input data i.e a valuer of 0.0 (defiantly no face) --> 1.0 (defiantly a face). This way you can move onto stages 2 and 3.
I might say at this point that this is a non-trivial problem and you might be better to have a look at some of the frameworks available e.g. OpenCV
Now for the training part, you need to have a stockpile of images available to train the NN. There are a number of ways in which you could train the NN. One potential solution is to use a technique called back propagation 1, 2. In general terms, you use the NN on an image and compare it to a predetermined output. If its wrong tweak the NN to produce the desired output and repeat.
If you want a good book on AI, then I would highly recommend Artificial Intelligence: A Modern Approach by Russell and Norvig. Im sure that there are more appropriate Computer Vision textbooks, but the Russell & Norvig book is an excellent starter.
Dear GantengX, you should prepare your self to the fact that the answer is so large, complex and hard to understand. There is so many approaches to pattern and face recognition. And implementing real-life face recognition system is a huge array of work that one person can never handle. Prepare your self for at least 10 years of life behind books on mathematic and artificial intelligence, I'm not talking about hiring 5 highly payed developers in the end who will understand what you want them to do. And maybe you will end up having your own face recognition system. There are also dozen of other issues that will jump out during the process. So be ready for a life full of stresses and problems.
I'm sorry for telling obvious things, but your question was not specific, complete answer would touch many different scientific spheres and will result as a book with over 1k pages.
Regarding your question (the short answer).
There are several principal parts that each face recognition app consists of:
Artificial intelligence algorithm
Optimization algorithm (for AI optimization)
Different filtration algorithms
Effective data set development
Items 1. and 2. are the central part of each system, they do the actual work. Any other preprocessing just makes the input data less complex, making it easier to do a decision for your AI. Don't start 3. and 4. until you will have your first results.
P.S.
Using existing solutions is more cost-effective, but if you are studying things then don't loose time like I did, and start your dissertation right away.

Project ideas for discrete mathematics course using MATLAB? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Closed 1 year ago.
Locked. This question and its answers are locked because the question is off-topic but has historical significance. It is not currently accepting new answers or interactions.
A professor asked me to help making a specification for a college project.
By the time the students should know the basics of programming.
The professor is a mathematician and has little experience in other programming languages, so it should really be in MATLAB.
I would like some projects ideas. The project should
last about 1 to 2 months
be done individually
have web interface would be great
doesn't necessary have to go deep in maths, but some would be great
use a database (or store data in files)
What kind of project would make the students excited?
If you have any other tips I'll appreciate.
UPDATE: The students are sophomores and have already studied vector calculus. This project is for an one year Discrete Mathematics course.
UPDATE 2: The topics covered in the course are
Formal Logic
Proofs, Recursion, and Analysis of Algorithms
Sets and Combinatorics
Relations, Functions, and Matrices
Graphs and Trees
Graph Algorithms
Boolean Algebra and Computer Logic
Modeling Arithmetic, Computation, and Languages
And it'll be based on this book Mathematical Structures for Computer Science: A Modern Approach to Discrete Mathematics by Judith L. Gersting
General Suggestions:
There are many teaching resources at The MathWorks that may give you some ideas for course projects. Some sample links:
The MATLAB Central blogs, specifically some posts by Loren that include using LEGO Mindstorms in teaching and a webinar about MATLAB for teaching (note: you will have to sign up to see the webinar)
The Curriculum Exchange: a repository of course materials
Teaching with MATLAB and Simulink: a number of other links you may find useful
Specific Suggestions:
One of my grad school projects in non-linear dynamics that I found interesting dealt with Lorenz oscillators. A Lorenz oscillator is a non-linear system of three variables that can exhibit chaotic behavior. Such a system would provide an opportunity to introduce the students to numerical computation (iterative methods for simulating systems of differential equations, stability and convergence, etc.).
The most interesting thing about this project was that we were using Lorenz oscillators to encode and decode signals. This "encrypted communication" aspect was really cool, and was based on the following journal article:
Kevin M. Cuomo and Alan V. Oppenheim,
Circuit Implementation of Synchronized Chaos with Applications
to Communications, Physical Review
Letters 71(1), 65-68 (1993)
The article addresses hardware implementations of a chaotic communication system, but the equivalent software implementation should be simple enough to derive (and much easier for the students to implement!).
Some other useful aspects of such a project:
The behavior of the system can be visualized in 2-D and 3-D plots, thus exposing the students to a number of graphing utilities in MATLAB (PLOT, PLOT3, COMET, COMET3, etc.).
Audio signals can be read from files, encrypted using the Lorenz equations, written out to a new file, and then decrypted once again. You could even have the students each encrypt a signal with their Lorenz oscillator code and give it to another student to decrypt. This would introduce them to various file operations (FREAD, FWRITE, SAVE, LOAD, etc.), and you could even introduce them to working with audio data file formats.
You can introduce the students to the use of the PUBLISH command in MATLAB, which allows you to format M-files and publish them to various output types (like HTML or Word documents). This will teach them techniques for making useful help documentation for their MATLAB code.
I have found that implementing and visualizing Dynamical systems is great
for giving an introduction to programming and to an interesting branch of
applied mathematics. Because one can see the 'life' in these systems,
our students really enjoy this practical module.
We usually start off by visualizing a 1D attractor, so that we can
overlay the evolution rule/rate of change with the current state of
the system. That way you can teach computational aspects (integrating the system) and
visualization, and the separation of both in implementation (on a simple level, refreshing
graphics at every n-th computation step, but in C++ leading to threads, unsure about MATLAB capabilities here).
Next we add noise, and then add a sigmoidal nonlinearity to the linear attractor. We combine this extension with an introduction to version control (we use a sandbox SVN repository for this): The
students first have to create branches, modify the evolution rule and then merge
it back into HEAD.
When going 2D you can simply start with a rotation and modify it to become a Hopf oscillator, and visualize either by morphing a grid over time or by going 3D when starting with a distinct point. You can also visualize the bifurcation diagram in 3D. So you again combine generic MATLAB skills like 3D plotting with the maths.
To link in other topics, browse around in wikipedia: you can bring in hunter/predator models, chaotic systems, physical systems, etc.etc.
We usually do not teach object-oriented-programming from within MATLAB, although it is possible and you can easily make up your own use cases in the dynamical systems setting.
When introducing inheritance, we will already have moved on to C++, and I'm again unaware of MATLAB's capabilities here.
Coming back to your five points:
Duration is easily adjusted, because the simple 1D attractor can be
done quickly and from then on, extensions are ample and modular.
We assign this as an individual task, but allow and encourage discussion among students.
About the web interface I'm at a loss: what exactly do you have in mind, why is it
important, what would it add to the assignment, how does it relate to learning MATLAB.
I would recommend dropping this.
Complexity: A simple attractor is easily understood, but the sky's the limit :)
Using a database really is a lot different from config files. As to the first, there
is a database toolbox for accessing databases from MATLAB. Few institutes have the license though, and apart from that: this IMHO does not belong into such a course. I suggest introducing to the concept of config files, e.g. for the location and strength of the attractor, and later for the system's respective properties.
All this said, I would at least also tell your professor (and your students!) that Python is rising up against MATLAB. We are in the progress of going Python with our tutorials, but I understand if someone wants to stick with what's familiar.
Also, we actually need the scientific content later on, so the usefulness for you will probably depend on which department your course will be related to.
A lot of things are possible.
The first example that comes in mind is to model a public transportation network (the network of your city, with underground, buses, tramways, ...). It is represented by a weighted directed graph (you can use sparse matrix to represent it, for example).
You may, for example, ask them to compute the shortest path from one station to another one (Moore-dijkistra algorithm, for example) and display it.
So, for the students, the several steps to do are:
choose an appropriate representation for the network (it could be some objects to represent the properties of the stations and the lines, and a sparse matrix for the network)
load all the data (you can provide them the data in an XML file)
be able to draw the network (since you will put the coordinates of the stations)
calculate the shortest path from one point to another and display it in a pretty way
create a fronted (with GUI)
Of course, this could be complicated by adding connection times (when you change from one line to another), asking for several options (shortest path with minimum connections, take in considerations the time you loose by waiting for a train/bus, ...)
The level of details will depend on the level of the students and the time they could spend on it (it could be very simple, or very realist)
You want to do a project with a web interface and a database, but not any serious math... and you're doing it in MATLAB? Do you understand that MATLAB is especially designed to be used for "deep math", and not for web interfaces or databases?
I think if this is an intro to a Discrete Mathematics course, you should probably do something involving Discrete Mathematics, and not waste the students' time as they learn a bunch of things in that language that they'll never actually use.
Why not do something involving audio? I did an undergraduate project in which we used MATLAB to automatically beat-match different tunes and DJ mix between them. The full program took all semester, but you could do a subset of it. wavread() and the like are built in and easy to use.
Or do some simple image processing like finding Waldo using cross-correlation.
Maybe do something involving cryptography, have them crack a simple encryption scheme and feel like hackers.
MATLAB started life as a MATrix LAB, so maybe concentrating on problems in linear algebra would be a natural fit.
Discrete math problems using matricies include:
Spanning trees and shortest paths
The marriage problem (bipartite graphs)
Matching algorithms
Maximal flow in a network
The transportation problem
See Gil Strang's "Intro to Applied Math" or Knuth's "Concrete Math" for ideas.
You might look here: http://www.mathworks.com/academia/student_center/tutorials/launchpad.html
on the MathWorks website. The interactive tutorial (second link) is quite popular.
--Loren
I always thought the one I was assigned in grad school was a good choice-a magnetic lens simulator. The math isn't completely overwhelming so you can focus more on learning the language, and it's a good intro to the graphical capabilities (e.g., animating the path of an off-axis electron going through the lens).
db I/O and fancy interfaces are out of place in a discrete math course.
my matlab labs were typically algorithm implementations, with charts as output, and simple file input.
how hard is the material? image processing is really easy in matlab, can you do some discrete 2D filtering? blurs and stuff. http://homepages.inf.ed.ac.uk/rbf/HIPR2/filtops.htm