I'm implementing an character recognition system with Hidden Markov Model(HMM). I have used skeleton to extract features of image. And I thought to use HMM for training images.
My question is how I can give those features to HMM? I got to know that I have to save those features into a file and then that file should feed to the HMM.
Can someone please help me? I am stuck here for two months. Still, I couldn't find the solution for this.
Appreciate your help a lot.
I was passing by and just saw this question.
Maybe you looked somewhere because your question is almost a month ago.
You give the features to HMM by clustering your data, you can use k-means, or you can use windows with lengths. If you use k-means, you will obtain the centers, you can use the centers to obtain the features, after this you need to crossfold validation to see if it really learns the features you labeled. Also K-means gives you the states and the initial transition probabilities.
Hope this helps you
I want to implement some kind of improvement of DBSCAN algorithm, where user do not need to enter input parameters (minPts and Eps). My idea is to use the K-distances plot, but what is the best method to compute the 'knee' of this plot? How to count when there are 2 or more knees on the plot?
Where I can find the source code for some DBSCAN improvement, like AUTODBSCAN, VDBSCAN, PDBSCAN or DBSCAN-DLP? Im looking for some basics, but nowhere I can find a good help. Maybe you've seen somewhere sample source codes?
DBSCAN has already been improved to death.
In Google Scholar, it has 5361 citations, and probably 1000+ of these "improve" DBSCAN. And probably a dozen of these use the k-distance plot. But none of these are used in practise.
If you want to continue this line of research, best get updated on what has been done since.
In particular, have a look at OPTICS which does away with the Epsilon parameter completely (except for performance reasons when using indexes).
Also have a look at HDBSCAN* by one of the original DBSCAN authors, Joerg Sander. That will likely be the most important DBSCAN extension besides his work on OPTICS and GDBSCAN.
I want to cluster a data set, but I do not know how many kinds in this data set, which clustering algorithm is better? Can someone give me some suggestions. Thank you very much.
There is no free lunch. And no "best" clustering algorithm.
Cluster analysis is an explorative technique. There are multiple correct answers, and it is up to you to decide which is most useful to you.
I am new in neural networks and I need to determine the pattern among a given set of inputs and outputs. So how do I decide which neural network to use for training or even which learning method to use? I have little idea about the pattern or relation between the given input and outputs.
Any sort of help will be appreciated. If you want me to read some stuff then it would be great if links are provided.
If any more info is needed plz say so.
Thanks.
Choosing the right neural networks is something of an art form. It's a bit difficult to give generic suggestions as the best NN for a situation will depend on the problem at hand. As with many of these problems neural netowrks may or may not be the best solution. I'd highly recommned trying out different networks and testing their performance vs a testing data set. When I did this I usually used the ANN tools though the R software package.
Also keep your mind open to other statistical learning techniques as well, things like decision trees and Support Vector Machines may be a better choice for some problems.
I'd suggest the following books:
http://www.amazon.com/Neural-Networks-Pattern-Recognition-Christopher/dp/0198538642
http://www.stats.ox.ac.uk/~ripley/PRbook/#Contents
I'm quite new with this topic so any help would be great. What I need is to optimize a neural network in MATLAB by using GA. My network has [2x98] input and [1x98] target, I've tried consulting MATLAB help but I'm still kind of clueless about what to do :( so, any help would be appreciated. Thanks in advance.
Edit: I guess I didn't say what is there to be optimized as Dan said in the 1st answer. I guess most important thing is number of hidden neurons. And maybe number of hidden layers and training parameters like number of epochs or so. Sorry for not providing enough info, I'm still learning about this.
If this is a homework assignment, do whatever you were taught in class.
Otherwise, ditch the MLP entirely. Support vector regression ( http://www.csie.ntu.edu.tw/~cjlin/libsvm/ ) is much more reliably trainable across a broad swath of problems, and pretty much never runs into the stuck-in-a-local-minima problem often hit with back-propagation trained MLP which forces you to solve a network topography optimization problem just to find a network which will actually train.
well, you need to be more specific about what you are trying to optimize. Is it the size of the hidden layer? Do you have a hidden layer? Is it parameter optimization (learning rate, kernel parameters)?
I assume you have a set of parameters (# of hidden layers, # of neurons per layer...) that needs to be tuned, instead of brute-force searching all combinations to pick a good one, GA can help you "jump" from this combination to another one. So, you can "explore" the search space for potential candidates.
GA can help in selecting "helpful" features. Some features might appear redundant and you want to prune them. However, say, data has too many features to search for the best set of features by some approaches such as forward selection. Again, GA can "jump" from this set candidate to another one.
You will need to find away to encode the data (input parameters, features...) fed to GA. For finding a set of input paras or a good set of features, I think binary encoding should work. In addition, choosing operators for GA to reproduce offsprings is also important. Yet GA needs to be tuned, too (early stopping which can also be applied to ANN).
Here are just some ideas. You might want to search for more info about GA, feature selection, ANN pruning...
Since you're using MATLAB already I suggest you look into the Genetic Algorithms solver (known as GATool, part of the Global Optimization Toolbox) and the Neural Network Toolbox. Between those two you should be able to save quite a bit of figuring out.
You'll basically have to do 2 main tasks:
Come up with a representation (or encoding) for your candidate solutions
Code your fitness function (which basically tests candidate solutions) and pass it as a parameter to the GA solver.
If you need help in terms of coming up with a fitness function, or encoding of candidate solutions then you'll have to be more specific.
Hope it helps.
Matlab has a simple but great explanation for this problem here. It explains both the ANN and GA part.
For more info on using ANN in command line see this.
There is also plenty of litterature on the subject if you google it. It is however not related to MATLAB, but simply the results and the method.
Look up Matthew Settles on Google Scholar. He did some work in this area at the University of Idaho in the last 5-6 years. He should have citations relevant to your work.