I want to prepare training and testing set for Automatic Speech Recognition by using a Matlab toolbox. I already have the sample set containing several recorded audio (.wav). I am new to Matlab. In order to use the toolbox I need to create training ad testing set save in .mat file. The question is how to create single .mat file containing all the audio? Thanks a million.
To create disjoint training and testing set, the best method is to use crossvalind command. So it performs cross-validation of k-fold where k is the parameter given as input. If k=5 then 1/5th data is used for testing and 4/5th data is used for training. The code is as follows:
data=randi(20,[500 20]); %creating random data with 500 rows and 20 columns.
indices=crossvalid('Kfold',size(data,1),5);
test = (indices == 2); %you can put any number between 1 to 5
train = ~test;
trainData=data(train,:);
testData=data(test,:);
savefile='dataFile.mat'
save(savefile,'trainData','testData');
If you change the number 2 to other number, you will get train-test data with a same distribution and it will be random each time. You can also put it in a for loop but then for saving, you will need to use some tricks or do it manually by placing a breakpoint each point to avoid data getting overwritten. This a general technique to create train-test sets. I hope you will be able to apply this for your problem.
Related
I am working on a traffic sign recognition code in MATLAB using Belgian Traffic Sign Dataset. This dataset can be found here.
The dataset consists of training data and test data (or evaluation data).
I resized the given images and extracted HOG features using the VL_HOG function from VL_feat library.
Then, I trained a multi class SVM using all of the signs inside the training dataset. There are 62 categories (i.e. different types of traffic signs) and 4577 frames inside the training set.
I used the fitcecoc function to obtain the classifier.
Upon training the multi-class SVM, I want to test the classifier performance using the test data and I used the predict and confusionmat functions, respectively.
For some reason, the size of the returned confusion matrix is 53 by 53 instead of 62 by 62.
Why the size of the confusion matrix is not the same as the number of categories?
Some of the folders inside the testing dataset are empty, causing MATLAB to skip those rows and columns in the confusion matrix.
In my undergrad thesis I am creating a neural network to control automated shifting algorithm for a vehicle.
I have created the nn from scratch starting from .m script which works correctly. I tested it to recognize some shapes.
A brief background information;
NN rewires neurons which are mathematical blocks located in a layer. There are multiple layers. output of a layer is input of preceding layer. Actual output is subtracted from known output and error is obtained by this manner. By using back propagation algorithm which are some algebraic equation the coefficient of neurons are updated.
What I want to do is;
in code there are 6 input matrices, don't have to be matrix just anything and corresponding outputs. lets call them as x(i) matrices and y(i) vectors. In for loop I go through each matrix and vector to teach the network. Finally by using last known updated coeffs networks give some responses according to unknown input.
I couldn't find the way that, how to simulate that for loop in simulink to go through each different input and output pairs. When the network is done with one pair it should change the input and compare with corresponding output then update the coefficient matrices.
I model the layers as given and just fed with one input but I need multiple.
When it comes to automatic transmission control issue it should do all this real time. It should continuously read the output and updates the coeffs and gives the decision.
Check out the "For each Subsystem". Exists since 2011b
To create the input signals you use the "Concatenate" Block which would have six inputs in your case, and a three dimensional output x.dim = [1x20x6] then you could iterate over the third dimension...
A very useful pattern to create smaller models that run faster and to keep your code DRY (Dont repeat yourself)
I have a problem with concept of Validation for NN. suppose I have 100 set of input variables (for example 8 input, X1,...,X8) and want to predict one Target(Y). now I have two ways to use NN:
1- use 70 set of data for training NN and then use trained NN to predict other 30 sets of Target for validation and then plot output VS Target for this 30 sets as validation plot.
2- use 100 sets of data for training NN and then divide all outputs to two part (70% and 30%). plot 70% of outputs VS corresponding Targets as Training plot. then plot other 30% outputs VS their corresponding Targets as validation plot
Which one is correct??
Also, what the difference between checking NN with new data set and validation data set??
Thanks
You cannot use data for validation, if it has been already used for the training, because the trained NN will already "know" your validation examples. The result of such validation will be very biased. I would for sure use the first way.
I want to use NN toolbox in matlab
my input is a 42*3 and my target is 42*1
i Have 42 samples with 3 features
but I cant load the target and it hasn't any error but it doesn't load
can anyone help me
Try to load an example dataset first. Matlab provides six example data sets, you can choose in the GUI. If you have no problems with those, the problem is with your data.
I'm new to matlab as well as LIBSVM. I calculated feature vector for every point stating r,g,b values of point in single vector and stored it in .mat file. Currently I'm having around 420 points and 4 classes viz Red/Green/Blue/Other. Now I want to pass this .mat file to train libsvm and based on that classify the newly arriving test point, whether it is red or blue or green or other. Need not to mention, its a multiclass classification and I don't even know how to deal with it ?
svmtrain(TrainingSet,Groups,'kernel_function','rbf'); where TrainingSet is my 420*4 feature vector set and Groups is class name.
Thanks in advance for help.