I am trying to load feature vectors into classifiers such as a k-nearest neighbors classifier.
I have my code for GLCM, so I get contrast, correlation, energy, homogeneity in numbers (feature vectors).
My question is, how can I save every set of feature vectors from all the training images? I have seen somewhere that people had a .set file to load into classifiers (may be it is a special case for the particular classifier toolbox).
load 'mydata.set';
for example.
I suppose it does not have to be a .set file.
I'd just need a way to store all the feature vectors from all the training images in a separate file that can be loaded.
I've google,
and I found this that may be useful
but I am not entirely sure.
Thanks for your time and help in advance.
Regards.
If you arrange your feature vectors as the columns of an array called X, then just issue the command
save('some_description.mat','X');
Alternatively, if you want the save file to be readable, say in ASCII, then just use this instead:
save('some_description.txt', 'X', '-ASCII');
Later, when you want to re-use the data, just say
var = {'X'}; % <-- You can modify this if you want to load multiple variables.
load('some_description.mat', var{:});
load('some_description.txt', var{:}); % <-- Use this if you saved to .txt file.
Then the variable named 'X' will be loaded into the workspace and its columns will be the same feature vectors you computed before.
You will want to replace the some_description part of each file name above and instead use something that allows you to easily identify which data set's feature vectors are saved in the file (if you have multiple data sets). Your array of feature vectors may also be called something besides X, so you can change the name accordingly.
Related
Unfortunately I am not too tech proficient and only have a basic MATLAB/programming background...
I have several csv data files in a folder, and would like to make a histogram plot of all of them simultaneously in order to compare them. I am not sure how to go about doing this. Some digging online gave a script:
d=dir('*.csv'); % return the list of csv files
for i=1:length(d)
m{i}=csvread(d(i).name); % put into cell array
end
The problem is I cannot now simply write histogram(m(i)) command, because m(i) is a cell type not a csv file type (I'm not sure I'm using this terminology correctly, but MATLAB definitely isn't accepting the former).
I am not quite sure how to proceed. In fact, I am not sure what exactly is the nature of the elements m(i) and what I can/cannot do with them. The histogram command wants a matrix input, so presumably I would need a 'vector of matrices' and a command which plots each of the vector elements (i.e. matrices) on a separate plot. I would have about 14 altogether, which is quite a lot and would take a long time to load, but I am not sure how to proceed more efficiently.
Generalizing the question:
I will later be writing a script to reduce the noise and smooth out the data in the csv file, and binarise it (the csv files are for noisy images with vague shapes, and I want to distinguish these shapes by setting a cut off for the pixel intensity/value in the csv matrix, such as to create a binary image showing these shapes). Ideally, I would like to apply this to all of the images in my folder at once so I can shift out which images are best for analysis. So my question is, how can I run a script with all of the csv files in my folder so that I can compare them all at once? I presume whatever technique I use for the histogram plots can apply to this too, but I am not sure.
It should probably be better to write a script which:
-makes a histogram plot and/or runs the binarising script for each csv file in the folder
-and puts all of the images into a new, designated folder, so I can sift through these.
I would greatly appreciate pointers on how to do this. As I mentioned, I am quite new to programming and am getting overwhelmed when looking at suggestions, seeing various different commands used to apparently achieve the same thing- reading several files at once.
The function csvread returns natively a matrix. I am not sure but it is possible that if some elements inside the csv file are not numbers, Matlab automatically makes a cell array out of the output. Since I don't know the structure of your csv-files I will recommend you trying out some similar functions(readtable, xlsread):
M = readtable(d(i).name) % Reads table like data, most recommended
M = xlsread(d(i).name) % Excel like structures, but works also on similar data
Try them out and let me know if it worked. If not please upload a file sample.
The function csvread(filename)
always return the matrix M that is numerical matrix and will never give the cell as return.
If you have textual data inside the .csv file, it will give you an error for not having the numerical data only. The only reason I can see for using the cell array when reading the files is if the dimensions of individual matrices read from each file are different, for example first .csv file contains data organised as 3xA, and second .csv file contains data organised as 2xB, so you can place them all into a single structure.
However, it is still possible to use histogram on cell array, by extracting the element as an array instead of extracting it as cell element.
If M is a cell matrix, there are two options for extracting the data:
M(i) and M{i}. M(i) will give you the cell element, and cannot be used for histogram, however M{i} returns element in its initial form which is numerical matrix.
TL;DR use histogram(M{i}) instead of histogram(M(i)).
I used classregtree to fit a tree to my data set in order to classify the data. All of predictors and the response are quantitative. I want to save the range of each variable on terminal nodes, because I am gonna use those ranges in another function.
So is there any way that I can have access to those ranges? I can see the variable ranges in view(tree) plot but I need to save them in like a matrix to use them.
I am not totally sure that this is what you were asking for but this gives you the split criterions for all trees
B = TreeBagger(nTrees,M,tag, 'Method', 'classification','OOBPred','on');
view(B.Trees{1:B.NTrees})
where M is your trainig data set and tag are the classes.
I have 50 matrices contained in one folder, all of dimension 181 x 360. How do I cycle through that folder and take an average of each corresponding data points across all 50 matrices?
If the matrices are contained within Matlab variables stored using save('filename','VariableName') then they can be opened using load('filename.mat').
As such, you can use the result of filesInDirectory = dir; to get a list of all your files, using a search pattern if appropriate, like files = dir('*.mat');
Next you can use your load command, and then whos to see which variables were loaded. You should consider storing these for ease clearing after each iteration of your loop.
Once you have your matrix loaded (one at a time), you can take averages as you need, probably summing a value across multiple loop iterations, then dividing by a total counter you've been measuring (using perhaps count = count + size(MatrixVar, dimension);).
If you need all of the matrices loaded at once, then you can modify the above idea, to load using a loop, then average outside of the loop. In this case, you may need to take care - but 50*181*360 isn't too bad I suspect.
A brief introduction to the load command can be found at this link. It talks mainly about opening one matrix, then plotting the values, but there are some comments about dealing with headers, if needed, and different ways in which you can open data, if load is insufficient. It doesn't talk about binary files, though.
Note on binary files, based on comment to OP's question:
If the file can be opened using
FID = fopen('filename.dat');
fread(FID, 'float');
then you can replace the steps referring to load above, and instead use a loop to find filenames using dir, open the matrices using fopen and fread, then average as needed, finally closing the files and clearing the matrices.
In this case, probably your file identifier is the only part you're likely to need to change during the loop (although your total will increase, if that's how you want to average your data)
Reshaping the matrix, or inverting it, might make the code clearer (which is good!), but might not be necessary depending on what you're trying to average - it may be that selecting only a subsection of the matrix is sufficient.
Possible example code?
Assuming that all of the files in the current directory are to be opened, and that no files are elsewhere, you could try something like:
listOfFiles = dir('*.dat');
for f = 1:size(listOfFiles,1)
FID = fopen(listOfFiles(f).name);
Data = fread(FID, 'float');
% Reshape if needed?
Total = Total + sum(Data(start:end,:)); % This might vary, depending on what you want to average etc.
Counter = Counter + (size(Data,1) * size(Data,2)); % This product will be the 181*360 you had in the matrix, in this case
end
Av = Total/Counter;
I am writing a program and I need some help. It starts by asking this question:
A = questdlg('What would you like to do?','Artificial Neural Network',...
'Train','Test','Exit','Exit');
Then depending what the use chooses it asks certain questions and do certain things
`if strcmp (A,'Train')
B = questdlg ('Would you like to create a new network or add to the already trained data?',...
'!','Create','Add','Exit','Exit');
if strcmp (B, 'Create')
if strcmp (B, 'Create')
%add as many text file as he wants to - need to figure out how I
%can extract the data from them though
[fname,dirpath]=uigetfile ('*.txt','Select a txt file','MultiSelect',...
'on');
elseif strcmp(B,'Add')
%choose what type is it
D = listdlg('PromptString','What colour is it?',...
'SelectionMode','single', 'ListString',...
{'Strawberry','Orange',...
'Chocolate','Banana','Rose'}, 'Name','Select Ice Cream',...
'ListSize',[230 130]);
%and then whatever choise he chooses it will feed it to the main
%function. For example if he chooses Orange then it will go the
%second part of the training, if it chooses Rose and the fifth
%one and so on.
else strcmp(B,'Exit')
disp('Exit')
end
So the thing I want help with is:
How can the user when he imports the txt files in Matlab use them in order to run the program? and
How can the user add more choices at the listdlg and when it will choose a choice then automatically it will go to the corresponding step of the code?
Any help would be appreciated!
Thanks!! :)
PS: Sorry for the long post!
with uigetfile etc. you only get the filename and path. But to get the data you have to load the file:
For mat-files use:
TMW: load mat-files
For other files use:
TMW: load data from file
To open a file in MATLAB, you can use uigetfile. To save a file, you can use uiputfile. This will open up standard file dialog boxex for opening and saving files. The result would be a cell array, and then use textscan to read the data from the individual files.
You should switch-case. On selecting one of the choices, you can train the neural network accordingly. The training preferably should be written in separate m files or different subfunctions for readability.
Matlab implementation of SIFT features were found from http://www.cs.ubc.ca/~lowe/keypoints/. with the help of stackoverflow. I want to save features to a .mat file. Features are roundness, color, no of white pixel count in the binary image and sift features. For the sift features I took descriptors in above code { [siftImage, descriptors, locs] = sift(filteredImg) } So my feature vector now is FeaturesTest = [roundness, nWhite, color, descriptors, outputs]; When saving this to .mat file using save('features.mat','Features'); it gives an error. Error is like this.
??? Error using ==> horzcat CAT
arguments dimensions are not
consistent. Error in ==>
user_interface>extract_features at 336
FeaturesTest = [roundness, nWhite,
color, descriptors, outputs];
As I can understand, I think the issue is descriptor feature vector size. It is <14x128 double>. 14 rows are for this feature, where as for others only one row is in .mat file. How can I save this feature vector to the .mat file with my other features?
Awaiting for the reply. Thanks in advance.
From what I can understand, it looks like you are trying to put the variables roundness, nWhite, color, descriptors, and outputs into a single vector, and all the variables have unique dimensions.
Maybe it would be better to use a cell or a structure to store the data. To store the data in a cell, just change square brackets to curly braces, like so:
FeaturesTest = {roundness, nWhite, color, descriptors, outputs};
However, that would require you to remember which cells were which when you pulled the data back out of the .mat file. A structure may be more useful for you:
FeaturesTest.roundness = roundness;
FeaturesTest.nWhite = nWhite;
FeaturesTest.color = color;
FeaturesTest.descriptors = descriptors;
FeaturesTest.outputs = outputs;
Then, when you load the .mat file, all of the data will be contained in that structure, which you can easily reference. If you needed to look at just the color variable, you would type FeaturesTest.color, press enter, and the variable would be displayed. Alternatively, you could browse the structure by double clicking on it in the workspace window.
Alternatively, you could just use the save command like so:
save(filename,roundness, nWhite, color, descriptors, outputs)
Hope this helps.