Writing labels & data to a CSV file for face recognition. MATLAB - matlab

I have done PCA for 21 images of the same person in different conditions. LAst step of the PCA is projection of original data : signals=PC'*data. Size of signals is 21*21, now I want to write this to a CSV file with a label as +1. Please guide me how to do this in matlab. I tried csvwrite but it does not write the labels, only the data.

[signals,V]=pca(im2double(inputdata));
for i=1:length(signals)
for j=1:1
label(i,j)=+1;
end(both for)
csvwrite('f1.csv',[label signals]);

Related

How to save labeled superpixels and visualize them in .pgm format in Matlab?

I'm working with the ERS (Entropy Rate Superpixels) algorithm in Matlab for segmenting images into superpixels. I'm considering forming 200 superpixels. When saving the file in .pgm format, with the commands
imwrite(labels, 'labels.pgm')
and
imwrite(labels, 'labels.pgm', 'MaxValue', 199)
I get a view from the entire range 0 to 255 as shown below:
When using the command
imshow(labels, [0 199])
I get a better view of all the superpixels as shown below:
Questions: How do I save image 2 in .pgm format? Are there practical differences if saving in a limit range or the whole range (I don't think so)? Regarding the creation of .pgm files in Matlab, is there any specific issue that I should consider for correct visualization and extraction of characteristics such as (color and texture) in other programs?

MATLAB: making a histogram plot from csv files read and put into cells?

Unfortunately I am not too tech proficient and only have a basic MATLAB/programming background...
I have several csv data files in a folder, and would like to make a histogram plot of all of them simultaneously in order to compare them. I am not sure how to go about doing this. Some digging online gave a script:
d=dir('*.csv'); % return the list of csv files
for i=1:length(d)
m{i}=csvread(d(i).name); % put into cell array
end
The problem is I cannot now simply write histogram(m(i)) command, because m(i) is a cell type not a csv file type (I'm not sure I'm using this terminology correctly, but MATLAB definitely isn't accepting the former).
I am not quite sure how to proceed. In fact, I am not sure what exactly is the nature of the elements m(i) and what I can/cannot do with them. The histogram command wants a matrix input, so presumably I would need a 'vector of matrices' and a command which plots each of the vector elements (i.e. matrices) on a separate plot. I would have about 14 altogether, which is quite a lot and would take a long time to load, but I am not sure how to proceed more efficiently.
Generalizing the question:
I will later be writing a script to reduce the noise and smooth out the data in the csv file, and binarise it (the csv files are for noisy images with vague shapes, and I want to distinguish these shapes by setting a cut off for the pixel intensity/value in the csv matrix, such as to create a binary image showing these shapes). Ideally, I would like to apply this to all of the images in my folder at once so I can shift out which images are best for analysis. So my question is, how can I run a script with all of the csv files in my folder so that I can compare them all at once? I presume whatever technique I use for the histogram plots can apply to this too, but I am not sure.
It should probably be better to write a script which:
-makes a histogram plot and/or runs the binarising script for each csv file in the folder
-and puts all of the images into a new, designated folder, so I can sift through these.
I would greatly appreciate pointers on how to do this. As I mentioned, I am quite new to programming and am getting overwhelmed when looking at suggestions, seeing various different commands used to apparently achieve the same thing- reading several files at once.
The function csvread returns natively a matrix. I am not sure but it is possible that if some elements inside the csv file are not numbers, Matlab automatically makes a cell array out of the output. Since I don't know the structure of your csv-files I will recommend you trying out some similar functions(readtable, xlsread):
M = readtable(d(i).name) % Reads table like data, most recommended
M = xlsread(d(i).name) % Excel like structures, but works also on similar data
Try them out and let me know if it worked. If not please upload a file sample.
The function csvread(filename)
always return the matrix M that is numerical matrix and will never give the cell as return.
If you have textual data inside the .csv file, it will give you an error for not having the numerical data only. The only reason I can see for using the cell array when reading the files is if the dimensions of individual matrices read from each file are different, for example first .csv file contains data organised as 3xA, and second .csv file contains data organised as 2xB, so you can place them all into a single structure.
However, it is still possible to use histogram on cell array, by extracting the element as an array instead of extracting it as cell element.
If M is a cell matrix, there are two options for extracting the data:
M(i) and M{i}. M(i) will give you the cell element, and cannot be used for histogram, however M{i} returns element in its initial form which is numerical matrix.
TL;DR use histogram(M{i}) instead of histogram(M(i)).

Gaussian mixture model in pyspark

I have gone through the link https://spark.apache.org/docs/latest/mllib-clustering.html regarding fitting a GMM in pyspark. I have carried out the same operation successfully in python, but after several iteration, I am unable to run in pyspark.
The questions i have are as follow;
1. The link mentioned above & another example of fitting GMM in pyspark I checked, takes a txt file with no column headings. I have a csv with 17 columns. The code is,
data = sc.textFile("..path/mydata.csv")
parsedData = data.map(lambda line: array([float(x) for x in line.strip().split(' ')]))
This worked, but when i am trying to fit GaussianMixture.train specifying some components, It is not working.
If the data used in the examples have no column headings, how can I judge which column is coming from which distribution & how the change in pattern is appearing?
How can I get heat-map from here so that whenever a new data comes in, I will use my trained model's heat-map to judge the distribution pattern of my new test data & can point out the mis-matches.
Thanks.

How to I give input through a file in MATLAB?

I have a data file having 50 2-D data points written in Notepad. I want to use it in clustering algorithm to cluster these 50 points. How can I import this file? Is there any other way to use it in program?
You can save the data as a .csv file or you can save it to an excel spreadsheet and use xlsread(). See here for more info: http://www.mathworks.com/help/techdoc/ref/xlsread.html
For the .csv case, this post should prove helpful: Fastest way to import CSV files in MATLAB
Imagine you had the following data:
X = [randn(100,2)-1 ; randn(100,2)];
save data.mat X
Then its as simple as doing:
%# load data from MAT-file
load data.mat
%# cluster into K=2 clusters
C = kmeans(X,2);
%# show cluster assignment
gscatter(X(:,1), X(:,2), C)
It depends how you have formatted the data file. You say it is saved on notepad but that is not too helpful. Depending on what you have used as the data delimiter you can import the datafile into an array using the dlmread function. For example if your file is called filename.dat and have used a ; character to separate each data item within this file you could read the data into a matrix A using
A = dlmread("filename.dat",';');
I would suggest reading the help documentation on the dlmread function in matlab.

Reading in points from a file

I have a txt file in which each row has the x, y ,z coordinates of the point. seperated by space.I want to read points from this txt file and store it as a matrix in matlab of the form [Pm_1 Pm_2 ... Pm_nmod] where each Pm_n is a point .Could someone help me with this?
I have to actually enter it into a code which accepts the model as :
"model - matrix with model points, [Pm_1 Pm_2 ... Pm_nmod]"
I use importdata heavily for this. It reads all kinds of formats ; I normally use other methods like dlmread only if importdata doesn't work.
Usage is as simple as M = importdata('data.txt');
Just use
load -ascii data.txt
That creates a matrix called `data' in your workspace whose rows contain the coordinates.
You can find all the details of the conversion in the documentation for the load command.