Load a dataset in matlab for knn classification - matlab

Hello guys actually i want to load this dataset in matlab for executing knn classification on it but i dont know how to do so .i have tried load and readtable and ... but it didnt work then i have tried this code
FID=fopen('file','rt');
a=textscan(FID,''...);
But actually i could not find out what is text format to take data that i wanted so it was a dead end does any one can help me with this.please
This is how inside of my data file looks like
enter image description here
this is the dataset file http://lms.ui.ac.ir/public/group/a7/b2/06/6a5fb_24fb.gz

Maybe you can try something as the following:
EDIT: code now imports file correctly
FID = fopen('toy.m','rt');
data = {};
i = 1;
while ~feof(FID)
line = fgetl(FID);
line = fgetl(FID);
if feof(FID)
break;
end
%sprintf('%d %s', i, line) %debug
if (isempty(line))
headers = 3;
elseif (line(1) ~= '#')
headers = 2;
else
headers = 1;
end
rows = textscan(FID, '%s %s %d', 1, 'headerlines', headers);
data{i} = cell2mat(textscan(FID, '', rows{3}, 'headerlines', 2));
i = i + 1;
end
fclose(FID);
fprintf('File contained %d blocks of data\n', i);
fprintf('To access a value in cell "data", for example value (1,2) inside the second matrix, digit "data{2}(1,2)": ');
data{2}(1,2)

Related

How to write multiple matrixes in a textfile in MATLAB?

I have three matrixes and their sizes are different. I need to write them in a textfile. I've tried to do that writing these:
fileID = fopen('results.txt','w');
fprintf(fileID,'HEADER\n');
fprintf(fileID,'\nmatrix1 = ');
fprintf(fileID,'%d',m1,'\n');
fprintf(fileID,'\nmatrix2 = ');
fprintf(fileID,'%d',m2,'\n');
fprintf(fileID,'\nresult = ');
fprintf(fileID,'%d',m3,'\n');
fclose(fileID);
The result is:
HEADER
matrix1 = 1111121121111111111132133113132333223333213233222212112411442341243123122112323313342431432334132434333341241424433334334333412414244333343321321212221211211222213213122212112112222132232232222231222222344333342243323232333224333343324223233443243343433343333334432433434333433333233443443434443444443444344343433443434443244224343444344444443341442444434434333413133242131123132234344433432434334433124313312212222124222241243323223113222323323343212434321111433213223121241442414334232433243434434412211241211113211121224333412141433122334444444444444444444492110
matrix2 = 1221314111312212211134432433434333433333211212122212112112224334432434244434444492110
result = 1041111041031091131061021111001071031011021061081001091059792110
But this isn't what I need. matrix1's size is 20x28, matrix2's size is 20x4 and matrix3's size is 1x20. They should look like matrixes in the textfile.
I should also write many more matrixes in the same file, so when I need to write something in the file, the previous data shouldn't be deleted.
Here's a function that wraps MathWorks' existing dlmwrite, which represents matrices in a file in the way you want. The wrapper is necessary to allow different naming of multiple variables in the file:
function mwrite(filename, variableName, data, mode)
if nargin < 4, mode = 'w'; end % pass mode 'w' to overwrite or 'a' to append
f = fopen(filename, mode);
fprintf(f, '%s = [', variableName);
if numel(data) == size(data, 2)
fprintf(f, '%s];\n', num2str(data) );
else
fprintf(f, '\n');
fclose(f);
dlmwrite(filename, data, '-append');
f = fopen(filename, 'a');
fprintf(f, '];\n\n');
fclose(f);
end

Matlab: func2str from a function in a m-file

In a main m-file I have
conformal = maketform('custom', 2, 2, [], #conformalInverse_0001, []);
used in imtransform that refers to the function defined in conformalInverse_0001.m:
function U = conformalInverse_0001(X, ~)
%#codegen
U = [zeros(size(X))];
Z = complex(X(:,1),X(:,2));
W = 1./(4.*Z.^2-1);
U(:,2) = imag(W);
U(:,1) = real(W);
How can I get the string '1./(4.*Z.^2-1)' in the main program?
I found a way to solve it, but it's not so elegant...
Assume conformalInverse_0001.m is a file in your folder.
You can parse the file as a text file, and search for your formula.
Example:
Assume you know the location is 5'th line in file, and start with W =.
You can use something like the following code to read '1./(4.*Z.^2-1)' in the main program:
%Open file for reading.
fid = fopen('conformalInverse_0001.m', 'r');
%Read 5 lines.
s = textscan(fid, '%s', 5, 'delimiter', '\n');
fclose(fid);
%Get the 5'th line.
s = s{1}(5);
%Convert cell array to string.
s = s{1};
%Get characters from the 5'th character to one char before end of string.
s = s(5:end-1)
Result: s = 1./(4.*Z.^2-1)
You can check textscan documentation for finding more elegant solution.
I'm not sure I fully understand the problem here, but what about adding into your conformalInverse_0001 function something like:
str = '1./(4.*Z.^2-1)';
save('temp_str','str') % or whatever data that you want to save from it
and then adding in your main file:
load('str.mat')% or you can use 'impordata'
where you want to extract it.
I have hacked two solutions with textscan: first knowing the line number and second searching the line that starts with substring 'W = '
% read line line_num = 5 and process string
f_id = fopen(conformalInverse_m_path);
conformalInverse_cell = textscan(f_id,'%s','delimiter','\n'); %disp(conformalInverse_cell); % {68×1 cell}
func_string = conformalInverse_cell{1}{line_num}; disp(func_string); % W = 1./(4.*Z.^2-1); OK
func_string_2=func_string(5:end-1); disp(func_string_2); % 1./(4.*Z.^2-1); OK
% read first line that starts with substring 'W = ' and process string
W_string = 'W = ';
for i=1:100
func_string = conformalInverse_cell{1}{i};
Firt4=func_string(1:4); %disp(['i = ', num2str(i), ': First4 = ', Firt4]);
if strcmp(Firt4,W_string) == 1; line_nr = i; break; end;
end
func_string_2 = conformalInverse_cell{1}{line_nr};
func_string_3=func_string_2(5:end-1);

Text Scanning to read in unknown number of variables and unknown number of runs

I am trying to read in a csv file which will have the format
Var1 Val1A Val1B ... Val1Q
Var2 Val2A Val2B ... Val2Q
...
And I will not know ahead of time how many variables (rows) or how many runs (columns) will be in the file.
I have been trying to get text scan to work but no matter what I try I cannot get either all the variable names isolated or a rows by columns cell array. This is what I've been trying.
fID = fopen(strcat(pwd,'/',inputFile),'rt');
if fID == -1
disp('Could not find file')
return
end
vars = textscan(fID, '%s,%*s','delimiter','\n');
fclose(fID);
Does anyone have a suggestion?
If the file has the same number of columns in each row (you just don't know how many to begin with), try the following.
First, figure out how many columns by parsing just the first row and find the number of columns, then parse the full file:
% Open the file, get the first line
fid = fopen('myfile.txt');
line = fgetl(fid);
fclose(fid);
tmp = textscan(line, '%s');
% The length of tmp will tell you how many lines
n = length(tmp);
% Now scan the file
fid = fopen('myfile.txt');
tmp = textscan(fid, repmat('%s ', [1, n]));
fclose(fid);
For any given file, are all the lines equal length? If they are, you could start by reading in the first line and use that to count the number of fields and then use textscan to read in the file.
fID = fopen(strcat(pwd,'/',inputFile),'rt');
firstLine = fgetl(fID);
numFields = length(strfind(firstLine,' ')) + 1;
fclose(fID);
formatString = repmat('%s',1,numFields);
fID = fopen(strcat(pwd,'/',inputFile),'rt');
vars = textscan(fID, formatString,' ');
fclose(fID);
Now you will have a cell array where first entry are the var names and all the other entries are the observations.
In this case I assumed the delimiter was space even though you said it was a csv file. If it is really commas, you can change the code accordingly.

matlab reading variables with varying lengths into the workspace

This is a follow up question to
Reading parameters from a text file into the workspace
I am wondering, how would I read the following:
% ---------------------- details --------------------------
%---------------------------------------------------------------
% location:
lat = 54.35
lon = -2.9833
%
Eitan T suggested using:
fid = fopen(filename);
C = textscan(fid, '%[^= ]%*[= ]%f', 'CommentStyle', '%')
fclose(fid);
to obtain the information from the file and then
lat = C{2}(strcmp(C{1}, 'lat'));
lon = C{2}(strcmp(C{1}, 'lon'));
to obtain the relevant parameters. How could I alter this to read the following:
% ---------------------- details --------------------------
%---------------------------------------------------------------
% location:
lat = 54.35
lon = -2.9833
heights = 0, 2, 4, 7, 10, 13, 16, 19, 22, 25, 30, 35
Volume = 6197333, 5630000, 4958800, 4419400, 3880000, 3340600,
3146800, 2780200, 2413600, 2177000, 1696000, 811000
%
where the variable should contain all of the data points following the equal sign (up until the start of the next variable, Volume in this case)?
Thanks for your help
Here's one method, which uses some filthy string hacking and eval to get the result. This works on your example, but I wouldn't really recommend it:
fid = fopen('data.txt');
contents = textscan(fid, '%s', 'CommentStyle', '%', 'Delimiter', '\n');
contents = contents{1};
for ii = 1:length(contents)
line = contents{ii};
eval( [strrep(line, '=', '=['), '];'] ) # convert to valid Matlab syntax
end
A better method would be to read each of the lines using textscan
for ii = 1:length(contents)
idx = strfind(contents{ii}, ' = ');
vars{ii} = contents{ii}(1:idx-1);
vals(ii) = textscan(contents{ii}(idx+3:end), '%f', 'Delimiter', ',');
end
Now the variables vars and vals have the names of your variables, and their values. To extract the values you could do something like
ix = strmatch('lat', vars, 'exact');
lat = vals{ix};
ix = strmatch('lon', vars, 'exact');
lon = vals{ix};
ix = strmatch('heights', vars, 'exact');
heights = vals{ix};
ix = strmatch('Volume', vars, 'exact');
volume = vals{ix};
This can be accomplished using a 2-step approach:
Read the leading string (first element), equals sign (ignored), and the rest of the line as a string (second element)
Convert these strings-of-the-rest-of-the-lines to floats (second element)
There is however a slight drawback here; your lines seem to follow two formats; one is the one described in step 1), the other is a continuation of the previous line, which contains numbers.
Because of this, an extra step is required:
Read the leading string (first element), equals sign (ignored), and the rest of the line as a string (second element)
This will fail when the "other format" is encountered. Detect this, correct this, and continue
Convert these strings-of-the-rest-of-the-lines to floats (second element)
I think this will do the trick:
fid = fopen('data.txt');
C = [];
try
while ~feof(fid)
% Read next set of data, assuming the "first format"
C_new = textscan(fid, '%[^= ]%*[= ]%s', 'CommentStyle', '%', 'Delimiter', '');
C = [C; C_new]; %#ok
% When we have non-trivial data, check for the "second format"
if ~all(cellfun('isempty', C_new))
% Second format detected!
if ~feof(fid)
% get the line manually
D = fgetl(fid);
% Insert new data from fgetl
C{2}(end) = {[C{2}{end} C{1}{end} D]};
% Correct the cell
C{1}(end) = [];
end
else
% empty means we've reached the end
C = C(1:end-1,:);
end
end
fclose(fid);
catch ME
% Just to make sure to not have any file handles lingering about
fclose(fid);
throw(ME);
end
% convert second elements to floats
C{2} = cellfun(#str2num, C{2}, 'UniformOutput', false);
If you can get rid of the multi-line Volume line, what you have written is valid matlab. So, just invoke the parameter file as a matlab script using the run command.
run(scriptName)
Your only other alternative, as others have shown, is to write what will end up looking like a bastardized Matlab parser. There are definitely better ways to spend your time than doing that!

Outputing cell array to CSV file ( MATLAB )

I've created a m x n cell array using cell(m,n), and filled each of the cells with arbitrary strings.
How do I output the cell array as a CSV file, where each cell in the array is a cell in the CSV 'spreadsheet'.
I've tried using cell2CSV, but I get errors ...
Error in ==> cell2csv at 71
fprintf(datei, '%s', var);
Caused by:
Error using ==> dlmwrite at 114
The input cell array cannot be converted to a matrix.
Any guidance will be well received :)
Here is a somewhat vectorized solution:
%# lets create a cellarray filled with random strings
C = cell(10,5);
chars = char(97:122);
for i=1:numel(C)
C{i} = chars(ceil(numel(chars).*rand(1,randi(10))));
end
%# build cellarray of lines, values are comma-separated
[m n] = size(C);
CC = cell(m,n+n-1);
CC(:,1:2:end) = C;
CC(:,2:2:end,:) = {','};
CC = arrayfun(#(i) [CC{i,:}], 1:m, 'UniformOutput',false)'; %'
%# write lines to file
fid = fopen('output.csv','wt');
fprintf(fid, '%s\n',CC{:});
fclose(fid);
The strings:
C =
'rdkhshx' 'egxpnpvnfl' 'qnwcxcndo' 'gubkafae' 'yvsejeaisq'
'kmsvpoils' 'zqssj' 't' 'ge' 'lhntto'
'sarlldvig' 'oeoslv' 'xznhcnptc' 'px' 'qdnjcdfr'
'jook' 'jlkutlsy' 'neyplyr' 'fmjngbleay' 'sganh'
'nrys' 'sckplbfv' 'vnorj' 'ztars' 'xkarvzblpr'
'vdbce' 'w' 'pwk' 'ofufjxw' 'qsjpdbzh'
'haoc' 'r' 'lh' 'ipxxp' 'zefiyk'
'qw' 'fodrpb' 'vkkjd' 'wlxa' 'dkj'
'ozonilmbxb' 'd' 'clg' 'seieik' 'lc'
'vkpvx' 'l' 'ldm' 'bohgge' 'aouglob'
The resulting CSV file:
rdkhshx,egxpnpvnfl,qnwcxcndo,gubkafae,yvsejeaisq
kmsvpoils,zqssj,t,ge,lhntto
sarlldvig,oeoslv,xznhcnptc,px,qdnjcdfr
jook,jlkutlsy,neyplyr,fmjngbleay,sganh
nrys,sckplbfv,vnorj,ztars,xkarvzblpr
vdbce,w,pwk,ofufjxw,qsjpdbzh
haoc,r,lh,ipxxp,zefiyk
qw,fodrpb,vkkjd,wlxa,dkj
ozonilmbxb,d,clg,seieik,lc
vkpvx,l,ldm,bohgge,aouglob
Last commment was written in "pure" C. So It doesnt work in Matlab.
Here it is the right solution.
function [ ] = writecellmatrixtocsvfile( filename, matrix )
%WRITECELLMATRIXTOCSVFILE Summary of this function goes here
% Detailed explanation goes here
fid = fopen(filename,'w');
for i = 1:size(matrix,1)
for j = 1:size(matrix,2)
fprintf(fid,'%s',matrix{i,j});
if j~=size(matrix,2)
fprintf(fid,'%s',',');
else
fprintf(fid,'\n');
end
end
end
fclose(fid);
end
easy enough to write your own csv writer.
-- edited to reflect comments --
fid = fopen('myfilename.csv','w');
for i = 1:size(A,1)
for j = 1:size(A,2)
fprintf(fid,'%s',A{i,j});
if(j!=size(A,2)
fprintf(fid,',',A{i,j})
else
fprintf(fid,'\n')
end
end
end
fclose(fid);