Matlab: How to use a .dat file in Neural Network? - matlab

I'm trying to use a .dat file's data to create an RBF neural network and train it. But I don't know how to use it's columns as input and target data in the network.
This is an image of the file in matlab:
train.dat
I tried this:
fid = fopen('train.dat','r');
A = fscanf(fid, '%f');
C1 = textscan(fid,'%s%f%s%f'); %read the first line
nb_col = C1{4}; %get the number of columns (could be set by user too)
%read the remaining of the file
C2 = textscan(fid, repmat('%f',1,nb_col), 'CollectOutput',1);
fclose(fid); %close the connection
My question is what code should I write at the beginning to open the train.dat file and put it's first column into a vector (patterns) and it's third column in another vector (target)?

I'm not sure if this will work but you can try:
load train.dat
patterns = train(:,1); % If just first column
patterns = train(:,[1:2]); % If column 1 & 2 are the vector patterns
target = train(:,3);

Related

MatLab: Add an extra header line and save the data to a csv file using fprintf or a similar low-level function

I am currently working with accelerometer (ActivPAL micro 3) data collected at 8 bits, and I am converting this data to a higher resolution (10 bits). After applying the conversion I need to leave the file as the original to process the data in other software, so to do that I have to add an extra header and then save the data (Time, Uncompressed sample index, X, Y, and Z axis) in each column and row. So far I have managed to write a code (see below) that works well, but I think there is a simpler and faster way to do the same thing.
Note: my files are large, they all have more than a million lines. Also, I cannot lose the configuration of the first and second column of the file.
% Loading data
[filename, ~] = uigetfile('*.csv');
data = table2array(readtable(filename));
% X axis
data(:,3) = 2.02183 * data(:,3) + 256.728;
% Y axis
data(:,4) = 2.02183 * data(:,4) + 256.728;
% Z axis
data(:,5) = 2.02183 * data(:,5) + 256.728;
newfilename = replace(filename, "2a", "4at");
fid = fopen(newfilename, 'w');
% First header
fprintf(fid, 'sep=;\n');
% Second header (variable names)
fprintf(fid, '%s;%s;%s;%s;%s\n','Time','Uncompressed sample index','X','Y','Z');
% Saving data
for k = 1:length(data)
fprintf(fid, '%.10f;%.0f;%f;%f;%f\n', data(k,1), data(k,2), data(k,3), data(k,4), data(k,5));
end
fclose(fid);
As the file is big, I could not attach it here, so I uploaded it to my Google Drive (in this folder you can find a compressed file and an uncompressed file that is bigger than the other).
Thank you in advance,
Luiz Augusto

Open .mtx file in MATLAB

I have a .mtx file which contains a vector which I am suppose to use for matrix vector multiplication. I tried to open the file using
fopen('filename') but this does not work, it returns a single number. I have also tried using readmtx but this gives me the following error: File size does not match inputs. Expected a file size of 232316 bytes. Instead the file size is
365 bytes. Could you please advise how I can open and work with this type of file in MATLAB.
fopen('filename') outputs only the fileID, which is typically an integer >3. This fileID can be used with e.g. fscanf(fileID) to get the content of the file.
For the .mtx format, where each line contains [row-number column-number matrix-entry], you could also do something like:
%% file
filename = 'my_matrix.mtx';
%% read Matrix
Mat = read_mtx(filename);
%% function
function Mat = read_mtx(filename)
% open file
fID = fopen(filename,'r');
M = fscanf(fID, '%f');
% reshape M vector into Nx3 matrix
M = reshape(M, [3, length(M)/3])';
% assemble final matrix
Mat = zeros(M(end,1),M(end,1));
for ii = 1:size(M,1)
Mat(M(ii,1),M(ii,2)) = M(ii,3);
end
end

Saving binary file in Matlab in a loop?

Consider a matrix A in Matlab of dimension mxn and suppose I want to save it as a binary file test.dat using
File_id = fopen('test.dat', 'w');
fwrite(File_id, A, 'float32');
fclose(File_id);
Now suppose that A is created within a loop for h=1:100: how can I assign to the binary files the names test1.dat, test2.dat,...,test100.dat? In other words this is what I want to do and my question is related to step 2):
%for h=1:H
%1)do something that creates A
%2) Save A using
%File_id = fopen('test'h'.dat', 'w'); %clearly wrong
%fwrite(File_id, A, 'float32');
%fclose(File_id);
%end
In the code you posted, the line:
%File_id = fopen('test'h'.dat', 'w'); %clearly wrong
should read:
File_id = fopen(strcat('test',num2str(h),'.dat'),'w');
and that should do the trick nicely.

Fastest way to export a 2d matrix to a triples format CSV file in Matlab

I want to convert a 2d matrix, for example:
10 2
3 5
to a (row,col,value) CSV file, for example:
1,1,10
1,2,2
2,1,3
2,2,5
is it possible to do it in a single Matlab command?
I didn't find a way with a single command, but try the following code:
[i1,i2] = ind2sub(size(A),1:numel(A));
csvwrite('test.csv',[i2',i1',reshape(A',numel(A),1)]);
The output is:
type test.csv
1,1,10
1,2,2
2,1,3
2,2,5
Assuming A to be the input matrix, two approaches can be suggested here.
fprintf based solution -
output_file = 'data.txt'; %// Edit if needed to be saved to a different path
At = A.'; %//'
[y,x] = ndgrid(1:size(At,1),1:size(At,2));
fid = fopen(output_file, 'w+');
for ii=1:numel(At)
fprintf(fid, '%d,%d,%d\n',x(ii),y(ii),At(ii));
end
fclose(fid);
dlmwrite based approach -
At = A.'; %//'
[y,x] = ndgrid(1:size(At,1),1:size(At,2));
dlmwrite(output_file,[x(:) y(:) At(:)]);
Some quick tests seem to suggest that fprintf performs better across varying input datasizes.

Matlab clustering and data formats

Leading on from a previous question FCM Clustering numeric data and csv/excel file Im now trying to figure out how to take the outputed information and create a workable .dat file for use with clustering in matlab.
%# read the list of features
fid = fopen('kddcup.names','rt');
C = textscan(fid, '%s %s', 'Delimiter',':', 'HeaderLines',1);
fclose(fid);
%# determine type of features
C{2} = regexprep(C{2}, '.$',''); %# remove "." at the end
attribNom = [ismember(C{2},'symbolic');true]; %# nominal features
%# build format string used to read/parse the actual data
frmt = cell(1,numel(C{1}));
frmt( ismember(C{2},'continuous') ) = {'%f'}; %# numeric features: read as number
frmt( ismember(C{2},'symbolic') ) = {'%s'}; %# nominal features: read as string
frmt = [frmt{:}];
frmt = [frmt '%s']; %# add the class attribute
%# read dataset
fid = fopen('kddcup.data','rt');
C = textscan(fid, frmt, 'Delimiter',',');
fclose(fid);
%# convert nominal attributes to numeric
ind = find(attribNom);
G = cell(numel(ind),1);
for i=1:numel(ind)
[C{ind(i)},G{i}] = grp2idx( C{ind(i)} );
end
%# all numeric dataset
M = cell2mat(C);
I have several types of data which looks like this:
I tried the below method to create a .dat file but came up with the error:
>> a = load('matlab.mat');
>> save 'matlab.dat' a -ascii
Warning: Attempt to write an unsupported data type
to an ASCII file.
Variable 'a' not written to file.
>> a = load('data.mat');
>> save 'matlab.dat' a -ascii
Warning: Attempt to write an unsupported data type
to an ASCII file.
Variable 'a' not written to file.
>> save 'matlab.dat' a
>> findcluster('matlab.dat')
??? Error using ==> load
Number of columns on line 1 of ASCII file
C:\Users\Garrith\Documents\MATLAB\matlab.dat
must be the same as previous lines.
Error in ==> findcluster>localloadfile at 471
load(filename);
Error in ==> findcluster at 160
localloadfile(filename, param);
Matlabs clustering tool works on multi-dimensional data sets, but only displays on two
dimensions. You then use the x and y axis to compare against but im not quite sure if I will be able to create a clustering 2d analysis from the current data?
What I need to do is normalize the m file from my previous post FCM Clustering numeric data and csv/excel file
To normalize the data:
find the minimum and maximum dataset
Normalized scale minimum and maximum
Number in the data set
Normalized value
So first question is how do I find the minimum and maximum numbers in my dataset(m)
Step 1:
Find the largest and smallest values in the data set and represent them with the variables capital A and capital B:
Lets say minimum number A = 92000
and max number say B = 64525000
Step 2 normalize
Identify the smallest and largest numbers and set the variables to lower case a and b
unsure how to do this in matlab (not sure how you normalize the data to start with)
set the minimum = a = 1
set the maximum = b = 10
step 3
calculate the normalized value of any number x using the equation
A = 92000
B = 64525000
a = 1
b = 10
x = 2214000
a + (x - A)(b - a)/(B - A)
1+(2214000 - 92000)(10-1)/(6425000 - 92000)
= 4.01
Looking at the errors in the middle of your question. a = load(matfile) returns a structure, which is not supported by the ASCII-based MAT-file format. Try reading the documentation.