Importing CSV into Matlab - matlab

Chemical composition of a certain material
Hi,
I am trying to import the below mentioned data in CSV format in matlab, which is [1000x10] in dimensions.
HCL;H2SO4;CH4; SULPHUR;CHLORINE;S2O3;SO2;NH3;CO2;O2
144 2 3 141 140 6 7 137 136 10 11 133
13 131 130 16 17 127 126 20 21 123 122 24
25 119 118 28 29 115 114 32 33 111 110 36
108 38 39 105 104 42 43 101 100 46 47 97
96 50 51 93 92 54 55 89 88 58 59 85
61 83 82 64 65 79 78 68 69 75 74 72
73 71 70 76 77 67 66 80 81 63 62 84
60 86 87 57 56 90 91 53 52 94 95 49
48 98 99 45 44 102 103 41 40 106 107 37
109 35 34 112 113 31 30 116 117 27 26 120
121 23 22 124 125 19 18 128 129 15 14 132
12 134 135 9 8 138 139 5 4 142 143 1
I am able to import this data through my code
fid = fopen(uigetfile('.csv'),'rt');
FileName = fopen(fid);
headers = fgets(fid); %get first line
headers = textscan(headers,'%s','delimiter',';'); %read first line
format = repmat('%f',1,size(headers{1,1},1)); %count columns n makeformat string
data = textscan(fid,format,'delimiter',';'); %read rest of the file
data = [data{:}];
I am getting data in matrix form in variable data [1000x10] and name of all the components like HCL, H2SO4 in a cell array named headers{1x1}.
Now I have two questions like the built in import feature in matlab you have flexibility to import data as separate column vectors, numeric matrix,cell array and table format. Is it possible to do as such through code, like i get column vectors with their name HCL with [1000x1] and H2sO4 with [1000x1] in my workspace after import and so on all the column vectors with their names with [1000x1]dimensions.
if yes then help me please...?
If above mentioned is not possible then i can do alternatively that now I have names of column vectors in headers cell array, how I can extract those name and use those names as column vector names through code and I can assign data from data matrix [1000x10] to each column vector with their corresponding names.
like if i say
x = headers {1*1}{1*1}; i will get x = "HCL"
x = genvarname(x); I will get x= x0x22HCL0x2 BUT
I want that x get replaced with HCL.and then I assign
HCL = data(:,1) and same like this other variables H2SO4,SULPHUR, CHLORINE.
You can say i try to implement the import feature of column vector through my code.
Kindly help me to solve this issue. thanks

Have you tried the built-in readtable function?
You can access each column of the table by using the named column header.

If you'd like, you can use the two data types to create a table in MatLab. I'm not terribly familiar with its use, but it seems to be well documented. I'm sure someone else can expand upon this.
Edit:
After re-reading your question, I think this is closer to what you are after.
n=10;
what='HCL';%change this to any of the strings you interested in
numstr = repmat('%f',1,n);
hdrstr = repmat('%s',1,n);
headers = textscan(headers,hdrstr,'delimiter',';');
headers = headers(1,:)
data = cell2mat(textscan(fid,numstr,'delimiter',';'));
datout = data(:,strcmp(headers,what));%datout will be 1000x1 HCL data
Depending on what you want to do, you can loop through these appropriately

I know this is not what you asked for, but I would convert to a struct:
x=cell2struct(num2cell(data),headers,2)
reason is simple, selecting for example the third row with individual variables is not possible. With a struct simply use x(3)
If at some point you need the vectors you originally asked for and you can't use the strcut, use [x.HCL]

Related

Matlab: select submatrix from matrix by certain criteria

I have a matrix A
A=[f magic(10)]
A=
931142103 92 99 1 8 15 67 74 51 58 40
931142103 98 80 7 14 16 73 55 57 64 41
931142103 4 81 88 20 22 54 56 63 70 47
459200101 85 87 19 21 3 60 62 69 71 28
459200101 86 93 25 2 9 61 68 75 52 34
459200101 17 24 76 83 90 42 49 26 33 65
459200101 23 5 82 89 91 48 30 32 39 66
37833100 79 6 13 95 97 29 31 38 45 72
37833100 10 12 94 96 78 35 37 44 46 53
37833100 11 18 100 77 84 36 43 50 27 59
The first column are firm codes. The rest columns are firms' data, with each row referring to the firm in Column 1 in a given year. Notice that years may not be balance for every firms.
I would like to subtract sub-matrices according to the first column. For instance, for A(1:3,2:11) for 931142103:
A(1:3,2:11)
ans =
92 99 1 8 15 67 74 51 58 40
98 80 7 14 16 73 55 57 64 41
4 81 88 20 22 54 56 63 70 47
Same as 459200101 (which would be A(4:7,2:11)) and A(8:10,2:11) for 37833100.
I get a sense that the code should like this:
indices=find(A(:,1));
obs=size(A(:,1));
for i=1:obs,
if i==indices(i ??)
A{i}=A(??,2:11);
end
end
I have difficulties in indexing these complicated codes: 459200101 and 37833100 in order to gather them together. And how can I write the rows of my submatrix A{i}?
Thanks so much!
One approach with arrayfun -
%// Get unique entries from first column of A and keep the order
%// with 'stable' option i.e. don't sort
unqA1 = unique(A(:,1),'stable')
%// Use arrayfun to select each such submatrix and store as a cell
%// in a cell array, which is the final output
outA = arrayfun(#(n) A(A(:,1)==unqA1(n),:),1:numel(unqA1),'Uni',0)
Or this -
[~,~,row_idx] = unique(A(:,1),'stable')
outA = arrayfun(#(n) A(row_idx==n,:),1:max(row_idx),'Uni',0)
Finally, you can verify results with a call to celldisp(outA)
If values in column 1 always appear grouped (as in your example), you can use mat2cell as follows:
result = mat2cell(A, diff([0; find(diff(A(:,1))); size(A,1)]));
If they don't, just sort the rows of A according to column 1 before applying the above:
A = sortrows(A,1);
result = mat2cell(A, diff([0; find(diff(A(:,1))); size(A,1)]));
If you don't mind the results internally not being ordered, you can use accumarray for this:
[~,~,I] = unique(A(:,1),'stable');
partitions = accumarray(I, 1:size(A,1), [], #(I){A(I,2:end)});

read/load parts of the irregular file by Matlab

I would like to partly load a PTX file by matlab (please see the following example)
I need to read and write the first two row (2 numbers) into 2 variables say a and b. And read and write the data from 5th row to the end into a matrix
Thanks for your help
114
221
1 0 0
1 0 0 0
-5.566405 -7.161944 -1.144557 0.197208 24 29 35
-5.560656 -7.154540 -1.137673 0.222400 29 32 39
-5.559846 -7.153491 -1.131895 0.254002 37 40 49
-5.560894 -7.154833 -1.126452 0.305013 51 54 63
-5.560084 -7.153783 -1.120633 0.290013 72 76 88
-5.561128 -7.155119 -1.115189 0.243214 105 113 134
-5.563203 -7.157782 -1.109926 0.227604 130 143 177
-5.569191 -7.165479 -1.105504 0.201602 121 140 173
-7.833616 -10.078705 -1.546952 0.130007 94 112 134
Look at the tdfread function in order to get the data into Matlab. It should be something like datafile = tdfread(filename, '\t'). Once you have that, index into the variable returned from that function like
a = datafile(1, 1);
b = datafile(2, 1);
data = datafile(5:end, :);

Function readmtx on matlab

I want to read a matrix that is on my matlab path. I was using the function readmtx but I don't know what to put on 'precision' (mtx = readmtx(fname,nrows,ncols,precision)).
I was wondering if you could help me with that. Or suggest a better way to read the matrix
You could read a matrix from text file with load command. If the first line include text, that should be started with %.
Note that each row of the text file should be values of a row in matrix, which are separated by a space, for Example:
%C1 C2 C3
1 2 3
4 5 6
7 8 9
Then, if you use load command you can read the text file into a matrix, something like:
myMatrix = load('textFileName.txt')
Now, Let's talk about readmtx ;)
About precision as described here:
Both binary and formatted data files can be read. If the file is binary, the precision argument is a format string recognized by fread. Repetition modifiers such as '40*char' are not supported. If the file is formatted, precision is a fscanf and sscanf-style format string of the form '%nX', where n is the number of characters within which the formatted data is found, and X is the conversion character such as 'g' or 'd'. Fortran-style double-precision output such as '0.0D00' can be read using a precision string such as '%nD', where n is the number of characters per element. This is an extension to the C-style format strings accepted by sscanf. Users unfamiliar with C should note that '%d' is preferred over '%i' for formatted integers. MATLAB syntax follows C in interpreting '%i' integers with leading zeros as octal. Formatted files with line endings need to provide the number of trailing bytes per row, which can be 1 for platforms with carriage returns or linefeed (Macintosh, UNIX®), or 2 for platforms with carriage returns and linefeeds (DOS).
Check this example also:
Write and read a binary matrix file:
fid = fopen('binmat','w');
fwrite(fid,1:100,'int16');
fclose(fid);
mtx = readmtx('binmat',10,10,'int16')
mtx =
1 2 3 4 5 6 7 8 9 10
11 12 13 14 15 16 17 18 19 20
21 22 23 24 25 26 27 28 29 30
31 32 33 34 35 36 37 38 39 40
41 42 43 44 45 46 47 48 49 50
51 52 53 54 55 56 57 58 59 60
61 62 63 64 65 66 67 68 69 70
71 72 73 74 75 76 77 78 79 80
81 82 83 84 85 86 87 88 89 90
91 92 93 94 95 96 97 98 99 100
mtx = readmtx('binmat',10,10,'int16',[2 5],3:2:9)
mtx =
13 15 17 19
23 25 27 29
33 35 37 39
43 45 47 49

Load with octave created ASCII format .mat file in Matlab

As I am running out of licenses I am using both Matlab and Octave and as long as I keep things simple I haven't had any trouble.
I recently started using .mat files to decrease my amount of single files.
When I do everything in Matlab it works just fine but when I use 'save' in Octave it saves the file as ASCII and it looks somewhat like this at the beginning and has multiple matrixes in it and so multiple headers.
# Created by Octave 3.6.4, Mon Feb 24 21:34:39 2014 CET <***#****>
# name: C
# type: matrix
# rows: 10000
# columns: 10
79 79 79 79 79 79 79 79 79 79
74 115 87 55 101 46 83 92 113 61
69 142 128 48 160 45 87 113 114 71
84 107 145 62 245 78 69 88 149 78
120 73 148 32 299 114 57 79 137 76
This is fine but Matlab refuses to read the file. Neither with 'load' and '-ASCII' nor with importdata.
(Warning: File contains uninterpretable
data....)
Is there anything I can do? Octave loads the files just fine with 'load'.
Thanks!!
In Octave save as "-ascii" or as matlab binary format v4, v6, v7
octave:1> a = rand(3, 3)
a =
0.086093 0.541999 0.889222
0.029643 0.633532 0.762954
0.544787 0.150573 0.927285
octave:2> save ("-ascii", "yourfile.asc")
octave:3> save ("-v7", "yourfile.mat")
back in matlab do
>> b = load ('yourfile.asc')
b =
0.0861 0.5420 0.8892
0.0296 0.6335 0.7630
0.5448 0.1506 0.9273
or
>> load ('yourfile.mat')
>> a
a =
0.0861 0.5420 0.8892
0.0296 0.6335 0.7630
0.5448 0.1506 0.9273

MATLAB: extracting groups of columns into a submatrix?

I have a data-set, in which I want to extract columns 1-3, 7-9, 13-15, all the way to the end of the matrix
As an example, I've used the standard magic function to create a matrix
A=magic(10)
A =
92 99 1 8 15 67 74 51 58 40
98 80 7 14 16 73 55 57 64 41
4 81 88 20 22 54 56 63 70 47
85 87 19 21 3 60 62 69 71 28
86 93 25 2 9 61 68 75 52 34
17 24 76 83 90 42 49 26 33 65
23 5 82 89 91 48 30 32 39 66
79 6 13 95 97 29 31 38 45 72
10 12 94 96 78 35 37 44 46 53
11 18 100 77 84 36 43 50 27 59
I know that I can extract single columns starting at 1, in intervals of 3 with the command:
Aex=a(:,1 : 3 : end)
Aex =
92 8 74 40
98 14 55 41
4 20 56 47
85 21 62 28
86 2 68 34
17 83 49 65
23 89 30 66
79 95 31 72
10 96 37 53
11 77 43 59
Say I want to extract groups of columns instead (e.g. column 1-3, 7-9 etc.).
Is there a way to do this without having to manually point out all the column numbers?
Thanks for your help!
Rasmus
Is this what you are looking for:
Aex = A(:,[1:3 7:9])
?
I am assuming that you would like the result all concatenated into another large matrix?
If that is the case, try this one on for size:
result = A(diag(0:2)*ones(3,floor((size(A,2) - 3)/6) + 1) + ...
ones(3,floor((size(A,2) - 3)/6) + 1)*diag(1:6:(size(A,2)-3)))
That could probably be shortened with some matrix math rules. You could also parameterize the values so that it can be modified to do more than what this problem expects, (and also might make more sense),
a = 3;
b = 6;
result = A(diag(0:a-1)*ones(a,floor((size(A,2) - a)/b) + 1) + ...
ones(a,floor((size(A,2) - a)/b) + 1)*diag(1:b:(size(A,2)-a)))
where a is the size of "group" (length([1 2 3]) = length([7 8 9]) = ... = 3), etc. and b is the column spacing ([1...7...13...] in your example)
If you would like them separated, I put them in cells here, but they can go to wherever you need:
a = 3;
b = 6;
results = {};
for Cols = 1:b:(size(A,2)-a)
results{end+1} = A(:, Cols:(Cols+2));
end
I didn't check the speed of either of these, but I think the first one may be faster. You may want to split it up into terms so it's more readable, I just did it to fit on a single line (which isn't always the best way of writing code).
The simple way to do this:
M = magic(10);
n = size(M,2)
idx = sort([1:3:n 2:3:n 3:3:n])
M(:,idx)
If however, the pattern of removal is simpler than the pattern of colums that you want to keep you could use this instead:
A = magic(10);
B = A;
B(:,4:3:end)=[];
B(:,4:3:end)=[]; %Yes 3x the same line of code.
B(:,4:3:end)=[];