Read (m x n) comma separated lines of a .txt file - matlab

Hello I have these kind of data in a text file and i wanted to read the data inside it.
2003,04,15,15,15,00,38.4279,-76.61,1565,3.7,0.0,38.19,-999,-999,3.9455,3.1457,2.9253
2003,04,15,16,50,00,38.368,-76.5,1566,3.7,0.0,35.01,-999
2003,04,15,17,50,00,38.3074,-76.44
I have used the following codes:
a= zeros(4460,216);
nl = a(:,1);
nc = a(1,:);
if fid>0
for i = 1:length(nl)
d = textscan(Ligne,'%f','whitespace',',');
numbers = d{:}';
D = a(i) + numbers;
i = i+1;
end
Ligne = fgetl(fid);
end
The problem is that i cant implement the matrix D. The data are being replaced each time. Can somebody help me please?

Assuming your file looks like:
Header
Header
Header
2003,04,15,15,15,00,38.4279,-76.61,1565,3.7,0.0,38.19,-999,-999,3.9455,3.1457,2.9253
2003,04,15,16,50,00,38.368,-76.5,1566,3.7,0.0,35.01,-999
2003,04,15,17,50,00,38.3074,-76.44
In the example you have 4 headerlines and the delimiter is ','. Now just use importdata as a very convenient import function:
X = importdata('myData.txt',',',4)
which returns:
X =
data: [3x17 double]
textdata: {4x17 cell}
colheaders: {1x17 cell}
X.data contains your numeric data. As the data in your file has a different number of entries in every row, missing values are filled with NaN. X.textdata contains the skipped header lines as strings.
You can process them, if needed with textscan:
additionalInformation = textscan(X.textdata, ... )
The alternative suggested by Shai using csvread with the row offset set to 4 does the job as well. But be aware that missing values are replaced with zeros, what I personally dislike for further processing of data. Especially as your actual data also contains zeros.
X = csvread('myData.txt',4)

You already said it: D is replaced every time. This is happening since you don't specify indices when accessing D. You should do something like
D = zeros(size(a))
....
if ...
for ...
...
D(i) = a(i) + numbers;
...
end
end
But as Shai pointed out, there might be a simpler solution to your problem.

Have you considered using csvread?
D = csvread( filename );
Regarding your code, you have two major bugs
D = a(i)+numbers; - you actually override D at each iteration. Try D(i,:) = a(i,:)+numbers; instead
i=i+1; - you change the loop variable inside the loop! if you are using a for-loop on i you do not need to increment it manually.
And some comments:
It is best not to use i as a variable name in Matlab.
You pre-allocated a but not D, consider pre-allocating D as well.

Finally i have used these code lines.
D = NaN(size(a));
i=1;
while ~(Ligne==-1)
d = textscan(Ligne,'%f','whitespace',',');
numbers = d{:}';
D(i,:) = numbers;
Ligne = fgetl(fid);
i=i+1;
end

Related

loop and concatenate variable in workspace

I am trying to concatenate matrices station_1, station_2,.....station_10 from my matlab workspace and trying to concatenate all stations automatically using a loop and not calling them one by one like this
cat(1,station_1,station_2,station_3,station_4... ,station_5,station_6,station_7,station_8... ,station_9,station_10 )
Any ideas?
the code below is what i was trying to improve
for jj= 1 : 10 T= cat(1,eval(['station_', num2str(jj)])); MegaMat = cat(1,T) end
Reading your code I think at the end of your loop you will have T = station_10.
If you want to concatenate all of them you would do
T = []
for jj= 1:10
T = cat(1, T, eval(['station_', num2str(jj)]));
end
MegaMat = T;
Using eval is not a good practice. Instead of creating station_1 to station_10 you could create a cell array
station{1} = ...
station{2} = ...
Then you could iterate like
T = []
for jj = 1:length(station)
T = cat(1, T, station{jj});
end
If the number of arrays is big this will be slow due to memory reallocation and copy. In that case is more efficient to initialize T as a matrix of the final dimension and write slices.
Appendix:
There is an interesting notation trick pointed by #Cris Luengo in the comments, that is when you have a cell array station, use the notation [station{:}], I have to admit, this notation is new to me. The only caveat is that if you set the items station{i} = ... then you will have the matrices concatenated horizontally rather than vertically.
The answer from #Mateo V, is also good, probably with leas overhead since it calls eval only once. That approach can be refined giving a one linear solution, and to be honest It felt not very unreadable.
MegaMat = eval(['cat(1', num2str(1:10, ', station_%d'), ')']);
No need for loops:
str = num2str(1:10, 'station_%i,'); % returns string 'station_1, station_2, ..., station_10,'
str = str(1:end-1); % remove last comma
eval(['MegaMat = cat(1, ', str, ');'])

Not sure what to do about error message "Conversion to double from cell is not possible."

I'm writing a program that finds the indices of a matrix G where there is only a single 1 for either a column index or a row index and removes any found index if it has a 1 for both the column and row index. Then I want to take these indices and use them as indices in an array U, which is where the trouble comes. The indices do not seem to be stored as integers and I'm not sure what they are being stored as or why. I'm quite new to Matlab (but thats probably obvious) and so I don't really understand how types work for Matlab or how they're assigned. So I'm not sure why I',m getting the error message mentioned in the title and I'm not sure what to do about it. Any assistance you can provide would be greatly appreciated.
I forgot to mention this before but G is a matrix that only contains 1s or 0s and U is an array of strings (i think what would be called a cell?)
function A = ISClinks(U, G)
B = [];
[rownum,colnum] = size(G);
j = 1;
for i=1:colnum
s = sum(G(:,i));
if s == 1
B(j,:) = i;
j = j + 1;
end
end
for i=1:rownum
s = sum(G(i,:));
if s == 1
if ismember(i, B)
B(B == i) = [];
else
B(j,:) = i;
j = j+1;
end
end
end
A = [];
for i=1:size(B,1)
s = B(i,:);
A(i,:) = U(s,:);
end
end
This is the problem code, but I'm not sure what's wrong with it.
A = [];
for i=1:size(B,1)
s = B(i,:);
A(i,:) = U(s,:);
end
Your program seems to be structured as though it had been written in a language like C. In MATLAB, you can usually substitute specialized functions (e.g. any() ) for low-level loops in many cases. Your function could be written more efficiently as:
function A = ISClinks(U, G)
% Find columns and rows that are set in the input
active_columns=any(G,1);
active_rows=any(G,2).';
% (Optional) Prevent columns and rows with same index from being simultaneously set
%exclusive_active_columns = active_columns & ~active_rows; %not needed; this line is only for illustrative purposes
%exclusive_active_rows = active_rows & ~active_columns; %same as above
% Merge column state vector and row state vector by XORing them
active_indices=xor(active_columns,active_rows);
% Select appropriate rows of matrix U
A=U(active_indices,:);
end
This function does not cause errors with the example input matrices I tested. If U is a cell array (e.g. U={'Lorem','ipsum'; 'dolor','sit'; 'amet','consectetur'}), then return value A will also be a cell array.

MATLAB: Using a for loop within another function

I am trying to concatenate several structs. What I take from each struct depends on a function that requires a for loop. Here is my simplified array:
t = 1;
for t = 1:5 %this isn't the for loop I am asking about
a(t).data = t^2; %it just creates a simple struct with 5 data entries
end
Here I am doing concatenation manually:
A = [a(1:2).data a(1:3).data a(1:4).data a(1:5).data] %concatenation function
As you can see, the range (1:2), (1:3), (1:4), and (1:5) can be looped, which I attempt to do like this:
t = 2;
A = [for t = 2:5
a(1:t).data
end]
This results in an error "Illegal use of reserved keyword "for"."
How can I do a for loop within the concatenate function? Can I do loops within other functions in Matlab? Is there another way to do it, other than copy/pasting the line and changing 1 number manually?
You were close to getting it right! This will do what you want.
A = []; %% note: no need to initialize t, the for-loop takes care of that
for t = 2:5
A = [A a(1:t).data]
end
This seems strange though...you are concatenating the same elements over and over...in this example, you get the result:
A =
1 4 1 4 9 1 4 9 16 1 4 9 16 25
If what you really need is just the .data elements concatenated into a single array, then that is very simple:
A = [a.data]
A couple of notes about this: why are the brackets necessary? Because the expressions
a.data, a(1:t).data
don't return all the numbers in a single array, like many functions do. They return a separate answer for each element of the structure array. You can test this like so:
>> [b,c,d,e,f] = a.data
b =
1
c =
4
d =
9
e =
16
f =
25
Five different answers there. But MATLAB gives you a cheat -- the square brackets! Put an expression like a.data inside square brackets, and all of a sudden those separate answers are compressed into a single array. It's magic!
Another note: for very large arrays, the for-loop version here will be very slow. It would be better to allocate the memory for A ahead of time. In the for-loop here, MATLAB is dynamically resizing the array each time through, and that can be very slow if your for-loop has 1 million iterations. If it's less than 1000 or so, you won't notice it at all.
Finally, the reason that HBHB could not run your struct creating code at the top is that it doesn't work unless a is already defined in your workspace. If you initialize a like this:
%% t = 1; %% by the way, you don't need this, the t value is overwritten by the loop below
a = []; %% always initialize!
for t = 1:5 %this isn't the for loop I am asking about
a(t).data = t^2; %it just creates a simple struct with 5 data entries
end
then it runs for anyone the first time.
As an appendix to gariepy's answer:
The matrix concatenation
A = [A k];
as a way of appending to it is actually pretty slow. You end up reassigning N elements every time you concatenate to an N size vector. If all you're doing is adding elements to the end of it, it is better to use the following syntax
A(end+1) = k;
In MATLAB this is optimized such that on average you only need to reassign about 80% of the elements in a matrix. This might not seam much, but for 10k elements this adds up to ~ an order of magnitude of difference in time (at least for me).
Bare in mind that this works only in MATLAB 2012b and higher as described in this thead: Octave/Matlab: Adding new elements to a vector
This is the code I used. tic/toc syntax is not the most accurate method for profiling in MATLAB, but it illustrates the point.
close all; clear all; clc;
t_cnc = []; t_app = [];
N = 1000;
for n = 1:N;
% Concatenate
tic;
A = [];
for k = 1:n;
A = [A k];
end
t_cnc(end+1) = toc;
% Append
tic;
A = [];
for k = 1:n;
A(end+1) = k;
end
t_app(end+1) = toc;
end
t_cnc = t_cnc*1000; t_app = t_app*1000; % Convert to ms
% Fit a straight line on a log scale
P1 = polyfit(log(1:N),log(t_cnc),1); P_cnc = #(x) exp(P1(2)).*x.^P1(1);
P2 = polyfit(log(1:N),log(t_app),1); P_app = #(x) exp(P2(2)).*x.^P2(1);
% Plot and save
loglog(1:N,t_cnc,'.',1:N,P_cnc(1:N),'k--',...
1:N,t_app,'.',1:N,P_app(1:N),'k--');
grid on;
xlabel('log(N)');
ylabel('log(Elapsed time / ms)');
title('Concatenate vs. Append in MATLAB 2014b');
legend('A = [A k]',['O(N^{',num2str(P1(1)),'})'],...
'A(end+1) = k',['O(N^{',num2str(P2(1)),'})'],...
'Location','northwest');
saveas(gcf,'Cnc_vs_App_test.png');

Save a sparse array in csv

I have a huge sparse matrix a and I want to save it in a .csv. I can not call full(a) because I do not have enough ram memory. So, calling dlmwrite with full(a) argument is not possible. We must note that dlmwrite is not working with sparse formatted matrices.
The .csv format is depicted below. Note that the first row and column with the characters should be included in the .csv file. The semicolon in the (0,0) position of the .csv file is necessary too.
;A;B;C;D;E
A;0;1.5;0;1;0
B;2;0;0;0;0
C;0;0;1;0;0
D;0;2.1;0;1;0
E;0;0;0;0;0
Could you please help me to tackle this problem and finally save the sparse matrix in the desired form?
You can use csvwrite function:
csvwrite('matrix.csv',a)
You could do this iteratively, as follows:
A = sprand(20,30000,.1);
delimiter = ';';
filename = 'filecontaininghugematrix.csv';
dims = size(A);
N = max(dims);
% create names first
idx = 1:26;
alphabet = dec2base(9+idx,36);
n = ceil(log(N)/log(26));
q = 26.^(1:n);
names = cell(sum(q),1);
p = 0;
for ii = 1:n
temp = repmat({idx},ii,1);
names(p+(1:q(ii))) = num2cell(alphabet(fliplr(combvec(temp{:})')),2);
p = p + q(ii);
end
names(N+1:end) = [];
% formats for writing
headStr = repmat(['%s' delimiter],1,dims(2));
headStr = [delimiter headStr(1:end-1) '\n'];
lineStr = repmat(['%f' delimiter],1,dims(2));
lineStr = ['%s' delimiter lineStr(1:end-1) '\n'];
fid = fopen(filename,'w');
% write header
header = names(1:dims(2));
fprintf(fid,headStr,header{:});
% write matrix rows
for ii = 1:dims(1)
row = full(A(ii,:));
fprintf(fid, lineStr, names{ii}, row);
end
fclose(fid);
The names cell array is quite memory demanding for this example. I have no time to fix that now, so think about this part yourself if it is really a problem ;) Hint: just write the header element wise, first A;, then B; and so on. For the rows, you can create a function that maps the index ii to the desired character, in which case the complete first part is not necessary.

insert NaNs where needed in matlab

i have the following three vectors and i want to insert NaNs into B where A misses the data points in An. So my Bn should be [0.1;0.2;0.3;NaN;NaN;0.6;0.7]. How can i get the Bn? Thanks.--Jackie
A=[1;2;3;6;7];
An=[1;2;3;4;5;6;7];
B=[0.1;0.2;0.3;0.6;0.7];
Okay so first off, you cant store the string 'NaN' into one cell of a matrix, it must be stored into a cell array.
The code snip below gives you your solution, if cell array is an okay output.
Please let me know any questions or concerns you might have.
Forget the italic parts, Thanks David K.
% NaN solution for Jackie
A=[1;2;3;6;7]; An=[1;2;3;4;5;6;7]; B=[0.1;0.2;0.3;0.6;0.7];
len = max(length(A),length(An))
Bn = zeros(len,1);
k = 0; % adjust the index so that you don't call B outside of its size
for i =1 :len
ind= A(An(i)==A);
if isempty(ind) ==1
Bn(i) = nan(1,1)
k = k+1;
else
Bn(i) = B(i-k)
end
end