Reading a text file with varying length of lines - matlab

I have a data file (.txt) which is as follows;
A 2.2 5
B 3.2 0.5
C 0 2
A 3 2 B
A 2 6 C
B 2.3 4.5 C
First three are representing the nodes (name, feature1, feature2) whereas the last three are representing the relation between each node (node A, node B, node C). And as you see, nodes and relations are in different format (nodes=string numeric numeric whereas relations=string, numeric numeric string). At the end I will plot them based on their initial features and relations through time. I tried couple of things but the thing that nodes have 3 parameters and edges have 4 parameters makes it difficult to solve.
So, basically, I want to read the text file line by line and I would like to be able to define all the nodes and have all the parameters of the nodes as string numeric numeric and define all the relations as well to plot them in the end.
Any help is appreciated.

check out the built-in function fgetl.
fid = fopen(filename);
lineoftext = fgetl(fid);
while ischar(lineoftext)
C = strsplit(strtrim(lineoftext)); % this will be a cell array
if length(C) == 3
% then it's a node, put code here
else
% then it's relational, put code here
end
lineoftext = fgetl(fid);
end
fclose(fid);
This will read a single line from the file, split it into chunks of text in a cell array, and then count the number of chunks to see if it's a node or a relation string. You'll have to put your own code inside the if statements. Then it reads in another line and does it all over again. When it reaches the end of the file, lineoftext = -1 and the while loop ends.

Related

Matlab, load adjacency lists (irregular file)

I have already looked here and here, but I'm not sure I found what I need.
I have an irregular file (that represents neighbors of particles 1 to 5) that looks like that
2 3 5
1 3
1 2
1
I am want to figure out a way to load it (as 'something' called A) and do the following things:
Count the number of elements on one line (for instance size(A(1,:)) shall give me 3)
Given an array B (of size 5) select the elements of B corresponding to the indices given by a line (something like B(A(1,:)) shall give me [B(2) B(3) B(5)])
Since you want arrays with size depending on their first index, you're probably left with cells. Your cell A could be such that A{1} equals to [2 3 5] and A{2} to [1 3] in your example etc. To do this you can read your file infile by
fid=fopen(infile,'rt');
A=[];
while 1
nextline=fgets(fid);
if nextline==-1 %EOF reached
break;
end
A{end+1}=sscanf(nextline,'%d');
end
fclose(fid);
%demonstrate use for indexing
B=randi(10,5,1);
B(A{3}) %2-element vector
B(A{4}) %empty vector
Then A{i} is a vector corresponding to the ith line in your file. If the line is empty, then it's the empty vector. You can use it to index B as you want to, see the example above. Note that you should not add a newline at the very end of your infile, otherwise you'll have a spurious empty element for A. If you know in advance how many lines you need to read, this is not an issue.
And the number of entries in line i are given by length(A{i}), with i=1:length(A).

Copy matrix rows matlab

Lets say i have a matrix A of 300x65. the last column(65th) contains ordered values (1,2,3). the first 102 elements are '1', the second 50 elements are '2' and the remainder will be '3'.
I have another matrix B, which is 3x65 and i want to copy the first row of B by the number of '1's in matrix A. The second row of B should be copied by the number of '2's in in matrix A and the 3th row should be copied by the remaining value of matrix A. By doing this, matrix B should result in a 300x65 matrix.
I've tried to use the repmat function of matlab with no succes, does anyone know how to do this?
There are many inconsistencies in your problem
first if you copy 1 row of B for every element of A(which will end up happening by your description) that will result in a matrix 19500x65
secondly copy its self is a vague term, do you mean duplicate? do you want to store the copied value into a new var?
what I gathered from your problem is you want to preform some operation between A and B to create a matrix and store it in B which in itself will cause the process to warp as it goes if you do not have another variable to store the result in
so i suggest using a third variable c to store the result in and then if you need it to be in b set b = C
also for whatever process you badly described I recommend learning to use a 'for' loop effectively because it seems like that is what you would need to use
syntax for 'for' loop
for i = [start:increment:end]
//loops for the length of [start:increment:end]
//sets i to the nth element of [start:increment:end] where n is the number of times the loop has run
end
If I understand your question, this should do it
index = A(:,end); % will be a column of numbers with values of 1, 2, or 3
newB = B(index,:); % B has 3 rows, which are copied as required by "index"
This should result in newB having the same number of rows as A and the same number of columns as the original B

Matlab doesn't recognize empty values at the end of a CSV file line

I am reading with Matlab a CSV file; the file can contain empty values which I want to convert into 0
FID=fopen('/file.txt','r');
text_line = fgetl(FID);
C = textscan(text_line,'%d','delimiter',',','EmptyValue', 0);
If the empty value is in the middle of the line, e.g.
5,,6
everything works fine and the variable C gets
5 0 6
as values. If the empty value is at the end, e.g.
5,6,
Matlab doesn't recognize it and the C variable gets
5 6
as values, instead of
5 6 0
EDIT After Dennis answer:
I don't understand why the number of elements expected is needed, I give the separator, shouldn't it be enough? Anyway I tried and the result is different: with %d%d%d I get
C =
[5] [0x1 int32] [6]
with %d everything is in the first element so
C{1}
ans =
5
0
6
This code snippet is part of a procedure which import a very big CSV matrix into a matlab sparse matrix (see my post Handling a very big and sparse matrix in Matlab) and I guess (not tried yet) that the first approach is faster.
Anyway, my values are actually >290k per line so I guess it wouldn't be a feasible option to specify all the %d
Judging from this answer on matlab central you need to tell Matlab how many values you expect.
In your case I would expect this to translate to:
FID=fopen('/file.txt','r');
text_line = fgetl(FID);
C = textscan(text_line,'%d%d%d','delimiter',',','EmptyValue', 0);

Writing data from multiple arrays into separate columns in matlab

I am a beginner in Matlab programming and I have been trying to generate a *.dat file for my function. It sits in a loop that generates a number of subplots based on the number of parameters I initiate. It looks like this:
x=-10:10;
parameter_Array = [1 2 3 4 5];
for i = 1:length(parameter_Array)
subplot(length(parameter_Array),1,i);
S = x+parameter_Array[i];
plot(x,S);
end
What I would like to to is, for the first round out of five times in the above loops, write the arrays x and S into two columns separated by a tab, on the second round out of five, write S into a third column, on the third round, write S into a fourth column and so on. so the file should have the following form:
x S1 S2 S3 S4 S5
How do I go about doing this? I am trying to figure out a solution by inserting the following at the end of each loop:
fileID = fopen('MyFilet.dat','w');
fprintf(fileID,'%6s %12s\n','x','S(x)');
fprintf(fileID,'%6.6f %6.6f\n',x,S);
fclose(fileID);
The output file has two things wrong with it, 1. both columns have the same data, I believe its actually S not x; and 2. I only get the data from last iteration of the for loop(I am expecting this one)
I would be grateful for any help you can provide.
Thank you!
This will save as test.dat.
x=(-10:10)';
parameter_Array = [1 2 3 4 5]';
for i = 1:length(parameter_Array)
subplot(length(parameter_Array),1,i);
S(:,i) = x+parameter_Array(i);
plot(x,S(:,i));
end
D=[x S];
dlmwrite('test.dat',D,'\t')

matlab: understanding matlab behavior

Could somebody explain the following code snippet? I have no background in computer science or programming and just recently became aware of Matlab. I understand the preallocation part from data=ceil(rand(7,5)*10)... to ...N*(N-1)/2).
I need to understand every aspect of how matlab processes the code from kk=0 to the end. Also, the reasons why the code is codified in that manner. There's no need to explain the function of: bsxfun(#minus), just how it operates in the scheme of the code.
data=ceil(rand(7,5)*10);
N = size(data,2);
b=cell(N-1,1);
c=NaN(size(data,1),N*(N-1)/2);
kk=0;
for ii=1:N-1
b{ii} = bsxfun(#minus,data(:,ii),data(:,ii+1:end));
c(:,kk+(1:N-ii)) = bsxfun(#minus,data(:,ii),data(:,ii+1:end));
kk=kk+N-ii;
end
Start at zero
kk=0;
Loop with ii going from 1 up to N-1 incrementing by 1 every iteration. Type 1:10 in the command line of matlab and you'll see that it outputs 1 2 3 4 5 6 7 8 9 10. Thuis colon operator is a very important operator to understand in matlab.
for ii=1:N-1
b{ii} = ... this just stores a matrix in the next element of the cell vector b. Cell arrays can hold anything in each of their elements, this is necessary as in this case each iteration is creating a matrix with one fewer column than the previous iteration.
data(:,ii) --> just get the iith column of the matrix data (: means get all the rows)
data(:, ii + 1:end) means get a subset of the matrix data consisting of all the rows but only of columns that appear after column ii
bsxfun(#minus, data(:,ii), data(:,ii+1:end)) --> for each column in the matrix data(:, ii+1:end), subtract the single column data(:,ii)
b{ii} = bsxfun(#minus,data(:,ii),data(:,ii+1:end));
%This does the same thing as the line above but instead of storing the resulting matrix of the loop in a separate cell of a cell array, this is appending the original array with the new matrix. Note that the new matrix will have the same number of rows each time but one fewer column, so this appends as new columns.
%c(:,kk + (1:N-ii)) = .... --> So 1:(N-ii) produces the numbers 1 up to the number of columns in the result of this iteration. In matlab, you can index an array using another array. So for example try this in the command line of matlab: a = [0 0 0 0 0]; a([1 3 5]) = 1. The result you should see is a = 1 0 1 0 1. but you can also extend a matrix like this so for example now type a(6) = 2. The result: a = 1 0 1 0 1 2. So by using c(:, 1:N-ii) we are indexing all the rows of c and also the right number of columns (in order). Adding the kk is just offsetting it so that we do not overwrite our previous results.
c(:,kk+(1:N-ii)) = bsxfun(#minus,data(:,ii),data(:,ii+1:end));
Now we just increment kk by the number of new columns we added so that in the next iteration, c is appended at the end.
kk=kk+N-ii;
end;
I suggest that you put a breakpoint in this code and step through it line by line and look at how the variables change in matlab. To do this click on the little dashed line next to k=0; in the mfile, you will see a red dot appear there, and then run the code. The code will only execute as far as the dot, you are now in debug mode. If you hover over a variable in debug mode matlab will show its contents in a tool tip. For a really big variable check it out in the workspace. Now step through the code line by line and use my explanations above to make sure you understand how each line is changing each variable. For more complex lines like b{ii} = bsxfun(#minus,data(:,ii),data(:,ii+1:end)); you should highlight code snippets and ruin these in the command line to see what each part is doing so for example run data(:,ii) to see what that does and then try data(:,ii+1:end)) or even just ii+1:end (well in that case it wont work, replace end with size(data, 2)). Debugging is the best way to understand code that confuses you.
bsxfun(#minus,A,B)
is almost the same as
A-B
The difference is that the bsxfun version will handle inputs of different size: In each dimension (“direction,” if you find it easier to think about that way), if one of the inputs is scalar and the other one a vector, the scalar one will simply be repeated sufficiently often.
http://www.mathworks.com/help/techdoc/ref/bsxfun.html