I want to write a sparse matrix to a text file. Let's say my sparse matrix is A.The first row of A has non zero values at 10,11th index. The second row has non zero values at 1,2nd index. Then when I write the data to a text file it should look something like this
10 11
1 2
......
....
How can I do this in MATLAB?
First post, so not sure if posting code is allowed, but maybe this is what you're looking for?
[rows, cols] = size(A);
outFile = fopen('sparse.txt', 'a');
for m = 1:rows
n = 1;
while n <= cols
if A(m, n) ~= 0
fprintf(outFile, '%d ', A(m, n));
end
n = n+1;
end
fprintf(outFile, '\r\n');
end
Given the constraints, I only really see one sensible option for the given format:
f = fopen('out.txt', 'w');
for ii=1:size(mat, 1)
fprintf(f, '%u ', find(mat(ii, :));
fprintf(f, '\n');
end
fclose(f);
Since the number of elements per row isn't constant, this means two things:
We can't construct a single matrix to pass to a vectorised function, so we're stuck with some form of per-row operation.
We also can't give fprintf a constant format string to write one or more whole rows in a single call, so we're stuck with multiple fprintf calls.
So, optimising for those constraints;
Iterate over the rows directly - yes, we could pull out the row and column indices all at once with find, but then we've wasted memory effectively copying the entire dataset, and we'd still have to iterate over that somehow.
Minimise the amount of high-level work - let the low-level internals of find and vectorised fprintf at least make processing a single row as quick as feasibly possible.
Sometimes a simple loop really is the best option - it's liable to be I/O-bound anyway, so the overhead of the loop itself really should be negligible compared to even the minimal number of fprintf calls.
Related
I am trying to compute this
in MATLAB but the code requires about 8 hours to compile. In particular e, Ft=[h(t);q(t)] and Omega are 2x1 matrices (e' is 1x2), Gamma is a 2x2 matrix and n=30. Can someone help me to optimize this code?
I tried in this way:
aux=[0;0];
for k=0:29
for j=1:k-1
aux=[aux Gamma^j*Omega];
end
E(t,k+1)= e'*(sum(aux,2)+Gamma^k*[h(t);q(t)]);
end
Vix=1/30*sum(E,2);
EDIT
now I changed into this and it is faster, but I am not sure that I am applying correctly the formula in the picture...
for t=2:T
% 1. compute today's volatility
csi(t) = log(SP500(t)/SP500(t-1))-r(t)+0.5*h(t);
q(t+1) = omega+rho*q(t)+phi*((csi(t)-lambda*sqrt(h(t)))^2-h(t));
h(t+1) = q(t)+alpha*((csi(t)-lambda*sqrt(h(t)))^2-q(t))+beta*(h(t)-q(t));
for k=1:30
aux=zeros(2,k);
for j=0:k-1
aux(:,j+1)=Gamma^j*Omega;
end
E(t,k)= e'*(sum(aux,2)+Gamma^k*[h(t);q(t)]);
end
end
Vix(2:end)=1/30*sum(E(2:end,:),2);
(I don't need Vix(1))
Here are some reasons I can think of:
REPEATED COPYING(No preallocation) The main reason for the long run time is the line aux=[aux Gamma^j*Omega] line, in which an array is concatenated at every loop iteration. MATLAB's debugger should have flagged this for you in its editor and should have cited that "memory preallocation" using zeros should be implemented.
Essentially, when one concatenates arrays this way, MATLAB is internally making copies of the array at every loop iteration, thus, in addition to the math operations copying is taking place. As the array grows, the copying operations become ever more expensive. This is avoided by preallocation, which consists of predefining the size of the storage array (in this case the variable aux) so that MATLAB doesn't have to keep on allocating space on the go. Try:
aux = zeros(2, 406); %Creates a 2 by 406 array. I explain how I get 406 below:
p = 0; %A variable that indexes the columns of aux
for k=0:29
for j=1:k-1
p = p+1; %Update column counter
aux(:,p) = Gamma^j*Omega; % A 2x2 matrix multiplied by a 2x1 matrix yields a 2x1.
end
E(t,k+1)= e'*(sum(aux,2)+Gamma^k*[h(t);q(t)]);
end
Vix=1/30*sum(E,2);
Now, MATLAB simply overwrites the individual elements of aux instead of copying aux, and concatenating it with Gamma^j*Omega, and then overwriting aux. Essentially, the above makes MATLAB allocate space for aux ONCE instead of 406 times. I figured out that aux ends up being a 2 by 406 array for the n=30 case in the end by running this code:
p = 0;
for k = 0:29
for j = 1:k-1
p = p + 1;
end
end
To know the final size of aux for other values of n you should see if a formula for it is available (or derive your own).
LOOPING TRANSPOSITION OF A CONSTANT?
Next, e'. As you may know, ' is the transpose operation. From your sample code, the variable e is not edited inside the for loops, yet you have the ' operator inside the outer for loop. If you perform the transpose operation once outside the outer for loop you save yourself the expense of transposing it at every loop iteration.
RUNNING TOTAL
As a final note, I would suggest replacing sum(aux,2) with a variable that keeps a running total. This is because currently, this makes MATLAB sum over the entirety of aux at every loop iteration.
Hope this helps mate.
I am trying to sum in the second dimension a matrix QI in Matlab. The trick is, the columns contain a series of increasing numbers, but not all columns have the same number of elements (i.e. numel(QI(:,1)) ~= numel(QI(:,2)) and so on). For the sake of clarity, I attach a picture of it. Note that I padded the missing areas with 0, so the previous condition becomes nnz(QI(:,1)) ~= nnz(QI(:,2)).
One initial strategy that I thought of was to treat this as an image and construct a mask for each different gradient level, but that seems like a tedious job.
Anyone has a better idea on how to do this? I should also mention that I am able to freely modify how QI is generated, but I'd rather not if there is a solution for this problem.
EDIT:
Hopefully the new colored image should give a better understanding.
FYI, each column was previously stored in a cell array without the trailing zeros. Then I extracted the columns one by one and stored them in a matrix in order to perform the summation, padding the extra zeros whenever the length isn't the same.
Generally these column data should have the same number of rows, but sometimes that's not the case, and even worse, they do not allign properly.
I'm starting to think if it's better to rework the code that generate the cell arrays rather than this matrix. Thoughts?
Thank you,
edit: following you comment, I modified the answer. Be aware that your data cannot be really "aligned" because they have not the same number of value.
A way would be to use a cell as a storage for your measures.
valueMissing = 0; % here you can put the defauld value you want
% transform you matrix in a cell
QICell = arrayfun(#(x) QI(QI(:,x)!=valueMissing,x), 1:size(QI,2),'UniformOutput', false);
Now you can sum the last element of the vectors inside the cell
QIsum = sum(cellfun(#(x) x(end), QICell))
Or reorder the vectors so that your last element are "aligned"
QICellReordered = cellfun(#(x) x(end:-1:1),QICell, 'UniformOutput',false);
Then you can make all possible sums:
m = min(cellfun(#numel, QICellReordered));
QIsum = zeros(m,1);
for i=1:m
QIsum(i) = sum(cellfun(#(x) x(i), QICellReordered));
end
% reorder QISum to your original order
QIsum = QIsum(end:-1:1);
I hope this help !
I want to write a function that takes number n as input, then outputs a tab separated word document that looks like 5 rows of:
1 2 3...n n n-1 n-2 ..1
Let me tell you what I have tried already: It is easy to create a vector like this with the integers I want, but if I save a file in an ascii format, in the output the integers come out in a format like " 1.0000000e+00".
Now I googled to find that the output can be formatted using %d and fprintf, but given the row length is part of the input, what would be the most efficient way to achieve it?
maybe something like this:
Nrow = 5;
N = 10;
dlmwrite('my_filename.txt', repmat([1:N, N:-1:1], Nrow, 1), 'delimiter', '\t', 'precision', '%d');
If you mean a normal *.txt kind of file, I would normally use a for loop with fprintf(fileid,'%d things to print',5), with the appropriate fopen(.) statement. You'd be surprised what a good job fopen with 'w' and 'a' does. Try it and let us know!
In response to rayryeng: You are right! Here is a sample of code for writing a matrix to file using fprintf, without a for-loop.
A=rand(5);
fid=fopen('Rand_mat.txt','w');
fprintf(fid,'%0.4f %0.4f %0.4f %0.4f %0.4f\n',A');
fclose (fid);
where A is transposed because MATLAB reads the columns of the matrix first.
Thanks!
I want to make 1000 random permutations of a vector in matlab. I do it like this
% vector is A
num_A = length(A);
for i=1:1000
n = randperm(num_A);
A = A(n); % This is one permutation
end
This takes like 73 seconds. Is there any way to do it more efficiently?
Problem 1 - Overwriting the original vector inside loop
Each time A = A(n); will overwrite A, the input vector, with a new permutation. This might be reasonable since anyway you don't need the order but all the elements in A. However, it's extremely inefficient because you have to re-write a million-element array in every iteration.
Solution: Store the permutation into a new variable -
B(ii, :) = A(n);
Problem 2 - Using i as iterator
We at Stackoverflow are always telling serious Matlab users that using i and j as interators in loops is absolutely a bad idea. Check this answer to see why it makes your code slow, and check other answers in that page for why it's bad.
Solution - use ii instead of i.
Problem 3 - Using unneccessary for loop
Actually you can avoid this for loop at all since the iterations are not related to each other, and it will be faster if you allow Matlab do parallel computing.
Solution - use arrayfun to generate 1000 results at once.
Final solution
Use arrayfun to generate 1000 x num_A indices. I think (didn't confirm) it's faster than directly accessing A.
n = cell2mat(arrayfun(#(x) randperm(num_A), 1:1000', 'UniformOutput', false)');
Then store all 1000 permutations at once, into a new variable.
B = A(n);
I found this code pretty attractive. You can replace randperm with Shuffle. Example code -
B = Shuffle(repmat(A, 1000, 1), 2);
A = perms(num_A)
A = A(1:1000)
Perms returns all the different permutations, just take the first 1000 permutations.
I'm fairly new to matlab and I'm currently working on MATLAB to create a loop that will go through each column and each row and then increment A and B as it goes. I know that there's indexing which you can do but I'd like to learn how to do it step by step. I've come up with the pseudo code for it but I'm struggling with the actual syntax in MATLAB to be able to do it.
Pseudocode:
For columns i 1-300;
Increment A
For rows j 1-4
Increment B
End
End
My actual code that I've been trying to get to work is:
%testmatrix = 4:300 Already defined earlier as a 4 row and 300 column matrix
for i = testmatrix (:,300)
for j = testmatrix (4,:)
B=B+1
end
A=A+1
end
I'm not 100% sure how I'm supposed to format the code so it'll read testmatrix(1,1) all the way through to testmatrix (4,300).
Any help would be greatly appreciated!
You could let it run through the first row to get the right column, then through that column. But you can't feed your running value from the matrix:
[rows cols] = size(testmatrix); %// rows=4, cols=300
for i = 1:cols
for j = 1:rows
temp = testmatrix (j,i); %// contains the element of your matrix at (j,i)
B=B+1;
end
A=A+1;
end
The semicolons ; suppress the output to the command line. Remove them if you want to output A and B at each step.
Here, temp will cycle through the elements (1,1) through (4,300) and you can do whatever you want with them. Note that this is generally an inefficient way to do most things. Matlab supports greatly efficient vectorized calculations, which you should use. But unless I know what exactly you're trying to achieve, I can't really help you with that. (If all you want is A and B's final values, it's as easy as A=A+cols;B=B+cols*rows;.)