Forcing a specific size when using spconvert in Matlab - matlab

I am loading a sparse matrix in MATLAB using the command:
A = spconvert(load('mymatrix.txt'));
I know that the dimension of my matrix is 1222 x 1222, but the matrix is loaded as 1220 x 1221. I know that it is impossible for MATLAB to infer the real size of my matrix, when it is saved sparse.
A possible solution for making A the right size, is to include a line in mymatrix.txt with the contents "1222 1222 0". But I have hundreds of matrices, and I do not want to do this in all of them.
How can I make MATLAB change the size of the matrix to a 1222 x 1222?

I found the following solution to the problem, which is simple and short, but not as elegant as I hoped:
A = spconvert(load('mymatrix.txt'));
if size(A,1) ~= pSize || size(A,2) ~= pSize
A(pSize,pSize) = 0;
end
where pSize is the preferred size of the matrix. So I load the matrix, and if the dimensions are not as I wanted, I insert a 0-element in the lower right corner.

Sorry, this post is more a pair of clarifying questions than it is an answer.
First, is the issue with the 'load' command or with 'spconvert'? As in, if you do
B = load('mymatrix.txt')
is B the size you expect? If not, then you can use 'textread' or 'fread' to write a function that creates the matrix of the right size before inputting into 'spconvert'.
Second, you say that you are loading several matrices. Is the issue consistent among all the matrices you are loading. As in, does the matrix always end up being two rows less and one column less than you expect?

I had the same problem, and this is the solution I came across:
nRows = 1222;
nCols = 1222;
A = spconvert(load('mymatrix.txt'));
[i,j,s] = find(A);
A = sparse(i,j,s,nRows,nCols);
It's an adaptation of one of the examples here.

Related

How to sum a matrix with unaligned elements?

I am trying to sum in the second dimension a matrix QI in Matlab. The trick is, the columns contain a series of increasing numbers, but not all columns have the same number of elements (i.e. numel(QI(:,1)) ~= numel(QI(:,2)) and so on). For the sake of clarity, I attach a picture of it. Note that I padded the missing areas with 0, so the previous condition becomes nnz(QI(:,1)) ~= nnz(QI(:,2)).
One initial strategy that I thought of was to treat this as an image and construct a mask for each different gradient level, but that seems like a tedious job.
Anyone has a better idea on how to do this? I should also mention that I am able to freely modify how QI is generated, but I'd rather not if there is a solution for this problem.
EDIT:
Hopefully the new colored image should give a better understanding.
FYI, each column was previously stored in a cell array without the trailing zeros. Then I extracted the columns one by one and stored them in a matrix in order to perform the summation, padding the extra zeros whenever the length isn't the same.
Generally these column data should have the same number of rows, but sometimes that's not the case, and even worse, they do not allign properly.
I'm starting to think if it's better to rework the code that generate the cell arrays rather than this matrix. Thoughts?
Thank you,
edit: following you comment, I modified the answer. Be aware that your data cannot be really "aligned" because they have not the same number of value.
A way would be to use a cell as a storage for your measures.
valueMissing = 0; % here you can put the defauld value you want
% transform you matrix in a cell
QICell = arrayfun(#(x) QI(QI(:,x)!=valueMissing,x), 1:size(QI,2),'UniformOutput', false);
Now you can sum the last element of the vectors inside the cell
QIsum = sum(cellfun(#(x) x(end), QICell))
Or reorder the vectors so that your last element are "aligned"
QICellReordered = cellfun(#(x) x(end:-1:1),QICell, 'UniformOutput',false);
Then you can make all possible sums:
m = min(cellfun(#numel, QICellReordered));
QIsum = zeros(m,1);
for i=1:m
QIsum(i) = sum(cellfun(#(x) x(i), QICellReordered));
end
% reorder QISum to your original order
QIsum = QIsum(end:-1:1);
I hope this help !

How to extract a submatrix without making a copy in Matlab

I have a large matrix, and I need to extract a small matrix taken from a sliding window which runs all over the large matrix, but during the operations the content of the extracted matrix does not change, so I'd like to extract the submatrix without creating a new copy but instead just acts like a C pointer that points to a portion of the large matrix. How can I do this? Please help me, thank you very much :)
I did some benchmarking to test if not using an explicit temporary matrix is faster, and it's probably not:
function move_mean(N)
M = randi(100,N);
window_size = [50 50];
dir_time = timeit(#() direct(M,window_size))
tmp_time = timeit(#() with_tmp(M,window_size))
end
function direct(M,window_size)
m = zeros(size(M)./2);
for r = 1:size(M,1)-window_size(1)
for c = 1:size(M,2)-window_size(2)
m(r,c) = mean(mean(M(r:r+window_size(1),c:c+window_size(2))));
end
end
end
function with_tmp(M,window_size)
m = zeros(size(M)./2);
for r = 1:size(M,1)-window_size(1)
for c = 1:size(M,2)-window_size(2)
tmp = M(r:r+window_size(1),c:c+window_size(2));
m(r,c) = mean(mean(tmp));
end
end
end
for M at size 100*100:
dir_time =
0.22739
tmp_time =
0.22339
So it's seems like using a temporary variable only makes your code readable, not slower.
In this answer I describe what is the 'best' solution in general. For this answer I define 'best' as most readable without a significant performance hit. (Partially shown by the existing answer).
Basically there are 2 situations that you may be in.
1. You use your submatrix several times
In this situation the best solution in general is to create a temporary variable containing the submatrix.
A = M(rmin:rmax, cmin:cmax)
There may be ways around it (defining a function/anonymous function that indexes into the matrix for you), but in general that won't make you happy.
2. You use your submatrix only 1 time
In this case the best solution is typically exactly what you referred to in the comments:
M(rmin:rmax, cmin:cmax)
A specific case of using the submatrix only 1 time, is when it is passed once to a function. Of course the contents of the submatrix may be used in that function several times, but that is irrelevant.

Details in sparse indexing

I have some code which uses sparse indexing (and there's no way that I can get around that). I run this in a function, and use it for two problems, where the sizes of all the variables involved do not change. However, for one problem, the sparse indexing part takes 5 seconds, and for the other, takes 25 seconds.
I checked the size of every variable involved, and they are the same for both problems. I also checked that xv is a full matrix for both problem types.
So, anyone else ever run into something weird like this? Any ideas as to why this would happen? Mainly I am trying to make the code more efficient, and while 5 seconds is ok for my particular application, 25 seconds (especially when I can't explain it) is very bad.
Edit: Here is a link to a photo that profiles this weird behavior. The runtime values were recorded on the third run to ensure that the size of X is also not changing. And I did check that xv is a dense (not sparse) matrix both times.
https://www.dropbox.com/s/i41j6afanzbjdyg/weird_bcd_thing.png?dl=0
Thanks so much for any help!
Code below (runs in a for loop). If I use ptype = 1, then it's 5 seconds, ptype = 3 is 25 seconds.
clvec = cliques{k};
xcurr = full(X(clvec));
xv = reshape(xcurr - Z(offset_index(k) + 1 : offset_index(k) + ncl^2),ncl,ncl);
%these two functions both take a dense symmetric matrix and return a dense symmetric matrix, and in both cases the size is the same for a given k.
if ptype == 1
xv = proj_PSD(xv,0,0);
elseif ptype == 3
xv = proj_Schoenberg(xv,0);
end
Xd = vec(xv) - xcurr;
%THIS IS THE WEIRD LINE
tic
X(clvec) = xv;
toc;
In the 'WEIRD LINE' : X(clvec) = xv;
You are using a random access to a sparse matrix.
This access in a sparse matrix is not constant and depends on its data. The time is may depend on the matrix values and the indices you are trying to access.
This is not the case in regular matrix, where you usually get a stable access time, and faster.
In order to assure a stable constant access try to change the implementation based on your specific matrix usage, try to avoid values assign by random access.
See next code for as a reference:
X = sparse(randi(100,50,1),randi(100,50,1),randn(1),100,100);
for i=1:10000
rand_inds{i} = randperm(10000,100);
end
for i=1:100
ti = tic;
X(rand_inds{i}) = 3;
to_X(i) = toc(ti);
end
Xf = full(X);
for i=1:100
ti = tic;
Xf(rand_inds{i}) = 3;
to_Xf(i) = toc(ti);
end
figure;plot(to_X);hold on;plot(to_Xf,'r');
I solved my problem! I'm posting the answer because I think it's interesting.
One thing I didn't mention in the question is that the loop goes from k = 1 to k = L, and for ptype = 3, we add one more step, and that's assigning all the diagonal indices to 0:
X(diag_index) = 0
where diag_index is computed ahead of time.
The problem is, instead of just assigning the values to 0, MATLAB will automatically discard these indices, and the next loop, when accessing diagonal indices, it has to re-allocate for X. So, I changed that line to
X(diag_index) = eps;
and now they both run equally fast! (It's not the best solution, since that's going to be a source of error later, but there's no more mystery!)
The answer is never what you think it would be...

Diffusion outer bounds

I'm attempting to run this simple diffusion case (I understand that it isn't ideal generally), and I'm doing fine with getting the inside of the solid, but need some help with the outer edges.
global M
size=100
M=zeros(size,size);
M(25,25)=50;
for diffusive_steps=1:500
oldM=M;
newM=zeros(size,size);
for i=2:size-1;
for j=2:size-1;
%we're considering the ij-th pixel
pixel_conc=oldM(i,j);
newM(i,j+1)=newM(i,j+1)+pixel_conc/4;
newM(i,j-1)=newM(i,j-1)+pixel_conc/4;
newM(i+1,j)=newM(i+1,j)+pixel_conc/4;
newM(i-1,j)=newM(i-1,j)+pixel_conc/4;
end
end
M=newM;
end
It's a pretty simple piece of code, and I know that. I'm not very good at using Octave yet (chemist by trade), so I'd appreciate any help!
If you have concerns about the border of your simulation you could pad your matrix with NaN values, and then remove the border after the simulation has completed. NaN stands for not a number and is often used to denote blank data. There are many MATLAB functions work in a useful way with these values.
e.g. finding the mean of an array which has blanks:
nanmean([0 nan 5 nan 10])
ans =
5
In your case, I would start by adding a border of NaNs to your M matrix. I'm using 'n' instead of 'size', since size is an important function in MATLAB, and using it as a variable can lead to confusing errors.
n=100;
blankM=zeros(n+2,n+2);
blankM([1,end],:) = nan;
blankM(:, [1,end]) = nan;
Now we can define 'M'. N.B that the first column and row will be NaNs so we need to add an offset (25+1):
M = blankM;
M(26,26)=50;
Run the simulation through,
m = size(blankM, 1);
n = size(blankM, 2);
for diffusive_steps=1:500
oldM = M;
newM = blankM;
for i=2:m-1;
for j=2:n-1;
pixel_conc=oldM(i,j);
newM(i,j+1)=newM(i,j+1)+pixel_conc/4;
newM(i,j-1)=newM(i,j-1)+pixel_conc/4;
newM(i+1,j)=newM(i+1,j)+pixel_conc/4;
newM(i-1,j)=newM(i-1,j)+pixel_conc/4;
end
end
M=newM;
end
and then extract the area of interest
finalResult = M(2:end-1, 2:end-1);
One simple change you might make is to add a boundary of ghost cells, or halo, around the domain of interest. Rather than mis-use the name size I've used a variable called sz. Replace:
M=zeros(sz,sz)
with
M=zeros(sz+2,sz+2)
and then compute your diffusion over the interior of this augmented matrix, ie over cells (2:sz+1,2:sz+1). When it comes to considering the results, discard or just ignore the halo.
Even simpler would be to simply take what you already have and ignore the cells in your existing matrix which are on the N,S,E,W edges.
This technique is widely used in problems such as, and similar to, yours and avoids the need to write code which deals with the computations on cells which don't have a full complement of neighbours. Setting the appropriate value for the contents of the halo cells is a problem-dependent matter, 0 isn't always the right value.

solving linear equations using matrices in MATLAB

my script creates a matrix and 2 vectors using several 'for' loops and as an example they are returned as follows:
K =
1.0e+006 *
1.2409 0.6250 0.8153 0.1250
0.6250 3.6591 -0.1250 3.5375
0.8153 -0.1250 1.2409 -0.6250
0.1250 3.5375 -0.6250 3.6591
F =
1.0e+006 *
0.1733
1.3533
-0.1066
1.3371
U =
u3
v3
u4
v4
As can be seen, the 'U' vector is a set of variables and I need to solve 'K*U=F' for variables contained in 'U'.
When I try to do that using linsolve or solve I get unexpected results and a message that the inverse of my matrix is close to singular.
HOWEVER, when I make another script and put in the SAME matrix and vectors BY HANDS it all works fine and I can't figure out what's wrong.
Is that somehow related to the way MATLAB stores matrices created by loop functions and I need to change the state of the matrix to something after the loop?
Also, when I put the matrix by hands it displays it without the 1.0e+006 multiplier in front of it:
K11 =
1240900 625000 815300 125000
625000 3659100 -125000 3537500
815300 -125000 1240900 -625000
125000 3537500 -625000 3659100
can that be related??
Thanks in advance.
Try the backslash operator:
U = K\F
See this reference.
From the previous discussion it's clear that your matrix is singular. This means that your equations are not linearly independent. When this happens there are two possibilities. Your system may be inconsistent (over-constrained), in which case no solutions exist. Or alternatively, it can also mean that your equations are under-constrained, in which case there is an infinite set of solutions.
To determine which case it is you can use rref to get the "row reduce echelon form" of the matrix. Do this as follows:
KF = [K,F]
rref(KF)
If the last row goes entirely to zeros then you're under-constrained and can extract a solution set (but not a unique solution) from your reduced matrix.
In this case however I get a row of [0 0 0 0 1], which makes the system over-constrained and hence without any solution.