I need to construct a huge sparse matrix in iterations. The code is as follow:
function Huge_Matrix = Create_Huge_Matrix(len, Weight, Index)
k = size(Weight,1);
Huge_Matrix = spalloc(len, len,floor(len*k));
parfor i = 1:len
temp = sparse(1,len);
ind = Index(:,i);
temp(ind) = Weight(:,i);
Huge_Matrix(i,:) = temp;
end
Huge_Matrix = Huge_Matrix + spdiags(-k*ones(len,1),0,len,len);
end
As is shown, len is size of the height * weight of the input image, for 200*200 image, the len is 40000! And I am assigning the Weight into this huge matrix according the position stored in Index. Even though I use parfor to accerlate the loop, the speed is very slow.
I also try to create full matrix at first, it seems that the code can becomes faster, but memory is limited. Is there any other way to speed up the code? Thanks in advance!
As #CrisLuengo says in the comments, there is probably a better way to do what you're trying to do than to create a 40kx40k matrix, but if you have to create a large sparse matrix, it's better to let MATLAB do it for you.
The sparse function has a signature that takes lists of rows, columns and the corresponding values for the nonzero elements of the matrix:
S = sparse(i,j,v) generates a sparse matrix S from the triplets i, j, and v such that S(i(k),j(k)) = v(k). The max(i)-by-max(j) output matrix has space allotted for length(v) nonzero elements. sparse adds together elements in v that have duplicate subscripts in i and j.
If the inputs i, j, and v are vectors or matrices, they must have the same number of elements. Alternatively, the argument v and/or one of the arguments i or j can be scalars.
So, we can simply pass Index as the row indices and Weight as the values, so all we need is an array of column indices the same size as Index:
col_idx = repmat(1:len, k, 1);
Huge_Matrix = sparse(Index, col_idx, Weight, len, len);
(The last two parameters specify the size of the sparse matrix.)
The next step is to create another large sparse matrix and add it to the first. That seems kind of wasteful, so why not just add those entries to the existing arrays before creating the matrix?
Here's the final function:
function Huge_Matrix = Create_Huge_Matrix(len, Weight, Index)
k = size(Weight,1);
% add diagonal indices/weights to arrays
% this avoids creating second huge sparse array
Index(end+1, :) = [1:len];
Weight(end+1, :) = -k*ones(1,len);
% create array of column numbers corresponding to each Index
% make k+1 rows because we've added the diagonal
col_idx = repmat(1:len, k+1, 1);
% let sparse do the work
Huge_Matrix = sparse(Index, col_idx, Weight, len, len);
end
Related
I've written a function that generates a sparse matrix of size nxd
and puts in each column 2 non-zero values.
function [M] = generateSparse(n,d)
M = sparse(d,n);
sz = size(M);
nnzs = 2;
val = ceil(rand(nnzs,n));
inds = zeros(nnzs,d);
for i=1:n
ind = randperm(d,nnzs);
inds(:,i) = ind;
end
points = (1:n);
nnzInds = zeros(nnzs,d);
for i=1:nnzs
nnzInd = sub2ind(sz, inds(i,:), points);
nnzInds(i,:) = nnzInd;
end
M(nnzInds) = val;
end
However, I'd like to be able to give the function another parameter num-nnz which will make it choose randomly num-nnz cells and put there 1.
I can't use sprand as it requires density and I need the number of non-zero entries to be in-dependable from the matrix size. And giving a density is basically dependable of the matrix size.
I am a bit confused on how to pick the indices and fill them... I did with a loop which is extremely costly and would appreciate help.
EDIT:
Everything has to be sparse. A big enough matrix will crash in memory if I don't do it in a sparse way.
You seem close!
You could pick num_nnz random (unique) integers between 1 and the number of elements in the matrix, then assign the value 1 to the indices in those elements.
To pick the random unique integers, use randperm. To get the number of elements in the matrix use numel.
M = sparse(d, n); % create dxn sparse matrix
num_nnz = 10; % number of non-zero elements
idx = randperm(numel(M), num_nnz); % get unique random indices
M(idx) = 1; % Assign 1 to those indices
I have a huge matrix MxN matrix, say, A=rand([M,N]); and an index vector with N integer values between 1 and M, say, RandomIndex = randi(M,[1,N]);.
Now I would like to generate a row vector with entries
result = [A(RandomIndex(1),1), A(RandomIndex(2),2), ..., A(RandomIndex(N),N)]
What would be an efficient way to do this? It should be a very cheap operation but all my implementations are slow. I don't think there is a notation in Matlab to do this directly, is there?
The fastest option so far is
indexFunction = #(r,c) A(r,c);
result = cell2mat(arrayfun(indexFunction,RandomIndex,1:N,'UniformOutput',false));
Is there a more efficient way?
Use sub2ind
A(sub2ind(size(A), RandomIndex, 1:N))
sub2ind will convert the row and column indices given by RandomIndex and 1:N to linear indices based on size(A) which you can then use to index A directly.
Another way to do this is to use RandomIndex and 1:N to return an NxN matrix and then take the diagonal of this with diag
diag(A(RandomIndex, 1:N)).'
Note: .' is used to convert the row vector returned by diag to a column vector.
M=10;N=50;
A=rand([M,N]);
RandomIndex = randi(M,[1,N]);
out = zeros(1,numel(RandomIndex));
for ii = 1:numel(RandomIndex)
out(ii)=A(RandomIndex(ii),ii);
end
Another approach would be to use sparse and logical indexing:
M = sparse(RandomIndex, 1:N, 1) == 1;
out = A(M);
The first line of code generates a logical matrix where there is only 1 true value set in each column. This is defined by each value of RandomIndex. We convert this to a logical matrix, then index into your matrix to obtain the final random vector.
Use your index directly.
M = 100;N=100;
A = rand(M,N);
% get a random index that can be as large as your matrix
% 10 rows by 1 column
idx = randi(numel(A), 10,1);
t = A(idx);
I have a small code below for one-to-one correspondence index of the img matrix
for k = 1:length(I)
img(I(k),J(k)) = 0;
end
Now, I hope to get rid of for loop, but I cannot find proper matlab syntax to realize it.
img(I(1:length(I)), J(1:length(I)),1:3) = 0;
is not one-to-one correspondence index. Any help to realize the same function is appreciated.
Indexing in a linear fashion along multiple dimensions can be done using the sub2ind function:
img(sub2ind(size(img), I, J(1:length(I))) = 0;
You can also use sparse to get this done:
ind = sparse(I, J, 1, size(img,1), size(img,2)) == 1;
img(ind) = 0;
The first line of code generates a sparse matrix where the row values stored in I and the column values stored in J set the matrix values to 1, and we ensure that this is the same size as your image. We also convert to a logical array by equating the statement with 1. When you're done, simply use the result to index into your actual array and set the values to 0.
If you have a multi-channel matrix, you can do this temporally by making a call to repmat:
img(repmat(ind, [1 1 size(img,3)])) = 0;
I have a cell array of 53 different (40,000 x 2000) sparse matrices. I need to take the mean over the third dimension, so that for example element (2,5) is averaged across the 53 cells. This should yield a single (33,000 x 2016) output. I think there ought to be a way to do this with cellfun(), but I am not able to write a function that works across cells on the same within-cell indices.
You can convert from sparse matrix to indices and values of nonzeros entries, and then use sparse to automatically obtain the sum in sparse form:
myCell = {sparse([0 1; 2 0]), sparse([3 0; 4 0])}; %// example
C = numel(myCell);
M = cell(1,C); %// preallocate
N = cell(1,C);
V = cell(1,C);
for c = 1:C
[m n v] = find(myCell{c}); %// rows, columns and values of nonzero entries
M{c} = m.';
N{c} = n.';
V{c} = v.';
end
result = sparse([M{:}],[N{:}],[V{:}])/C; %'// "sparse" sums over repeated indices
This should do the trick, just initialize an empty array and sum over each element of the cell array. I don't see any way around using a for loop without concatenating it into one giant 3D array (which will almost definitely run out of memory)
running_sum=zeros(size(cell_arr{1}))
for i=1:length(cell_arr)
running_sum=running_sum+cell_arr{i};
end
means = running_sum./length(cell_arr);
i'm running the following code, where M is a ~200,000 by ~200,000 sparse matrix and points is ~200,000 by 2 matrix
inds=sub2ind(size(M),points(:,1),points(:,2));
M(inds)=M(inds)+1;
the problem is that the second line takes very long to run (15-90 seconds).
the operation takes longer depending on how many of the indices in inds are 'new' (i.e. that don't already have a value in the sparse matrix)
is there a more efficient way to do this?
Here's an idea:
M = M + sparse(points(:,1),points(:,2),1,size(M,1),size(M,2),size(points,1));
Just so you know,
S = sparse(i,j,s,m,n,nzmax) uses vectors i, j, and s to generate an
m-by-n sparse matrix such that S(i(k),j(k)) = s(k), with space
allocated for nzmax nonzeros. Vectors i, j, and s are all the same
length. Any elements of s that are zero are ignored, along with the
corresponding values of i and j. Any elements of s that have duplicate
values of i and j are added together.
For the curious:
M = sprand(200000,200000,1e-6);
points = [randperm(200000) ; randperm(200000)]'; %'//Initialization over
Mo = M;
tic;
inds=sub2ind(size(Mo),points(:,1),points(:,2));
Mo(inds) = Mo(inds)+1;
toc
tic;
M = M + sparse(points(:,1),points(:,2),1,size(M,1),size(M,2),size(points,1));
toc