My code needs to in a loop modify the elements of a sparse matrix. Doing this matlab warns me that This sparse indexing expression is likely to be slow. I am preallocating the sparse arrays using the Spalloc function but am still getting this warning. What is the optimal approach for assembling of sparse matrices? This is what I am currently doing.
K=spalloc(n,n,100); f=spalloc(n,1,100);
for i = 1:Nel
[Ke,fe] = myFunction(Ex(i),Ey(i));
inds = data(i,2:end);
K(inds,inds) = K(inds,inds) + Ke;
f(inds) = f(inds)+fe;
end
the indices in inds may be appear several times in the loop, meaning an element in K or f may receive multiple contributions. The last two lines within the loop are where I'm getting warnings.
A common approach is to use the triplet form of the sparse constructor:
S = sparse(i,j,v,m,n)
i and j are row and column index vectors and v is the corresponding data vector. Values corresponding to repeated indices are summed like your code does. So you could instead build up row and column index vectors along with a data vector and then just call sparse with those.
For example something like:
nout = Nel*(size(data,2)-1);
% Data vector for K
Kdata = zeros(1,nout);
% Data vector for f
fdata = zeros(1,nout);
% Index vector for K and f
sparseIdxvec = ones(1,nout);
outIdx = 1;
for i = 1:Nel
[Ke,fe] = myFunction(Ex(i),Ey(i));
inds = data(i,2:end);
nidx = numel(inds);
outIdxvec = outIdx:outIdx+nidx-1;
sparseIdxvec(outIdxvec) = inds;
Kdata(outIdxvec) = Ke;
fdata(outIdxvec) = fe;
outIdx = outIdx + nidx;
end
K = sparse(sparseIdxvec,sparseIdxvec,Kdata,n,n);
f = sparse(sparseIdxvec,1,fdata,n,1);
Depending on your application, that may or may not actually be faster.
Related
I've written a function that generates a sparse matrix of size nxd
and puts in each column 2 non-zero values.
function [M] = generateSparse(n,d)
M = sparse(d,n);
sz = size(M);
nnzs = 2;
val = ceil(rand(nnzs,n));
inds = zeros(nnzs,d);
for i=1:n
ind = randperm(d,nnzs);
inds(:,i) = ind;
end
points = (1:n);
nnzInds = zeros(nnzs,d);
for i=1:nnzs
nnzInd = sub2ind(sz, inds(i,:), points);
nnzInds(i,:) = nnzInd;
end
M(nnzInds) = val;
end
However, I'd like to be able to give the function another parameter num-nnz which will make it choose randomly num-nnz cells and put there 1.
I can't use sprand as it requires density and I need the number of non-zero entries to be in-dependable from the matrix size. And giving a density is basically dependable of the matrix size.
I am a bit confused on how to pick the indices and fill them... I did with a loop which is extremely costly and would appreciate help.
EDIT:
Everything has to be sparse. A big enough matrix will crash in memory if I don't do it in a sparse way.
You seem close!
You could pick num_nnz random (unique) integers between 1 and the number of elements in the matrix, then assign the value 1 to the indices in those elements.
To pick the random unique integers, use randperm. To get the number of elements in the matrix use numel.
M = sparse(d, n); % create dxn sparse matrix
num_nnz = 10; % number of non-zero elements
idx = randperm(numel(M), num_nnz); % get unique random indices
M(idx) = 1; % Assign 1 to those indices
I have a cell array of 53 different (40,000 x 2000) sparse matrices. I need to take the mean over the third dimension, so that for example element (2,5) is averaged across the 53 cells. This should yield a single (33,000 x 2016) output. I think there ought to be a way to do this with cellfun(), but I am not able to write a function that works across cells on the same within-cell indices.
You can convert from sparse matrix to indices and values of nonzeros entries, and then use sparse to automatically obtain the sum in sparse form:
myCell = {sparse([0 1; 2 0]), sparse([3 0; 4 0])}; %// example
C = numel(myCell);
M = cell(1,C); %// preallocate
N = cell(1,C);
V = cell(1,C);
for c = 1:C
[m n v] = find(myCell{c}); %// rows, columns and values of nonzero entries
M{c} = m.';
N{c} = n.';
V{c} = v.';
end
result = sparse([M{:}],[N{:}],[V{:}])/C; %'// "sparse" sums over repeated indices
This should do the trick, just initialize an empty array and sum over each element of the cell array. I don't see any way around using a for loop without concatenating it into one giant 3D array (which will almost definitely run out of memory)
running_sum=zeros(size(cell_arr{1}))
for i=1:length(cell_arr)
running_sum=running_sum+cell_arr{i};
end
means = running_sum./length(cell_arr);
I have a column vector of data in variable vdata and a list of indeces idx. I want to access vdata at the indeces x before and x after each index in idx. One way I would do it in a for loop is:
x = 10;
accessed_data = [];
for (ii = 1:length(idx))
accessed_data = vdata(idx-x:idx+x);
end
Is there a way to do this in a vectorized function? I found a solution to a very similar question here: Addressing multiple ranges via indices in a vector but I don't understand the code :(.
Assuming min(idx)-x>0 and max(idx)+x<=numel(vdata) then you can simply do
iidx = bsxfun(#plus, idx(:), -x:x); % create all indices
accessed_data = vdata( iidx );
One scheme that uses direct indexing instead of a for loop:
xx = (-x:x).'; % Range of indices
idxx = bsxfun(#plus,xx(:,ones(1,numel(idx))),idx(:).'); % Build array
idxx = idxx(:); % Columnize to interleave columns
idxx = idxx(idxx>=1&idxx<=length(vdata)); % Make sure the idx+/-x is valid index
accessed_data = vdata(idxx); % Indices of data
The second line can be replaced with a form of the first line from #Shai's answer. This scheme checks that all of the resultant indices are valid. Because some might have to be removed, you could end up with a ragged array. One way to solve this is to use cell arrays, but here I just make idxx a vector, and thus accessed_data is as well.
This gives the solution in a matrix, with one row for each value in idx. It assumes that all values in idx are greater than or equal to x, and less than or equal to length(vdata)-x.
% Data
x = 10;
idx = [12 20 15];
vdata = 1:100;
ind = repmat(-x:x,length(idx),1) + repmat(idx(:),1,2*x+1);
vdata(ind)
Is there a well-vectorized way to take the product of all the nonzero elements in each column of a sparse matrix in octave (or matlab) (returning a row-vector of products)?
I'd combine find with accumarray:
%# create a random sparse array
s = sprand(4,4,0.6);
%# find the nonzero values
[rowIdx,colIdx,values] = find(s);
%# calculate product
product = accumarray(colIdx,values,[],#prod)
Some alternatives (that might be less efficient; you may want to profile them)
%# simply set the zero-elements to 1, then apply prod
%# may lead to memory issues
s(s==0) = 1;
product = prod(s,1);
.
%# do "manual" accumarray
[rowIdx,colIdx,values] = find(s);
product = zeros(1,size(s,2));
uCols = unique(colIdx);
for col = uCols(:)'
product(col) = prod(values(colIdx==col));
end
I found an alternative approach to solving this, but it may be slower and not quite as precise in the worst-case:
Simply take the log of all the nonzero elements and then sum the columns. Then take the exp of the resulting vector:
function [r] = prodnz(m)
nzinds = find(m != 0);
vals = full(m(nzinds));
vals = log(vals);
m(nzinds) = vals;
s = full(sum(m));
r = exp(s);
endfunction
I've got 2 different files, one of them is an input matrix (X) which has 3823*63 elements (3823 input and 63 features), the other one is a class vector (R) which has 3823*1 elements; those elements have values from 0 to 9 (there are 10 classes).
I have to compute covariance matrices for every classes. So far, i could only compute mean vectors for every classes with so many nested loops. However, it leads me to brain dead.
Is there any other easy way?
There is the code for my purpose (thanks to Sam Roberts):
xTra = importdata('optdigits.tra');
xTra = xTra(:,2:64); % first column's inputs are all zero
rTra = importdata('optdigits.tra');
rTra = rTra(:,65); % classes of the data
c = numel(unique(rTra));
for i = 1:c
rTrai = (rTra==i-1); % Get indices of the elements from the ith class
meanvect{i} = mean(xTra(rTrai,:)); % Calculate their mean
covmat{i} = cov(xTra(rTrai,:)); % Calculate their covariance
end
Does this do what you need?
X = rand(3263,63);
R = randi(10,3263,1)-1;
numClasses = numel(unique(R));
for i = 1:numClasses
Ri = (R==i); % Get indices of the elements from the ith class
meanvect{i} = mean(X(Ri,:)); % Calculate their mean
covmat{i} = cov(X(Ri,:)); % Calculate their covariance
end
This code loops through each of the classes, selects the rows of R that correspond to observations from that class, and then gets the same rows from X and calculates their mean and covariance. It stores them in a cell array, so you can access the results like this:
% Display the mean vector of class 1
meanvect{1}
% Display the covariance matrix of class 2
covmat{2}
Hope that helps!
Don't use mean and sum as a variable names because they are names of useful Matlab built-in functions. (Type doc mean or doc sum for usage help)
Also cov will calculate the covariance matrix for you.
You can use logical indexing to pull out the examples.
covarianceMatrices = cell(m,1);
for k=0:m-1
covarianceMatrices{k} = cov(xTra(rTra==k,:));
end
One-liner
covarianceMatrices = arrayfun(#(k) cov(xTra(rTra==k,:)), 0:m-1, 'UniformOutput', false);
First construct the data matrix for each class.
Second compute the covariance for each data matrix.
The code below does this.
% assume allData contains all the data you've read in, each row is one data point
% assume classVector contains the class of each data point
numClasses = 10;
data = cell(10,1); %use cells to store each of the data matrices
covariance = cell(10,1); %store each of the class covariance matrices
[numData dummy] = size(allData);
%get the data out of allData and into each class' data matrix
%there is probably a nice matrix way to do this, but this is hopefully clearer
for i = 1:numData
currentClass = classVector(i) + 1; %Matlab indexes from 1
currentData = allData(i,:);
data{currentClass} = [data{currentClass}; currentData];
end
%calculate the covariance matrix for each class
for i = 1:numClasses
covariance{i} = cov(data{i});
end