Matrix indices for creating sparse matrix - matlab

I want to create a 4 by 4 sparse matrix A. I want assign values (e.g. 1) to following entries:
A(2,1), A(3,1), A(4,1)
A(2,2), A(3,2), A(4,2)
A(2,3), A(3,3), A(4,3)
A(2,4), A(3,4), A(4,4)
According to the manual page, I know that I should store the indices by row and column respectively. That is, for row indices,
r=[2,2,2,2,3,3,3,3,4,4,4,4]
Also, for column indices
c=[1,2,3,4,1,2,3,4,1,2,3,4]
Since I want to assign 1 to each of the entries, so I use
value = ones(1,length(r))
Then, my sparse matrix will be
Matrix = sparse(r,c,value,4,4)
My problem is this:
Indeed, I want to construct a square matrix of arbitrary dimension. Says, if it is a 10 by 10 matrix, then my column vector will be
[1,2,..., 10, 1,2, ..., 10, 1,...,10, 1,...10]
For row vector, it will be
[2,2,...,2,3,3,...,3,...,10, 10, ...,10]
I would like to ask if there is a quick way to build these column and row vector in an efficient manner? Thanks in advance.

I think the question aims to create vectors c,r in an easy way.
n = 4;
c = repmat(1:n,1,n-1);
r = reshape(repmat(2:n,n,1),1,[]);
Matrix = sparse(r,c,value,n,n);
This will create your specified vectors in general.
However as pointed out by others full sparse matrixes are not very efficient due to overhead. If I recall correctly a sparse matrix offers advantages if the density is lower than 25%. Having everything except the first row will result in slower performance.

You can sparse a matrix after creating its full version.
A = (10,10);
A(1,:) = 0;
B = sparse(A);

Related

Access multiple elements and assign to each selected element a different value

I need to know if there is any efficient way of doing the following in MATLAB.
I have several big sparse matrices, the size of each one is roughly 9000000x9000000.
I need to access multiple element of such matrix and assign to each selected element a different value stored in another array. I'll give an example:
What I have:
SPARSE MATRIX of size 9000000x9000000
Matrix with the list of indexes and values I want to access, this is a matrix like this:
[row1, col1, value1;
row2, col2, value2;
...
rowN, colN, valueN]
Where N is the length of such matrix.
What I need:
Assign to the SPARSE MATRIX the corresponding value to the corresponding index, this is:
SPARSE_MATRIX(row1, col1) = value1
SPARSE_MATRIX(row2, col2) = value2
...
SPARSE_MATRIX(rowN, colN) = valueN
Thanks in advance!
EDIT:
Thank you to both for answering, I think I did not explain myself well, I'll try again.
I already have a large SPARSE MATRIX of about 9000000 rows x 9000000 columns, it is a SPARSE MATRIX filled with zeros.
Then I have another array or matrix, let's call it M with N number of rows, where N could take values from 0 to 9000000; and 3 columns. The first two columns are used to index an element of my SPARSE MATRIX, and the third column stores the value I want to transfer to the SPARSE MATRIX, this is, given a random row of M, i:
SPARSE_MATRIX(M(i, 1), M(i, 2)) = M(i, 3)
The idea is to do that for all the rows, I have tried it with common indexing:
SPARSE_MATRIX(M(:, 1), M(:, 2)) = M(:, 3)
Now I would like to do this assignation for all the rows in M as fast as possible, because if I use a loop or common indexing it takes ages (I am using a 7th Gen i7 processor with 16 GB of RAM). And I also need to keep the zeros in the SPARSE_MATRIX.
EDIT 2: SOLVED! Thank you Metahominid, I was not thinking through, but yes the sparse function does solve my problem, I just think my brain circuits were shortcircuited yesterday and was unable to see through it hahaha. Thank you to both anyway!
Regards!
You can construct a sparse matrix like this.
A = sparse(i,j,v)
S = sparse(i,j,v) generates a sparse matrix S from the triplets i, j,
and v such that S(i(k),j(k)) = v(k). The max(i)-by-max(j) output
matrix has space allotted for length(v) nonzero elements. sparse adds
together elements in v that have duplicate subscripts in i and j.
So you can simply construct the row vector, column vector and value vector.
I am answering in part because I cannot comment. You question seems a little confusing to me. The sparse() function in MATLAB does just this.
You can enter your arrays of indices and values directly into the interface, or declare a sparse matrix of zeros and set each individually.
Given your data format make three vectors, ROWS = [row1; ...; rown], COLS = [col1; ...; coln], and DATA = [val1; ... valn]. I am assuming that your size is the overall size of the full matrix and not the sparse portion.
Then
A = sparse(ROWS, COLS, DATA) will do just what you want. You can even specify the original matrix size.
A = sparse(ROWS, COLS, DATA, 90...., 90....).

How do I reshape a non-quadratic matrix?

I have a column vector A with dimensions (35064x1) that I want to reshape into a matrix with 720 lines and as many columns as it needs.
In MATLAB, it'd be something like this:
B = reshape(A,720,[])
in which B is my new matrix.
However, if I divide 35604 by 720, there'll be a remainder.
Ideally, MATLAB would go about filling every column with 720 values until the last column, which wouldn't have 720 values; rather, 504 values (48x720+504 = 35064).
Is there any function, as reshape, that would perform this task?
Since I am not good at coding, I'd resort to built-in functions first before going into programming.
reshape preserves the number of elements but you achieve the same in two steps
b=zeros(720*ceil(35604/720),1); b(1:35604)=a;
reshape(b,720,[])
A = rand(35064,1);
NoCols = 720;
tmp = mod(numel(A),NoCols ); % get the remainder
tmp2 = NoCols -tmp;
B = reshape([A; nan(tmp2,1)],720,[]); % reshape the extended column
This first gets the remainder after division, and then subtract that from the number of columns to find the amount of missing values. Then create an array with nan (or zeros, whichever suits your purpose best) to pad the original and then reshape. One liner:
A = rand(35064,1);
NoCols = 720;
B = reshape([A; nan(NoCols-mod(numel(A),NoCols);,1)],720,[]);
karakfa got the right idea, but some error in his code.
Fixing the errors and slightly simplifying it, you end up with:
B=nan(720,ceil(numel(a)/720));
B(1:numel(A))=A;
Create a matrix where A fits in and assingn the elemnent of A to the first numel(A) elements of the matrix.
An alternative implementation which is probably a bit faster but manipulates your variable b
%pads zeros at the end
A(720*ceil(numel(A)/720))=0;
%reshape
B=reshape(A,720,[]);

matlab Create growing matrix with for loop that grows by 3 per loop

So I have written this:
HSRXdistpR = squeeze(comDatape_m1(2,7,1,:,isubj));
HSRXdistpL = squeeze(comDatape_m1(2,4,1,:,isubj));
TocomXdistp = squeeze(comDatape_m1(2,10,1,:,isubj));
for i = 1:2;
HSRXp = NaN(8,3*i);
HSRXp(:,i*3) = [HSRXdistpR(:,i) HSRXdistpL(:,i) TocomXdistp(:,i)];
end
In the first part I am just selecting data from a 5-D matrix, nothing special. All that's important here is that it creates an 8x2 matrix per line (isubj=2). Now I want to add the first column of each matrix into an 8x3 matrix, and then the second column of each matrix into the same matrix (creating an 8x6 matrix). Since the number of my subjects will vary, I want to do this in a for loop. This way, if the isubj increases to 3, it should go on to create an 8x9 matrix.
So I tried to create a matrix that will grow by 3 for each iteration of i, which selects the ith column of each of the 3 matrices and then puts them in there.
However I get the following error:
Subscripted assignment dimension mismatch.
Is it possible to let a matrix grow by more than one in a for loop? Or how should it be done otherwise?
Here is your problem:
HSRXp(:,i*3) = [HSRXdistpR(:,i) HSRXdistpL(:,i) TocomXdistp(:,i)];
You're trying to assign an n x 3 matrix (RHS) into an n x 1 vector (LHS). It would be easier to simply use horizontal concatenation:
HSRXp = [HSRXp, [HSRXdistpR(:,i) HSRXdistpL(:,i) TocomXdistp(:,i)]];
But that would mean reallocation at each step, which might slow your code down if the matrix becomes large.

MatLab: Create matrix row element if row elements of two previous matrices are identical

Sorry for the title. I could not think of something better.
I have the following problem.
I have two four-column matrices build up like this:
Property | X | Y | Z
The two matrices have different sizes, since matrix 1 has a large amount of additional rows compared to matrix 2.
What I want to do is the following:
I need to create a third matrix that only features those rows (of the large matrix) that are identical in columns X, Y and Z to rows in matrix2(the property column is always different).
I tried an if-statement but it did not really work out due to my programming syntax. Has somebody a tip?
Thank you!
I tried something like this: (in this case A is the larger matrix and I want its property column for X,Y,Z-positions that are identical to another matrix B.. I am terrible with the MatLab-syntax..
if (A(:,2) == B(:,2) and (A(:,3) == B(:,3) and (A(:,4) == B(:,4))
newArray(:,1) = A(:,1);
end
Use ismember with the 'rows' option to find the desired rows, and then use that as an index to build the result:
ind = ismember(A(:,2:4), B(:,2:4), 'rows');
C = A(ind,:);
I have assumed that a row of A is selected if its last three columns match those of any row of B.

N-Dimensional Histogram Counts

I am currently trying to code up a function to assign probabilities to a collection of vectors using a histogram count. This is essentially a counting exercise, but requires some finesse to be able to achieve efficiently. I will illustrate with an example:
Say that I have a matrix X = [x1, x2....xM] with N rows and M columns. Here, X represents a collection of M, N-dimensional vectors. IN other words, each of the columns of X is an N-dimensional vector.
As an example, we can generate such an X for M = 10000 vectors and N = 5 dimensions using:
X = randint(5,10000)
This will produce a 5 x 10000 matrix of 0s and 1s, where each column is represents a 5 dimensional vector of 1s and 0s.
I would like to assign a probability to each of these vectors through a basic histogram count. The steps are simple: first find the unique columns of X; second, count the number of times each unique column occurs. The probability of a particular occurrence is then the #of times this column was in X / total number of columns in X.
Returning to the example above, I can do the first step using the unique function in MATLAB as follows:
UniqueXs = unique(X','rows')'
The code above will return UniqueXs, a matrix with N rows that only contains the unique columns of X. Note that the transposes are due to weird MATLAB input requirements.
However, I am unable to find a good way to count the number of times each of the columns in UniqueX is in X. So I'm wondering if anyone has any suggestions?
Broadly speaking, I can think of two ways of achieving the counting step. The first way would be to use the find function, though I think this may be slow since find is an elementwise operation. The second way would be to call unique recursively as it can also provide the index of one of the unique columns in X. This should allow us to remove that column from X and redo unique on the resulting X and keep counting.
Ideally, I think that unique might already be doing some counting so the most efficient way would probably be to work without the built-in functions.
Here are two solutions, one assumes all values are either 0's or 1's (just like the example in your description), the other does not. Both codes should be very fast (more so the one with binary values), even on large data.
1) only zeros and ones
%# random vectors of 0's and 1's
x = randi([0 1], [5 10000]); %# RANDINT is deprecated, use RANDI instead
%# convert each column to a binary string
str = num2str(x', repmat('%d',[1 size(x,1)])); %'
%# convert binary representation to decimal number
num = (str-'0') * (2.^(size(s,2)-1:-1:0))'; %'# num = bin2dec(str);
%# count frequency of how many each number occurs
count = accumarray(num+1,1); %# num+1 since it starts at zero
%# assign probability based on count
prob = count(num+1)./sum(count);
2) any positive integer
%# random vectors with values 0:MAX_NUM
x = randi([0 999], [5 10000]);
%# format vectors as strings (zero-filled to a constant length)
nDigits = ceil(log10( max(x(:)) ));
frmt = repmat(['%0' num2str(nDigits) 'd'], [1 size(x,1)]);
str = cellstr(num2str(x',frmt)); %'
%# find unique strings, and convert them to group indices
[G,GN] = grp2idx(str);
%# count frequency of occurrence
count = accumarray(G,1);
%# assign probability based on count
prob = count(G)./sum(count);
Now we can see for example how many times each "unique vector" occurred:
>> table = sortrows([GN num2cell(count)])
table =
'000064850843749' [1] # original vector is: [0 64 850 843 749]
'000130170550598' [1] # and so on..
'000181606710020' [1]
'000220492735249' [1]
'000275871573376' [1]
'000525617682120' [1]
'000572482660558' [1]
'000601910301952' [1]
...
Note that in my example with random data, the vector space becomes very sparse (as you increase the maximum possible value), thus I wouldn't be surprised if all counts were equal to 1...