Finding the largest vector inside a matrix - matlab

I am trying to find the largest vector inside a matrix compound by vectors with MATLAB, however I am having some difficulties, so I would be very thankful if someone help me. I have this:
The matrix paths (solution of a Dijkstra function), which is a 1000x1000 matrix, whose values are vectors of 1 row and different number of columns (when the columns are bigger than 10, the values appear as "1x11 double, 1x12 double, etc"). The matrix paths have this form:
1 2 3 ....
1 1 <1x20 double> <1x16 double>
2 <1x20 double> 2 [2,870,183,492,641,863,611,3]
3 <1x16 double> [3,611,863,641,492,183,870,2] 3
4 <1x25 double> <1x12 double> <1x14 double>
.
.
.
At first I thought in just finding the largest vector in the matrix by
B = max(length(paths))
However MATLAB returns B = 1000, value which is feasible, but not likely. When trying to find out the position of the vector by using:
[row,column] = find(length(paths) == B)
MATLAB returns row = 1, column = 1, which for sure is wrong... I have thought that maybe is a problem of how MATLAB takes the data. It is like it doesn't consider the entries of the matrix as vectors, because when I enter:
length(paths(3,2))
It returns me 1, but it should return 8 as I understand, also when introducing:
paths(3,2)
It returns [1x8 double] but I expect to see the whole vector. I don't know what to do, maybe a "for" loop, I really do not know if MATLAB takes the data of the matrix as vectors or as simple double values.

The cell with the largest vector can be found using cellfun and numel to get the number of elements in each numeric matrix stored in the cells of paths:
vecLens = cellfun(#numel,paths);
[maxLen,im] = max(vecLens(:));
[rowMax,colMax] = ind2sub(size(vecLens),im)
This gets a 1000x1000 numeric matrix vecLens containing the sizes, max gets the linear index of the largest element, and ind2sub translates that to row,column indexes.
A note on length: It gives you the size of the largest dimension. The size of paths is 1000x1000, so length(paths) is 1000. My advice is, Don't ever use length. Use size, specifying the dimension you want.
If multiple vectors are the same length, you get the first one with the above approach. To get all of them (starting after the max command):
maxMask = vecLens==maxLen;
if nnz(maxMask)>1,
[rowMax,colMax] = find(maxMask);
else
[rowMax,colMax] = ind2sub(size(vecLens),im)
end
or just
[rowMax,colMax] = find(vecLens==maxLen);

Related

Access multiple elements and assign to each selected element a different value

I need to know if there is any efficient way of doing the following in MATLAB.
I have several big sparse matrices, the size of each one is roughly 9000000x9000000.
I need to access multiple element of such matrix and assign to each selected element a different value stored in another array. I'll give an example:
What I have:
SPARSE MATRIX of size 9000000x9000000
Matrix with the list of indexes and values I want to access, this is a matrix like this:
[row1, col1, value1;
row2, col2, value2;
...
rowN, colN, valueN]
Where N is the length of such matrix.
What I need:
Assign to the SPARSE MATRIX the corresponding value to the corresponding index, this is:
SPARSE_MATRIX(row1, col1) = value1
SPARSE_MATRIX(row2, col2) = value2
...
SPARSE_MATRIX(rowN, colN) = valueN
Thanks in advance!
EDIT:
Thank you to both for answering, I think I did not explain myself well, I'll try again.
I already have a large SPARSE MATRIX of about 9000000 rows x 9000000 columns, it is a SPARSE MATRIX filled with zeros.
Then I have another array or matrix, let's call it M with N number of rows, where N could take values from 0 to 9000000; and 3 columns. The first two columns are used to index an element of my SPARSE MATRIX, and the third column stores the value I want to transfer to the SPARSE MATRIX, this is, given a random row of M, i:
SPARSE_MATRIX(M(i, 1), M(i, 2)) = M(i, 3)
The idea is to do that for all the rows, I have tried it with common indexing:
SPARSE_MATRIX(M(:, 1), M(:, 2)) = M(:, 3)
Now I would like to do this assignation for all the rows in M as fast as possible, because if I use a loop or common indexing it takes ages (I am using a 7th Gen i7 processor with 16 GB of RAM). And I also need to keep the zeros in the SPARSE_MATRIX.
EDIT 2: SOLVED! Thank you Metahominid, I was not thinking through, but yes the sparse function does solve my problem, I just think my brain circuits were shortcircuited yesterday and was unable to see through it hahaha. Thank you to both anyway!
Regards!
You can construct a sparse matrix like this.
A = sparse(i,j,v)
S = sparse(i,j,v) generates a sparse matrix S from the triplets i, j,
and v such that S(i(k),j(k)) = v(k). The max(i)-by-max(j) output
matrix has space allotted for length(v) nonzero elements. sparse adds
together elements in v that have duplicate subscripts in i and j.
So you can simply construct the row vector, column vector and value vector.
I am answering in part because I cannot comment. You question seems a little confusing to me. The sparse() function in MATLAB does just this.
You can enter your arrays of indices and values directly into the interface, or declare a sparse matrix of zeros and set each individually.
Given your data format make three vectors, ROWS = [row1; ...; rown], COLS = [col1; ...; coln], and DATA = [val1; ... valn]. I am assuming that your size is the overall size of the full matrix and not the sparse portion.
Then
A = sparse(ROWS, COLS, DATA) will do just what you want. You can even specify the original matrix size.
A = sparse(ROWS, COLS, DATA, 90...., 90....).

Matrix indices for creating sparse matrix

I want to create a 4 by 4 sparse matrix A. I want assign values (e.g. 1) to following entries:
A(2,1), A(3,1), A(4,1)
A(2,2), A(3,2), A(4,2)
A(2,3), A(3,3), A(4,3)
A(2,4), A(3,4), A(4,4)
According to the manual page, I know that I should store the indices by row and column respectively. That is, for row indices,
r=[2,2,2,2,3,3,3,3,4,4,4,4]
Also, for column indices
c=[1,2,3,4,1,2,3,4,1,2,3,4]
Since I want to assign 1 to each of the entries, so I use
value = ones(1,length(r))
Then, my sparse matrix will be
Matrix = sparse(r,c,value,4,4)
My problem is this:
Indeed, I want to construct a square matrix of arbitrary dimension. Says, if it is a 10 by 10 matrix, then my column vector will be
[1,2,..., 10, 1,2, ..., 10, 1,...,10, 1,...10]
For row vector, it will be
[2,2,...,2,3,3,...,3,...,10, 10, ...,10]
I would like to ask if there is a quick way to build these column and row vector in an efficient manner? Thanks in advance.
I think the question aims to create vectors c,r in an easy way.
n = 4;
c = repmat(1:n,1,n-1);
r = reshape(repmat(2:n,n,1),1,[]);
Matrix = sparse(r,c,value,n,n);
This will create your specified vectors in general.
However as pointed out by others full sparse matrixes are not very efficient due to overhead. If I recall correctly a sparse matrix offers advantages if the density is lower than 25%. Having everything except the first row will result in slower performance.
You can sparse a matrix after creating its full version.
A = (10,10);
A(1,:) = 0;
B = sparse(A);

matlab Create growing matrix with for loop that grows by 3 per loop

So I have written this:
HSRXdistpR = squeeze(comDatape_m1(2,7,1,:,isubj));
HSRXdistpL = squeeze(comDatape_m1(2,4,1,:,isubj));
TocomXdistp = squeeze(comDatape_m1(2,10,1,:,isubj));
for i = 1:2;
HSRXp = NaN(8,3*i);
HSRXp(:,i*3) = [HSRXdistpR(:,i) HSRXdistpL(:,i) TocomXdistp(:,i)];
end
In the first part I am just selecting data from a 5-D matrix, nothing special. All that's important here is that it creates an 8x2 matrix per line (isubj=2). Now I want to add the first column of each matrix into an 8x3 matrix, and then the second column of each matrix into the same matrix (creating an 8x6 matrix). Since the number of my subjects will vary, I want to do this in a for loop. This way, if the isubj increases to 3, it should go on to create an 8x9 matrix.
So I tried to create a matrix that will grow by 3 for each iteration of i, which selects the ith column of each of the 3 matrices and then puts them in there.
However I get the following error:
Subscripted assignment dimension mismatch.
Is it possible to let a matrix grow by more than one in a for loop? Or how should it be done otherwise?
Here is your problem:
HSRXp(:,i*3) = [HSRXdistpR(:,i) HSRXdistpL(:,i) TocomXdistp(:,i)];
You're trying to assign an n x 3 matrix (RHS) into an n x 1 vector (LHS). It would be easier to simply use horizontal concatenation:
HSRXp = [HSRXp, [HSRXdistpR(:,i) HSRXdistpL(:,i) TocomXdistp(:,i)]];
But that would mean reallocation at each step, which might slow your code down if the matrix becomes large.

Matlab: Structure with 6x6 matrices - how to get the average across the group

I have a structure named 'data' with 100 entries, each corresponding to a participant from an experiment. Each of the 100 entries contains multiple 6x6 matrices giving different values.
For instance, an example of a matrix from my first participant is:
data.p001.matrixCB
18.9737 17.0000 14.2829 12.4499 11.7898 10.0995
18.1384 16.0000 13.4907 11.7898 11.2250 10.3441
14.7986 12.5300 11.7898 11.7473 12.2066 9.3808
14.3527 13.4536 12.9615 13.3417 12.7279 11.7047
18.0278 17.8885 17.6068 17.4642 17.1464 16.6132
24.1661 24.7790 23.7697 23.3880 22.6495 23.8537
...and this is one of 100 entries in the structure with a similar setup.
I'd like to get the mean average value for each cell in the matrix across my 100 participants. So I would have a mean value for the 100 values in position matrixCB(1,1), and all other positions in the matrix. Unfortunately I can't see how this is done, and the help functions are less than helpful. Any assistance would be greatly appreciated!
You can sum all your 100 matrix into Sum and then divide it by 100 - Sum./100 and then each cell would represent the avg of all 100 cells on each index .
For example -
Sum = A + B ;
Sum./2 ;
Structures can be a pain. To avoid typing out a bunch of code, you could take the following approach:
Convert required matrices to cell array
Reshape the cell array into 3D matrix
Compute means across 3rd dimension
Code for this:
Mcell = arrayfun(#(x) data.(sprintf('p%03d',x)).matrixCB, 1:100, 'uni', 0);
M = mean( reshape(cell2mat(Mcell), 6, 6, []), 3 );

N-Dimensional Histogram Counts

I am currently trying to code up a function to assign probabilities to a collection of vectors using a histogram count. This is essentially a counting exercise, but requires some finesse to be able to achieve efficiently. I will illustrate with an example:
Say that I have a matrix X = [x1, x2....xM] with N rows and M columns. Here, X represents a collection of M, N-dimensional vectors. IN other words, each of the columns of X is an N-dimensional vector.
As an example, we can generate such an X for M = 10000 vectors and N = 5 dimensions using:
X = randint(5,10000)
This will produce a 5 x 10000 matrix of 0s and 1s, where each column is represents a 5 dimensional vector of 1s and 0s.
I would like to assign a probability to each of these vectors through a basic histogram count. The steps are simple: first find the unique columns of X; second, count the number of times each unique column occurs. The probability of a particular occurrence is then the #of times this column was in X / total number of columns in X.
Returning to the example above, I can do the first step using the unique function in MATLAB as follows:
UniqueXs = unique(X','rows')'
The code above will return UniqueXs, a matrix with N rows that only contains the unique columns of X. Note that the transposes are due to weird MATLAB input requirements.
However, I am unable to find a good way to count the number of times each of the columns in UniqueX is in X. So I'm wondering if anyone has any suggestions?
Broadly speaking, I can think of two ways of achieving the counting step. The first way would be to use the find function, though I think this may be slow since find is an elementwise operation. The second way would be to call unique recursively as it can also provide the index of one of the unique columns in X. This should allow us to remove that column from X and redo unique on the resulting X and keep counting.
Ideally, I think that unique might already be doing some counting so the most efficient way would probably be to work without the built-in functions.
Here are two solutions, one assumes all values are either 0's or 1's (just like the example in your description), the other does not. Both codes should be very fast (more so the one with binary values), even on large data.
1) only zeros and ones
%# random vectors of 0's and 1's
x = randi([0 1], [5 10000]); %# RANDINT is deprecated, use RANDI instead
%# convert each column to a binary string
str = num2str(x', repmat('%d',[1 size(x,1)])); %'
%# convert binary representation to decimal number
num = (str-'0') * (2.^(size(s,2)-1:-1:0))'; %'# num = bin2dec(str);
%# count frequency of how many each number occurs
count = accumarray(num+1,1); %# num+1 since it starts at zero
%# assign probability based on count
prob = count(num+1)./sum(count);
2) any positive integer
%# random vectors with values 0:MAX_NUM
x = randi([0 999], [5 10000]);
%# format vectors as strings (zero-filled to a constant length)
nDigits = ceil(log10( max(x(:)) ));
frmt = repmat(['%0' num2str(nDigits) 'd'], [1 size(x,1)]);
str = cellstr(num2str(x',frmt)); %'
%# find unique strings, and convert them to group indices
[G,GN] = grp2idx(str);
%# count frequency of occurrence
count = accumarray(G,1);
%# assign probability based on count
prob = count(G)./sum(count);
Now we can see for example how many times each "unique vector" occurred:
>> table = sortrows([GN num2cell(count)])
table =
'000064850843749' [1] # original vector is: [0 64 850 843 749]
'000130170550598' [1] # and so on..
'000181606710020' [1]
'000220492735249' [1]
'000275871573376' [1]
'000525617682120' [1]
'000572482660558' [1]
'000601910301952' [1]
...
Note that in my example with random data, the vector space becomes very sparse (as you increase the maximum possible value), thus I wouldn't be surprised if all counts were equal to 1...