Selecting columns/rows of a matrix in Julia - select

this is a very basic problem but I didn't find any hints on it. Let's say I have a 2x4 matrix and I want to reduce the dimension of the matrix to only these columns that are in the sum larger than 1:
A=rand(2,4)
ind = sum(A,1).>1
That gives me an indicator of the columns I want to retain. Naively one would assume that I can do that:
A[:,ind]
which doesn't work as ind is a BitArray and only for Bool Arrays this is allowed, i.e., the following works
A[:,[true,true,false,true]]
in return, the following does work:
A[A.>0.5]
But it returns a vector of filtered elements.
What is the logic behind this and how do I solve my problem?

As noted in the comments, this is fixed by using a version of Julia which is >=v0.4.

Related

Deletion of all but the first channel in a cell of matrices

I have a row cell vector M, containing matrices in each cell. Every matrix m (matrix inside the big matrix M) is made of 2 channels (columns), of which I only want to use the first.
The approach I thought about was going through each m, check if it has 2 channels, and if that is the case delete the second channel.
Is there a way to just slice it in matlab? or loop it and obtain the matrix M as the matrix m would disappear.
First code is:
load('ECGdata.mat')
I have the below.
when I double-click in one of the variable , here is what I can see:
As you can see the length of each matrix in each cell is different. Now let's see one cell:
The loop I'm trying to get must check the shape of the matrix (I'm talking python here/ I mean if the matrix has 2 columns then delete the second) because some of the variables of the dataframe have matrix containing one column (or just a normal column).
In here I'm only showing the SR variable that has 2 columns for each matrix. Its not the case for the rest of the variables
You do not need to delete the extra "channel", what you can do is quite simple:
newVar = cellfun(#(x)x(:,1), varName, 'UniformOutput', false);
where varName is SR, VF etc. (just run this command once for each of the variables you load).
What the code above does is go over each element of the input cell (an Nx2 matrix in your example), and select the first column only. Then it stores all outputs in a new cell array. In case of matrices with a single column, there is no effect - we just get the input back.
(I apologize in advance if there is some typo / error in the code, as I am writing this answer from my phone and cannot test it. Please leave a comment if something is wrong, and I'll do my best to fix it tomorrow.)

(matlab matrix operation), Is it possible to get a group of value from matrix without loop?

I'm currently working on implementing a gradient check function in which it requires to get certain index values from the result matrix. Could someone tell me how to get a group of values from the matrix?
To be specific, for a result matrx res with size M x N, I'll need to get element res(3,1), res(4,2), res(1,3), res(2,4)...
In my case, M is dimension and N is batch size and there's a label array whose size is 1xbatch_size, [3 4 1 2...]. So the desired values are res(label(:),1:batch_size). Since I'm trying to practice vectorization programming and it's better not using loop. Could someone tell me how to get a group of value without a iteration?
Cheers.
--------------------------UPDATE----------------------------------------------
The only idea I found is firstly building a 'mask matrix' then use the original result matrix to do element wise multiplication (technically called 'Hadamard product', see in wiki). After that just get non-zero element out and do the sum operation, the code in matlab should look like:
temp=Mask.*res;
desired_res=temp(temp~=0); %Note: the temp(temp~=0) extract non-zero elements in a 'column' fashion: it searches temp matrix column by column then put the non-zero number into container 'desired_res'.
In my case, what I wanna do next is simply sum(desired_res) so I don't need to consider the order of those non-zero elements in 'desired_res'.
Based on this idea above, creating mask matrix is the key aim. There are two methods to do this job.
Codes are shown below. In my case, use accumarray function to add '1' in certain location (which are stored in matrix 'subs') and add '0' to other space. This will give you a mask matrix size [rwo column]. The usage of full(sparse()) is similar. I made some comparisons on those two methods (repeat around 10 times), turns out full(sparse) is faster and their time costs magnitude is 10^-4. So small difference but in a large scale experiments, this matters. One benefit of using accumarray is that it could define the matrix size while full(sparse()) cannot. The full(sparse(subs, 1)) would create matrix with size [max(subs(:,1)), max(subs(:,2))]. Since in my case, this is sufficient for my requirement and I only know few of their usage. If you find out more, please share with us. Thanks.
The detailed description of those two functions could be found on matlab's official website. accumarray and full, sparse.
% assume we have a label vector
test_labels=ones(10000,1);
% method one, accumarray(subs,1,[row column])
tic
subs=zeros(10000,2);
subs(:,1)=test_labels;
subs(:,2)=1:10000;
k1=accumarray(subs,1,[10, 10000]);
t1=toc % to compare with method two to check which one is faster
%method two: full(sparse(),1)
tic
k2=full(sparse(test_labels,1:10000,1));
t2=toc

Mean of outlinks of a sparse matrix

So I'm working with sparse matrix and I have to find out different info about a very big one (10^6 size) and I need to find out the mean of the outlinks. Just to be sure I think mean is what you get from 3+4+5/3=4, 4 is the mean.
I thought of something like this:
[row,col] = find(A(:,2),1,'first')
and then I would do 1/numberInThatIndex or something similar, since it's a S-matrix (pretty sure it's called that).
And I would iterate column by column but for some reason it's not giving me the first number in each column, if I do find(A(:,1),1,'first') it does give me the first in the first column, but not in the second if I change it to A(:,2).
I'd also need something to store that index to access the value, I thought of a 2xN vector but I guess it's not the best idea. I mean, find is going to give me index, but I need the value in that index, and then store that or show it. Not sure if I'm explaining myself properly but I'm trying, sorry about that.
Just to be clear both when I input A(:,1) and A(:,2) it gives me index from the first column, and I do not want that, I want first element found from each column, so I can calculate the mean out of the number in that index.
edit: allright it seems like that indeed does work, but when I was checking the results I was putting 3817 instead of 3871 that was the given answer and so I found a 0 when I wanted something that's not a zero. Not sure if I should delete all of this.
To solve your problem, you can do the following:
numberNonZerosPerColumn = sum(S~=0,1);
meanValue = nanmean(1./numberNonZerosPerColumn);
Count the number of nonzero elements in every column n(i)
Compute the values v(i) that are stored there, which are defined by v(i) := 1/n(i)
Take the mean of those values where n(i) is not zero (i.e. summing all those values, where v(i) is not NaN and divide by the number of columns that contain at least one zero)
If you want to treat columns without any nonzero entry as v(i):= 0, but still use them in your mean, you can use:
numberNonZerosPerColumn = sum(S~=0,1);
meanValue = nansum(1./numberNonZerosPerColumn)/size(S,2);

Error using - Matrix dimensions must agree

I got the following error while working with a MATLAB program:
Error using - Matrix dimensions must agree
I noticed that the sizes of the matrices I'm trying to subtract from each other were:
firstMatrix --> 425x356
secondMatrix --> 426x356
How can I make them of equal size and go ahead and do my subtraction process?
I tried reshape, but the number of elements here seem to have to be equal.
Thanks.
I think both answers are missing the key point. Blithely subtracting two arrays of different size forgets that those arrays are NOT just numbers. The numbers must mean something. Else, they are just meaningless.
As well, simply deleting a row from the beginning or end may well be wrong, or padding with zeros. Only you know what the numbers mean, and why those arrays are not the same size. So only you can decide what is the proper action.
It might be right to pad, delete, interpolate, do any of these things. Or you might realize there is a bug in your code that created these arrays.
Your matrices have a different number of elements, so there's no point using reshape here (since it maintains the total number of elements). You'll have to discard one of the lines in the larger matrix before doing the subtraction:
For instance, you can discard the last line:
firstMatrix - secondMatrix(1:end - 1, :)
or discard the first line:
firstMatrix - secondMatrix(2:end, :)
Alternatively, you can pad the smaller matrix with default values (e.g NaN or zeroes), as suggested in another answer.
You're missing a row in firstMatrix
So can try:
firstMatrix=[firstMatrix;zeros(1,356)];
This will add a row of zeros at end of firstMatrix making it of 426x356

Compact MATLAB matrix indexing notation

I've got an n-by-k sized matrix, containing k numbers per row. I want to use these k numbers as indexes into a k-dimensional matrix. Is there any compact way of doing so in MATLAB or must I use a for loop?
This is what I want to do (in MATLAB pseudo code), but in a more MATLAB-ish way:
for row=1:1:n
finalTable(row) = kDimensionalMatrix(indexmatrix(row, 1),...
indexmatrix(row, 2),...,indexmatrix(row, k))
end
If you want to avoid having to use a for loop, this is probably the cleanest way to do it:
indexCell = num2cell(indexmatrix, 1);
linearIndexMatrix = sub2ind(size(kDimensionalMatrix), indexCell{:});
finalTable = kDimensionalMatrix(linearIndexMatrix);
The first line puts each column of indexmatrix into separate cells of a cell array using num2cell. This allows us to pass all k columns as a comma-separated list into sub2ind, a function that converts subscripted indices (row, column, etc.) into linear indices (each matrix element is numbered from 1 to N, N being the total number of elements in the matrix). The last line uses these linear indices to replace your for loop. A good discussion about matrix indexing (subscript, linear, and logical) can be found here.
Some more food for thought...
The tendency to shy away from for loops in favor of vectorized solutions is something many MATLAB users (myself included) have become accustomed to. However, newer versions of MATLAB handle looping much more efficiently. As discussed in this answer to another SO question, using for loops can sometimes result in faster-running code than you would get with a vectorized solution.
I'm certainly NOT saying you shouldn't try to vectorize your code anymore, only that every problem is unique. Vectorizing will often be more efficient, but not always. For your problem, the execution speed of for loops versus vectorized code will probably depend on how big the values n and k are.
To treat the elements of the vector indexmatrix(row, :) as separate subscripts, you need the elements as a cell array. So, you could do something like this
subsCell = num2cell( indexmatrix( row, : ) );
finalTable( row ) = kDimensionalMatrix( subsCell{:} );
To expand subsCell as a comma-separated-list, unfortunately you do need the two separate lines. However, this code is independent of k.
Convert your sub-indices into linear indices in a hacky way
ksz = size(kDimensionalMatrix);
cksz = cumprod([ 1 ksz(1:end-1)] );
lidx = ( indexmatrix - 1 ) * cksz' + 1; #'
% lindx is now (n)x1 linear indices into kDimensionalMatrix, one index per row of indexmatrix
% access all n values:
selectedValues = kDimensionalMatrix( lindx );
Cheers!