I'm looking an efficient way to turn the vector:
[1,1,1,2,3,3,3,4,4,4,5,1]
into a vector of vectors such that:
[[1,2,3,12],[4],[5,6,7],[8,9,10],[11]]
In general:
newVector[i] = indexes of the initial vector that contained i
Preferably in Matlab/Octave but I'm just curious if there is an efficient way of achieving this.
I tried looking it up on google and stack but I have no idea what to call this 'operation' so nothing came up.
There is an easy way to do it using accumarray
A = [1,1,1,2,3,3,3,4,4,4,5,1]
accumarray(A',A',[],#(x){find(ismember(A,x))})
But next time, please show your own attempt in your question
Alternatively (but only if A starts from 1 and doesn't skip any numbers)
accumarray(A', (1:size(A,2))', [], #(x){sort(x)})
Related
Sorry ahead that the title may not be precise. But I am just not sure what it should be called.
Consider an index vector id=[1,1,2] and a data vector d=[3,4,5]. I would like to have
A(id)=A(id)+d;
Of course, I am aware that this is invalid. Just wonder if there is an efficient way (avoiding for loop) if length(id)=length(d) is very long.
To be more precise, I want to have
for ii=1:length(id)
A(id(ii))=A(id(ii))+d(ii);
end
So for the example above, I expect A = [3+4,5] = [7,5].
You can use accumarray :
A = accumarray(id(:), d);
After some thought, maybe I should just expand to another dimension trading off space with time.
dummy=zeros(max(id),length(d));
dummy(sub2ind(size(dummy),id,1:length(d)))=d;
A=sum(dummy,2);
I have a very big sparse csc_matrix x. I want to do elementwise exp() on it. Basically what I want is to get the same result as I would have got with numpy.exp(x.toarray()). But I can't do that(my memory won't allow me to convert the sparse matrix into an array). Is there any way out? Thanks in advance!
If you don't have the memory to hold x.toarray(), you don't have the memory to hold the output you're asking for. The output won't be sparse; in fact, unless your input has negative infinities in it, the output probably won't have a single 0.
It'd probably be better to compute exp(x)-1, which is as simple as
x.expm1()
If you want to do something on nonzeros only: the data attribute is writable at least in some representations including csr and csc. Some representations allow for duplicate entries, so make sure you are acting on a "normalised" form.
To change non-zero elements, maybe this would work for you:
x = some big sparse matrix
np.exp( x.data, out=x.data ) # ask np.exp() to store results in existing x.data
presumably slower:
# above seems more efficient (no new memory alloc).
x.data = np.exp( x.data )
I've been wrestling with how to get an element-wise log2() of each non-zero array element. I ended up doing smth like:
np.log2( x.data, out=x.data )
The following two techniques seem like exactly what I was looking for. My matrix is sparse but it still plenty of non-zero elements.
Credit to #DSM here for the idea of directly changing x.data, I think that is a superb insight about sparse matrices.
Credit to #Mike Müller for the idea of using "out" as itself. In the same thread, #kmario23 points out an important caveat about promoting .data to floats (inputs could be int or smth) so it is compatible with the .exp() or whatever function, I would want to do that if I was writing smth for general use.
note: I'm just starting to learn about sparse matrices, so would like to know if this is a bad idea for reason(s) I'm not seeing. Please do let me know if I'm on thin ice with this.
Normally I wouldn't mess with private attributes, but .data shows up pretty clearly in the attributes documentation for the various sparse matrices I've looked at.
I have two matrices S and T which have n columns and a row vector v of length n. By my construction, I know that S does not have any duplicates. What I'm looking for is a fast way to find out whether or not the row vector v appears as one of the rows of S. Currently I'm using the test
if min([sum(abs(S - repmat(f,size(S,1),1)),2);sum(abs(T - repmat(v,size(dS_new,1),1)),2)]) ~= 0 ....
When I first wrote it, I had a for loop testing each (I knew this would be slow, I was just making sure the whole thing worked first). I then changed this to defining a matrix diff by the two components above and then summing, but this was slightly slower than the above.
All the stuff I've found online says to use the function unique. However, this is very slow as it orders my matrix after. I don't need this, and it's a massively waste of time (it makes the process really slow). This is a bottleneck in my code -- taking nearly 90% of the run time. If anyone has any advice as how to speed this up, I'd be most appreciative!
I imagine there's a fairly straightforward way, but I'm not that experienced with Matlab (fairly, just not lots). I know how to use basic stuff, but not some of the more specialist functions.
Thanks!
To clarify following Sardar_Usama's comment, I want this to work for a matrix with any number of rows and a single vector. I'd forgotten to mention that the elements are all in the set {0,1,...,q-1}. I don't know whether that helps or not to make it faster!
You may want this:
ismember(v,S,'rows')
and replace arguments S and v to get indices of duplicates
ismember(S,v,'rows')
Or
for test if v is member of S:
any(all(bsxfun(#eq,S,v,2))
this returns logical indices of all duplicates
all(bsxfun(#eq,S,v),2)
i searched a lot in google but didnt find an answer that did help me without reducing my performance.
I have a Matrice A and B of the same size with different values. Then i want to filter:
indices=find(A<5 & B>3)
A(indices)=
B(indices)=
Now I want to apply a function on the indices -> indices_2=find(A>=5 | b<=3) without using the find function on the whole matrices A and B again. Logic operations are not possible in this case because I need the indices and not 0 and 1.
Something like:
A(~indices)=
B(~indices)=
instead of:
indices_2=find(A>=5 | B<=3)
A(indices_2)=
B(indices_2)=
And after that I want to split these sets once again.... Just Filtering.
I used indices_2=setdiff(indices, size(A)) but it did screw my computation performance. Is there any other method to split the matrices into subsets without using find twice?
Hope you understand my problem and it fits the regulations.
I don't understand why you can't just use find again, nor why you can't use logical indexing in this case but I suppose if you are going to restrict yourself like this then you could accomplish this using setdiff:
indices_2 = setdiff(1:numel(A), indices)
however, if you are worried about performance, you should be sticking to logical indexing:
indices = A<5 & B>3
A(indices)=...
B(indices)=...
A(~indices)=...
B(~indices)=...
I think you may be looking for something like this:
%Split your data in two and keep track of which numbers you have
ranks = 1:numel(A);
indices= find(A<5 & B>3);
% Update the numbers list to contain the set of numbers you are interested in
ranks_2 = ranks(~indices)
% Operate on the set you are interested in and find the relevant ranks
indices_2= A(~indices)>=5 | B(~indices)<=3
ranks_2 = ranks_2(indices_2)
I have one big matrix of for example 3000X300. And I need to select each element and do several calculations with it. I looked into using the array fun function but because the output of my program is not one value this is not possible.
It works fine now with the loops but it has to preform much faster, so i want to remove the for loop.
Maybe i'll try to be more specific: Each value of the big matrix has to give me an answer of 4 different matrices with the size of 4X6020..
So i don't know if this is possible making this vectorized...
Maybe somebody has other suggestions to make it faster?
greetings,
You can use arrayfun and set uniformoutput to false. See here.