cumsum of values for same timeunit

cumsum of values for same timeunit - matlab

i have the following vectors:
A=[1 0 1 0 0 1 0 1 0 0];
B=[1 2 3 4 5 6 7 8 9 10];
in this case A represents a time vector, where the 1s signal the beginning of one time unit.
now i want to add up all the values in B which correspond to a time unit with the same length of 3 steps.
So in this example this would mean the 3rd, 4th and 5th value and the 8th, 9th and 10th value of B should be summed cause these are in a time unit of length 3.
B_result=[12 27];
i know cumsum() is the command for this but i dont know how to say that only these specific values depending on the time indices of A should be summed.
can you help me?
thanks alot

You can use cumsum alongside accumarray and hist:
csa = cumsum(A); %// from begining og unit to unit indices
n = hist(csa, 1:max(csa)); %// count num of steps in each unit
B_result = accumarray( csa', B' ); %// accumulate B into different time units
B_result(n~=3) = []; %// discard all time units that do not have 3 steps

For a simpler pattern matching, you can use strfind:
loc = strfind([A,1],[1 0 0 1]); %// add the 1 at the end of A and the pattern to avoid longer intervals
idx = bsxfun(#plus,loc,(0:2)'); %'// get the indices that need to be summed
result = sum(B(idx),1); %// obtain the result

N = 3; %// We want to detect a one followed by exactly N-1 zeros. Call that
%// sequence an "interesting part"
ind = find([A 1]); %// find ones. Append a last one to detect a possible
%// interesting part at the end.
ind = ind(diff(ind)==N); %// index of beginning of interesting parts
cs = cumsum(B); %// accumulate values
B_result = cs(ind+N-1)-cs(ind-1); %// use index to build result

A more generic application of Jonas' Idea:
A = [1 0 1 0 0 1 0 1 0 0 0 0 1];
B = [1 2 3 4 5 6 7 8 9 10 11 12];
n = 3;
result = arrayfun(#(x) sum( B(x:x+n-1) ), strfind([A,1],num2str(10^n+1)-48))
or use cumsum instead of sum, I was not sure what you actually want:
result = arrayfun(#(x) cumsum( B(x:x+n-1) ), ...
strfind( [A,1],num2str(10^n+1)-48 ) ,'uni',0)
%optional:
result = cell2mat(result')

Related

Remove single elements from a vector

I have a vector M containing single elements and repeats. I want to delete all the single elements. Turning something like [1 1 2 3 4 5 4 4 5] to [1 1 4 5 4 4 5].
I thought I'd try to get the count of each element then use the index to delete what I don't need, something like this:
uniq = unique(M);
list = [uniq histc(M,uniq)];
Though I'm stuck here and not sure how to go forward. Can anyone help?

Here is a solution using unique, histcounts and ismember:
tmp=unique(M) ; %finding unique elements of M
%Now keeping only those elements in tmp which appear only once in M
tmp = tmp(histcounts(M,[tmp tmp(end)])==1); %Thanks to rahnema for his insight on this
[~,ind] = ismember(tmp,M); %finding the indexes of these elements in M
M(ind)=[];
histcounts was introduced in R2014b. For earlier versions, hist can be used by replacing that line with this:
tmp=tmp(hist(M,tmp)==1);

You can get the result with the following code:
A = [a.', ones(length(a),1)];
[C,~,ic] = unique(A(:,1));
result = [C, accumarray(ic,A(:,2))];
a = A(~ismember(A(:,1),result(result(:,2) == 1))).';
The idea is, add ones to the second column of a', then accumarray base on the first column (elements of a). After that, found the elements in first column which have accum sum in the second column. Therefore, these elements repeated once in a. Finally, removing them from the first column of A.

Here is a cheaper alternative:
[s ii] = sort(a);
x = [false s(2:end)==s(1:end-1)]
y = [x(2:end)|x(1:end-1) x(end)]
z(ii) = y;
result = a(z);
Assuming the input is
a =
1 1 8 8 3 1 4 5 4 6 4 5
we sort the list s and get index of the sorted list ii
s=
1 1 1 3 4 4 4 5 5 6 8 8
we can find index of repeated elements and for it we check if an element is equal to the previous element
x =
0 1 1 0 0 1 1 0 1 0 0 1
however in x the first elements of each block is omitted to find it we can apply [or] between each element with the previous element
y =
1 1 1 0 1 1 1 1 1 0 1 1
we now have sorted logical index of repeated elements. It should be reordered to its original order. For it we use index of sorted elements ii :
z =
1 1 1 1 0 1 1 1 1 0 1 1
finally use z to extract only the repeated elements.
result =
1 1 8 8 1 4 5 4 4 5
Here is a result of a test in Octave* for the following input:
a = randi([1 100000],1,10000000);
-------HIST--------
Elapsed time is 5.38654 seconds.
----ACCUMARRAY------
Elapsed time is 2.62602 seconds.
-------SORT--------
Elapsed time is 1.83391 seconds.
-------LOOP--------
Doesn't complete in 15 seconds.
*Since in Octave histcounts hasn't been implemented so instead of histcounts I used hist.
You can test it Online

X = [1 1 2 3 4 5 4 4 5];
Y = X;
A = unique(X);
for i = 1:length(A)
idx = find(X==A(i));
if length(idx) == 1
Y(idx) = NaN;
end
end
Y(isnan(Y)) = [];
Then, Y would be [1 1 4 5 4 4 5]. It detects all single elements, and makes them as NaN, and then remove all NaN elements from the vector.

Finding the column index for the 1 in each row of a matrix

I have the following matrix in Matlab:
M = [0 0 1
1 0 0
0 1 0
1 0 0
0 0 1];
Each row has exactly one 1. How can I (without looping) determine a column vector so that the first element is a 2 if there is a 1 in the second column, the second element is a 3 for a one in the third column etc.? The above example should turn into:
M = [ 3
1
2
1
3];

You can actually solve this with simple matrix multiplication.
result = M * (1:size(M, 2)).';
3
1
2
1
3
This works by multiplying your M x 3 matrix with a 3 x 1 array where the elements of the 3x1 are simply [1; 2; 3]. Briefly, for each row of M, element-wise multiplication is performed with the 3 x 1 array. Only the 1's in the row of M will yield anything in the result. Then the result of this element-wise multiplication is summed. Because you only have one "1" per row, the result is going to be the column index where that 1 is located.
So for example for the first row of M.
element_wise_multiplication = [0 0 1] .* [1 2 3]
[0, 0, 3]
sum(element_wise_multiplication)
3
Update
Based on the solutions provided by #reyryeng and #Luis below, I decided to run a comparison to see how the performance of the various methods compared.
To setup the test matrix (M) I created a matrix of the form specified in the original question and varied the number of rows. Which column had the 1 was chosen randomly using randi([1 nCols], size(M, 1)). Execution times were analyzed using timeit.
When run using M of type double (MATLAB's default) you get the following execution times.
If M is a logical, then the matrix multiplication takes a hit due to the fact that it has to be converted to a numerical type prior to matrix multiplication, whereas the other two have a bit of a performance improvement.
Here is the test code that I used.
sizes = round(linspace(100, 100000, 100));
times = zeros(numel(sizes), 3);
for k = 1:numel(sizes)
M = generateM(sizes(k));
times(k,1) = timeit(#()M * (1:size(M, 2)).');
M = generateM(sizes(k));
times(k,2) = timeit(#()max(M, [], 2), 2);
M = generateM(sizes(k));
times(k,3) = timeit(#()find(M.'), 2);
end
figure
plot(range, times / 1000);
legend({'Multiplication', 'Max', 'Find'})
xlabel('Number of rows in M')
ylabel('Execution Time (ms)')
function M = generateM(nRows)
M = zeros(nRows, 3);
col = randi([1 size(M, 2)], 1, size(M, 1));
M(sub2ind(size(M), 1:numel(col), col)) = 1;
end

You can also abuse find and observe the row positions of the transpose of M. You have to transpose the matrix first as find operates in column major order:
M = [0 0 1
1 0 0
0 1 0
1 0 0
0 0 1];
[out,~] = find(M.');
Not sure if this is faster than matrix multiplication though.

Yet another approach: use the second output of max:
[~, result] = max(M.', [], 1);
Or, as suggested by #rayryeng, use max along the second dimension instead of transposing M:
[~, result] = max(M, [], 2);
For
M = [0 0 1
1 0 0
0 1 0
1 0 0
0 0 1];
this gives
result =
3 1 2 1 3
If M contains more than one 1 in a given row, this will give the index of the first such 1.

Row-by-row comparison in MATLAB

I want to compare every row of a matrix with its every other row, element by element wise, using MATLAB. If two of the entries match, the result will be stored as 1, and if they don't match, it will be 0. This will give a symmetric matrix consisting of 0s and 1s.
For example, let A = [4 6 7 9 5; 2 6 9 9 1]
Then, the result expected is [1 1 1 1 1; 0 1 0 1 0; 0 1 0 1 0; 1 1 1 1 1]
The code I am using is (for a 1000*1000 random matrix):
A = randi(50,1000,1000);
B = zeros(1000000,1000);
D = zeros(1000000,1);
c=0;
for i=1:1000
for k=1:1000
for j=1:1000
if A(i,j)==A(k,j)
B(k+c,j)=1;
else
B(k+c,j)=0;
end
end
end
c=c+1000;
end
for l=1:1000000
D(l)=0;
for m=1:1000
D(l)=D(l)+(B(l,m)/(1000));
end
end
E=reshape(D,1000,1000);
This goes out of memory. Could anyone please suggest a solution or a more efficient code?

you can try row by row comparison directly as taking a complete row array and comparing with the other row array.
For example,
let
A = [4 6 7 9 5; 2 6 9 9 1];
nA = length(A(:,1));
finalMat = [];
for i = 1:nA
matRow = ones(nA,1)*A(i,:); % create a matrix size of A consists of same row elements
finalMat = [finalMat;matRow == A];
end
see if it is okay for you application.

You can use permute to align dimensions apprpriately and then bsxfun for the comparisons:
reshape(bsxfun(#eq, permute(A, [1 3 2]), permute(A, [3 1 2])), [], size(A,2))

How to shift zero in the last column of a matrix

I have one matrix like below-
A=[1 1 1 1 1;
0 1 1 1 2;
0 0 1 1 3]
But I want to place all the 0 at the end of the row, so A should be like-
A=[1 1 1 1 1;
1 1 1 2 0;
1 1 3 0 0]
How can I do this? Matlab experts please help me.

There you go. Whole matrix, no loops, works even for non-contiguous zeros:
A = [1 1 1 1 1; 0 1 1 1 2; 0 0 1 1 3];
At = A.'; %// It's easier to work with the transpose
[~, rows] = sort(At~=0,'descend'); %// This is the important part.
%// It sends the zeros to the end of each column
cols = repmat(1:size(At,2),size(At,1),1);
ind = sub2ind(size(At),rows(:),cols(:));
sol = repmat(NaN,size(At,1),size(At,2));
sol(:) = At(ind);
sol = sol.'; %'// undo transpose
As usual, for Matlab versions that do not support the ~ symbol on function return, change ~ by a dummy variable, for example:
[nada, rows] = sort(At~=0,'descend'); %// This is the important part.

A more generic example:
A = [1 3 0 1 1;
0 1 1 1 2;
0 0 1 1 3]
% Sort columns directly
[~,srtcol] = sort(A == 0,2);
% Sorted positions
sz = size(A);
pos = bsxfun(#plus, (srtcol-1)*sz(1), (1:sz(1))'); % or use sub2ind
The result
B = A(pos)
B =
1 3 1 1 0
1 1 1 2 0
1 1 3 0 0

there are many ways to do this. one fast way can be easily like this:
a = [1 2 3 4 0 5 7 0];
idx=(find(a==0));
idx =
5 8
b=a; % save a new copy of the vector
b(idx)=[]; % remove zero elements
b =
1 2 3 4 5 7
c=[b zeros(size(idx))]
c =
1 2 3 4 5 7 0 0
You may modify this code as well.

If your zeros are always together, you could use the circshift command. This shifts values in an array by a specified number of places, and wraps values that run off the edge over to the other side. It looks like you would need to do this separately for each row in A, so in your example above you could try:
A(2,:) = circshift(A(2,:), [1 -1]); % shift the second row one to the left with wrapping
A(3,:) = circshift(A(3,:), [1 -2]); % shift the third row two to the left with wrapping
In general, if your zeros are always at the front of the row in A, you could try something like:
for ii = 1:size(A,1) % iterate over rows in A
numShift = numel(find(A(ii,:) == 0)); % assuming zeros at the front of the row, this is how many times we have to shift the row.
A(ii,:) = circshift(A(ii,:), [1 -numShift]); % shift it
end

Try this (just a fast hack):
for row_k = 1:size(A, 1)
[A_sorted, A_sortmap] = sort(A(row_k, :) == 0, 'ascend');
% update row in A:
A(row_k, :) = A(row_k, A_sortmap);
end
Now optimized for versions of MATLAB not supporting ~ as garbage lhs-identifier.

#LuisMendo's answer is inspiring in its elegance, but I couldn't get it to work (perhaps a matlab version thing). The following (based on his answer) worked for me:
Aaux = fliplr(reshape([1:numel(A)],size(A)));
Aaux(find(A==0))=0;
[Asort iso]=sort(Aaux.',1,'descend');
iso = iso + repmat([0:size(A,1)-1]*size(A,2),size(A,2),1);
A=A.';
A(iso).'

I've also asked this question and got a super elegant answer (non of above answers is same) here:
Optimize deleting matrix leading zeros in MATLAB

MATLAB: Fastest Way to Count Unique # of 2 Number Combinations in a Vector of Integers

Given a vector of integers such as:
X = [1 2 3 4 5 1 2]
I would like to find a really fast way to count the number of unique combinations with 2-elements.
In this case the two-number combinations are:
[1 2] (occurs twice)
[2 3] (occurs once)
[3 4] (occurs once)
[4 5] (occurs once)
[5 1] (occurs once)
As it stands, I am currently doing this in MATLAB as follows
X = [1 2 3 4 5 1 2];
N = length(X)
X_max = max(X);
COUNTS = nan(X_max); %store as a X_max x X_max matrix
for i = 1:X_max
first_number_indices = find(X==1)
second_number_indices = first_number_indices + 1;
second_number_indices(second_number_indices>N) = [] %just in case last entry = 1
second_number_vals = X(second_number_indices);
for j = 1:X_max
COUNTS(i,j) = sum(second_number_vals==j)
end
end
Is there a faster/smarter way of doing this?

Here is a super fast way:
>> counts = sparse(x(1:end-1),x(2:end),1)
counts =
(5,1) 1
(1,2) 2
(2,3) 1
(3,4) 1
(4,5) 1
You could convert to a full matrix simply as: full(counts)
Here is an equivalent solution using accumarray:
>> counts = accumarray([x(1:end-1);x(2:end)]', 1)
counts =
0 2 0 0 0
0 0 1 0 0
0 0 0 1 0
0 0 0 0 1
1 0 0 0 0

EDIT: #Amro has provided a much better solution (well, better in the vast majority of cases, I suspect my method would work better if MaxX is very large and X contains zeros - this is because the presence of zeros will rule out the use of sparse while a large MaxX will slow down the accumarray approach as it creates a matrix of size MaxX by MaxX).
EDIT: Thanks to #EitanT for pointing out an improvement that can be made using accumarray.
Here is how I would solve it:
%Generate some random data
T = 20;
MaxX = 3;
X = randi(MaxX, T, 1);
%Get the unique combinations and an index. Note, I am assuming X is a column vector.
[UniqueComb, ~, Ind] = unique([X(1:end-1), X(2:end)], 'rows');
NumComb = size(UniqueComb, 1);
%Count the number of occurrences of each combination
Count = accumarray(Ind, 1);
All unique sequential two element combinations are now stored in UniqueComb, while the corresponding counts for each unique combination are stored in Count.