I have written an algorithm that finds the local maxima and minima in a signal.
[id_max, id_min] = find_max_min(signal);
I would like now to check:
if the alterantion of maxima and minima is respected
i.e. id_max(1)<id_min(1)<id_max(2)<id_min(2)<...
we could start with a minimum..this is not known
Suppose that:
id_max = [1 3 5 7 10 14 20];
id_min = [2 4 6 8 16 19];
I would like to have 2 vectors missing_max missing_min indicating the location of the missing maxima and minima.
A missing maximum (minimum) occours when between two consecutive minima (maxima) in id_min (id_max) there is not a maximum (minimum).
In this example a maximum is missing in the 7th position of id_max because in id_min there are two consecutive values (16 19) without a maximum between.
Then we have
missing_max = [7]
missing_min = [5]
since
id_max = [1 3 5 7 10 14 X 20];
id_min = [2 4 6 8 X 16 19]; (with X I marked the missing values)
If the alternation is correct the vectors should be empty. Can you suggest an efficient way to do that without for loops?
Thanks in advance
Here's a script that you can adapt to a function if you want:
id_max = [1 3 5 7 10 14 20];
id_min = [2 4 6 8 16 19];
% Group all values, codify extremity (1-max, 0-min), and position
id_all = [ id_max, id_min ];
code_all = [ones(size(id_max)), zeros(size(id_min))];
posn_all = [ 1:numel(id_max), 1:numel(id_min) ];
% Reshuffle the codes and positions according to sorted IDs of min/max
[~, ix] = sort(id_all);
code_all = code_all(ix);
posn_all = posn_all(ix);
% Find adjacent IDs that have the same code, i.e. code diff = 0
code_diff = (diff(code_all)==0);
% Get the indices of same-code neighbors, and their original positions
ix_missing_min = find([code_diff,false] & (code_all==1));
ix_missing_max = find([code_diff,false] & (code_all==0));
missing_min = posn_all(ix_missing_min+1);
missing_max = posn_all(ix_missing_max+1);
Caveats on IDs:
Make sure your id_min and id_max are rows (even if empty);
Make sure that at least one of them is not empty;
While they need not to be sorted, their values must be unique (within the IDs and across).
Later edit:
New version of the code, based on new explanations about the definition:
id_max = [1 3 5 7 10 14 20];
id_min = [2 4 6 8 16 19];
%id_max = [12 14]
%id_min = [2 4 6 8 10];
id_min_ext = [-Inf, id_min];
id_max_ext = [-Inf, id_max];
% Group all values, and codify their extremity (1-max, 0-min), and position
id_all = [ id_max_ext, id_min_ext ];
code_all = [ones(size(id_max_ext)), zeros(size(id_min_ext))];
posn_all = [ 0:numel(id_max), 0:numel(id_min) ];
% Reshuffle the codes and position according to sorted positions of min/max
[~, ix] = sort(id_all);
code_all = code_all(ix);
posn_all = posn_all(ix);
% Find adjacent IDs that have the same code, i.e. code diff = 0
code_diff = (diff(code_all)==0);
% Get the indices of same-code neighbours, and their original positions
ix_missing_min = find([code_diff,false] & (code_all==1));
ix_missing_max = find([code_diff,false] & (code_all==0));
missing_min = unique(posn_all(ix_missing_min-1))+1;
missing_max = unique(posn_all(ix_missing_max-1))+1;
However, the code contains a subtle bug. The bug will be removed by either the person that asked the question, or by me after he/she improves the question in such a way that is really clear what's asked for. :-) Due the fact that we have 2 virtual extremums (one max and one min, at ID = −∞) is possible that the first missing extremum will be marked twice: once at −∞ and once at the first element of the ID list. unique() will take care of that (though is too much of a function call to check if the first 2 elements of an array have the same value)
Related
This question is motivated by very specific combinatorial optimization problem, where search space is defined as a space of permuted subsets of vector unsorted set of discrete values with multiplicities.
I am looking for effective (fast enough, vectorized or any other more clever solution) function which is able to find indices of subsets in the following manner:
t = [1 1 3 2 2 2 3 ]
is unsorted vector of all possible values, including its multiplicities.
item = [2 3 1; 2 1 2; 3 1 1; 1 3 3]
is a list of permuted subsets of vector t.
I need to find list of corresponding indices of subsets item which corresponds to the vector t. So, for above mentioned example we have:
item =
2 3 1
2 1 2
3 1 1
1 3 3
t =
1 1 3 2 2 2 3
ind = item2ind(item,t)
ind =
4 3 1
4 1 5
3 1 2
1 3 7
So, for item = [2 3 1] we get ind = [4 3 1], which means, that:
first value "2" at item corresponds to the first value "2" at t on position "4",
second value "3" at item corresponds to the first value "3" at t on position "3" and
third value "1" at item corresponds to the first value "1" at t on position "1".
In a case item =[ 2 1 2] we get ind = [4 1 5], which means, that:
first value "2" at item corresponds to the first value "2" at t on position "4",
second value "1" at item corresponds to the first value "1" at t on position "1", and
third value "2" at item corresponds to the second(!!!) value "1" at t on position "5".
For
item = [1 1 1]
does not exist any solution, because vector t contains only two "1".
My current version of function "item2ind" is very trivial serial code, which is possible simple parallelized by changing of "for" to "parfor" loop:
function ind = item2ind(item,t)
[nlp,N] = size(item);
ind = zeros(nlp,N);
for i = 1:nlp
auxitem = item(i,:);
auxt = t;
for j = 1:N
I = find(auxitem(j) == auxt,1,'first');
if ~isempty(I)
auxt(I) = 0;
ind(i,j) = I;
else
error('Incompatible content of item and t.');
end
end
end
end
But I need something definitely more clever ... and faster:)
Test case for larger input data:
t = 1:10; % 10 unique values at vector t
t = repmat(t,1,5); % unsorted vector t with multiplicity of all unique values 5
nlp = 100000; % number of item rows
[~,p] = sort(rand(nlp,length(t)),2); % 100000 random permutations
item = t(p); % transform permutations to items
item = item(:,1:30); % transform item to shorter subset
tic;ind = item2ind(item,t);toc % runing and timing of the original function
tic;ind_ = item2ind_new(item,t);toc % runing and timing of the new function
isequal(ind,ind_) % comparison of solutions
To achieve vectorizing the code, I have assumed that the error case won't be present. It should be discarded first, with a simple procedure I will present below.
Method First, let's compute the indexes of all elements in t:
t = t(:);
mct = max(accumarray(t,1));
G = accumarray(t,1:length(t),[],#(x) {sort(x)});
G = cellfun(#(x) padarray(x.',[0 mct-length(x)],0,'post'), G, 'UniformOutput', false);
G = vertcat(G{:});
Explanation: after putting input in column vector shape, we compute the max number of occurences of each possible value in t using accumarray. Now, we form array of all indexes of all numbers. It forms a cell array as there may be not the same number of occurences for each value. In order to form a matrix, we pad each array independently to the max length (naming mct). Then we can transform the cell array into a matrix. At this step, we have:
G =
1 11 21 31 41
2 12 22 32 42
3 13 23 33 43
4 14 24 34 44
5 15 25 35 45
6 16 26 36 46
7 17 27 37 47
8 18 28 38 48
9 19 29 39 49
10 20 30 40 50
Now, we process item. For that, let's figure out how to create the cumulative sum of occurences of values inside a vector. For example, if I have:
A = [1 1 3 2 2 2 3];
then I want to get:
B = [1 2 1 1 2 3 2];
Thanks to implicit expansion, we can have it in one line:
B = diag(cumsum(A==A'));
As easy as this. The syntax A==A' expands into a matrix where each element is A(i)==A(j). Making the cumulative sum in only one dimension and taking the diagonal gives us the good result: each column in the cumulative sum of occurences over one value.
To use this trick with item which 2-D, we should use a 3D array. Let's call m=size(item,1) and n=size(item,2). So:
C = cumsum(reshape(item,m,1,n)==item,3);
is a (big) 3D matrix of all cumulatives occurences. Last thing is to select the columns that are on the diagonal along dimension 2 and 3:
ia = C(sub2ind(size(C),repelem((1:m).',1,n),repelem(1:n,m,1),repelem(1:n,m,1)));
Now, with all these matrices, indexing is easy:
ind = G(sub2ind(size(G),item,ia));
Finally, let's recap the code of the function:
function ind = item2ind_new(item,t)
t = t(:);
[m,n] = size(item);
mct = max(accumarray(t,1));
G = accumarray(t,1:length(t),[],#(x) {sort(x)});
G = cellfun(#(x) padarray(x.',[0 mct-length(x)],0,'post'), G, 'UniformOutput', false);
G = vertcat(G{:});
C = cumsum(reshape(item,m,1,n)==item,3);
ia = C(sub2ind(size(C),repelem((1:m).',1,n),repelem(1:n,m,1),repelem(1:n,m,1)));
ind = G(sub2ind(size(G),item,ia));
Results Running the provided script on an old 4-core, I get:
Elapsed time is 4.317914 seconds.
Elapsed time is 0.556803 seconds.
ans =
logical
1
Speed up is substential (more than 8x), along with memory consumption (with matrix C). I guess some improvements can be done with this part to save more memory.
EDIT For generating ia, this procedure can cost a lost of memory. A way to save memory is to use a for-loop to generate directly this array:
ia = zeros(size(item));
for i=unique(t(:)).'
ia = ia+cumsum(item==i, 2).*(item==i);
end
In all cases, when you have ia, it's easy to test if there is an error in item compared to t:
any(ind(:)==0)
A simple solution to get items in error (as a mask) is then
min(ind,[],2)==0
Overview
An n×m matrix A and an n×1 vector Date are the inputs of the function S = sumdate(A,Date).
The function returns an n×m vector S such that all rows in S correspond to the sum of the rows of A from the same date.
For example, if
A = [1 2 7 3 7 3 4 1 9
6 4 3 0 -1 2 8 7 5]';
Date = [161012 161223 161223 170222 160801 170222 161012 161012 161012]';
Then I would expect the returned matrix S is
S = [15 9 9 6 7 6 15 15 15;
26 7 7 2 -1 2 26 26 26]';
Because the elements Date(2) and Date(3) are the same, we have
S(2,1) and S(3,1) are both equal to the sum of A(2,1) and A(3,1)
S(2,2) and S(3,2) are both equal to the sum of A(2,2) and A(3,2).
Because the elements Date(1), Date(7), Date(8) and Date(9) are the same, we have
S(1,1), S(7,1), S(8,1), S(9,1) equal the sum of A(1,1), A(7,1), A(8,1), A(9,1)
S(1,2), S(7,2), S(8,2), S(9,2) equal the sum of A(1,2), A(7,2), A(8,2), A(9,2)
The same for S([4,6],1) and S([4,6],2)
As the element Date(5) does not repeat, so S(5,1) = A(5,1) = 7 and S(5,2) = A(5,2) = -1.
The code I have written so far
Here is my try on the code for this task.
function S = sumdate(A,Date)
S = A; %Pre-assign S as a matrix in the same size of A.
Dlist = unique(Date); %Sort out a non-repeating list from Date
for J = 1 : length(Dlist)
loc = (Date == Dlist(J)); %Compute a logical indexing vector for locating the J-th element in Dlist
S(loc,:) = repmat(sum(S(loc,:)),sum(loc),1); %Replace the located rows of S by the sum of them
end
end
I tested it on my computer using A and Date with these attributes:
size(A) = [33055 400];
size(Date) = [33055 1];
length(unique(Date)) = 2645;
It took my PC about 1.25 seconds to perform the task.
This task is performed hundreds of thousands of times in my project, therefore my code is too time-consuming. I think the performance will be boosted up if I can eliminate the for-loop above.
I have found some built-in functions which do special types of sums like accumarray or cumsum, but I still do not have any ideas on how to eliminate the for-loop.
I would appreciate your help.
You can do this with accumarray, but you'll need to generate a set of row and column subscripts into A to do it. Here's how:
[~, ~, index] = unique(Date); % Get indices of unique dates
subs = [repmat(index, size(A, 2), 1) ... % repmat to create row subscript
repelem((1:size(A, 2)).', size(A, 1))]; % repelem to create column subscript
S = accumarray(subs, A(:)); % Reshape A into column vector for accumarray
S = S(index, :); % Use index to expand S to original size of A
S =
15 26
9 7
9 7
6 2
7 -1
6 2
15 26
15 26
15 26
Note #1: This will use more memory than your for loop solution (subs will have twice the number of element as A), but may give you a significant speed-up.
Note #2: If you are using a version of MATLAB older than R2015a, you won't have repelem. Instead you can replace that line using kron (or one of the other solutions here):
kron((1:size(A, 2)).', ones(size(A, 1), 1))
I have a <206x193> matrix A. It contains the values of a parameter at 206 different locations at 193 time steps. I am interested in the maximum value at each location over all times as well as the corresponding indices. I have another matrix B with the same dimensions of A and I'm interested in values for each location at the time that A's value at that location was maximal.
I've tried [max_val pos] = max(A,[],2), which gives the right maximum values, but A(pos) does not equal max_val.
How exactly does this function work?
I tried a smaller example as well. Still I don't understand the meaning of the indices....
>> H
H(:,:,1) =
1 2
3 4
H(:,:,2) =
5 6
7 8
>> [val pos] = max(H,[],2)
val(:,:,1) =
2
4
val(:,:,2) =
6
8
pos(:,:,1) =
2
2
pos(:,:,2) =
2
2
The indices in idx represent the index of the max value in the corresponding row. You can use sub2ind to create a linear index if you want to test if A(pos)=max_val
A=rand(206, 193);
[max_val, idx]=max(A, [], 2);
A_max=A(sub2ind(size(A), (1:size(A,1))', idx));
Similarly, you can access the values of B with:
B_Amax=B(sub2ind(size(A), (1:size(A,1))', idx));
From your example:
H(:,:,2) =
5 6
7 8
[val pos] = max(H,[],2)
val(:,:,2) =
6
8
pos(:,:,2) =
2
2
The reason why pos(:,:,2) is [2; 2] is because the maximum is at position 2 for both rows.
max is a primarily intended for use with vectors. In normal mode, even the multi-dimensional arrays are treated as a series of vectors along which the max function is applied.
So, to get the values in B at each location at the time where A is maximum, you should
// find the maximum values and positions in A
[c,i] = max(A, [], 2);
// iterate along the first dimension, to retrieve the corresponding values in B
C = [];
for k=1:size(A,1)
C(k) = B(k,i(k));
end
You can refer to #Jigg's answer for a more concise way of creating matrix C
I want to raise a matrix to a next matrix and subtrat one before taking the product.
e.g.
A = [2 3 5
2 3 0]
B = [2 2 1
1 2 0]
so prod(A.^B-1) would be:
first row (2^2-1)*(3^2-1)*(5^1-1)=96
second row (2^1-1)*(3^2-1)=8
and we would have prod(A.^B-1) = 96, 81. the trick also to skip past the zero, i keep getting zero or NaN, i think the zero is being calculated as well.
Is there a way to code this,
this is the code I have in mind
if A~=0 && B~=0
prod(A.^B-1)
end
You could do it like this using logical indexing to replace instances where A.^B-1 is 0:
A = [2 3 5;2 3 0];
B = [2 2 1;1 2 0];
C = A.^B-1;
C(C==0) = 1; % Replace zeros with ones
D = prod(C,2) % Product across the columns
which returns
D =
96
8
provided that you remove the zeros, I don't think that you should get NaN unless your original matrices contain it. However, You can replace it in the same manner as well (C(isnan(C)) = 1;).
The expression is:
for i=1:n
X(:,i) = [P{i}(:)];
end
where X is a DxN matrix and P is a cell-array.
reshape(cat(3,P{:}),[numel(P{1}) n])
Of course, the above solution is just for fun. I would recommend profiling both solutions and only using this one if it has a significant performance advantage.
Maintenance and readability are also very important factors to consider when writing code.
If you obtained the cell array via mat2cell, you may be wanting to arrange blocks of an image into the columns of an array X. This can be achieved in a single step using the command IM2COL
%# rearrange the large array so that each column of X
%# corresponds to the 4 pixels of each 2-by-2 block
X = im2col(largeArray,[2 2],'distinct');
You might be able to get away with:
P{1} = [ 1 2; 3 4];
P{2} = [ 7 8; 9 10];
P{3} = [ 11 12; 13 14];
X = [P{:}]
X =
1 2 7 8 11 12
3 4 9 10 13 14
Then some sort of reshape() to get to where you want to be.