Grouping unique values in a vector and putting them in a matrix - matlab

I have a vector that contains repeated numbers like so:
[1 1 1 1 5 5 5 5 93 93 93 6 6 6 6 6 6] and so on. What I want to do is to group the similar values (1's, 5's, etc.). I would like to have each of the unique values in a row of a big matrix, like:
[ 1 1 1 1 0 0
5 5 5 5 0 0
93 93 93 0 0 0
6 6 6 6 6 6]
I don't know the maximum number of occurrence of a unique value, so it is ok to create an initial zero matrix with a large number of columns (that I am sure is bigger than the maximum number of occurrence of a unique value).
Any help is highly appreciated.

How about this?
A = [1 1 1 1 5 5 5 5 93 93 93 6 6 6 6 6 6];
[a,b] = hist(A,unique(A))
f = #(x) [ones(1,a(x)) zeros(1,max(a)-a(x))]
X = cell2mat( arrayfun(#(x) {f(x)*b(x)}, 1:numel(b) )' )
returns:
X =
1 1 1 1 0 0
5 5 5 5 0 0
6 6 6 6 6 6
93 93 93 0 0 0
I know the order is different, is that important? Otherwise:
n = hist(A,1:max(A)) % counts how often every number apperas
[a b] = unique(A,'stable') % gets all unique numbers
n = n(a) % correlates count and numbers
f = #(x) [ones(1,n(x)) zeros(1,max(n)-n(x))] % creates the logical index
% vector for every single row
X = cell2mat( arrayfun(#(x) {f(x)*b(x)}, 1:numel(b) )' ) %fills the rows
or inspired by Luis Mendo's Answer a little shorter:
n = hist(A,1:max(A));
a = unique(A,'stable')
n = n(a)
Y = repmat(a',1,max(n)).*bsxfun(#le, cumsum(ones(max(n),numel(n))), n)'
returns:
X =
1 1 1 1 0 0
5 5 5 5 0 0
93 93 93 0 0 0
6 6 6 6 6 6
For the bored people out there, there is a one-line solution:
X = getfield(cell2mat(arrayfun(#(x,y) padarray( padarray(x,[0 y],'replicate','pre'),[0 max(hist(A,1:max(A)))-y],'post'),1:max(A),hist(A,1:max(A)),'uni',0)'),{unique(A,'stable'),2:1+max(hist(A,1:max(A)))})
Or an almost lovely two-liner:
n = hist(A,1:max(A))
X = getfield(cell2mat(arrayfun(#(x,y) padarray( padarray(x,[0 y],'replicate',...
'pre'),[0 max(n)-y],'post'),1:max(A),n,'uni',0)'),...
{unique(A,'stable'),2:1+max(n)})
just for fun ;)

Vectorized solution (no loops):
x = [1 1 1 1 5 5 5 5 93 93 93 6 6 6 6 6 6]; %// data
ind = [find(diff(x)) numel(x)]; %// end of each run of equal values
values = x(ind); %// unique values (maintaining order)
count = diff([0 ind]); %// count of each value
result = bsxfun(#le, meshgrid(1:max(count),1:numel(values)), count.'); %'// mask
result = bsxfun(#times, result, values.'); %'// fill with the values
EDIT:
Alternative procedure that avoids the second bsxfun:
x = [1 1 1 1 5 5 5 5 93 93 93 6 6 6 6 6 6]; %// data
ind = [find(diff(x)) numel(x)];
values = x(ind); %// unique values (maintaining order)
count = diff([0 ind]); %// count of each value
mask = bsxfun(#le, ndgrid(1:max(count),1:numel(values)), count);
result = zeros(size(mask)); %// pre-allocate and pre-shape (transposed) result
result(mask) = x; %// fill in values
result = result.';

This could be one approach -
%%// Input
array1 = [1 1 1 1 5 5 5 5 93 93 93 6 6 6 6 6 6];
%// Main Processing
id = unique(array1,'stable'); %//Find the unique numbers/IDs
mat1 = zeros(numel(id),nnz(array1==mode(array1))); %%// Create a matrix to hold the final result
for k=1:numel(id)
extent_each_id = nnz(array1==id(k)); %%// Count of no. of occurances for each ID
mat1(k,1:extent_each_id)=id(k); %%// Starting from the left to the extent for each ID store that ID
end
Gives -
mat1 =
1 1 1 1 0 0
5 5 5 5 0 0
93 93 93 0 0 0
6 6 6 6 6 6

Related

Insert certain value after occurence of a set of n equal values

Example:
input = [1 255 0 0 0 9 9 9 1 6 6 6 6 6 6 1]; % array of numbers (uint8)
output = [1 255 0 0 0 255 9 9 9 255 1 6 6 6 255 6 6 6 255 1];
% output must have 255 inserted at positions 6, 10, 15, 19
% because 0, 9, 6, 6 have occurred three times respectively
outputIndex = [6 10 15 19];
% outputIndex must indicate the positions where 255 was inserted
This could be one vectorized approach to get things done efficiently -
%// Input
A = [1 255 0 0 0 9 9 9 1 6 6 6 6 6 6 1]
%// Input paramter (how many times a value must be repeated for detection)
search_count = 3;
%// Find difference between consecutive elemnts and set all non zero
%// differences as ones, otherwise as zeros in a binary array
diffA = diff(A)~=0
%// Find start and end indices of "islands" of same value
starts = strfind([1 diffA],[1 zeros(1,search_count-1)])
ends = strfind([diffA 1],[zeros(1,search_count-1) 1])+search_count
%// For each island of same valued elements, find out where first group ends
firstgrp = starts + search_count
%// Find how many times a group of that same value of search_count times repeats
%// within each "island" of same valued elements. Also get the max repeats.
pattern_repeats = floor((ends - starts)./search_count)
max_repeat = max(pattern_repeats)
%// All possible repeat indices within all islands
all_repeats = bsxfun(#plus,firstgrp,[0:max_repeat-1]'*(search_count)) %//'
%// Use a binary mask to select only those repeats allowed with pattern_repeat
out_idx = all_repeats(bsxfun(#lt,[0:max_repeat-1]',pattern_repeats)) %//'
out_idx = out_idx + [0:numel(out_idx)-1]' %//'
%// Create output arary, insert 255 at out_idx locations and put values
%// from input array into rest of the locations
out = zeros(1,numel(A)+numel(out_idx));
out(out_idx) = 255
out(out==0) = A
Code run -
>> A
A =
Columns 1 through 13
1 255 0 0 0 9 9 9 1 6 6 6 6
Columns 14 through 16
6 6 1
>> out_idx
out_idx =
6
10
15
19
>> out
out =
Columns 1 through 13
1 255 0 0 0 255 9 9 9 255 1 6 6
Columns 14 through 20
6 255 6 6 6 255 1
I don't understand the downvotes, it's actually an interesting question.
Here the long answer:
n = 3;
subst = 255;
input = [1 255 0 0 0 9 9 9 1 6 6 6 6 6 6 61];
%// mask
X = NaN(1,numel(input));
%// something complicated (see below)
X(filter(ones(1,n-1),1,~([0. diff(input)])) == n-1) = 1;
%// loop to split multiple occurences of n-tuples
for ii = 1:numel(input)
if X(ii) == 1 && ii < numel(X)-n+1
X(ii+1:ii+n-1) = NaN(1,n-1);
end
end
%// output vector
D = [input; X.*subst];
E = D(:);
output = E(isfinite(E))
%// indices of inserted 255
D = [input.*0; X.*subst];
E = D(:);
outputIndex = find(E(isfinite(E)))
Explanation of the complicated part:
%// finite differences of input
A = [0 diff(input)];
%// conversion to logical
B = ~A;
%// mask consecutive values
mask = filter(ones(1,n-1),1,B) == n-1;
%// set masked values to 1
X(mask) = 1;
If you have the image processing toolbox you can save the loop with this fancy oneliner for getting the mask:
mask = accumarray(bwlabel(filter(ones(1,n-1),1,~([0. diff(input)])) == n-1).'+1,1:numel(input),[],#(x) {getfield(sort(x),{find(mod(cumsum(1:numel(x)),n) == 1)})});
X = NaN(1,numel(input));
X(vertcat(mask{2:end})) = subst;
%// output vector
D = [input; X];
E = D(:);
output = E(isfinite(E))
%// indices of inserted 255
D = [input.*0; X];
E = D(:);
outputIndex = find(E(isfinite(E)))

Matlab: How I can make this transformation on the matrix A?

I have a matrix A 4x10000, I want to use it to find another matrix C.
I'll simplify my problem with a simple example:
from a matrix A
20 4 4 74 20 20
36 1 1 11 36 36
77 1 1 15 77 77
3 4 2 6 7 8
I want, first, to find an intermediate entity B:
2 3 4 6 7 8
[20 36 77] 0 1 0 0 1 1 3
[4 1 1] 1 0 1 0 0 0 2
[74 11 15] 0 0 0 1 0 0 1
we put 1 if the corresponding value of the first line and the vector on the left, made ​​a column in the matrix A.
the last column of the entity B is the sum of 1 of each line.
at the end I want a matrix C, consisting of vectors which are left in the entity B, but only if the sum of 1 is greater than or equal to 2.
for my example:
20 4
C = 36 1
77 1
N.B: for my problem, I use a matrix A 4x10000
See if this works for you -
%// We need to replace this as its not available in your old version of MATLAB:
%// [unqcols,~,col_match] = unique(A(1:end-1,:).','rows','stable') %//'
A1 = A(1:end-1,:).'; %//'
[unqmat_notinorder,row_ind,labels] = unique(A1,'rows');
[tmp_sortedval,ordered_ind] = sort(row_ind);
unqcols = unqmat_notinorder(ordered_ind,:);
[tmp_matches,col_match] = ismember(labels,ordered_ind);
%// OR use - "[tmp2,col_match] = ismember(A1,out,'rows');"
C = unqcols(sum(bsxfun(#eq,col_match,1:max(col_match)),1)>=2,:).'; %//'
%// OR use - "C = out(accumarray(col_match,ones(1,numel(col_match)))>=2,:).'"
This should work:
[a,~,c] = unique(A(1:end-1,:).', 'rows', 'stable');
C=a(histc(c,unique(c))>=2, :).';
Edit: For older versions of MATLAB:
D=A(1:end-1,:);
C=unique(D(:,squeeze(sum(all(bsxfun(#eq, D, permute(D, [1 3 2])))))>=2).', 'rows').':

Matlab: index array by Hamming Weights

a contains indices and their occurrences. Now the indexing needs to be changed with Hamming weight so that indices with equal hamming weight will be summed up. How to do the Hamming weight indexing? Any ready command for this in Matlab?
>> a=[1,2;2,3;5,2;10,1;12,2]
1 2
2 3
5 2
10 1
12 2
13 8
>> dec2bin(a(:,1))
ans =
0001
0010
0101
1010
1100
1101
Goal: index things by Hamming weight
HW Count
1 5 (=2+3)
2 5 (=2+1+2)
3 8 (=8)
You can do it as follows:
a = [1,2;2,3;5,2;10,1;12,2;13,8]
the following line needs to be added, to consider also a hammingweight of zero:
if nnz(a(:,1)) == numel(a(:,1)); a = [0,0;a]; end
% or just
a = [0,0;a]; %// wouldn't change the result
to get the indices
rowidx = sum( de2bi(a(:,1)), 2 )
to get the sums
sums = accumarray( rowidx+1, a(:,2) ) %// +1 to consider Hammingweight of zero
to get the Hammingweight vector
HW = unique(rowidx)
returns:
rowidx =
1
1
2
2
2
3
sums =
5
5
8
and all together:
result = [HW, sums]
%or
result = [unique(rowidx), accumarray(rowidx+1,a(:,2))]
result =
0 0
1 5
2 5
3 8
If you are bothered by the 0 0 line, filter it out
result(sum(result,2)>0,:)
The result for a = [0,2;2,3;5,2;10,1;12,2;13,8] would be:
result =
0 2
1 3
2 5
3 8
Try this -
a = [1 2
2 3
5 2
10 1
12 2
13 8]
HW = dec2bin(a(:,1)) - '0';
out = accumarray(sum(HW,2), a(:,2), [], #sum);%%// You don't need that "sum" option it seems, as that's the default operation with accumarray
final_out = [unique(sum(HW,2)) out]
Output -
a =
1 2
2 3
5 2
10 1
12 2
13 8
final_out =
1 5
2 5
3 8

Greatest values in a matrix, row by row - matlab

I have an m-by-n matrix. For each row, I want to find the position of the k greatest values, and set the others to 0.
Example, for k=2
I WANT
[1 2 3 5 [0 0 3 5
4 5 9 3 0 5 9 0
2 6 7 1] 0 6 7 0 ]
You can achieve it easily using the second output of sort:
data = [ 1 2 3 5
4 5 9 3
2 6 7 1 ];
k = 2;
[M N] = size(data);
[~, ind] = sort(data,2);
data(repmat((1:M).',1,N-k) + (ind(:,1:N-k)-1)*M) = 0;
In the example, this gives
>> data
data =
0 0 3 5
0 5 9 0
0 6 7 0
You can use prctile command to find the threshold per-line.
prctile returns percentiles of the values in the rows of data and thus can be easily tweaked to return the threshold value above which the k-th largest elements at each row exist:
T = prctile( data, 100*(1 - k/size(data,2)), 2 ); % find the threshold
out = bsxfun(#gt, data, T) .* data; % set lower than T to zero
For the data matrix posted in the question we get
>> out
out =
0 0 3 5
0 5 9 0
0 6 7 0

Matlab: sorting a vector by the number of time each unique value occurs

We have p.e. i = 1:25 iterations.
Each iteration result is a 1xlength(N) cell array, where 0<=N<=25.
iteration 1: 4 5 9 10 20
iteration 2: 3 8 9 13 14 6
...
iteration 25: 1 2 3
We evaluate the results of all iterations to one matrix sorted according to frequency each value is repeated in descending order like this example:
Matrix=
Columns 1 through 13
16 22 19 25 2 5 8 14 17 21 3 12 13
6 5 4 4 3 3 3 3 3 3 2 2 2
Columns 14 through 23
18 20 1 6 7 9 10 11 15 23
2 2 1 1 1 1 1 1 1 1
Result explanation: Column 1: N == 16 is present in 6 iterations, column 2: N == 22 is present in 5 iterations etc.
If a number N isn't displayed (in that paradigm N == 4, N == 24) in any iteration, is not listed with frequency index of zero either.
I want to associate each iteration (i) to the first N it is displayed p.e. N == 9 to be present only in first iteration i = 1 and not in i = 2 too, N == 3 only to i = 2 and not in i = 25 too etc until all i's to be unique associated to N's.
Thank you in advance.
Here's a way that uses a feature of unique (i.e. that it returns the index to the first value) that was introduced in R2012a
%# make some sample data
iteration{1} = [1 2 4 6];
iteration{2} = [1 3 6];
iteration{3} = [1 2 3 4 5 6];
nIter= length(iteration);
%# create an index vector so we can associate N's with iterations
nn = cellfun(#numel,iteration);
idx = zeros(1,sum(nn));
idx([1,cumsum(nn(1:end-1))+1]) = 1;
idx = cumsum(idx); %# has 4 ones, 3 twos, 6 threes
%# create a vector of the same length as idx with all the N's
nVec = cat(2,iteration{:});
%# run `unique` on the vector to identify the first occurrence of each N
[~,firstIdx] = unique(nVec,'first');
%# create a "cleanIteration" array, where each N only appears once
cleanIter = accumarray(idx(firstIdx)',firstIdx',[nIter,1],#(x){sort(nVec(x))},{});
cleanIter =
[1x4 double]
[ 3]
[ 5]
>> cleanIter{1}
ans =
1 2 4 6
Here is another solution using accumarray. Explanations in the comments
% example data (from your question)
iteration{1} = [4 5 9 10 20 ];
iteration{2} = [3 8 9 13 14 6];
iteration{3} = [1 2 3];
niterations = length(iteration);
% create iteration numbers
% same as Jonas did in the first part of his code, but using a short loop
for i=1:niterations
idx{i} = i*ones(size(iteration{i}));
end
% count occurences of values from all iterations
% sort them in descending order
occurences = accumarray([iteration{:}]', 1);
[occ val] = sort(occurences, 1, 'descend');
% remove zero occurences and create the Matrix
nonzero = find(occ);
Matrix = [val(nonzero) occ(nonzero)]'
Matrix =
3 9 1 2 4 5 6 8 10 13 14 20
2 2 1 1 1 1 1 1 1 1 1 1
% find minimum iteration number for all occurences
% again, using accumarray with #min function
assoc = accumarray([iteration{:}]', [idx{:}]', [], #min);
nonzero = find(assoc);
result = [nonzero assoc(nonzero)]'
result =
1 2 3 4 5 6 8 9 10 13 14 20
3 3 2 1 1 2 2 1 1 2 2 1