Vector of the occurence number - matlab

I have a vector a=[1 2 3 1 4 2 5]'
I am trying to create a new vector that would give for each row, the occurence number of the element in a. For instance, with this matrix, the result would be [1 1 1 2 1 2 1]': The fourth element is 2 because this is the first time that 1 is repeated.
The only way I can see to achieve that is by creating a zero vector whose number of rows would be the number of unique elements (here: c = [0 0 0 0 0] because I have 5 elements).
I also create a zero vector d of the same length as a. Then, going through the vector a, adding one to the row of c whose element we read and the corresponding number of c to the current row of d.
Can anyone think about something better?

This is a nice way of doing it
C=sum(triu(bsxfun(#eq,a,a.')))
My first suggestion was this, a not very nice for loop
for i=1:length(a)
F(i)=sum(a(1:i)==a(i));
end

This does what you want, without loops:
m = max(a);
aux = cumsum([ ones(1,m); bsxfun(#eq, a(:), 1:m) ]);
aux = (aux-1).*diff([ ones(1,m); aux ]);
result = sum(aux(2:end,:).');

My first thought:
M = cumsum(bsxfun(#eq,a,1:numel(a)));
v = M(sub2ind(size(M),1:numel(a),a'))

on a completely different level, you can look into tabulate to get info about the frequency of the values. For example:
tabulate([1 2 4 4 3 4])
Value Count Percent
1 1 16.67%
2 1 16.67%
3 1 16.67%
4 3 50.00%

Please note that the solutions proposed by David, chappjc and Luis Mendo are beautiful but cannot be used if the vector is big. In this case a couple of naïve approaches are:
% Big vector
a = randi(1e4, [1e5, 1]);
a1 = a;
a2 = a;
% Super-naive solution
tic
x = sort(a);
x = x([find(diff(x)); end]);
for hh = 1:size(x, 1)
inds = (a == x(hh));
a1(inds) = 1:sum(inds);
end
toc
% Other naive solution
tic
x = sort(a);
y(:, 1) = x([find(diff(x)); end]);
y(:, 2) = histc(x, y(:, 1));
for hh = 1:size(y, 1)
a2(a == y(hh, 1)) = 1:y(hh, 2);
end
toc
% The two solutions are of course equivalent:
all(a1(:) == a2(:))
Actually, now the question is: can we avoid the last loop? Maybe using arrayfun?

Related

Matlab index matrix by vector specifying column

Let A be of size [n,m], i.e. it has n rows and m columns. Given I of size [n,1] with max(I)<=m, what is the fastest way to return B of size [n,1], such that B(i)=A(i,I(i))?
Example:
A =
8 1 6
3 5 7
4 9 2
and
I =
1
2
2
I want B to look like
B =
8
5
9
There obviously exist several ways to implement this, but in my case n is in the order of 1e6 and m in the order of 1e2, which is why I'm interested in the fastest implementation. I would like to avoid ind2sub or sub2ind since they both appear to be too slow as well. Any idea is greatly appreciated! Thanks!
You can replicate the behavior of sub2ind yourself. This gives me a speedup in my test:
clear
%% small example
A = rand(4,6)
I = [3 2 2 1]
inds = (I-1)*size(A,1) + (1:length(I));
B = A(inds)
%% timing
n = 1e4;
m = 1e2;
A = rand(n, m);
I = ceil(rand(1,n) * m);
% sub2ind
F = #() A(sub2ind(size(A), 1:size(A,1), I));
timeit(F)
% manual
F = #() A((I-1)*size(A,1) + (1:length(I)));
timeit(F)
You can also use something like this:
A(meshgrid(1:size(A,2),1:size(A,1)) == repmat(I,1,size(A,2)))
which will give you the same result, with no loop and no sub2ind.

Eliminate/Remove duplicates from array Matlab

How can I remove any number that has duplicate from an array.
for example:
b =[ 1 1 2 3 3 5 6]
becomes
b =[ 2 5 6]
Use unique function to extract unique values then compute histogram of data for unique values and preserve those that have counts of 1.
a =[ 1 1 2 3 3 5 6];
u = unique(a)
idx = hist(a, u) ==1;
b = u(idx)
result
2 5 6
for multi column input this can be done:
a = [1 2; 1 2;1 3;2 1; 1 3; 3 5 ; 3 6; 5 9; 6 10] ;
[u ,~, uid] = unique(a,'rows');
idx = hist(uid,1:size(u,1))==1;
b= u(idx,:)
You can first sort your elements and afterwards remove all elements which have the same value as one of its neighbors as follows:
A_sorted = sort(A); % sort elements
A_diff = diff(A_sorted)~=0; % check if element is the different from the next one
A_unique = [A_diff true] & [true A_diff]; % check if element is different from previous and next one
A = A_sorted(A_unique); % obtain the unique elements.
Benchmark
I will benchmark my solution with the other provided solutions, i.e.:
using diff (my solution)
using hist (rahnema1)
using sum (Jean Logeart)
using unique (my alternative solution)
I will use two cases:
small problem (yours): A = [1 1 2 3 3 5 6];
larger problem
rng('default');
A= round(rand(1, 1000) * 300);
Result:
Small Large Comments
----------------|------------|------------%----------------
using `diff` | 6.4080e-06 | 6.2228e-05 % Fastest method for large problems
using `unique` | 6.1228e-05 | 2.1923e-04 % Good performance
using `sum` | 5.4352e-06 | 0.0020 % Only fast for small problems, preserves the original order
using `hist` | 8.4408e-05 | 1.5691e-04 % Good performance
My solution (using diff) is the fastest method for somewhat larger problems. The solution of Jean Logeart using sum is faster for small problems, but the slowest method for larger problems, while mine is almost equally fast for the small problem.
Conclusion: In general, my proposed solution using diff is the fastest method.
timeit(#() usingDiff(A))
timeit(#() usingUnique(A))
timeit(#() usingSum(A))
timeit(#() usingHist(A))
function A = usingDiff (A)
A_sorted = sort(A);
A_unique = [diff(A_sorted)~=0 true] & [true diff(A_sorted)~=0];
A = A_sorted(A_unique);
end
function A = usingUnique (A)
[~, ia1] = unique(A, 'first');
[~, ia2] = unique(A, 'last');
A = A(ia1(ia1 == ia2));
end
function A = usingSum (A)
A = A(sum(A==A') == 1);
end
function A = usingHist (A)
u = unique(A);
A = u(hist(A, u) ==1);
end

Permuting columns of a matrix in MATLAB

Say I have an n by d matrix A and I want to permute the entries of some columns. To do this, I compute permutations of 1 ... n as
idx1 = randperm(n)'
idx2 = randperm(n)'
Then I could do:
A(:,1) = A(idx1,1)
A(:,2) = A(idx2,2)
However, I dont want to do this using a for-loop, as it'll be slow. Say I have an n by d matrix A and an n by d index matrix IDX that specifies the permutations, is there a quicker equivalent of the following for-loop:
for i = 1:d
A(:,i) = A(IDX(:,i),i);
end
Using linear-indexing with the help of bsxfun -
[n,m] = size(A);
newA = A(bsxfun(#plus,IDX,[0:m-1]*n))
I guess another rather stupid way to do it is with cellfun, stupid because you have to convert it into a cell and then convert it back, but it is there anyways.
N=ones(d,1)*n; %//create a vector of d length with each element = n
M=num2cell(N); %//convert it into a cell
P=cellfun(#randperm, M,'uni',0); %//cellfun applys randperm to each cell
res = cell2mat(P); %//Convert result back into a matrix (since results are numeric).
This also allows randperm of type Char and String, but the cell2mat will not work for those cases, results are in Cell Array format instead.
for d = 5, n = 3:
>> res =
1 3 2
1 2 3
2 3 1
3 1 2
3 2 1

average number of different values in a column

I had a question in Matlab. It is so, I try to take average of the different number of values ​​in a column. For example, if we have the column below,
X = [1 1 2 3 4 3 8 2 1 3 5 6 7 7 5]
first I want to start by taking the average of 5 values ​​and plot them. In the case above, I should receive three averages that I could plot. Then take 10 values ​​at a time and so on.
I wonder if you have to write custom code to fix it.
The fastest way is probably to rearrange your initial vector X into some matrix, with each column storing the required values to average:
A = reshape(X, N, []);
where N is the desired number of rows in the new matrix, and the empty brackets ([]) tell MATLAB to calculate the number of columns automatically. Then you can average each column using mean:
X_avg = mean(A);
Vector X_avg stores the result. This can be done in one line like so:
X_avg = mean(reshape(X, N, []));
Note that the number of elements in X has to be divisible by N, otherwise you'll have to either pad it first (e.g with zeroes), or handle the "leftover" tail elements separately:
tail = mod(numel(X), N);
X_avg = mean(reshape(X(1:numel(X) - tail), N, [])); %// Compute average values
X_avg(end + 1) = mean(X(end - tail + 1:end)); %// Handle leftover elements
Later on you can put this code in a loop, computing and plotting the average values for a different value of N in each iteration.
Example #1
X = [1 1 2 3 4 3 8 2 1 3 5 6 7 7 5];
N = 5;
tail = mod(numel(X), N);
X_avg = mean(reshape(X(1:numel(X) - tail), N, []))
X_avg(end + 1) = mean(X(end - tail + 1:end))
The result is:
X_avg =
2.2000 3.4000 6.0000
Example #2
Here's another example (this time the length of X is not divisible by N):
X = [1 1 2 3 4 3 8 2 1 3 5 6 7 7 5];
N = 10;
tail = mod(numel(X), N);
X_avg = mean(reshape(X(1:numel(X) - tail), N, []))
X_avg(end + 1) = mean(X(end - tail + 1:end))
The result is:
X_avg =
2.8000 6.0000
This should do the trick:
For a selected N (the number of values you want to take the average of):
N = 5;
mean_vals = arrayfun(#(n) mean(X(n-1+(1:N))),1:N:length(X))
Note: This does not check if Index exceeds matrix dimensions.
If you want to skip the last numbers, this should work:
mean_vals = arrayfun(#(n) mean(X(n-1+(1:N))),1:N:(length(X)-mod(length(X),N)));
To add the remaining values:
if mod(length(X),N) ~= 0
mean_vals(end+1) = mean(X(numel(X)+1-mod(length(X),N):end))
end
UPDATE: This is a modification of Eitan's first answer (before it was edited). It uses nanmean(), which takes the mean of all values that are not NaN. So, instead of filling the remaining rows with zeros, fill them with NaN, and just take the mean.
X = [X(:); NaN(mod(N - numel(X), N), 1)];
X_avg = nanmean(reshape(X, N, []));
It would be helpful if you posted some code and point out exactly what is not working.
As a first pointer. If
X = [1 1 2 3 4 3 8 2 1 3 5 6 7 7 5]
the three means in blocks of 5 you are interested in are
mean(X(1:5))
mean(X(6:10))
mean(X(11:15))
You will have to come up with a for loop or maybe some other way to iterate through the indices.
I think you want something like this (I didn't use Matlab in a while, I hope the syntax is right):
X = [1 1 2 3 4 3 8 2 1 3 5 6 7 7 5],
currentAmount=5,
block=0,
while(numel(X)<=currentAmount)
while(numel(X)<=currentAmount+block*currentAmount)
mean(X(block*currentAmount+1:block*currentAmount+currentAmount));
block =block+1;
end;
currentAmount = currentAmount+5;
block=0;
end
This code will first loop through all elements calculating means of 5 elements at a time. Then, it will expand to 10 elements. Then to 15, and so on, until the number of elements from which you want to make the mean is bigger than the number of elements in the column.
If you are looking to average K random samples in your N-dimensional vector, then you could use:
N = length(X);
K = 20; % or 10, or 30, or any integer less than or equal to N
indices = randperm(N, K); % gives you K random indices from the range 1:N
result = mean(X(indices)); % averages the values of X at the K random
% indices from above
A slightly more compact form would be:
K = 20;
result = mean(X(randperm(length(X), K)));
If you are just looking to take every K consecutive samples from the list and average them then I am sure one of the previous answers will give you what you want.
If you need to do this operation a lot, it might be worth writing your own function for it. I would recommend using #EitanT's basic idea: pad the data, reshape, take mean of each column. However, rather than including the zero-padded numbers at the end, I recommend taking the average of the "straggling" data points separately:
function m = meanOfN(x, N)
% function m = meanOfN(x, N)
% create groups of N elements of vector x
% and return their mean
% if numel(x) is not a multiple of N, the last value returned
% will be for fewer than N elements
Nf = N * floor( numel( x ) / N ); % largest multiple of N <= length of x
xr = reshape( x( 1:Nf ), N, []);
m = mean(xr);
if Nf < N
m = [m mean( x( Nf + 1:end ) )];
end
This function will return exactly what you were asking for: in the case of a 15 element vector with N=5, it returns 3 values. When the size of the input vector is not a multiple of N, the last value returned will be the "mean of what is left".
Often when you need to take the mean of a set of numbers, it is the "running average" that is of interest. So rather than getting [mean(x(1:5)) mean(x(6:10)) mean(11:15))], you might want
m(1) = mean(x(1:N));
m(2) = mean(x(2:N+1));
m(3) = mean(x(3:N+2));
...etc
That could be achieved using a simple convolution of your data with a vector of ones; for completeness, here is a possible way of coding that:
function m = meansOfN(x, n)
% function m = meansOfN(x, n)
% taking the running mean of the values in x
% over n samples. Returns a row vector of size (sizeof(x) - n + 1)
% if numel(x) < n, this returns an empty matrix
mv = ones(N,1) / N; % vector of ones, normalized
m = convn(x(:), mv, 'valid'); % perform 1D convolution
With these two functions in your path (save them in a file called meanOfN.m and meansOfN.m respectively), you can do anything you want. In any program you will be able to write
myMeans = meanOfN(1:30, 5);
myMeans2 = meansOfN(1:30, 6);
etc. Matlab will find the function, perform the calculation, return the result. Writing your custom functions for specific operations like this can be very helpful - not only does it keep your code clean, but you only have to test the function once...

Matlab swap

I am trying to create a function that will swap a specific number in a matrix with a specific number in the same matrix. For examlpe, if I start with A = [1 2 3;1 3 2], I want to be able to create B = [2 1 3; 2 3 1], simply by telling matlab to swap the 1's with the 2's. Any advice would be appreciated. Thanks!
If you have the following matrix:
A = [1 2 3; 1 3 2];
and you want all the ones to become twos and the twos to become ones, the following would be the simplest way to do it:
B = A;
B(find(A == 1)) = 2;
B(find(A == 2)) = 1;
EDIT:
As Kenny suggested, this can even be further simplified as:
B = A;
B(A == 1) = 2;
B(A == 2) = 1;
Another way to deal with the original problem is to create a permutation vector indicating to which numbers should the original entries be mapped to. For the example, entries [1 2 3] should be mapped respectively to [2 1 3], so that we can write
A = [1 2 3; 1 3 2];
perm = [2 1 3];
B = perm(A)
(advantage here is that everything is done in one step, and that it also works for operations more complicated than swaps ; drawback is that all elements of A must be positive integers with a known maximum)
Not sure why you would to perform that particular swap (row/column interchanges are more common). Matlab often denotes ':' to represent all of something. Here's how to swap rows and columns:
To swap rows:
A = A([New order of rows,,...], :)
To Swap columns:
A = A(:, [New order of columns,,...])
To change the entire i-th column:
A(:, i) = [New; values; for; i-th; column]
For example, to swap the 2nd and 3rd columns of A = [1 2 3;1 3 2]
A = A(:, [1, 3, 2])
A = [1 2 3; 1 3 2]
alpha = 1;
beta = 2;
indAlpha = (A == alpha);
indBeta = (A == beta);
A(indAlpha) = beta;
A(indBeta ) = alpha
I like this solution, it makes it clearer what is going on. Less magic numbers, could easily be made into a function. Recycles the same matrix if that is important.
I don't have a copy of MatLab installed, but I think you can do some thing like this;
for i=1:length(A)
if (A(i)=1), B(i) = 2, B(i)=A(i)
end
Note, that's only convert 1's to 2's and it looks like you also want to convert 2's to 1's, so you'll need to do a little more work.
There also probably a much more elegant way of doing it given you can do this sort of thing in Matlab
>> A = 1:1:3
A = [1,2,3]
>> B = A * 2
B = [2,4,6]
There might be a swapif primitive you can use, but I haven't used Matlab in a long time, so I'm not sure the best way to do it.
In reference to tarn's more elegant way of swapping values you could use a permutation matrix as follows:
>> a =[1 2 3];
>> T = [1 0 0;
0 0 1;
0 1 0];
>> b = a*T
ans =
1 3 2
but this will swap column 2 and column 3 of the vector (matrix) a; whereas the question asked about swapping the 1's and 2's.
Update
To swap elements of two different values look into the find function
ind = find(a==1);
returns the indices of all the elements with value, 1. Then you can use Mitch's suggestion to change the value of the elements using index arrays. Remeber that find returns the linear index into the matrix; the first element has index 1 and the last element of an nxm matrix has linear index n*m. The linear index is counted down the columns. For example
>> b = [1 3 5;2 4 6];
>> b(3) % same as b(1,2)
ans = 3
>> b(5) % same as b(1,3)
ans = 5
>> b(6) % same as b(2,3)
ans = 6