This is a fairly simple task I want to perform, but I can't seem to figure out a way to do it. I've tried sortrows, reshaping, and other solutions, but none of them do exactly what I want.
Essentially, I have two vectors from the same range of values, of unequal lengths. Some of the values are equal, some are not. E.g.
A = [1 5 20 30 53 70 92]
B = [2 3 4 16 20 30 60 95 100]
What I want to do is add "NaNs" to each vector to "stand in" for the values in the other vector that aren't shared. So, I want them to look like:
A = [1 NaN NaN NaN 5 NaN 20 30 53 NaN 70 92 NaN NaN]
B = [NaN 2 3 4 NaN 16 20 30 NaN 60 NaN NaN 95 100]
Some method by which the vector will have placeholders for the value of the other vector.
Do I combine the vectors, sort it, then somehow search and replace all values from the other vector with NaNs? That seems like a bit of a clunky solution, though not impossible. I feel like there is some more elegant way to accomplish this that I am missing.
Thanks!
Here is one solution using a simple map:
A = [1 5 20 30 53 70 92]
B = [2 3 4 16 20 30 60 95 100]
% map all A and B elements
% use 1 for A and 2 for B
map = zeros(max([A,B]),1);
map(A) = 1;
map(B) = bitor(map(B), 2);
% find the values present in either A, or B
[~,~,j] = find(map);
AA = nan(size(j));
BB = nan(size(j));
AA(bitand(j,1)~=0) = A;
BB(bitand(j,2)~=0) = B;
Comparison with Rodys solution shows this method is a bit faster:
A = unique(randi(10000, 1000, 1));
B = unique(randi(10000, 1000, 1));
tic;
for i=1:1000
map=zeros(10000,1);
map(A) = 1;
map(B) = bitor(map(B), 2);
[~,~,j] = find(map);
AA = nan(size(j));
BB = nan(size(j));
AA(bitand(j,1)~=0) = A;
BB(bitand(j,2)~=0) = B;
end
toc
tic
for i=1:1000
C = union(A,B);
Ap = NaN(size(C));
Ap(ismember(C,A)) = A;
Bp = NaN(size(C));
Bp(ismember(C,B)) = B;
end
toc
isequalwithequalnans(BB, Bp)
isequalwithequalnans(AA, Ap)
Elapsed time is 0.283828 seconds.
Elapsed time is 0.457204 seconds.
ans =
1
ans =
1
Well, here's one way:
% union of sets A and B
C = union(A,B);
% initialize new sets, check membership, and
% assign old values when applicable
Ap = NaN(size(C)); Ap(ismember(C,A)) = A;
Bp = NaN(size(C)); Bp(ismember(C,B)) = B;
Note that union gets rid of repititions. In case you want to keep all repetitions, use a manual sort and the second output of ismember:
% combine and sort, KEEPING repetitions
C = sort([A B]);
% initialize new sets, check membership, and
% assign old values when applicable
Ap = NaN(size(C)); Bp = NaN(size(C));
[I,Ab] = ismember(C,A); [I,Bb] = ismember(C,B);
Ap(I) = A(Ab(I)); Bp(I) = B(Bb(I));
Related
Imagine I have two matrices with different sizes (let's say 6x2 and 5x2) as follows:
A = [47 10;
29 10;
23 10;
34 10;
12 10;
64 10];
B = [23 20;
12 20;
54 20
47 20;
31 20];
I need to compare A(:,1) with B(:,1) and delete the rows in matrix A whose first-column-element is different from matrix B's first-column-element (so my focus is only on first columns of the matrices). So I should eventually get something like this:
A = [47 10;
12 10;
23 10];
as "47", "12", and "23" are the only first-column-elements in A that also exist in B! I have written this but I get the error "Matrix dimensions must agree."!
TF = A(:,1) ~= B(:,1); %define indexes in A that A(:,1) is not equal to B(:,1)
A(TF,:) = [];
Any ideas how I could fix this?
You can use ismember:
result = A(ismember(A(:,1), B(:,1)), :);
Replacing this line
TF = A(:,1) ~= B(:,1);
with this line
[~,TF] = setdiff(A(:,1),B(:,1));
yields the desired result.
How is it possible to create a sequence if I have vectors of starting and ending numbers of the subsequences in a vectorized way in Matlab?
Example Input:
A=[12 20 34]
B=[18 25 37]
I want to get (spacing for clarity):
C=[12 13 14 15 16 17 18 20 21 22 23 24 25 34 35 36 37]
Assuming that A and B are sorted ascending and that B(i) < A(i+1) holds then:
idx = zeros(1,max(B)+1);
idx(A) = 1;
idx(B+1) = -1;
C = find(cumsum(idx))
To get around the issue mentioned by Dennis in the comments:
m = min(A)-1;
A = A-m;
B = B-m;
idx = zeros(1,max(B)+1);
idx(A) = 1;
idx(B+1) = -1;
C = find(cumsum(idx)) + m;
cumsum based approach for a generic case (negative numbers or overlaps) -
%// Positions in intended output array at which group shifts
intv = cumsum([1 B-A+1])
%// Values to be put at those places with intention of doing cumsum at the end
put_vals = [A(1) A(2:end) - B(1:end-1)]
%// Get vector of ones and put_vals
id_arr = ones(1,intv(end)-1)
id_arr(intv(1:end-1)) = put_vals
%// Final output with cumsum of id_arr
out = cumsum(id_arr)
Sample run -
>> A,B
A =
-2 -3 1
B =
5 -1 3
>> out
out =
-2 -1 0 1 2 3 4 5 -3 -2 -1 1 2 3
Benchmarking
Here's a runtime test after warming-up tic-toc to compare the various approaches listed to solve the problem for large sized A and B -
%// Create inputs
A = round(linspace(1,400000000,200000));
B = round((A(1:end-1) + A(2:end))/2);
B = [B A(end)+B(1)];
disp('------------------ Divakar Method')
.... Proposed approach in this solution
disp('------------------ Dan Method')
tic
idx = zeros(1,max(B)+1);
idx(A) = 1;
idx(B+1) = -1;
C = find(cumsum(idx));
toc, clear C idx
disp('------------------ Santhan Method')
tic
In = [A;B];
difIn = diff(In);
out1 = bsxfun(#plus, (0:max(difIn)).',A); %//'
mask = bsxfun(#le, (1:max(difIn)+1).',difIn+1); %//'
out1 = out1(mask).'; %//'
toc, clear out1 mask difIn In
disp('------------------ Itamar Method')
tic
C = cell2mat(cellfun(#(a,b){a:b},num2cell(A),num2cell(B)));
toc, clear C
disp('------------------ dlavila Method')
tic
C = cell2mat(arrayfun(#(a,b)a:b, A, B, 'UniformOutput', false));
toc
Runtimes
------------------ Divakar Method
Elapsed time is 0.793758 seconds.
------------------ Dan Method
Elapsed time is 2.640529 seconds.
------------------ Santhan Method
Elapsed time is 1.662889 seconds.
------------------ Itamar Method
Elapsed time is 2.524527 seconds.
------------------ dlavila Method
Elapsed time is 2.096454 seconds.
Alternative using bsxfun
%// Find the difference between Inputs
difIn = B - A;
%// Do colon on A to A+max(difIn)
out = bsxfun(#plus, (0:max(difIn)).',A); %//'
%// mask out which values you want
mask = bsxfun(#le, (1:max(difIn)+1).',difIn+1); %//'
%// getting only the masked values
out = out(mask).'
Results:
>> A,B
A =
-2 -4 1
B =
5 -3 3
>> out
out =
-2 -1 0 1 2 3 4 5 -4 -3 1 2 3
This is one option:
C = cell2mat(cellfun(#(a,b){a:b},num2cell(A),num2cell(B)));
You can do this:
C = cell2mat(arrayfun(#(a,b)a:b, A, B, "UniformOutput", false))
arrayfun(...) create all the subsequences taking pairs of A and B.
cell2mat is used to concatenate each subsequence.
I work with an image that I consider as a matrix.
I want to turn a 800 x 800 matrix (A) into a 400 x 400 matrix (B) where the mean of 4 cells of the A matrix = 1 cell of the B matrix (I know this not a right code line) :
B[1,1] =mean2(A[1,1 + 1,2 + 2,1 + 2,2])
and so on for the whole matrix ...
B [1,2]=mean2(A[1,3 + 1,4 + 2,3 + 2,4 ])
I thought to :
1) Reshape the A matrix into a 2 x 320 000 matrix so I get the four cells I need to average next to each other and it is easier to deal with the row number afterwards.
Im4bis=reshape(permute(reshape(Im4,size(Im4,2),2,[]),[2,3,1]),2,[]);
2) Create a cell-array with the 4 cells I need to average (subsetted) and calculate the mean of it. That's where it doesn't work
I{1,160000}=ones,
for k=drange(1:2:319999)
for n=1:160000
I{n}=mean2(Im4bis(1:2,k:k+1));
end
end
I created an empty matrix of 400 x 400 cells (actually a vector of 1 x 160000) and I wanted to fill it with the mean but I get a matrix of 1 x 319 999 cells with one cell out of 2 empty.
Looking for light
My Input Image:
Method 1
Using mat2cell and cellfun
AC = mat2cell(A, repmat(2,size(A,1)/2,1), repmat(2,size(A,2)/2,1));
out = cellfun(#(x) mean(x(:)), AC);
Method 2
using im2col
out = reshape(mean(im2col(A,[2 2],'distinct')),size(A)./2);
Method 3
Using simple for loop
out(size(A,1)/2,size(A,2)/2) = 0;
k = 1;
for i = 1:2:size(A,1)
l = 1;
for j = 1:2:size(A,2)
out(k,l) = mean(mean(A(i:i+1,j:j+1)));
l = l+1;
end
k = k+1;
end
Test on input image:
A = rgb2gray(imread('inputImage.png'));
%// Here, You could use any of the method from any answers
%// or you could use the best method from the bench-marking tests done by Divakar
out = reshape(mean(im2col(A,[2 2],'distinct')),size(A)./2);
imshow(uint8(out));
imwrite(uint8(out),'outputImage.bmp');
Output Image:
Final check by reading the already written image
B = imread('outputImage.bmp');
>> whos B
Name Size Bytes Class Attributes
B 400x400 160000 uint8
Let A denote your matrix and
m = 2; %// block size: rows
n = 2; %// block size: columns
Method 1
Use blockproc:
B = blockproc(A, [m n], #(x) mean(x.data(:)));
Example:
>> A = magic(6)
A =
35 1 6 26 19 24
3 32 7 21 23 25
31 9 2 22 27 20
8 28 33 17 10 15
30 5 34 12 14 16
4 36 29 13 18 11
>> B = blockproc(A, [m n], #(x) mean(x.data(:)))
B =
17.7500 15.0000 22.7500
19.0000 18.5000 18.0000
18.7500 22.0000 14.7500
Method 2
If you prefer the reshaping way (which is probably faster), use this great answer to organize the matrix into 2x2 blocks tiled along the third dimension, average along the first two dimensions, and reshape the result:
T = permute(reshape(permute(reshape(A, size(A, 1), n, []), [2 1 3]), n, m, []), [2 1 3]);
B = reshape(mean(mean(T,1),2), size(A,1)/m, size(A,2)/n);
Method 3
Apply a 2D convolution (conv2) and then downsample. The convolution computes more entries than are really necessary (hence the downsampling), but on the other hand it can be done separably, which helps speed things up:
B = conv2(ones(m,1)/m, ones(1,n)/n ,A,'same');
B = B(m-1:m:end ,n-1:n:end);
One approach based on this solution using reshape, sum & squeeze -
sublen = 2; %// subset length
part1 = reshape(sum(reshape(A,sublen,[])),size(A,1)/sublen,sublen,[]);
out = squeeze(sum(part1,2))/sublen^2;
Benchmarking
Set #1
Here are the runtime comparisons for the approaches listed so far for a input datasize of 800x 800 -
%// Input
A = rand(800,800);
%// Warm up tic/toc.
for k = 1:50000
tic(); elapsed = toc();
end
disp('----------------------- With RESHAPE + SUM + SQUEEZE')
tic
sublen = 2; %// subset length
part1 = reshape(sum(reshape(A,sublen,[])),size(A,1)/sublen,sublen,[]);
out = squeeze(sum(part1,2))/sublen^2;
toc, clear sublen part1 out
disp('----------------------- With BLOCKPROC')
tic
B = blockproc(A, [2 2], #(x) mean(x.data(:))); %// [m n]
toc, clear B
disp('----------------------- With PERMUTE + MEAN + RESHAPE')
tic
m = 2;n = 2;
T = permute(reshape(permute(reshape(A, size(A, 1), n, []),...
[2 1 3]), n, m, []), [2 1 3]);
B = reshape(mean(mean(T,1),2), size(A,1)/m, size(A,2)/m);
toc, clear B T m n
disp('----------------------- With CONVOLUTION')
tic
m = 2;n = 2;
B = conv2(ones(m,1)/m, ones(1,n)/n ,A,'same');
B = B(m-1:m:end ,n-1:n:end);
toc, clear m n B
disp('----------------------- With MAT2CELL')
tic
AC = mat2cell(A, repmat(2,size(A,1)/2,1), repmat(2,size(A,2)/2,1));
out = cellfun(#(x) mean(x(:)), AC);
toc
disp('----------------------- With IM2COL')
tic
out = reshape(mean(im2col(A,[2 2],'distinct')),size(A)./2);
toc
Runtime results -
----------------------- With RESHAPE + SUM + SQUEEZE
Elapsed time is 0.004702 seconds.
----------------------- With BLOCKPROC
Elapsed time is 6.039851 seconds.
----------------------- With PERMUTE + MEAN + RESHAPE
Elapsed time is 0.006015 seconds.
----------------------- With CONVOLUTION
Elapsed time is 0.002174 seconds.
----------------------- With MAT2CELL
Elapsed time is 2.362291 seconds.
----------------------- With IM2COL
Elapsed time is 0.239218 seconds.
To make the runtimes more fair, we can use a number of trials of 1000 on top of the fastest three approaches for the same input datasize of 800 x 800, giving us -
----------------------- With RESHAPE + SUM + SQUEEZE
Elapsed time is 1.264722 seconds.
----------------------- With PERMUTE + MEAN + RESHAPE
Elapsed time is 3.986038 seconds.
----------------------- With CONVOLUTION
Elapsed time is 1.992030 seconds.
Set #2
Here are the runtime comparisons for a larger input datasize of 10000x 10000 for the fastest three approaches -
----------------------- With RESHAPE + SUM + SQUEEZE
Elapsed time is 0.158483 seconds.
----------------------- With PERMUTE + MEAN + RESHAPE
Elapsed time is 0.589322 seconds.
----------------------- With CONVOLUTION
Elapsed time is 0.307836 seconds.
I am currently experimenting with Matlab functions. Basically I am trying to perform a function on each value found in a matrix such as the following simple example:
k = [1:100];
p = [45 60 98 100; 46 65 98 20; 47 65 96 50];
p(find(p)) = getSum(k, find(p), find(p) + 1);
function x = getSum(k, f, g, h)
x = sum(k(f:g));
end
Why the corresponding output matrix values are all 3, in other words why all indices are depending on the first calculated sum?
The output is the following:
p =
3 3 3 3
3 3 3 3
3 3 3 3
f:g returns the value between f(1,1) and g(1,1), so 1:2.
find(p) returns the indices of non zero values. Since all values are non-zero, you get all indices.
So if we break down the statement p(find(p)) = getSum(k, find(p), fin(p) + 1)
We get
find(p) = 1:12
We then get
f = 1:12 and g = 2:13 which lead to k = 1:2 (as explained above)
finally sum(1:2) = 3
And this value is apply over p(1:12), which is the same as p(:,:) (all the matrix)
I have an NxM matrix in MATLAB that I would like to reorder in similar fashion to the way JPEG reorders its subblock pixels:
(image from Wikipedia)
I would like the algorithm to be generic such that I can pass in a 2D matrix with any dimensions. I am a C++ programmer by trade and am very tempted to write an old school loop to accomplish this, but I suspect there is a better way to do it in MATLAB.
I'd be rather want an algorithm that worked on an NxN matrix and go from there.
Example:
1 2 3
4 5 6 --> 1 2 4 7 5 3 6 8 9
7 8 9
Consider the code:
M = randi(100, [3 4]); %# input matrix
ind = reshape(1:numel(M), size(M)); %# indices of elements
ind = fliplr( spdiags( fliplr(ind) ) ); %# get the anti-diagonals
ind(:,1:2:end) = flipud( ind(:,1:2:end) ); %# reverse order of odd columns
ind(ind==0) = []; %# keep non-zero indices
M(ind) %# get elements in zigzag order
An example with a 4x4 matrix:
» M
M =
17 35 26 96
12 59 51 55
50 23 70 14
96 76 90 15
» M(ind)
ans =
17 35 12 50 59 26 96 51 23 96 76 70 55 14 90 15
and an example with a non-square matrix:
M =
69 9 16 100
75 23 83 8
46 92 54 45
ans =
69 9 75 46 23 16 100 83 92 54 8 45
This approach is pretty fast:
X = randn(500,2000); %// example input matrix
[r, c] = size(X);
M = bsxfun(#plus, (1:r).', 0:c-1);
M = M + bsxfun(#times, (1:r).'/(r+c), (-1).^M);
[~, ind] = sort(M(:));
y = X(ind).'; %'// output row vector
Benchmarking
The following code compares running time with that of Amro's excellent answer, using timeit. It tests different combinations of matrix size (number of entries) and matrix shape (number of rows to number of columns ratio).
%// Amro's approach
function y = zigzag_Amro(M)
ind = reshape(1:numel(M), size(M));
ind = fliplr( spdiags( fliplr(ind) ) );
ind(:,1:2:end) = flipud( ind(:,1:2:end) );
ind(ind==0) = [];
y = M(ind);
%// Luis' approach
function y = zigzag_Luis(X)
[r, c] = size(X);
M = bsxfun(#plus, (1:r).', 0:c-1);
M = M + bsxfun(#times, (1:r).'/(r+c), (-1).^M);
[~, ind] = sort(M(:));
y = X(ind).';
%// Benchmarking code:
S = [10 30 100 300 1000 3000]; %// reference to generate matrix size
f = [1 1]; %// number of cols is S*f(1); number of rows is S*f(2)
%// f = [0.5 2]; %// plotted with '--'
%// f = [2 0.5]; %// plotted with ':'
t_Amro = NaN(size(S));
t_Luis = NaN(size(S));
for n = 1:numel(S)
X = rand(f(1)*S(n), f(2)*S(n));
f_Amro = #() zigzag_Amro(X);
f_Luis = #() zigzag_Luis(X);
t_Amro(n) = timeit(f_Amro);
t_Luis(n) = timeit(f_Luis);
end
loglog(S.^2*prod(f), t_Amro, '.b-');
hold on
loglog(S.^2*prod(f), t_Luis, '.r-');
xlabel('number of matrix entries')
ylabel('time')
The figure below has been obtained with Matlab R2014b on Windows 7 64 bits. Results in R2010b are very similar. It is seen that the new approach reduces running time by a factor between 2.5 (for small matrices) and 1.4 (for large matrices). Results are seen to be almost insensitive to matrix shape, given a total number of entries.
Here's a non-loop solution zig_zag.m. It looks ugly but it works!:
function [M,index] = zig_zag(M)
[r,c] = size(M);
checker = rem(hankel(1:r,r-1+(1:c)),2);
[rEven,cEven] = find(checker);
[cOdd,rOdd] = find(~checker.'); %'#
rTotal = [rEven; rOdd];
cTotal = [cEven; cOdd];
[junk,sortIndex] = sort(rTotal+cTotal);
rSort = rTotal(sortIndex);
cSort = cTotal(sortIndex);
index = sub2ind([r c],rSort,cSort);
M = M(index);
end
And a test matrix:
>> M = [magic(4) zeros(4,1)];
M =
16 2 3 13 0
5 11 10 8 0
9 7 6 12 0
4 14 15 1 0
>> newM = zig_zag(M) %# Zig-zag sampled elements
newM =
16
2
5
9
11
3
13
10
7
4
14
6
8
0
0
12
15
1
0
0
Here's a way how to do this. Basically, your array is a hankel matrix plus vectors of 1:m, where m is the number of elements in each diagonal. Maybe someone else has a neat idea on how to create the diagonal arrays that have to be added to the flipped hankel array without a loop.
I think this should be generalizeable to a non-square array.
% for a 3x3 array
n=3;
numElementsPerDiagonal = [1:n,n-1:-1:1];
hadaRC = cumsum([0,numElementsPerDiagonal(1:end-1)]);
array2add = fliplr(hankel(hadaRC(1:n),hadaRC(end-n+1:n)));
% loop through the hankel array and add numbers counting either up or down
% if they are even or odd
for d = 1:(2*n-1)
if floor(d/2)==d/2
% even, count down
array2add = array2add + diag(1:numElementsPerDiagonal(d),d-n);
else
% odd, count up
array2add = array2add + diag(numElementsPerDiagonal(d):-1:1,d-n);
end
end
% now flip to get the result
indexMatrix = fliplr(array2add)
result =
1 2 6
3 5 7
4 8 9
Afterward, you just call reshape(image(indexMatrix),[],1) to get the vector of reordered elements.
EDIT
Ok, from your comment it looks like you need to use sort like Marc suggested.
indexMatrixT = indexMatrix'; % ' SO formatting
[dummy,sortedIdx] = sort(indexMatrixT(:));
sortedIdx =
1 2 4 7 5 3 6 8 9
Note that you'd need to transpose your input matrix first before you index, because Matlab counts first down, then right.
Assuming X to be the input 2D matrix and that is square or landscape-shaped, this seems to be pretty efficient -
[m,n] = size(X);
nlim = m*n;
n = n+mod(n-m,2);
mask = bsxfun(#le,[1:m]',[n:-1:1]);
start_vec = m:m-1:m*(m-1)+1;
a = bsxfun(#plus,start_vec',[0:n-1]*m);
offset_startcol = 2- mod(m+1,2);
[~,idx] = min(mask,[],1);
idx = idx - 1;
idx(idx==0) = m;
end_ind = a([0:n-1]*m + idx);
offsets = a(1,offset_startcol:2:end) + end_ind(offset_startcol:2:end);
a(:,offset_startcol:2:end) = bsxfun(#minus,offsets,a(:,offset_startcol:2:end));
out = a(mask);
out2 = m*n+1 - out(end:-1:1+m*(n-m+1));
result = X([out2 ; out(out<=nlim)]);
Quick runtime tests against Luis's approach -
Datasize: 500 x 2000
------------------------------------- With Proposed Approach
Elapsed time is 0.037145 seconds.
------------------------------------- With Luis Approach
Elapsed time is 0.045900 seconds.
Datasize: 5000 x 20000
------------------------------------- With Proposed Approach
Elapsed time is 3.947325 seconds.
------------------------------------- With Luis Approach
Elapsed time is 6.370463 seconds.
Let's assume for a moment that you have a 2-D matrix that's the same size as your image specifying the correct index. Call this array idx; then the matlab commands to reorder your image would be
[~,I] = sort (idx(:)); %sort the 1D indices of the image into ascending order according to idx
reorderedim = im(I);
I don't see an obvious solution to generate idx without using for loops or recursion, but I'll think some more.