Combine and sort data based on indices in Matlab - matlab

I have 2 datasets like this
dataset1 (31x1 double):
0
32
45
8
...
91
dataset2 (40x1 double):
5
12
27
10
...
15
I also have dataset1_index (31x1 double) that indexes the values of datset1 from a larger dataset
2
5
6
9
...
58
Similarly, I have dataset2_index (40x1 double) that indexes the values of datset2 from the same larger dataset
3
7
8
13
...
62
I would like to combine dataset1 and dataset2 into dataset3 (71x1 double) but the order of values in dataset3 should follow the order (from small to large) of dataset1_index and dataset2_index. Could anyone help?

You could create a 71x2 matrix containing the indices and values, then sortrows() on the index column, and take the sorted values column
B = sortrows(A,column) sorts A based on the columns specified in the vector column. For example, sortrows(A,4) sorts the rows of A in ascending order based on the elements in the fourth column
dataset1 = [0
32
45
8];
dataset2 = [5
12
27
10];
dataset1_index = [2
5
6
9];
dataset2_index = [3
7
8
13];
tmp = sortrows([dataset1_index dataset1; dataset2_index dataset2], 1);
% 2 0
% 3 5
% 5 32
% 6 45
% 7 12
% 8 27
% 9 8
% 13 10
dataset3 = tmp(:, 2);
dataset3_index = tmp(:, 1);

Related

Reshaping a vector into a larger matrix with arbitrary m and n

I'm attempting to create a function that takes a vector of any length and uses its entries to generate a matrix of size mxn, where m and n are arbitrary numbers. If the matrix has a greater number of entries that the original vector, the entries should repeat. E.g. A vector, (1,2,3,4) would make a 3x3 matrix (1,2,3;4,1,2;3,4,1).
So far I have this function:
function A = MyMatrix(Vector,m,n)
A = reshape([Vector,Vector(1:(m*n)-length(Vector))],[m,n]);
end
which is successful in some cases:
>> m=8;n=5;Vector=(1:20);
>> A = MyMatrix(Vector,m,n)
A =
1 9 17 5 13
2 10 18 6 14
3 11 19 7 15
4 12 20 8 16
5 13 1 9 17
6 14 2 10 18
7 15 3 11 19
8 16 4 12 20
However this only works for values of m and n that multiply to a number less than or equal to twice the number of entries in 'Vector', so 40 in this case. When mn is larger than 40, this code yields:
>> m=8;n=6;Vector=(1:20);
>> A = MyMatrix(Vector,m,n)
Index exceeds the number of array elements (20).
Error in MyMatrix (line 3)
A = reshape([Vector,Vector(1:(m*n)-length(Vector))],[m,n]);
I have tried to create a workaround using functions such as repmat, however, so far I have not been able to create a matrix with larger m and n.
You only need to
index the vector using "modular", 1-based indexing;
reshape it taking into account that Matlab is column-major, so you need to swap m and n;
transpose to swap m and n back.
V = [10 20 30 40 50 60]; % vector
m = 4; % number of rows
n = 5; % number of columns
A = reshape(V(mod(0:m*n-1, numel(V))+1), n, m).';
This gives
A =
10 20 30 40 50
60 10 20 30 40
50 60 10 20 30
40 50 60 10 20

Summing specific columns for each row in a matrix of double

I would like to sum specific columns of each row in a matrix using a for loop. Below I have included a simplified version of my problem. As of right now, I am calculating the column sums individually, but this is not effective as my actual problem has multiple matrices (data sets).
a = [1 2 3 4 5 6; 4 5 6 7 8 9];
b = [2 2 3 4 4 6; 3 3 3 4 5 5];
% Repeat the 3 lines of code below for row 2 of matrix a
% Repeat the entire process for matrix b
c = sum(a(1,1:3)); % Sum columns 1:3 of row 1
d = sum(a(1,4:6)); % Sum columns 4:6 of row 1
e = sum(a(1,:)); % Sum all columns of row 1
I would like to know how to create a for loop that automatically loops through and sums the specific columns of each row for each matrix that I have.
Thank you.
Here is a solution that you don't need to use for loop.
Assuming that you have a matrix a of size 2x12, and you want to do the row sums every 4 columns, then you can use reshape() and squeeze() to get the final result:
k = 4;
a = [1:12
13:24];
% a =
% 1 2 3 4 5 6 7 8 9 10 11 12
% 13 14 15 16 17 18 19 20 21 22 23 24
s = squeeze(sum(reshape(a,size(a,1),k,[]),2));
and you will get
s =
10 26 42
58 74 90

Combine index-based and logical addressing in Matlab

Consider a matrix X. I have to update a submatrix of X, X(row1:row2, col1:col2), with a matrix Z (of size row2-row1+1, col2-col1+1) but only on those positions where a logical matrix L (of size row2-row1+1, col2-col1+1) is true.
E.g. if
X=[ 1 2 3 4 5 6
11 12 13 14 15 16
21 22 23 24 25 26
31 32 33 34 34 36]
Z=[31 41
32 42]
L=[ 1 0
0 1]
row1 = 2; row2 = 3; col1 = 3; col2 = 4
then after the update I should get:
X=[ 1 2 3 4 5 6
11 12 31 14 15 16
21 22 23 42 25 26
31 32 33 34 34 36]
Currently I do the following:
Y = X(row1:row2, col1:col2);
Y(L) = Z(L);
X(row1:row2, col1:col2) = Y;
This code is in a tight loop and according to Matlab's (v2019a) profiler is the main bottleneck of my program. In the real code X is a 2000x1500x3 cube; row1, row2, col1, col2, Z and L change in the loop.
The question is whether it can be rewritten into a single / faster assignment.
Thanks.
Honestly, without seeing your actual code, I get the sense that your solution may be as fast as you can get. The reason I say that is because I tested a few different solutions by creating some random sample data closer to your actual problem. I assumed X is an image of type uint8 with size 2000-by-1500-by-3, Z is size N-by-N (i.e. we will only be modifying the first page of X), L is an N-by-N logical array, and the row and column indices are randomly chosen:
X = randi([0 255], 2000, 1500, 3, 'uint8');
N = 20; % Submatrix size
Z = randi([0 255], N, N, 'uint8');
L = (rand(N, N) > 0.5);
row1 = randi([1 2000-N]);
row2 = row1+N-1
col1 = randi([1 1500-N]);
col2 = col1+N-1;
I then tested 3 different solutions: your original solution, a solution using find and sub2ind to create a linear index for X, and a solution that creates a logical index for X:
% Original solution:
Y = X(row1:row2, col1:col2, 1);
Y(L) = Z(L);
X(row1:row2, col1:col2, 1) = Y;
% Linear index solution:
[rIndex, cIndex] = find(L);
X(sub2ind(size(X), rIndex+row1-1, cIndex+col1-1)) = Z(L);
% Logical index solution
[R, C, ~] = size(X);
fullL = false(R, C);
fullL(row1:row2, col1:col2) = L;
X(fullL) = Z(L);
I tested these repeatedly with randomly-generated sample data using timeit and found that your original solution is consistently the fastest. The linear index solution is very close, but slightly slower. The logical index solution takes more than twice as long.
Let's define some example data:
X = randi(9,5,6);
Y = 10+X;
row1 = 2;
row2 = 4;
col1 = 3;
col2 = 4;
L = logical([0 1; 0 0; 1 1]);
Then:
ind_subm = bsxfun(#plus, (row1:row2).',size(X,1)*((col1:col2)-1));
% linear index for submatrix
ind_subm_masked = ind_subm(L);
% linear index for masked submatrix
X(ind_subm_masked) = Y(ind_subm_masked);
Example results:
X before:
X =
6 2 1 7 9 6
3 3 3 5 5 7
6 3 8 6 5 4
7 4 1 3 3 4
2 5 9 5 5 9
L:
L =
3×2 logical array
0 1
0 0
1 1
X after:
X =
6 2 1 7 9 6
3 3 3 15 5 7
6 3 8 6 5 4
7 4 11 13 3 4
2 5 9 5 5 9

matlab shuffle elements of vector with the same sequent of the same number

I have the following vector
a = 3 3 5 5 20 20 20 4 4 4 2 2 2 10 10 10 6 6 1 1 1
does anyone know how to shuffle this vector with the same elementsnever be seperate?
something like bellow
a = 10 10 10 5 5 4 4 4 20 20 20 1 1 1 3 3 2 2 2 6 6
thank you, best regard...
You can use unique combined with accumarray to create a cell array where each group of values is placed into a separate cell element. You can then shuffle these elements and recombine them into an array.
% Put each group into a separate cell of a cell array
[~, ~, ind] = unique(a);
C = accumarray(ind(:), a(:), [], #(x){x});
% Shuffle it
shuffled = C(randperm(numel(C)));
% Now make it back into a vector
out = cat(1, shuffled{:}).';
% 20 20 20 1 1 1 3 3 10 10 10 5 5 4 4 4 6 6 2 2 2
Another option is to get the values using unique and then compute the number that each occurs. You can then shuffle the values and use repelem to expand out the result
u = unique(a);
counts = histc(a, u);
% Shuffle the values
inds = randperm(numel(u));
% Now expand out the array
out = repelem(u(inds), counts(inds));
A very similar answer to #Suever, using a loop and logical matrix rather than cells
a = [3 3 5 5 20 20 20 4 4 4 2 2 2 10 10 10 6 6 1 1 1];
vals = unique(a); %find unique values
vals = vals(randperm(length(vals))); %shuffle vals matrix
aout = []; %initialize output matrix
for ii = 1:length(vals)
aout = [aout a(a==(vals(ii)))]; %add correct number of each value
end
Here's another approach:
a = [3 3 5 5 20 20 20 4 4 4 2 2 2 10 10 10 6 6 1 1 1];
[~, ~, lab] = unique(a);
r = randperm(max(lab));
[~, ind] = sort(r(lab));
result = a(ind);
Example result:
result =
2 2 2 3 3 5 5 20 20 20 4 4 4 10 10 10 1 1 1 6 6
It works as follows:
Assign unique labels to each element of a depending on their values (this is vector lab);
Apply a random bijection from the values of lab to themselves (the random bijection is represented by r; the result of applying it is r(lab));
Sort r(lab) and get the indices of the sorting (this is ind);
Apply those indices to a.

How to efficiently reshape one column matrix to many specific length columns by moving specific interval

The input is an N-by-1 matrix. I need to reshape it to L-by-M matrix. The following is an example.
Input:
b =
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
Set length = 18, Output:
X =
1 2 3
2 3 4
3 4 5
4 5 6
5 6 7
6 7 8
7 8 9
8 9 10
9 10 11
10 11 12
11 12 13
12 13 14
13 14 15
14 15 16
15 16 17
16 17 18
17 18 19
18 19 20
Because I have a very big matrix, using a loop to reshape is very inefficient. How can I improve the reshape speed?
Your example output matrix X is the perfect matrix to index a vector of length N to get what you want. It's also very easy to create using bsxfun:
N = 20;
b = rand(N,1);
M = 3; %// number of columns
L = N-M; %// Note that N-M is an upper limit for L!
idx = bsxfun(#plus, (0:L)', 1:M)
X = b(idx)
That's exactly what im2col (from the Image Processing Toolbox) does:
b = (1:20).'; %'// example data
L = 18; % // desired length of sliding blocks
x = im2col(b, [L 1]); % // result
I'd use horzcat. For example:
function X = reshaper(b,len)
diff = length(b) - len + 1;
X = b(1:len);
for i=2:diff
X = horzcat(X,b(i:len+(i-1)));
end
You could probably remove the for loop with some further thought.