Creating combinations without repetitions - matlab

I have so far the following code:
data = xlsread('filename');
% 1000 samples without replacement
% each element of y contains 10 values without repetition
y = cell(10,1000);
for i = 1:1000
y{i} = datasample(data,10,'Replace',false);
end
Now I dont want to have the same vector twice in the cell y, and by twice I also mean vectors like [ 1 2 3 4 5 6 7 8 9 10] and [1 2 3 4 5 6 7 8 10 9], i.e the ordering of the elements does not matter, but if 2 vectors contain the same elements I want one to be deleted. How do I do that? Is there alternatively a way to sample some of combinations without replacement from data? Data contains 171 values, and all of the combinations without replacement would probably be some milions, whereas I basically only need around 1000 combinations without replacement.. Thanks

Related

Solving a Simple Sorting Problem in MATLAB

I have been given this problem in a MATLAB course I am doing. The instructor's solution provided there is wrong, and I have been struggling with the same problem for hours as I am a beginner who has just started coding (a science student here).
Consider a one-dimensional matrix A such as A = \[5 8 8 8 9 9 6 6 5 5 4 1 2 3 5 3 3 \]. Show the percentage frequencies of unique
elements in the matrix A in descending order.
Hint: Use the functions tabulate and sort.
How do I solve this problem using only tabulate, sort, and find functions (find is for eliminating zero frequency elements in tabulate table, which my instructor did not do)?
I tried first extracting the indices of non-zero elements in the percentage column of tabulating table using the find function, which I succeeded in doing using the following:
A = [5 8 8 8 9 9 6 6 5 5 4 1 2 3 5 3 3 ];
B = tabulate(A);
C = find(B(:,3) > 0)
But I am now struggling to return the values corresponding to the 3rd column of B using indices in C. Please help. Also please give me some alternative syntax where one can easily make a vector out of non-zero elements of a row or column easily by omitting the zeroes in that vector if it exists. Rest of the problem I'll do by myself.
With your find command, you are just finding the indices of the matrix and not the values themselves.
So you either will do something like this:
A = [5 8 8 8 9 9 6 6 5 5 4 1 2 3 5 3 3 ];
B = tabulate(A);
for i = 1:size(B,1)-1
if B(i,3) == 0
B(i,:) = [];
end
end
sortrows(B,3,'descend')
where you remove the 0 value's row.
Or since you have all the numbers with none-zero frequency you can ask for their rows. Like this:
A = [5 8 8 8 9 9 6 6 5 5 4 1 2 3 5 3 3 ];
B = tabulate(A);
C = find(B(:,3) > 0);
sortrows(B(C(:),:),3,'descend')
in a bit more elegant way. B(C(:),:) calls all the rows with first indices the indices of matrix C. Which is exactly what you are asking for. While at the same time you sort your matrix based on row 3 at a descending order.

How to split a matrix based on how close the values are?

Suppose I have a matrix A:
A = [1 2 3 6 7 8];
I would like to split this matrix into sub-matrices based on how relatively close the numbers are. For example, the above matrix must be split into:
B = [1 2 3];
C = [6 7 8];
I understand that I need to define some sort of criteria for this grouping so I thought I'd take the absolute difference of the number and its next one, and define a limit upto which a number is allowed to be in a group. But the problem is that I cannot fix a static limit on the difference since the matrices and sub-matrices will be changing.
Another example:
A = [5 11 6 4 4 3 12 30 33 32 12];
So, this must be split into:
B = [5 6 4 4 3];
C = [11 12 12];
D = [30 33 32];
Here, the matrix is split into three parts based on how close the values are. So the criteria for this matrix is different from the previous one though what I want out of each matrix is the same, to separate it based on the closeness of its numbers. Is there any way I can specify a general set of conditions to make the criteria dynamic rather than static?
I'm afraid, my answer comes too late for you, but maybe future readers with a similar problem can profit from it.
In general, your problem calls for cluster analysis. Nevertheless, maybe there's a simpler solution to your actual problem. Here's my approach:
First, sort the input A.
To find a criterion to distinguish between "intraclass" and "interclass" elements, I calculate the differences between adjacent elements of A, using diff.
Then, I calculate the median over all these differences.
Finally, I find the indices for all differences, which are greater or equal than three times the median, with a minimum difference of 1. (Depending on the actual data, this might be modified, e.g. using mean instead.) These are the indices, where you will have to "split" the (sorted) input.
At last, I set up two vectors with the starting and end indices for each "sub-matrix", to use this approach using arrayfun to get a cell array with all desired "sub-matrices".
Now, here comes the code:
% Sort input, and calculate differences between adjacent elements
AA = sort(A);
d = diff(AA);
% Calculate median over all differences
m = median(d);
% Find indices with "significantly higher difference",
% e.g. greater or equal than three times the median
% (minimum difference should be 1)
idx = find(d >= max(1, 3 * m));
% Set up proper start and end indices
start_idx = [1 idx+1];
end_idx = [idx numel(A)];
% Generate cell array with desired vectors
out = arrayfun(#(x, y) AA(x:y), start_idx, end_idx, 'UniformOutput', false)
Due to the unknown number of possible vectors, I can't think of way to "unpack" these to individual variables.
Some tests:
A =
1 2 3 6 7 8
out =
{
[1,1] =
1 2 3
[1,2] =
6 7 8
}
A =
5 11 6 4 4 3 12 30 33 32 12
out =
{
[1,1] =
3 4 4 5 6
[1,2] =
11 12 12
[1,3] =
30 32 33
}
A =
1 1 1 1 1 1 1 2 2 2 2 2 2 3 3 3 3 3 3 3
out =
{
[1,1] =
1 1 1 1 1 1 1
[1,2] =
2 2 2 2 2 2
[1,3] =
3 3 3 3 3 3 3
}
Hope that helps!

How to create a submatrix of a randomly filled matrix? [duplicate]

I have a 30 x 30 matrix, called A, and I want to assign B as the leftmost 30 x 20 block of A how can I do that?
Is this the correct way to do it?
B = A[30 ; 20]
No the correct way is
B = A(:, 1:20);
where : is shorthand for all of the rows in A.
Matrix indexing in MATLAB uses round brackets, (). Square brackets, [], are used to declare matrices (or vectors) as in
>> v = [1 2 3; 4 5 6; 7 8 9]
v =
1 2 3
4 5 6
7 8 9
excaza provides a very good link on Matrix Indexing in MATLAB which should help you. There is also Matrix Indexing.
A_new = A(:,1:20)
takes all the rows from A with this part A(:,) and the first 20 columns with this part A(,1:20)
A_newis now 30x20
You can also iterate over elements in two loops, but the above answer is easiest

How to create more than one matrix in a row using matlab

I am trying to get a series of vectors which come from the same original, to make an easy example, suppose this vector V= (1,2,3,4,5,6,7,8,9,10) (of course mine is bigger)
The first vector has to look like this:
R1=(1,3,5,7,9)= V(1:1:end)
The second vector:
R2=(2,4,6,8,10)=V(2:1:end)
The third vector:
R3=(3,5,7,9)=V(3:1:end)
The fourth vector:
R4=(4,6,8,10)=V(4:1:end)
...
R8=(8,10)=V(8:1:end)
So my questions are:
Is there an easier way to get this result?
How can I know the total number of Ri vectors with distance = 1 that can obtained from V?
Use Matlab's cell object which can hold a vector in every cell.
Use a for loop to fill this cell object gradually.
Code example:
%initialize V
V= [1,2,3,4,5,6,7,8,9,10];
%initialize an empty cell of size [10,1]
R= cell(length(V)-2,1);
%fill the cell
for ii=1:length(R)
R{ii} = V(ii:2:end);
end
%prints results
for ii=1:length(R)
R{ii}
end
Results (each row is a different vector):
1 3 5 7 9
2 4 6 8 10
3 5 7 9
4 6 8 10
5 7 9
6 8 10
7 9
8 10

Find the increasing and decreasing trend in a curve MATLAB

a=[2 3 6 7 2 1 0.01 6 8 10 12 15 18 9 6 5 4 2].
Here is an array i need to extract the exact values where the increasing and decreasing trend starts.
the output for the array a will be [2(first element) 2 6 9]
a=[2 3 6 7 2 1 0.01 6 8 10 12 15 18 9 6 5 4 2].
^ ^ ^ ^
| | | |
Kindly help me to get the result in MATLAB for any similar type of array..
You just have to find where the sign of the difference between consecutive numbers changes.
With some common sense and the functions diff, sign and find, you get this solution:
a = [2 3 6 7 2 1 0.01 6 8 10 12 15 18 9 6 5 4 2];
sda = sign(diff(a));
idx = [1 find(sda(1:end-1)~=sda(2:end))+2 ];
result = a(idx);
EDIT:
The sign function messes things up when there are two consecutive numbers which are the same, because sign(0) = 0, which is falsely identified as a trend change. You'd have to filter these out. You can do this by first removing the consecutive duplicates from the original data. Since you only want the values where the trend change starts, and not the position where it actually starts, this is easiest:
a(diff(a)==0) = [];
This is a great place to use the diff function.
Your first step will be to do the following:
B = [0 diff(a)]
The reason we add the 0 there is to keep the matrix the same length because of the way the diff function works. It will start with the first element in the matrix and then report the difference between that and the next element. There's no leading element before the first one so is just truncates the matrix by one element. We add a zero because there is no change there as it's the starting element.
If you look at the results in B now it is quite obvious where the inflection points are (where you go from positive to negative numbers).
To pull this out programatically there are a number of things you can do. I tend to use a little multiplication and the find command.
Result = find(B(1:end-1).*B(2:end)<0)
This will return the index where you are on the cusp of the inflection. In this case it will be:
ans =
4 7 13