Random selection of matrix columns - matlab

I have an m x n matrix and I want to use it in some neural networks applications in MATLAB.
For example,
A = [ 24 22 35 40 30 ; 32 42 47 45 39 ; 14 1 10 5 9 ; 2 8 4 1 8] ;
I want to randomly train some columns and test the other remaining columns.
So, the first matrix will contain three random, distinct columns taken from the original matrix A, while the second matrix contains the remaining two columns.
How can I extract these matrices ?

This will do:
s = randperm(5);
train = A(:, s(1:3));
test = A(:, s(4:end));

Neural Network Toolbox comes with a set of functions that do this for you, such as dividerand and divideblock.

Related

Weighted Random number?

How can I randomize and generate numbers from 0-50 in matrix of 5x5 with SUM or each row printed on the right side?
+
is there any way to give weight to individual numbers before generating the numbers?
Please help
Thanks!
To generate a random matrix of integers between 0 and 50 (sampled with replacement) you could use
M = randint(5,5,[0,50])
To print the matrix with the sum of each row execute the following command
[M sum(M,2)]
To use a different distribution there are a number of techniques but one of the easiest is to use the datasample function from the Statistics and Machine Learning toolbox.
% sample from a truncated Normal distribution. No need to normalize
x = 0:50;
weights = exp(-0.5*(x-25).^2 / 5^2);
M = reshape(datasample(x,25,'Weights',weights),[5,5])
Edit:
Based on your comment you want to perform random sampling without replacement. You can perform such a random sampling without replacement if the weights are non-negative integers by simulating the classic ball-urn experiment.
First create an array containing the appropriate number of each value.
Example: If we have the values 0,1,2,3,4 with the following weights
w(0) = 2
w(1) = 3
w(2) = 5
w(3) = 4
w(4) = 1
Then we would first create the urn array
>> urn = [0 0 1 1 1 2 2 2 2 2 3 3 3 3 4];
then, we would shuffle the urn using randperm
>> urn_shuffled = urn(randperm(numel(urn)))
urn_shuffled =
2 0 4 3 0 3 2 2 3 3 1 2 1 2 1
To pick 5 elements without replacement we would simple select the first 5 elements of urn_shuffled.
Rather than typing out the entire urn array, we can construct it programatically given an array of weights for each value. For example
weight = [2 3 5 4 1];
urn = []
v = 0
for w = weight
urn = [urn repmat(v,1,w)];
v = v + 1;
end
In your case, the urn will contain many elements. Once you shuffle you would select the first 25 elements and reshape them into a matrix.
>> M = reshape(urn_shuffled(1:25),5,5)
To draw random integer uniformly distributed numbers, you can use the randi function:
>> randi(50,[5,5])
ans =
34 48 13 28 13
33 18 26 7 41
9 30 35 8 13
6 12 45 13 47
25 38 48 43 18
Printing the sum of each row can be done by using the sum function with 2 as the dimension argument:
>> sum(ans,2)
ans =
136
125
95
123
172
For weighting the various random numbers, see this question.

MATLAB APPLY CUMSUM IN STEPS

I have data of integers in x = 500 X 612 matrix. I need a new variable xx in a 500 X 612 matrix but I need to apply cumsum along each row (500) across 12 column steps and applying cumsum like this 51 times --> 500 X (12 X 51) matrix. Then I need a for loop to produce 51 plots of the 500 rows and 12 columns of the cumsum time series. thank you!
I will rephrase what the question is asking to benefit those who are reading.
The OP wishes to segment a matrix into chunks by splitting up the matrix into a bunch of columns. A cumsum is applied to each row individually for each column and are then concatenated together to build a final matrix. As such, given this source matrix:
x =
1 2 3 4 5 6 7 8 9 10 11 12
13 14 15 16 17 18 19 20 21 22 23 24
Supposing that we wish to split up the matrix by columns 3, 6 and 9 and 12, we will have four chunks to work with. We do a cumsum on each of these blocks individually and piece the final result together. So the result would like the following:
xx =
1 3 6 4 9 15 7 15 24 10 21 33
13 27 42 16 33 51 19 39 60 22 45 69
First, you need to determine how many columns you want to break up the matrix into. In your case, we wish to segment the matrix into 4 chunks: Columns 1 - 3, columns 4 - 6, columns 7 - 9, and columns 10 - 12. As such, I'm going to reshape this matrix so that each column is an individual row from a chunk in this matrix. We then apply cumsum over this reshaped matrix and we then reshape it back to what you had originally.
Therefore, do this:
num_chunks = 4; %// Columns 3, 6, 9, 12
divide_point = size(x,2) / num_chunks; %// Determine how many elements are in a row for a cumsum
x_reshape = reshape(x.', divide_point, []); %// Get reshaped matrix
xy = cumsum(x_reshape); %// cumsum over all columns individually
xx = reshape(xy, size(x,2), size(x,1)).'; %// Reconstruct matrix
In the third line of code, x_reshape = reshape(x.', divide_point, []); may seem a bit daunting, but it's actually not that bad. I had to transpose the matrix first because you want to take each row of a chunk and place them into individual columns so we can perform a cumsum on each column. When you reshape something in MATLAB, it collects values column-wise and reshapes the input into an output of a specified size. Therefore, to collect the rows, we need to collect row-wise and so we must transpose this matrix. Next, divide_point tells you how many elements we have for a single row in one chunk. As such, we want to construct a matrix that is of size divide_point x N where divide_point tells you how many elements we have in a row of a chunk and N is the total number of rows over all chunks. Because I don't want to calculate how many there are (am rather lazy actually....), the [] syntax is to automatically infer this number so that we can get a reshaped matrix that respects the total number of elements in the original input. We then perform cumsum on each of these columns, and then we need to reshape this back into the original shape of the input. With this, we use reshape again on the cumsum result, but in order to get it back into the row-order that you want, we have to determine the transpose as reshape takes values in column-major order, then re-transpose that result.
We get:
xx =
1 3 6 4 9 15 7 15 24 10 21 33
13 27 42 16 33 51 19 39 60 22 45 69
In general, the total number of elements to sum over for a row needs to be evenly divisible by the total number of columns that your matrix contains. For example, given the above, if you were to try to segment this matrix into 5 chunks, you would certainly get an error as the number of rows to cumsum over is not symmetric.
As another example, let's say we wanted to break up the matrix into 6 chunks. Therefore, by setting num_chunks = 6, we get:
xx =
1 3 3 7 5 11 7 15 9 19 11 23
13 27 15 31 17 35 19 39 21 43 23 47
You can see that cumsum restarts at every second column, as we desired 6 chunks and to get 6 chunks with a matrix of 12 columns, a chunk is created at every second column.

Removing a row in a matrix, by removing an entry from a possibly different row for each column

I have a vector of values which represent an index of a row to be removed in some matrix M (an image). There's only one row value per column in this vector (i.e. if the image is 128 x 500, my vector contains 500 values).
I'm pretty new to MATLAB so I'm unsure if there's a more efficient way of removing a single pixel (row,col value) from a matrix so I've come here to ask that.
I was thinking of making a new matrix with one less row, looping through each column up until I find the row whose value I wish to remove, and "shift" the column up by one and then move onto the next column to do the same.
Is there a better way?
Thanks
Yes, there is a solution which avoids loops and is thus faster to write and to execute. It makes use of linear indexing, and exploits the fact that you can remove a matrix entry by assigning it an empty value ([]):
% Example data matrix:
M = [1 5 9 13 17
2 6 10 14 18
3 7 11 15 19
4 8 12 16 20];
% Example vector of rows to be removed for each column:
vector = [2 3 4 1 3];
[r c] = size(M);
ind = sub2ind([r c],vector,1:c);
M(ind) = [];
M = reshape(M,r-1,c);
This gives the result:
>> M =
1 5 9 14 17
3 6 10 15 18
4 8 11 16 20

Finding index of vector from its original matrix

I have a matrix of 2d lets assume the values of the matrix
a =
17 24 1 8 15
23 5 7 14 16
4 6 13 20 22
10 12 19 21 3
17 24 1 8 15
11 18 25 2 9
This matrix is going to be divided into three different matrices randomly let say
b =
17 24 1 8 15
23 5 7 14 16
c =
4 6 13 20 22
11 18 25 2 9
d =
10 12 19 21 3
17 24 1 8 15
How can i know the index of the vectors in matrix d for example in the original matrix a,note that the values of the matrix can be duplicated.
for example if i want to know the index of {10 12 19 21 3} in matrix a?
or the index of {17 24 1 8 15} in matrix a,but for this one should return only on index value?
I would appreciate it so much if you can help me with this. Thank you in advance
You can use ismember with the 'rows' option. For example:
tf = ismember(a, c, 'rows')
Should produce:
tf =
0
0
1
0
0
1
To get the indices of the rows, you can apply find on the result of ismember (note that it's redundant if you're planning to use this vector for matrix indexing). Here find(tf) return the vector [3; 6].
If you want to know the number of the row in matrix a that matches a single vector, you either use the method explained and apply find, or use the second output parameter of ismember. For example:
[tf, loc] = ismember(a, [10 12 19 21 3], 'rows')
returns loc = 4 for your example. Note that here a is the second parameter, so that the output variable loc would hold a meaningful result.
Handling floating-point numbers
If your data contains floating point numbers, The ismember approach is going to fail because floating-point comparisons are inaccurate. Here's a shorter variant of Amro's solution:
x = reshape(c', size(c, 2), 1, []);
tf = any(all(abs(bsxfun(#minus, a', x)) < eps), 3)';
Essentially this is a one-liner, but I've split it into two commands for clarity:
x is the target rows to be searched, concatenated along the third dimension.
bsxfun subtracts each row in turn from all rows of a, and the magnitude of the result is compared to some small threshold value (e.g eps). If all elements in a row fall below it, mark this row as "1".
It depends on how you build those divided matrices. For example:
a = magic(5);
d = a([2 1 2 3],:);
then the matching rows are obviously: 2 1 2 3
EDIT:
Let me expand on the idea of using ismember shown by #EitanT to handle floating-point comparisons:
tf = any(cell2mat(arrayfun(#(i) all(abs(bsxfun(#minus, a, d(i,:)))<1e-9,2), ...
1:size(d,1), 'UniformOutput',false)), 2)
not pretty but works :) This would be necessary for comparisons such as: 0.1*3 == 0.3
(basically it compares each row of d against all rows of a using an absolute difference)

Matlab: Dividing chunks of data randomly into equal sized sets

I have a large dataset that I need to divide randomly into 5 almost equal sized sets for cross validation. I have happily used _crossvalind_ to divide into sets before, however this time I need to divide chunks of data into these groups at a time.
Let's say my data looks like this:
data = [1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18];
Then I want to divide them randomly into 5 groups in chunks of 2, e.g. like this
g1 = [3 4], [11 12]
g2 = [9 10]
g3 = [1 2], [15 16]
g4 = [7 8], [17 18]
g5 = [5 6], [13 14]
I think I can do this with some for-loops, but I'm guessing there must be a much more cost-efficient way to do it in matlab :-)
Any suggestions?
I'm interpreting your needs to be random ordering of sets, but within each set, the ordering of elements is unchanged from the parent set. You can use randperm to randomly order the number of sets and use linear indexing for the elements.
dataElements=numel(data);%# get number of elements
totalGroups=5;
groupSize=dataElements/totalGroups;%# I'm assuming here that it's neatly divisible as in your example
randOrder=randperm(totalGroups);%# randomly order of numbers from 1 till totalGroups
g=reshape(data,groupSize,totalGroups)'; %'# SO formatting
g=g(randOrder,:);
The different rows of g give you the different groupings.
You can shuffle the array (randperm) and then divide it into consequentive equal parts.
data = [10 20 30 40 50 60 70 80 90 100 110 120 130 140 150];
permuted = data(randperm(length(data)));
% padding may be required if the length of data is not divisible by the size of chunks
k = 5;
g = reshape(permuted, k, length(data)/k);