Sequential feature selection Matlab

Sequential feature selection Matlab - matlab

Can somebody explain how to use this function in Matlab
"sequentialfs"
it looks straight forward but I do not know how can we design a function handler for it?!
any clue?!

Here's a simpler example than the one in the documentation.
First let's create a very simple dataset. We have some class labels y. 500 are from class 0, and 500 are from class 1, and they are randomly ordered.
>> y = [zeros(500,1); ones(500,1)];
>> y = y(randperm(1000));
And we have 100 variables x that we want to use to predict y. 99 of them are just random noise, but one of them is highly correlated with the class label.
>> x = rand(1000,99);
>> x(:,100) = y + rand(1000,1)*0.1;
Now let's say we want to classify the points using linear discriminant analysis. If we were to do this directly without applying any feature selection, we would first split the data up into a training set and a test set:
>> xtrain = x(1:700, :); xtest = x(701:end, :);
>> ytrain = y(1:700); ytest = y(701:end);
Then we would classify them:
>> ypred = classify(xtest, xtrain, ytrain);
And finally we would measure the error rate of the prediction:
>> sum(ytest ~= ypred)
ans =
0
and in this case we get perfect classification.
To make a function handle to be used with sequentialfs, just put these pieces together:
>> f = #(xtrain, ytrain, xtest, ytest) sum(ytest ~= classify(xtest, xtrain, ytrain));
And pass all of them together into sequentialfs:
>> fs = sequentialfs(f,x,y)
fs =
Columns 1 through 16
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Columns 17 through 32
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Columns 33 through 48
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Columns 49 through 64
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Columns 65 through 80
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Columns 81 through 96
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Columns 97 through 100
0 0 0 1
The final 1 in the output indicates that variable 100 is, as expected, the best predictor of y among the variables in x.
The example in the documentation for sequentialfs is a little more complex, mostly because the predicted class labels are strings rather than numerical values as above, so ~strcmp is used to calculate the error rate rather than ~=. In addition it makes use of cross-validation to estimate the error rate, rather than direct evaluation as above.

Related

How to randomly select x number of indices from a matrix in Matlab

I'm trying to generate a randomly scattered but limited number of 1's in a matrix of zeros efficiently.
Say I have a 10x10 matrix of zeros (zeros(10)) and I want to randomly place ten 1's so it looks like:
0 0 0 0 0 0 0 0 0 1
0 0 0 0 0 1 0 0 0 0
0 0 1 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 1 0
1 0 0 0 0 1 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 1 0 0 0 0 0 0
0 1 0 0 0 0 0 1 0 0
0 0 0 0 1 0 0 0 0 0
0 0 1 0 0 0 0 0 0 0
How can I do this WITHOUT a for-loop and without manually plugging in each position (this example is a much smaller version of my real problem)?
My code so far:
% Generate zeros
M = zeros(10)
% Generate random indices
Rands = [randsample(10, 10) randsample(10, 10)]
Where the first column is intended to be the row indices and the second column the column indices.
Now I obviously can't just drop these indices into the row and column indices of M like this:
M(Rands(:,1), Rands(:,2)) = 1
How can I vecorise the changes to these random indices?

You can use randperm to randomly generate the linear indices to be filled with 1:
sz = [10 10]; % desired size
n = 10; % desired number of ones
M = zeros(sz);
M(randperm(prod(sz), n)) = 1;
Alternatively, you can use randperm and reshape in one line:
M = reshape(randperm(prod(sz))<=n, sz);

You can use sub2ind to convert subscripts to linear index:
M(sub2ind(size(M),Rands(:,1),Rands(:,2)))=1

Matlab, why is strel non-semetrical with respect to the angle?

I was attempting to apply a closing operation to an image using a line structuring element at 8 different directions. Initially I wanted to apply it to angles in the range [0 .. 360] but I later realised that my stucturing element is symmetrical so I thought of using the range [0 .. 180] instead. However, I later realized that Matlab's structuring element function (strel) does not produce symmetrical results for angles that are 180 degrees apart. Consider:
>> strel('line', 11, 120)
ans =
Flat STREL object containing 9 neighbors.
Neighborhood:
1 0 0 0 0
1 0 0 0 0
0 1 0 0 0
0 1 0 0 0
0 0 1 0 0
0 0 0 1 0
0 0 0 1 0
0 0 0 0 1
0 0 0 0 1
>>
And:
>> strel('line', 11, 300)
ans =
Flat STREL object containing 9 neighbors.
Neighborhood:
1 0 0 0 0 0 0
0 1 0 0 0 0 0
0 1 0 0 0 0 0
0 0 1 0 0 0 0
0 0 0 1 0 0 0
0 0 0 0 1 0 0
0 0 0 0 0 1 0
0 0 0 0 0 1 0
0 0 0 0 0 0 1
I expect that the 2 structuring elements above should be symmetrical, since 300 = 120 + 180. Why is this not the case for Matlab's strel function?

Potatoes:
This is an interesting observation. If you happen to look into the implementation of the strel object in MATLAB, the lines to observe are these (in the MakeLineStrel sub-function):
theta = theta_d * pi / 180;
x = round((len-1)/2 * cos(theta));
y = -round((len-1)/2 * sin(theta));
The theta_d value is the angle in degrees you've specified, and len is the length of the line you've requested for.
The values for x and y define the integer valued "end point" of the line, and the structuring element is constructed such that it is symmetric with respect to the origin.
Due to the rounding operation here, when you specify 120 for theta_d, the pair (x,y) will be (-2, -4), but when you specify 300, (x,y) will be (3, 4). This is the root cause for the discretized line being different due to the way the angles have been represented.
Given this understanding, it would be safest to actually print the neighborhood of the structuring element and ensure that it looks right, before using it in your operation.
Hope this helps.

Improve naive gauss elimination, when zero elements are known

I wrote naive gauss elimination without pivoting:
function [x] = NaiveGaussianElimination(A, b)
N = length(b);
x = zeros(N,1);
mulDivOp = 0;
subAddOp = 0;
for column=1:(N-1)
for row = (column+1):N
mul = A(row,column)/A(column,column);
A(row,:) = A(row,:)-mul*A(column,:);
b(row) = b(row)-mul*b(column);
mulDivOp = mulDivOp+N-column+2;
subAddOp = subAddOp +N-column+1;
end
end
for row=N:-1:1
x(row) = b(row);
for i=(row+1):N
x(row) = x(row)-A(row,i)*x(i);
end
x(row) = x(row)/A(row,row);
mulDivOp = mulDivOp + N-row + 1;
subAddOp = subAddOp + N-row;
end
x = x';
mulDivOp
subAddOp
return
end
but I am curious if I can reduce the number of multiplications/divisions and additions/subtractions in case I know which elements of matrix are 0:
For N = 10:
A =
96 118 0 0 0 0 0 0 0 63
154 -31 -258 0 0 0 0 0 0 0
0 -168 257 -216 0 0 0 0 0 0
0 0 202 24 308 0 0 0 0 0
0 0 0 -262 -36 -244 0 0 0 0
0 0 0 0 287 -308 171 0 0 0
0 0 0 0 0 197 229 -258 0 0
0 0 0 0 0 0 -62 -149 186 0
0 0 0 0 0 0 0 -43 255 -198
-147 0 0 0 0 0 0 0 -147 -220
(non-zero values are from randi). In general, non-zero elements are a_{1, N}, a_{N,1} and a_{i,j} when abs(i-j) <= 1.

Probably not. There are nice algorithms for reducing tridiagonal matrices (which these aren't, but they are close) to diagonal matrices. Indeed, this is one way in which the SVD of a matrix is produced, using orthogonal similarity transformations, not Gaussian elimination.
The problem is that when you use Gaussian elimination to remove the nonzero entries in the first column, you will have introduced additional nonzero entries in the other columns. The further you proceed, the more you destroy the structure of the matrix. It may be that Gaussian elimination is simply the wrong approach for the problem you are trying to solve, at least if you are trying to exploit the structure of the matrix.

replace non-zero values with random numbers

I have a zero-one matrix in MATLAB as follows:
[0 0 0 1 1 1
0 1 1 0 0 0
1 0 0 0 0 1
1 1 1 0 0 0
0 0 0 1 0 1]
I want to define another matrix including rand values instead of indexes of above matrix by 1. For instance the desired new rand matrix should be:
[0 0 0 0.2 0.2 0.1
0 0.6 0.7 0 0 0
0.4 0 0 0 0 0.6
0.7 0.8 0.5 0 0 0
0 0 0 0.3 0 0.4]
I used a two nested loop for to find non-zero values from first matrix and replace the rand values instead of them in a new matrix.
Is there any function of matlab to do it automatically, without using two nested loop for?

You can do it as follows:
A = ...
[0 0 0 1 1 1;
0 1 1 0 0 0;
1 0 0 0 0 1;
1 1 1 0 0 0;
0 0 0 1 0 1];
B = rand(size(A));
A(logical(A)) = B(logical(A));
A =
0 0 0 0.1320 0.2348 0.1690
0 0.3377 0.3897 0 0 0
0.9027 0 0 0 0 0.7317
0.9448 0.3692 0.4039 0 0 0
0 0 0 0.0598 0 0.4509
(I just took the basic rand-function, adjust it, as you need it)

You can slightly improve thewaywewalk's answer by generating only as many random numbers as you need. As a bonus, this approach allows to do everything in one line:
A(logical(A)) = rand(1,nnz(A));

If you're trying to replace the ones in matrix A with random numbers then you don't need any looping at all.
Here's one method.
a = double(rand(5,5)>.5); % Your binary matrix should be type double.
n = sum(a(:)); % Count the 1's.
a(a>0) = rand(1,n); % Replace the ones with rands.

If 'l' is a matrix containing zeros and non-zeros. Consider the scenarios below answering this question :
Replace all the zeros in matrix with random numbers :
l(l==0) = randn(1,size(l(l==0),1));
Replace all the positive values with random numbers :
l(l>0) = randn(1,size(l(l>0),1));
Replace all the negative values with random numbers :
l(l<0) = randn(1,size(l(l<0),1));
Replace all the 'NaN' with random numbers.
l(isnan(l)) = randn(1,size(l(isnan(l)),1));

Get the indexes of the boundary cells of a subset of a matrix. Matlab

Given a matrix where 1 is the current subset
test =
0 0 0 0 0 0
0 0 0 0 0 0
0 0 1 1 0 0
0 0 1 1 0 0
0 0 0 0 0 0
0 0 0 0 0 0
Is there a function, or quick method to get change the subset to the boundary of the current subset?
Eg. Get this subset from 'test' above
test =
0 0 0 0 0 0
0 1 1 1 1 0
0 1 0 0 1 0
0 1 0 0 1 0
0 1 1 1 1 0
0 0 0 0 0 0
In the end I just want to get the minimum of the cells surrounding a subset of a matrix. Sure I could loop through and get the minimum of the boundary (cell by cell), but there must be a way to do it with the method i've shown above.
Note the subset WILL be connected, but may not be rectangular. This may be the big catch.
This is a possible subset.... (Would pad this with a NaN border)
test =
0 0 0 0 0 0
0 0 0 0 0 0
0 0 1 1 0 0
0 0 1 1 0 0
0 0 1 1 1 1
0 0 1 1 1 1
Ideas?

The basic steps I'd use are:
Perform a dilation on the shape to get a new area which is the shape plus its boundary
Subtract the original shape from the dilated shape to leave just the boundary
Use the boundary to index your data matrix, then take the minimum.
Dilation
What I want to do here is pass a 3x3 window over each cell and take the maximum value in that window:
[m, n] = size(A); % assuming A is your original shape matrix
APadded = zeros(m + 2, n + 2);
APadded(2:end-1, 2:end-1) = A; % pad A with zeroes on each side
ADilated = zeros(m + 2, n + 2); % this will hold the dilated shape.
for i = 1:m
for j = 1:n
mask = zeros(size(APadded));
mask(i:i+2, j:j+2) = 1; % this places a 3x3 square of 1's around (i, j)
ADilated(i + 1, j + 1) = max(APadded(mask));
end
end
Shape subtraction
This is basically a logical AND and a logical NOT to remove the intersection:
ABoundary = ADilated & (~APadded);
At this stage you may want to remove the border we added to do the dilation, since we don't need it any more.
ABoundary = ABoundary(2:end-1, 2:end-1);
Find the minimum data point along the boundary
We can use our logical boundary to index the original data into a vector, then just take the minimum of that vector.
dataMinimum = min(data(ABoundary));

You should look at this as morphology problem, not set theory. This can be solved pretty easily with imdilate() (requires the image package). You basically only need to subtract the image to its dilation with a 3x3 matrix of 1.
octave> test = logical ([0 0 0 0 0 0
0 0 0 0 0 0
0 0 1 1 0 0
0 0 1 1 0 0
0 0 1 1 1 1
0 0 1 1 1 1]);
octave> imdilate (test, true (3)) - test
ans =
0 0 0 0 0 0
0 1 1 1 1 0
0 1 0 0 1 0
0 1 0 0 1 1
0 1 0 0 0 0
0 1 0 0 0 0
It does not, however, pads with NaN. If you really want that, you could pad your original matrix with false, do the operation, and then check if there's any true values in the border.
Note that you don't have to use logical() in which case you'll have to use ones() instead of true(). But that takes more memory and has worse performance.
EDIT: since you are trying to do it without using any matlab toolbox, take a look at the source of imdilate() in Octave. For the case of logical matrices (which is your case) it's a simple usage of filter2() which belongs to matlab core. That said, the following one line should work fine and be much faster
octave> (filter2 (true (3), test) > 0) - test
ans =
0 0 0 0 0 0
0 1 1 1 1 0
0 1 0 0 1 0
0 1 0 0 1 1
0 1 0 0 0 0
0 1 0 0 0 0

One possible solution is to take the subset and add it to the original matrix, but ensure that each time you add it, you offset its position by +1 row, -1 row and +1 column, -1 column. The result will then be expanded by one row and column all around the original subset. You then use the original matrix to mask the original subet to zero.
Like this:
test_new = test + ...
[[test(2:end,2:end);zeros(1,size(test,1)-1)],zeros(size(test,1),1)] + ... %move subset up-left
[[zeros(1,size(test,1)-1);test(1:end-1,2:end)],zeros(size(test,1),1)] + ... %move down-left
[zeros(size(test,1),1),[test(2:end,1:end-1);zeros(1,size(test,1)-1)]] + ... %move subset up-right
[zeros(size(test,1),1),[zeros(1,size(test,1)-1);test(1:end-1,1:end-1)]]; %move subset down-right
test_masked = test_new.*~test; %mask with original matrix
result = test_masked;
result(result>1)=1; % ensure that there is only 1's, not 2, 3, etc.
The result for this on your test matrix is:
result =
0 0 0 0 0 0
0 1 1 1 1 0
0 1 0 0 1 0
0 1 0 0 1 1
0 1 0 0 0 0
0 1 0 0 0 0
Edited - it now grabs the corners as well, by moving the subset up and to the left, up and to the right, down then left and down then right.
I expect this would be a very quick way to achieve this - it doesn't have any loops, nor functions - just matrix operations.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Sequential feature selection Matlab - matlab

Can somebody explain how to use this function in Matlab "sequentialfs" it looks straight forward but I do not know how can we design a function handler for it?! any clue?!

Related

How to randomly select x number of indices from a matrix in Matlab

Matlab, why is strel non-semetrical with respect to the angle?

Improve naive gauss elimination, when zero elements are known

replace non-zero values with random numbers

Get the indexes of the boundary cells of a subset of a matrix. Matlab

Categories

Resources