I have results of an experiment which is parameter optimization of support vector regression. I have the first three columns the parameters and the last column the MSE so I have about 24,646 X 4 matrix. I would like to get the parameters corresponding to the smallest MSE. How can I plot this on Matlab? the elements of the matrix are float numbers. Thank you for your help.
Use second output of min to obtain the index of the minimum value in the fourth columns; and then use that to get the parameters:
[~, ind] = min(matrix(:,4));
params = matrix(ind,1:3);
Related
I'm trying to use the mnrfit function but I get the error
If Y is a column vector, it must contain positive integer category numbers. .
My data is in a double and my Y values are floats, e.g. 0.6667. Is there a way that I can adjust my data to be able to use the mnrfit function?
Thanks in advance!
An unexperienced beginner
Y should be a "nominal outcome", i.e. non-continuous, to use mnrfit. We don't need to turn Y into integers, just categoricals. A categorical array is discrete as far as MATLAB is concerned, regardless whether the categories are represented by double values.
X = rand(5,3); % Predictors (should be double or single)
Y = rand(5,1); % Response (doubles, will cause error)
B = mnrfit( X, Y )
% ERROR: If Y is a column vector, it must contain positive integer category numbers.
B = mnrfit( X, categorical(Y) )
% No error, regression matrix B is output successfully.
Be careful, if you're expecting a continuous response variable (hence why Y is a vector of doubles) then mnrfit may not be suitable in the first place!
Note the valid data types are specified in the docs
Y can be one of the following:
An n-by-k matrix, where Y(i,j) is the number of outcomes of the multinomial category j for the predictor combinations given by X(i,:). In this case, the number of observations are made at each predictor combination.
An n-by-1 column vector of scalar integers from 1 to k indicating the value of the response for each observation. In this case, all sample sizes are 1.
An n-by-1 categorical array indicating the nominal or ordinal value of the response for each observation. In this case, all sample sizes are 1.
I have a 36x256x2232 3d matrix in Matlab created by M = ones(36,256,2232) and I want to reduce the size of the matrix by sum rows by interval 3. The result matrix should be 12x256x2232 and each cell should have the value 3.
I tried using reshape and sum function but I get 1x256x2232 matrix.
How can I do this without using the for-loop ?
This should do it:
M = ones(36,256,2232)
reduced = reshape(sum(reshape(M, 3,[], 256,2232), 1),[], 256, 2232);
reshape makes a 4d matrix with the given intervals
sum reduce it
second reshape transform it to 3d again
you can use also squeeze, which removes singleton dimensions:
reduced = squeeze(sum(reshape(M, 3,[], 256,2232), 1));
You can use the new-ish splitapply function (which is similar to accumarray but can handle data with multiple dimensions). This approach works even if the number of rows is not a multiple of the group size:
M = ones(4,5,2); % example data
n = 3; % group size
result = splitapply(#(x)sum(x,1), M, floor((0:size(M,1)-1).'/n)+1);
I want to reduce the dimension of data to ndim dimensions in MATLAB. I am using pcares to reduce dimension but the result (i.e. residuals,reconstructed) has the same dimensions as the data and not ndim. How can I project the residuals to ndim dimensions only.
[residuals,reconstructed] = pcares(X,ndim)
Sample code
MU = [0 0];
SIGMA = [4/3 2/3; 2/3 4/3];
X = mvnrnd(MU,SIGMA,1000);
[residuals,reconstructed] = pcares(X,1)
Now I expect the residuals to have 1 dimensions i.e. the data X projected to prime component as I specified it as pcares(X,1). But here both residuals and reconstructed have the same of 2.
pcares is doing its job. If you read the documentation, you call the function this way:
[RESIDUALS,RECONSTRUCTED] = pcares(X,NDIM);
RESIDUALS returns the residuals for each data point by retaining the first NDIM dimensions of your data and RECONSTRUCTED is the reconstructed data using the first NDIM principal components.
If you want the actual projection vectors, you need to use pca instead. You'd call it this way:
[coeff,score] = pca(x);
In fact, this is what pcares does under the hood but it also reconstructs the data for you using the above outputs. coeff returns the principal coefficients for your data while score returns the actual projection vectors themselves. score is such that each column is a single projection vector. It should be noted that these are ordered with respect to dominance as you'd expect with PCA... and so the first column is the most dominant direction, second column second dominant direction, etc.
Once you call the above, you simply index into coeff and score to retain whatever components you want. In your case, you just want the first component, and so do this:
c = coeff(1);
s = score(:,1);
If you want to reconstruct the data given your projection vectors, referring to the second last line of code, it's simply:
[coeff,score] = pca(x);
n = size(X,1);
ndim = 1; %// For your case
reconstructed = repmat(mean(X,1),n,1) + score(:,1:ndim)*coeff(:,1:ndim)';
The above is basically what pcares does under the hood.
I have a matrix A in Matlab of dimension mxn. I want to construct a vector B of dimension mx1 such that B(i)=1 if all elements of A(i,:) are equal and 0 otherwise. Any suggestion? E.g.
A=[1 2 3; 9 9 9; 2 2 2; 1 1 4]
B=[0;1;1;0]
One way with diff -
B = all(diff(A,[],2)==0,2)
Or With bsxfun -
B = all(bsxfun(#eq,A,A(:,1)),2)
Here's another example that's a bit more obfuscated, but also does the job:
B = sum(histc(A,unique(A),2) ~= 0, 2) == 1;
So how does this work? histc counts the frequency or occurrence of numbers in a dataset. What's cool about histc is that we can compute the frequency along a dimension independently, so what we can do is calculate the frequency of values along each row of the matrix A separately. The first parameter to histc is the matrix you want to compute the frequency of values of. The second parameter denotes the edges, or which values you are looking at in your matrix that you want to compute the frequencies of. We can specify all possible values by using unique on the entire matrix. The next parameter is the dimension we want to operate on, and I want to work along all of the columns so 2 is specified.
The result from histc will give us a M x N matrix where M is the total number of rows in our matrix A and N is the total number of unique values in A. Next, if a row contains all equal values, there should be only one value in this row where all of the values were binned at this location where the rest of the values are zero. As such, we determine which values in this matrix are non-zero and store this into a result matrix, then sum along the columns of the result matrix and see if each row has a sum of 1. If it does, then this row of A qualifies as having all of the same values.
Certainly not as efficient as Divakar's diff and bsxfun method, but an alternative since he took the two methods I would have used :P
Some more alternatives:
B = var(A,[],2)==0;
B = max(A,[],2)==min(A,[],2)
I have two vectors A & B of size 250x4. The first column in each vector has the X values and the second column has the Y values. I want to calculate the euclidean distance between each the X & Y of each row in the two vectors and save the result in a new vector C of size 250x1 which holds the result of the euclidean distance. For example, if the first row in A is A1x, A1y, A1n, A1m and the first row in B is B1x, B1y, B1n, B1m so I want to get the eucledian distance which will be [(A1x-B1x)^2 + (A1y-B1y)^2]^0.5 and the result will be saved in C1 and same will be done for the rest of the 250 rows. So if anyone could please advise how to do this in Matlab.
Like this:
%// First extract on x-y data from A and B
Axy = A(:,1:2);
Bxy = B(:,1:2);
%// Find all euclidean distances (row-wise)
C1 = sqrt(sum((Axy-Bxy).^2,2));
plus it handles higher dimension too
use pdist2:
C1=diag(pdist2(A(:,1:2),B(:,1:2)));
Actually, pdist2 will give you a 250x250 matrix, because it calculate all the distances. You need only the main diagonal, so calling diag on the result (as in the code above) will produce the wanted result.