transform categorical predictors to numerical variable matlab

transform categorical predictors to numerical variable matlab - matlab

I am new to matlab.
I have a categorical input predictor(X) and the set of past results (Y, binary).
I would like to convert it to numeric variable in the following method.
For each category calculate the average of Y and replace the value with the average.
for example:
X Y X'
1 1 1
2 0 0
3 1 0.5
1 1 1
2 0 0
3 0 0.5
Please help.

you are looking for accumarray with mean function with Y as vals and X as subs
Xprime = accumarray( X, Y, [], #mean );
Xprime = Xptime( X );

Related

How to reduce coefficients to their lowest possible integers using Matlab - Balancing Chemical Equations

I am attempting to develop a Matlab program to balance chemical equations. I am able to balance them via solving a system of linear equations. Currently my output is a column vector with the coefficients.
My problem is that I need to return the smallest integer values of these coefficients. For example, if [10, 20, 30] was returned. I want [1, 2, 3] to be returned.
What is the best way to accomplish this?
I want this program to be fully autonomous once it is fed a matrix with the linear system. Thus I can not play around with the values, I need to automate this from the code. Thanks!
% Chemical Equation in Matrix Form
Chem = [1 0 0 -1 0 0 0; 1 0 1 0 0 -3 0; 0 2 0 0 -1 0 0; 0 10 0 0 0 -1 0; 0 35 4 -4 0 12 1; 0 0 2 -1 -3 0 2]
%set x4 = 1 then Chem(:, 4) = b and
b = Chem(:, 4); % Arbitrarily set x4 = 1 and set its column equal to b
Chem(:,4) = [] % Delete the x4 column from Chem and shift over
g = 1; % Initialize variable for LCM
x = Chem\b % This is equivalent to the reduced row echelon form of
% Chem | b
% Below is my sad attempt at factoring the values, I divide by the smallest decimal to raise all the values to numbers greater than or equal to 1
for n = 1:numel(x)
g = x(n)*g
M = -min(abs(x))
y = x./M
end
I want code that will take some vector with coefficients, and return an equivalent coefficient vector with the lowest possible integer coefficients. Thanks!

I was able to find a solution without using integer programming. I converted the non-integer values to rational expressions, and used a built-in matlab function to extract the denominator of each of these expressions. I then used a built in matlab function to find the least common multiples of these values. Finally, I multiplied the least common multiple by the matrix to find my answer coefficients.
% Chemical Equation in Matrix Form
clear, clc
% Enter chemical equation as a linear system in matrix form as Chem
Chem = [1 0 0 -1 0 0 0; 1 0 1 0 0 -3 0; 0 2 0 0 -1 0 0; 0 10 0 0 0 -1 0; 0 35 4 -4 0 -12 -1; 0 0 2 -1 -3 0 -2];
% row reduce the system
C = rref(Chem);
% parametrize the system by setting the last variable xend (e.g. x7) = 1
x = [C(:,end);1];
% extract numerator and denominator from the rational expressions of these
% values
[N,D] = rat(x);
% take the least common multiple of the first pair, set this to the
% variable least
least = lcm(D(1),D(2));
% loop through taking the lcm of the previous values with the next value
% through x
for n = 3:numel(x)
least = lcm(least,D(n));
end
% give answer as column vector with the coefficients (now factored to their
% lowest possible integers
coeff = abs(least.*x)

Applying median filter to data with 2 axes

I have the following code:
x = VarName3;
y = VarName4;
x = (x/6000)/60;
plot(x, y)
Where VarName3 and VarName4 are 3000x1. I would like to apply a median filter to this in MATLAB. However, the problem I am having is that, if I use medfilt1, then I can only enter a single array of variables as the first argument. And for medfilt2, I can only enter a matrix as the first argument. But the data looks very obscured if I convert x and y into a matrix.
The x is time and y is a list of integers. I'd like to be able to filter out spikes and dips. How do I go about doing this? I was thinking of just eliminating the erroneous data points by direct manipulation of the data file. But then, I don't really get the effect of a median filter.

I found a solution using sort.
Median is the center element, so you can sort three elements, and take the middle element as median.
sort function also returns the index of the previous syntaxes.
I used the index information for restoring the matching value of X.
Here is my code sample:
%X - simulates time.
X = [1 2 3 4 5 6 7 8 9 10];
%Y - simulates data
Y = [0 1 2 0 100 1 1 1 2 3];
%Create three vectors:
Y0 = [0, Y(1:end-1)]; %Left elements [0 0 1 2 0 2 1 1 1 2]
Y1 = Y; %Center elements [0 1 2 0 2 1 1 1 2 3]
Y2 = [Y(2:end), 0]; %Right elements [1 2 0 2 1 1 1 2 3 0]
%Concatenate Y0, Y1 and Y2.
YYY = [Y0; Y1; Y2];
%Sort YYY:
%sortedYYY(2, :) equals medfilt1(Y)
%I(2, :) equals the index: value 1 for Y0, 2 for Y1 and 3 for Y2.
[sortedYYY, I] = sort(YYY);
%Median is the center of sorted 3 elements.
medY = sortedYYY(2, :);
%Corrected X index of medY
medX = X + I(2, :) - 2;
%Protect X from exceeding original boundries.
medX = min(max(medX, min(X)), max(X));
Result:
medX =
1 2 2 3 6 7 7 8 9 9
>> medY
medY =
0 1 1 2 1 1 1 1 2 2

Use a sliding window on the data vector centred at a given time. The value of your filtered output at that time is the median value of the data in the sliding window. The size of the sliding window is an odd value, not necessarily fixed to 3.

Matlab calculate 3D similarity transformation. fitgeotrans for 3D

How can I calculate in MatLab similarity transformation between 4 points in 3D?
I can calculate transform matrix from
T*X = Xp,
but it will give me affine matrix due to small errors in points coordinates. How can I fit that matrix to similarity one? I need something like fitgeotrans, but in 3D
Thanks

If I am interpreting your question correctly, you seek to find all coefficients in a 3D transformation matrix that will best warp one point to another. All you really have to do is put this problem into a linear system and solve. Recall that warping one point to another in 3D is simply:
A*s = t
s = (x,y,z) is the source point, t = (x',y',z') is the target point and A would be the 3 x 3 transformation matrix that is formatted such that:
A = [a00 a01 a02]
[a10 a11 a12]
[a20 a21 a22]
Writing out the actual system of equations of A*s = t, we get:
a00*x + a01*y + a02*z = x'
a10*x + a11*y + a12*z = y'
a20*x + a21*y + a22*z = z'
The coefficients in A are what we need to solve for. Re-writing this in matrix form, we get:
[x y z 0 0 0 0 0 0] [a00] [x']
[0 0 0 x y z 0 0 0] * [a01] = [y']
[0 0 0 0 0 0 x y z] [a02] [z']
[a10]
[a11]
[a12]
[a20]
[a21]
[a22]
Given that you have four points, you would simply concatenate rows of the matrix on the left side and the vector on the right
[x1 y1 z1 0 0 0 0 0 0] [a00] [x1']
[0 0 0 x1 y1 z1 0 0 0] [a01] [y1']
[0 0 0 0 0 0 x1 y1 z1] [a02] [z1']
[x2 y2 z2 0 0 0 0 0 0] [a10] [x2']
[0 0 0 x2 y2 z2 0 0 0] [a11] [y2']
[0 0 0 0 0 0 x2 y2 z2] [a12] [z2']
[x3 y3 z3 0 0 0 0 0 0] * [a20] = [x3']
[0 0 0 x3 y3 z3 0 0 0] [a21] [y3']
[0 0 0 0 0 0 x3 y3 z3] [a22] [z3']
[x4 y4 z4 0 0 0 0 0 0] [x4']
[0 0 0 x4 y4 z4 0 0 0] [y4']
[0 0 0 0 0 0 x4 y4 z4] [z4']
S * a = T
S would now be a matrix that contains your four source points in the format shown above, a is now a vector of the transformation coefficients in the matrix you want to solve (ordered in row-major format), and T would be a vector of target points in the format shown above.
To solve for the parameters, you simply have to use the mldivide operator or \ in MATLAB, which will compute the least squares estimate for you. Therefore:
a = S^{-1} * T
As such, simply build your matrix like above, then use the \ operator to solve for your transformation parameters in your matrix. When you're done, reshape T into a 3 x 3 matrix. Therefore:
S = ... ; %// Enter in your source points here like above
T = ... ; %// Enter in your target points in a right hand side vector like above
a = S \ T;
similarity_matrix = reshape(a, 3, 3).';
With regards to your error in small perturbations of each of the co-ordinates, the more points you have the better. Using 4 will certainly give you a solution, but it isn't enough to mitigate any errors in my opinion.
Minor Note: This (more or less) is what fitgeotrans does under the hood. It computes the best homography given a bunch of source and target points, and determines this using least squares.
Hope this answered your question!

The answer by #rayryeng is correct, given that you have a set of up to 3 points in a 3-dimensional space. If you need to transform m points in n-dimensional space (m>n), then you first need to add m-n coordinates to these m points such that they exist in m-dimensional space (i.e. the a matrix in #rayryeng becomes a square matrix)... Then the procedure described by #rayryeng will give you the exact transformation of points, you then just need to select only the coordinates of the transformed points in the original n-dimensional space.
As an example, say you want to transform the points:
(2 -2 2) -> (-3 5 -4)
(2 3 0) -> (3 4 4)
(-4 -2 5) -> (-4 -1 -2)
(-3 4 1) -> (4 0 5)
(5 -4 0) -> (-3 -2 -3)
Notice that you have m=5 points which are n=3-dimensional. So you need to add coordinates to these points such that they are n=m=5-dimensional, and then apply the procedure described by #rayryeng.
I have implemented a function that does that (find it below). You just need to organize the points such that each of the source-points is a column in a matrix u, and each of the target points is a column in a matrix v. The matrices u and v are going to be, thus, 3 by 5 each.
WARNING:
the matrix A in the function may require A LOT of memory for moderately many points nP, because it has nP^4 elements.
To overcome this, for square matrices u and v, you can simply use T=v*inv(u) or T=v/u in MATLAB notation.
The code may run very slowly...
In MATLAB:
u = [2 2 -4 -3 5;-2 3 -2 4 -4;2 0 5 1 0]; % setting the set of source points
v = [-3 3 -4 4 -3;5 4 -1 0 -2;-4 4 -2 5 -3]; % setting the set of target points
T = findLinearTransformation(u,v); % calculating the transformation
You can verify that T is correct by:
I = eye(5);
uu = [u;I((3+1):5,1:5)]; % filling-up the matrix of source points so that you have 5-d points
w = T*uu; % calculating target points
w = w(1:3,1:5); % recovering the 3-d points
w - v % w should match v ... notice that the error between w and v is really small
The function that calculates the transformation matrix:
function [T,A] = findLinearTransformation(u,v)
% finds a matrix T (nP X nP) such that T * u(:,i) = v(:,i)
% u(:,i) and v(:,i) are n-dim col vectors; the amount of col vectors in u and v must match (and are equal to nP)
%
if any(size(u) ~= size(v))
error('findLinearTransform:u','u and v must be the same shape and size n-dim vectors');
end
[n,nP] = size(u); % n -> dimensionality; nP -> number of points to be transformed
if nP > n % if the number of points to be transform exceeds the dimensionality of points
I = eye(nP);
u = [u;I((n+1):nP,1:nP)]; % then fill up the points to be transformed with the identity matrix
v = [v;I((n+1):nP,1:nP)]; % as well as the transformed points
[n,nP] = size(u);
end
A = zeros(nP*n,n*n);
for k = 1:nP
for i = ((k-1)*n+1):(k*n)
A(i,mod((((i-1)*n+1):(i*n))-1,n*n) + 1) = u(:,k)';
end
end
v = v(:);
T = reshape(A\v, n, n).';
end

How to display coordinates within a matrix?

I would like to display the coordinates of a matrix in terms of X and Y. For example
if matrix = [ 0 0 5 0; 0 0 1 0; 0 0 0 1; 0 0 0 0]
Say, i want the coordinate for 5...how can i write a code that says 5 = 1x and 3 y.
I don't want to display the element in the matrix, just the coordinates of that element.

use find
[y x] = find( matrix ~= 0 ); % gives you the x y coordinates of all non-zero elements
Note the order of y and x, since Matlab is indexing using row-column.

Matlab/Simulink: Convert Data Table (Measured) Into Lookup Table

I have simulated a magnetic system and I have 2 input variables and 1 output variable. The result looks like this:
myData = [...
0 0 1.1;...
0 1 1.2;...
0 2 1.2;...
1 0.1 2.1;...
1 0.9 2.2;...
1 2.05 2.2;...
3 0.1 3.1;...
3 1.2 3.2;...
3 1.9 3.2;...
];
Column 1 and 2 are the input values. Column 3 is the output variable:
x = myData(:,1);
y = myData(:,2);
z = myData(:,3);
I want to create a 2D lookup table in Simulink with x and y as inputs and z as an output. I do not get how to do this. It would be easy if the 2nd input variable would be evenly spaced like here:
x = [0 1 2];
y = [0 1 2];
z = [0 0 0; 1 2 3; 4 4 8]
In the Simulink lookup table block you would put:
In a nutshell:
How do I treat my data do be able to use a lookup table in Simulink?

The matlab/simulink command you are looking for is set_param.
The matlab command you are looking for in mat2str

You can set.
x = [0 1 2];
y = [0 1 2];
Then interpolate z corresponding to x and y, using the raw data available. This will form an evenly spaced LUT.