I would like to know how to compute:
E[Y | X_i, X_j]
on MATLAB.
Let's say
x = unifrnd(0,1,1000,2) and the model is y = x1 + x2.
suppose that you have scatter plot x1 vs Y and x2 vs Y. With these scatter plots, you compute the regression coefficient (i.e. f1 = ax1 + b and f2 = cx2 + d).
now I want to compute expected value of Y given x1 and x2.
My question is how do I do that ? Thanks in advance
Related
I have a matrix whose three columns correspond to x, y and f values. I want to make a contour plot of f(x,y) in the x,y plane from these data with Octave/MATLAB.
Let's say, the matrix M is
x1 y1 f1
x2 y2 f2
x3 y3 f3
. . .
. . .
I found the function contourf requires f to be a matrix (whereas I have a vector with corresponding points).
How to generate this plot?
The x, y, and z variables that you pass to contourf are all matrices of the same size. For every point you need an x, y, and z value. You can use meshgrid to make matrices that have all the combinations of x and y values.
This example is from the doc for contourf. I added some comments to explain what is happening
% Create a vector of x values
x = linspace(-2*pi,2*pi);
% Create a vector of y values
y = linspace(0,4*pi);
% Make matrices with all combinations of x and y values for plotting
[X,Y] = meshgrid(x,y);
Z = sin(X)+cos(Y);
contourf(X,Y,Z)
This is the result of the above code
To plot the data in 3D plane for this model: y = a + a1*x1 + a2*x2
I do like this, the figure is shown in this website (http://kr.mathworks.com/help/stats/regress.html) , x1, x2, and y denote respectively vectors X, Y, and Z.
scatter3(x1,x2,y,'filled')
hold on
x1fit = min(x1):100:max(x1);
x2fit = min(x2):10:max(x2);
[X1FIT,X2FIT] = meshgrid(x1fit,x2fit);
YFIT = b(1) + b(2)*X1FIT + b(3)*X2FIT + b(4)*X1FIT.*X2FIT;
mesh(X1FIT,X2FIT,YFIT)
xlabel('Weight')
ylabel('Horsepower')
zlabel('MPG')
view(50,10)
My question is how can I plot the model with 3 variables in 3D: y = a + a1*x1 + a2*x2 + a3*x3 ?
I used the below code to get the linear model
X2 = [ImageSize Resolution PSNR];
lm3 = regress(K_Number, X2);
a1,a2,a3 <-> X2 vector.
I would create a function to solve the equation (functionSolver) based on three inputs (y,x1,x2)
Define a grid of the region you care about
y = -100:1:100;
x1 = -50:0.05:25;
x2 = 10:0.5:100;
(outx, outy, outz) = functionSolver(x1,x2,y); However you defined this
plot3(outx, outy, outz); This will plot the output as defined in your grid.
I have a variable y that depends on some variables x1 ∈ [x1_min,x1_max], x2 ∈ [x2_min,x2_max], x3 ∈ [x3_min,x3_max] and y can be a matrix as well, i.e. y=y(x1,x2,x3). I want to detect which among x1,x2,x3 is less relevant to determine the value of y.
I am using the following code in Matlab:
x = rand(1000,3); % x1, x2, x3 are the columns of x
y = fct(x); % A generic function of x1, x2, x3
[corr_mat, p_val] = corrcoef(x,y);
[i,j] = find(p_val > 0.5);
disp([i,j])
The problem is that the resulting indices strongly depend on the random samples (even if I increase the number of samples). How can I get a more precise measure?
As a simple alternative example, y=x1+x2+x3, with x1∈[50,80], x2∈[0,1], x3∈[0,1]. Clearly, the value of y depends much more on x1 than the other 2 variables. How do I quantify this dependence?
Thank you in advance.
EDIT: Here is what I mean with "quantification" or "relevance". I want to detect which variable determines very small changes in y, i.e. in the previous example x2 and x3 makes y to vary less than x1 does.
You need to use covariance and not correlation coefficient. The correlation coefficient is normalized by the variance of each variable to give the same weight to all variables when they have different ranges, and this is exactly what you want to avoid.
x1 = 50+30*rand(1000,1);
x2 = rand(1000,1);
x3 = rand(1000,1);
y = x1+x2+x3;
c=cov([x1 x2 x3 y]);
c(1:3,4) % Covariances of x[1-3] and y
I am trying to plot something similar to below:
I am using Matlab. I achieved drawing contour plots. However I could not draw the discriminant. Can anyone show a sample Matlab code or give some idea to draw the discriminant?
If you know the probability density function of each of the gaussian for a given point (x,y), lets say its pdf1(x,y) and pdf2(x,y) then you can simply plot the contour line of f(x,y) := pdf1(x,y) > pdf2(x,y). So you define function f to be 1 iff pdf1(x,y)>pdf2(x,y). This way the only contour will be placed along the curve where pdf1(x,y)==pdf2(x,y) which is the decision boundary (discriminant). If you wish to define "nice" function you can do it simply by setting f(x,y) = sgn( pdf1(x,y) - pdf2(x,y) ), and plotting its contour plot will result in exact same discriminant.
Here is how I would solve this problem analytically: you equate these two discriminant functions
g1(x)=x' W1 x + w1' x + w10
g2(x)=x' W2 x + w2' x + w20
g1(x) = g2(x)
==> x' (W2 - W1) x + (w2-w1)'x + w20 - w10
then, I consider W2 - W1 to have be this matrix
W2-W1 = [a b; c d]
which then by expanding vector x=[x1 x2]', we get:
a x1^2 + (b+c) x1 x2 + d x2^2 + (w21-w11) x1 + (w22-w12) x2 + w20-w10 = 0
this is the equation of an ellipse, so you can simplify it into the form below:
(x1 - a0)^2/h + (x2-b0)^2/g = r^2
Or, you can assume that you know the range of x1 for example x1=[-2:0.1:2], and then solve the parabola
I am trying to graphically find the intersections between two surfaces and the x-y plane. (Intersection of surface z1 with the x-y plane and intersection z2 with the x-y plane)
I have created arrays representing the surfaces z1 = 3+x+y and z2 = 4-2x-4y and the z3 for the x-y plane using meshgrid. Looking everywhere, the only command that seems I can use to find the intersections between arrays is the intersect(A,B) command where A and B are arrays. When I enter intersect(z1,z3) however, I get the error "A and B must be vectors, or 'rows' must be specified." When I try intersect (z1,z2,'rows'), I am returned a 0-by-21 empty matrix. What am I doing wrong here?
My code:
x = -10:10;
y = -10:10;
[X,Y] = meshgrid(x,y);
z1 = 3+X+Y;
z2 = 4-2.*X-4.*Y;
z3 = 0.*X+0.*Y; %x-y plane
surf(X,Y,z1)
hold on
surf(X,Y,z2)
surf(X,Y,z3)
int1 = intersect(z1,z3,'rows');
int2 = intersect(z2,z3,'rows');
It sounds like you want the points where z1 = z2. To numerically find these, you have a couple options.
1) Numerical rootfinding: fsolve is capable of solving systems of equations. You can formulate the surfaces as functions of one vector, [x;y] and solve for the vector that makes the two surfaces equal. An example using the initial guess x=1, y=1 follows:
z1 = #(x) 3 + x(1) + x(2);
z2 = #(x) 4 - 2*x(1) - 4*x(2);
f = #(x) z1(x) - z2(x);
x0 = [1;1]
intersect = fsolve(#(x) f(x), x0);
2) Minimizing the error: If you are stuck with discrete data (arrays instead of functions) you can simply find the points where z1 - z2 is closest to zero. An easy starting point is to take the arrays Z1 and Z2 and find all points where the difference nears zero:
tol = 1e-3;
near_zero = abs(Z1 - Z2) < tol;
near_zero is going to be a logical array that is true whenever the difference between Z1 and Z2 is small relative to tol. You can use this to index into corresponding meshgrid arrays for X and Y to find the coordinates of intersection.
a simple way (no major function calls) to solve this is as follows:
x = -10:.1:10;
y = -10:.1:10;
[X,Y] = meshgrid(x,y);
z1 = 3+X+Y;
z2 = 4-2.*X-4.*Y;
z3 = z1 - z2;
[~,mn] = min(abs(z3));
the intersection is defined as (x, y(mn)).
This, of course is a numerical approximation (since you wanted a numerical method), subject to boundary condition which I haven't explored (you'll need to disregard values far from zero when performing the minimum function)
Note: if you're looking for an equation, consider performing a least squares approximation on the resulting data.