Correlation dependent on samples - matlab

I have a variable y that depends on some variables x1 ∈ [x1_min,x1_max], x2 ∈ [x2_min,x2_max], x3 ∈ [x3_min,x3_max] and y can be a matrix as well, i.e. y=y(x1,x2,x3). I want to detect which among x1,x2,x3 is less relevant to determine the value of y.
I am using the following code in Matlab:
x = rand(1000,3); % x1, x2, x3 are the columns of x
y = fct(x); % A generic function of x1, x2, x3
[corr_mat, p_val] = corrcoef(x,y);
[i,j] = find(p_val > 0.5);
disp([i,j])
The problem is that the resulting indices strongly depend on the random samples (even if I increase the number of samples). How can I get a more precise measure?
As a simple alternative example, y=x1+x2+x3, with x1∈[50,80], x2∈[0,1], x3∈[0,1]. Clearly, the value of y depends much more on x1 than the other 2 variables. How do I quantify this dependence?
Thank you in advance.
EDIT: Here is what I mean with "quantification" or "relevance". I want to detect which variable determines very small changes in y, i.e. in the previous example x2 and x3 makes y to vary less than x1 does.

You need to use covariance and not correlation coefficient. The correlation coefficient is normalized by the variance of each variable to give the same weight to all variables when they have different ranges, and this is exactly what you want to avoid.
x1 = 50+30*rand(1000,1);
x2 = rand(1000,1);
x3 = rand(1000,1);
y = x1+x2+x3;
c=cov([x1 x2 x3 y]);
c(1:3,4) % Covariances of x[1-3] and y

Related

How to make a 2D contour plot with given data point in Octave/MATLAB?

I have a matrix whose three columns correspond to x, y and f values. I want to make a contour plot of f(x,y) in the x,y plane from these data with Octave/MATLAB.
Let's say, the matrix M is
x1 y1 f1
x2 y2 f2
x3 y3 f3
. . .
. . .
I found the function contourf requires f to be a matrix (whereas I have a vector with corresponding points).
How to generate this plot?
The x, y, and z variables that you pass to contourf are all matrices of the same size. For every point you need an x, y, and z value. You can use meshgrid to make matrices that have all the combinations of x and y values.
This example is from the doc for contourf. I added some comments to explain what is happening
% Create a vector of x values
x = linspace(-2*pi,2*pi);
% Create a vector of y values
y = linspace(0,4*pi);
% Make matrices with all combinations of x and y values for plotting
[X,Y] = meshgrid(x,y);
Z = sin(X)+cos(Y);
contourf(X,Y,Z)
This is the result of the above code

Expected Value of given two parameters

I would like to know how to compute:
E[Y | X_i, X_j]
on MATLAB.
Let's say
x = unifrnd(0,1,1000,2) and the model is y = x1 + x2.
suppose that you have scatter plot x1 vs Y and x2 vs Y. With these scatter plots, you compute the regression coefficient (i.e. f1 = ax1 + b and f2 = cx2 + d).
now I want to compute expected value of Y given x1 and x2.
My question is how do I do that ? Thanks in advance

Manually Creating Spline Piecewise-Polynomial

Consider a function y(x) sampled in an array of values, represented by the arrays x and y. If I have another x value x0, I can evaluate y(x0) using spline
y0 = spline(x,y,x0);
Now, I can also write
pp = spline(x,y);
y0 = ppval(pp,x0);
MY QUESTION: If I already have the coefficient and x matrices, my_coefs (size(my_coefs) = [length(y),4]) and x, how can I create a piecewise polynomial My_pp such that pp.coefs = my_coefs and that y0 = ppval(My_pp,x0)?
OK, There is no "spline object", but rather a piecewise polynomial object. So, if my_coefs was attained by break-points my_x then the code needed is
my_spline = mkpp(my_x,my_coefs);
y0 = ppval(my_spline, x0);
In case that dimensions are dazzeling here, which they are, then
my_coefs is 4*n
my_x is n
y0 is N
x0 is N

Using matlabs regress like polyfit

I have:
x = [1970:1:2000]
y = [data]
size(x) = [30,1]
size(y) = [30,1]
I want:
% Yl = kx + m, where
[k,m] = polyfit(x,y,1)
For some reason i have to use "regress" for this.
Using k = regress(x,y) gives some totally random value that i have no idea where it comes from. How do it?
The number of outputs you get in "k" is dependant on the size of input X, so you will not get both m and k just by putting in your x and y straight. From the docs:
b = regress(y,X) returns a p-by-1 vector b of coefficient estimates for a multilinear regression of the responses in y on the predictors in X. X is an n-by-p matrix of p predictors at each of n observations. y is an n-by-1 vector of observed responses.
It is not exactly stated, but the example in the help docs using the carsmall inbuilt dataset shows you how to set this up. For your case, you'd want:
X = [ones(size(x)) x]; % make sure this is 30 x 2
b = regress(y,X); % y should be 30 x 1, b should be 2 x 1
b(1) should then be your m, and b(2) your k.
regress can also provide additional outputs, such as confidence intervals, residuals, statistics such as r-squared, etc. The input remains the same, you'd just change the outputs:
[b,bint,r,rint,stats] = regress(y,X);

MATLAB - Intersection of arrays

I am trying to graphically find the intersections between two surfaces and the x-y plane. (Intersection of surface z1 with the x-y plane and intersection z2 with the x-y plane)
I have created arrays representing the surfaces z1 = 3+x+y and z2 = 4-2x-4y and the z3 for the x-y plane using meshgrid. Looking everywhere, the only command that seems I can use to find the intersections between arrays is the intersect(A,B) command where A and B are arrays. When I enter intersect(z1,z3) however, I get the error "A and B must be vectors, or 'rows' must be specified." When I try intersect (z1,z2,'rows'), I am returned a 0-by-21 empty matrix. What am I doing wrong here?
My code:
x = -10:10;
y = -10:10;
[X,Y] = meshgrid(x,y);
z1 = 3+X+Y;
z2 = 4-2.*X-4.*Y;
z3 = 0.*X+0.*Y; %x-y plane
surf(X,Y,z1)
hold on
surf(X,Y,z2)
surf(X,Y,z3)
int1 = intersect(z1,z3,'rows');
int2 = intersect(z2,z3,'rows');
It sounds like you want the points where z1 = z2. To numerically find these, you have a couple options.
1) Numerical rootfinding: fsolve is capable of solving systems of equations. You can formulate the surfaces as functions of one vector, [x;y] and solve for the vector that makes the two surfaces equal. An example using the initial guess x=1, y=1 follows:
z1 = #(x) 3 + x(1) + x(2);
z2 = #(x) 4 - 2*x(1) - 4*x(2);
f = #(x) z1(x) - z2(x);
x0 = [1;1]
intersect = fsolve(#(x) f(x), x0);
2) Minimizing the error: If you are stuck with discrete data (arrays instead of functions) you can simply find the points where z1 - z2 is closest to zero. An easy starting point is to take the arrays Z1 and Z2 and find all points where the difference nears zero:
tol = 1e-3;
near_zero = abs(Z1 - Z2) < tol;
near_zero is going to be a logical array that is true whenever the difference between Z1 and Z2 is small relative to tol. You can use this to index into corresponding meshgrid arrays for X and Y to find the coordinates of intersection.
a simple way (no major function calls) to solve this is as follows:
x = -10:.1:10;
y = -10:.1:10;
[X,Y] = meshgrid(x,y);
z1 = 3+X+Y;
z2 = 4-2.*X-4.*Y;
z3 = z1 - z2;
[~,mn] = min(abs(z3));
the intersection is defined as (x, y(mn)).
This, of course is a numerical approximation (since you wanted a numerical method), subject to boundary condition which I haven't explored (you'll need to disregard values far from zero when performing the minimum function)
Note: if you're looking for an equation, consider performing a least squares approximation on the resulting data.