Symmetric Regression In Stan - linear-regression

I have to vectors of data points (Gene expression in Tissue A and B) and I want to see, if their is any systematic bias along its magnitude (same expression of Gene X in A and B).
The idea was to build a simple regression model in stan and see how much the posterior for the slope (beta) overlaps with 1.
model {
for (n in 1:N){
y[n] ~ normal(alpha[i[n]] + beta[i[n]] * x[n], sigma[i[n]]);
}
}
However, depending on which vector is x and which is y, I get different results, where one slope is about 1 and other not (see Image, where x and y a swapped and the colored lines represents the regressions I get from the model (gray is slope 1)). As I found out, this a typical thing for regression methods like ordinary least squares, which makes sense if one value is dependent on the other. However, here there is no dependency and both vectors are "equal".
Now the question is, what would be an appropriate model to perform a symmetrical regression in stan.
Following the suggestion from LukasNeugebauer by standardizing the data first and working without an intercept, does not solve the problem.

I cheated a bit and found a solution:
When you rotate the coordinate system by 45 degrees, the new y-Axis (y') represents the information of x and y in equal amounts. Therefor, assuming a variance only on the new y-Axis involves both x and y.
x' = x*cos((pi/180)*45) + y*sin((pi/180)*45)
y' = -x*sin((pi/180)*45) + y*cos((pi/180)*45)
The above model now results in symmetric results. Where a slope of 0, represents a slope of 1 in the old system.

Related

Finding length between a lot of elements

I have an image of a cytoskeleton. There are a lot of small objects inside and I want to calculate the length between all of them in every axis and to get a matrix with all this data. I am trying to do this in matlab.
My final aim is to figure out if there is any axis with a constant distance between the object.
I've tried bwdist and to use connected components without any luck.
Do you have any other ideas?
So, the end goal is that you want to globally stretch this image in a certain direction (linearly) so that the distances between nearest pairs end up the closest together, hopefully the same? Or may you do more complex stretching ? (note that with arbitrarily complex one you can always make it work :) )
If linear global one, distance in x' and y' is going to be a simple multiplication of the old distance in x and y, applied to every pair of points. So, the final euclidean distance will end up being sqrt((SX*x)^2 + (SY*y)^2), with SX being stretch in x and SY stretch in y; X and Y are distances in X and Y between pairs of points.
If you are interested in just "the same" part, solution is not so difficult:
Find all objects of interest and put their X and Y coordinates in a N*2 matrix.
Calculate distances between all pairs of objects in X and Y. You will end up with 2 matrices sized N*N (with 0 on the diagonal, symmetric and real, not sure what is the name for that type of matrix).
Find minimum distance (say this is between A an B).
You probably already have this. Now:
Take C. Make N-1 transformations, which all end up in C->nearestToC = A->B. It is a simple system of equations, you have X1^2*SX^2+Y1^2*SY^2 = X2^2*SX^2+Y2*SY^2.
So, first say A->B = C->A, then A->B = C->B, then A->B = C->D etc etc. Make sure transformation is normalized => SX^2 + SY^2 = 1. If it cannot be found, the only valid transformation is SX = SY = 0 which means you don't have solution here. Obviously, SX and SY need to be real.
Note that this solution is unique except in case where X1 = X2 and Y1 = Y2. In this case, grab some other point than C to find this transformation.
For each transformation check the remaining points and find all nearest neighbours of them. If distance is always the same as these 2 (to a given tolerance), great, you found your transformation. If not, this transformation does not work and you should continue with the next one.
If you want a transformation that minimizes variations between distances (but doesn't require them to be nearly equal), I would do some optimization method and search for a minimum - I don't know how to find an exact solution otherwise. I would pick this also in case you don't have linear or global stretch.
If i understand your question correctly, the first step is to obtain all of the objects center of mass points in the image as (x,y) coordinates. Then, you can easily compute all of the distances between all points. I suggest taking a look on a histogram of those distances which may provide some information as to the nature of distance distribution (for example if it is uniformly random, or are there any patterns that appear).
Obtaining the center of mass points is not an easy task, consider transforming the image into a binary one, or some sort of background subtraction with blob detection or/and edge detector.
For building a histogram you can use histogram.

How can I make all-in-one polynomial from multi-polynomial?

I'm not familiar with expert math. so I don't know where to start from.
I have get a some article like this. I am just following this article description. But this is not easy to me.
But I'm not sure how to make just one polynomial equation(or something like that) from above 4 polynomial equations. Is this can be possible way?
If yes, Would you please help me how to get a polynomial(or something like equation)? If not, would you let me know the reason of why?
UPDATE
I'd like to try as following
clear all ;
clc
ab = (H' * H)\H' * y;
y2 = H*ab;
Finally I can get some numbers like this.
So, is this meaning?
As you can see the red curve line, something wrong.
What did I miss anythings?
All the article says is "you can combine multiple data sets into one to get a single polynomial".
You can also go in the other direction: subdivide your data set into pieces and get as many separate ones as you wish. (This is called n-fold validation.)
You start with a collection of n points (x, y). (Keep it simple by having only one independent variable x and one dependent variable y.)
Your first step should be to plot the data, look at it, and think about what kind of relationship between the two would explain it well.
Your next step is to assume some form for the relationship between the two. People like polynomials because they're easy to understand and work with, but other, more complex relationships are possible.
One polynomial might be:
y = c0 + c1*x + c2*x^2 + c3*x^3
This is your general relationship between the dependent variable y and the independent variable x.
You have n points (x, y). Your function can't go through every point. In the example I gave there are only four coefficients. How do you calculate the coefficients for n >> 4?
That's where the matricies come in. You have n equations:
y(1) = c0 + c1*x(1) + c2*x(1)^2 + c3*x(1)^3
....
y(n) = c0 + c1*x(n) + c2*x(n)^2 + c3*x(n)^3
You can write these as a matrix:
y = H * c
where the prime denotes "transpose".
Premultiply both sides by transpose(X):
transpose(X)* y = transpose(H)* H * c
Do a standard matrix inversion or LU decomposition to solve for the unknown vector of coefficients c. These particular coefficients minimize the sum of squares of differences between the function evaluated at each point x and your actual value y.
Update:
I don't know where this fixation with those polynomials comes from.
Your y vector? Wrong. Your H matrix? Wrong again.
If you must insist on using those polynomials, here's what I'd recommend: You have a range of x values in your plot. Let's say you have 100 x values, equally spaced between 0 and your max value. Those are the values to plug into your H matrix.
Use the polynomials to synthesize sets of y values, one for each polynomial.
Combine all of them into a single large problem and solve for a new set of coefficients. If you want a 3rd order polynomial, you'll only have four coefficients and one equation. It'll represent the least squares best approximation of all the synthesized data you created with your four polynomials.

Rotate a basis to align to vector

I have a matrix M of size NxP. Every P columns are orthogonal (M is a basis). I also have a vector V of size N.
My objective is to transform the first vector of M into V and to update the others in order to conservate their orthogonality. I know that the origins of V and M are the same, so it is basically a rotation from a certain angle. I assume we can find a matrix T such that T*M = M'. However, I can't figure out the details of how to do it (with MATLAB).
Also, I know there might be an infinite number of transforms doing that, but I'd like to get the simplest one (in which others vectors of M approximately remain the same, i.e no rotation around the first vector).
A small picture to illustrate. In my actual case, N and P can be large integers (not necessarily 3):
Thanks in advance for your help!
[EDIT] Alternative solution to Gram-Schmidt (accepted answer)
I managed to get a correct solution by retrieving a rotation matrix R by solving an optimization problem minimizing the 2-norm between M and R*M, under the constraints:
V is orthogonal to R*M[1] ... R*M[P-1] (i.e V'*(R*M[i]) = 0)
R*M[0] = V
Due to the solver constraints, I couldn't indicate that R*M[0] ... R*M[P-1] are all pairwise orthogonal (i.e (R*M)' * (R*M) = I).
Luckily, it seems that with this problem and with my solver (CVX using SDPT3), the resulting R*M[0] ... R*M[P-1] are also pairwise orthogonal.
I believe you want to use the Gram-Schmidt process here, which finds an orthogonal basis for a set of vectors. If V is not orthogonal to M[0], you can simply change M[0] to V and run Gram-Schmidt, to arrive at an orthogonal basis. If it is orthogonal to M[0], instead change another, non-orthogonal vector such as M[1] to V and swap the columns to make it first.
Mind you, the vector V needs to be in the column space of M, or you will always have a different basis than you had before.
Matlab doesn't have a built-in Gram-Schmidt command, although you can use the qr command to get an orthogonal basis. However, this won't work if you need V to be one of the vectors.
Option # 1 : if you have some vector and after some changes you want to rotate matrix to restore its orthogonality then, I believe, this method should work for you in Matlab
http://www.mathworks.com/help/symbolic/mupad_ref/numeric-rotationmatrix.html
(edit by another user: above link is broken, possible redirect: Matrix Rotations and Transformations)
If it does not, then ...
Option # 2 : I did not do this in Matlab but a part of another task was to find Eigenvalues and Eigenvectors of the matrix. To achieve this I used SVD. Part of SVD algorithm was Jacobi Rotation. It says to rotate the matrix until it is almost diagonalizable with some precision and invertible.
https://math.stackexchange.com/questions/222171/what-is-the-difference-between-diagonalization-and-orthogonal-diagonalization
Approximate algorithm of Jacobi rotation in your case should be similar to this one. I may be wrong at some point so you will need to double check this in relevant docs :
1) change values in existing vector
2) compute angle between actual and new vector
3) create rotation matrix and ...
put Cosine(angle) to diagonal of rotation matrix
put Sin(angle) to the top left corner of the matric
put minus -Sin(angle) to the right bottom corner of the matrix
4) multiple vector or matrix of vectors by rotation matrix in a loop until your vector matrix is invertible and diagonalizable, ability to invert can be calculated by determinant (check for singularity) and orthogonality (matrix is diagonalized) can be tested with this check - if Max value in LU matrix is less then some constant then stop rotation, at this point new matrix should contain only orthogonal vectors.
Unfortunately, I am not able to find exact pseudo code that I was referring to in the past but these links may help you to understand Jacobi Rotation :
http://www.physik.uni-freiburg.de/~severin/fulltext.pdf
http://web.stanford.edu/class/cme335/lecture7.pdf
https://www.nada.kth.se/utbildning/grukth/exjobb/rapportlistor/2003/rapporter03/maleko_mercy_03003.pdf

How do I determine the coefficients for a linear regression line in MATLAB? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
I'm going to write a program where the input is a data set of 2D points and the output is the regression coefficients of the line of best fit by minimizing the minimum MSE error.
I have some sample points that I would like to process:
X Y
1.00 1.00
2.00 2.00
3.00 1.30
4.00 3.75
5.00 2.25
How would I do this in MATLAB?
Specifically, I need to get the following formula:
y = A + Bx + e
A is the intercept and B is the slope while e is the residual error per point.
Judging from the link you provided, and my understanding of your problem, you want to calculate the line of best fit for a set of data points. You also want to do this from first principles. This will require some basic Calculus as well as some linear algebra for solving a 2 x 2 system of equations. If you recall from linear regression theory, we wish to find the best slope m and intercept b such that for a set of points ([x_1,y_1], [x_2,y_2], ..., [x_n,y_n]) (that is, we have n data points), we want to minimize the sum of squared residuals between this line and the data points.
In other words, we wish to minimize the cost function F(m,b,x,y):
m and b are our slope and intercept for this best fit line, while x and y are a vector of x and y co-ordinates that form our data set.
This function is convex, so there is an optimal minimum that we can determine. The minimum can be determined by finding the derivative with respect to each parameter, and setting these equal to 0. We then solve for m and b. The intuition behind this is that we are simultaneously finding m and b such that the cost function is jointly minimized by these two parameters. In other words:
OK, so let's find the first quantity :
We can drop the factor 2 from the derivative as the other side of the equation is equal to 0, and we can also do some distribution of terms by multiplying the -x_i term throughout:
Next, let's tackle the next parameter :
We can again drop the factor of 2 and distribute the -1 throughout the expression:
Knowing that is simply n, we can simplify the above to:
Now, we need to simultaneously solve for m and b with the above two equations. This will jointly minimize the cost function which finds the best line of fit for our data points.
Doing some re-arranging, we can isolate m and b on one side of the equations and the rest on the other sides:
As you can see, we can formulate this into a 2 x 2 system of equations to solve for m and b. Specifically, let's re-arrange the two equations above so that it's in matrix form:
With regards to above, we can decompose the problem by solving a linear system: Ax = b. All you have to do is solve for x, which is x = A^{-1}*b. To find the inverse of a 2 x 2 system, given the matrix:
The inverse is simply:
Therefore, by substituting our quantities into the above equation, we solve for m and b in matrix form, and it simplifies to this:
Carrying out this multiplication and solving for m and b individually, this gives:
As such, to find the best slope and intercept to best fit your data, you need to calculate m and b using the above equations.
Given your data specified in the link in your comments, we can do this quite easily:
%// Define points
X = 1:5;
Y = [1 2 1.3 3.75 2.25];
%// Get total number of points
n = numel(X);
% // Define relevant quantities for finding quantities
sumxi = sum(X);
sumyi = sum(Y);
sumxiyi = sum(X.*Y);
sumxi2 = sum(X.^2);
sumyi2 = sum(Y.^2);
%// Determine slope and intercept
m = (sumxi * sumyi - n*sumxiyi) / (sumxi^2 - n*sumxi2);
b = (sumxiyi * sumxi - sumyi * sumxi2) / (sumxi^2 - n*sumxi2);
%// Display them
disp([m b])
... and we get:
0.4250 0.7850
Therefore, the line of best fit that minimizes the error is:
y = 0.4250*x + 0.7850
However, if you want to use built-in MATLAB tools, you can use polyfit (credit goes to Luis Mendo for providing the hint). polyfit determines the line (or nth order polynomial curve rather...) of best fit by linear regression by minimizing the sum of squared errors between the best fit line and your data points. How you call the function is so:
coeff = polyfit(x,y,order);
x and y are the x and y points of your data while order determines the order of the line of best fit you want. As an example, order=1 means that the line is linear, order=2 means that the line is quadratic and so on. Essentially, polyfit fits a polynomial of order order given your data points. Given your problem, order=1. As such, given the data in the link, you would simply do:
X = 1:5;
Y = [1 2 1.3 3.75 2.25];
coeff = polyfit(X,Y,1)
coeff =
0.4250 0.7850
The way coeff works is that these are the coefficients of the regression line, starting from the highest order in decreasing value. As such, the above coeff variable means that the regression line was fitted as:
y = 0.4250*x + 0.7850
The first coefficient is the slope while the second coefficient is the intercept. You'll also see that this matches up with the link you provided.
If you want a visual representation, here's a plot of the data points as well as the regression line that best fits these points:
plot(X, Y, 'r.', X, polyval(coeff, X));
Here's the plot:
polyval takes an array of coefficients (usually produced by polyfit), and you provide a set of x co-ordinates and it calculates what the y values are given the values of x. Essentially, you are evaluating what the points are along the best fit line.
Edit - Extending to higher orders
If you want to extend so that you're finding the best fit for any nth order polynomial, I won't go into the details, but it boils down to constructing the following linear system. Given the relationship for the ith point between (x_i, y_i):
You would construct the following linear system:
Basically, you would create a vector of points y, and you would construct a matrix X such that each column denotes taking your vector of points x and applying a power operation to each column. Specifically, the first column is the zero-th power, the first column is the first power, the second column is the second power and so on. You would do this up until m, which is the order polynomial you want. The vector of e would be the residual error for each point in your set.
Specifically, the formulation of the problem can be written in matrix form as:
Once you construct this matrix, you would find the parameters by least-squares by calculating the pseudo-inverse. How the pseudo-inverse is derived, you can read it up on the Wikipedia article I linked to, but this is the basis for minimizing a system by least-squares. The pseudo-inverse is the backbone behind least-squares minimization. Specifically:
(X^{T}*X)^{-1}*X^{T} is the pseudo-inverse. X itself is a very popular matrix, which is known as the Vandermonde matrix and MATLAB has a command called vander to help you compute that matrix. A small note is that vander in MATLAB is returned in reverse order. The powers decrease from m-1 down to 0. If you want to have this reversed, you'd need to call fliplr on that output matrix. Also, you will need to append one more column at the end of it, which is the vector with all of its elements raised to the mth power.
I won't go into how you'd repeat your example for anything higher order than linear. I'm going to leave that to you as a learning exercise, but simply construct the vector y, the matrix X with vander, then find the parameters by applying the pseudo-inverse of X with the above to solve for your parameters.
Good luck!

Normalization in neural network with (x, y) output

I built a backpropagation neural network to learn from a dataset that consists of 7 continuous inputs and 2 outputs (x, y coordinates). My implementation choice was to use one hidden layer with 7 neurons, but I did it in such a way that I can try different combinations of hidden layers with variable number of hidden nodes.
The error measurement is the usual mean squared error, calculated as follows:
MSE(x,y) = 1/N * sum((X - x)^2 + (Y - y)^2)
where X and Y are the target values, x and y the predictions. I also have to compute an accuracy measure which is the mean euclidean distance of each point from the target points, that's basically the same as the MSE, but the values inside sum get square-rooted.
The input ranges are all between the interval [-2, +2], plus some outliers.
The output coordinates have completely unrelated distributions (x is normally distributed while y is uniformly distributed). The x range is small (say -1, +1 from the mean) while the y range varies more (say -10, +10 from the mean).
The behavior I get is that the net seems to predict quite right the y output, while the x "flattens" to y. Ie, the x values get closer to the y values, the network doesn't adapt to predict the x correctly.
My initial choice was to scale both inputs/outputs as a whole to the usual (0,1) interval but that didn't lead to good results. So I then chose to standardize each feature separately with their z-score, and scale the outputs in the (0,1) interval (I am using the sigmoid activation function so (0,1) seemed about right). But then this strange behavior appeared.
So my questions are, how would you normalize such inputs/outputs? Is there a way to deal with such uncorrelated outputs? I had even thought about using two separate networks to predict one single output discarding the other, is that a good choice?
Could you also point me to some reading where output normalization is discussed? The literature talks a lot about normalizing the inputs, but no one seems to care about the outputs.