Linear-logarithmic regression in MATLAB: 2 Input-Parameters - matlab

Look at the following plot and ignore the solid lines please (just look at the dotted/dashed ones).
For each curve, g is between [0, 255] (thus always positive), concave, bijective.
I know from the process that lies behind the measures, that by increasing V, the corresponding curve flattens.
The different curves result when varying V. The orange curve at the top is for like V=100, the bottom curve (red/magenta) results for V=180.
I have measured data with a lot more data points in the following form:
T[1] V[1] g[1]
T[2] V[1] g[2]
T[3] V[1] g[3]
... V[1] g[4]
T[N] V[1] g[5]
.......
T[1] V[N] g[1]
T[2] V[N] g[2]
T[3] V[N] g[3]
... V[N] g[4]
T[N] V[N] g[5]
Now I want a regression like this:
g = g(V, T)
which would yield the curve for a fixed V-value:
g = g(T), V=Vfix
Which regression-funktion in MATLAB do you think would work out the best way?
And how to assume a "model" here?
I only know (from the process itself AND obviously from the plots), that its some sort of linear curve at the beginning, pass over into a logarithmic curve, but I dont know how the value of V inferfers with it!?
Thanks a lot in advance: for any advice..

#bjoern, for each fixed V, it seems that your curve is concave and only has positive values. Therefore, my first choice is to assume that Y=A X^r. The easiest way to estimate this is to apply log in both sides to get the linear regression log Y = log A + r log X (you probably will find 0<r<1). Therefore, for each value of V, I would use the function regress in matlab applied to the values log Y and log X in order to estimate the parameters A and r. This function is called Cobb-Douglas and it is very useful in economics: http://en.wikipedia.org/wiki/Cobb%E2%80%93Douglas_production_function.
For most curves, it seems that the effect of V is well behaved, but the behavior of the blue curve, which is very strange. I would say that in general the effect of V is to translate the points.
If the behavior of V is really linear, maybe you can estimate Y=A V X^r. Therefore, you have to estimate logY = log A + log V + r log X. In this case, your dependent variable is log Y and your independent variables log X and log V.
In both cases, I think that the function regress of matlab does not automatically include the constant of the regression (A for us). So remember to include a vector of ones with the size of your sample as an independent variable as well,
Furthermore, if you really want to test if the behavior of V is linear, just estimate
logY = log A + slog V + r log X ehich is equivalent to Y=A V^s X^r
I hope it helps.

Related

Symmetric Regression In Stan

I have to vectors of data points (Gene expression in Tissue A and B) and I want to see, if their is any systematic bias along its magnitude (same expression of Gene X in A and B).
The idea was to build a simple regression model in stan and see how much the posterior for the slope (beta) overlaps with 1.
model {
for (n in 1:N){
y[n] ~ normal(alpha[i[n]] + beta[i[n]] * x[n], sigma[i[n]]);
}
}
However, depending on which vector is x and which is y, I get different results, where one slope is about 1 and other not (see Image, where x and y a swapped and the colored lines represents the regressions I get from the model (gray is slope 1)). As I found out, this a typical thing for regression methods like ordinary least squares, which makes sense if one value is dependent on the other. However, here there is no dependency and both vectors are "equal".
Now the question is, what would be an appropriate model to perform a symmetrical regression in stan.
Following the suggestion from LukasNeugebauer by standardizing the data first and working without an intercept, does not solve the problem.
I cheated a bit and found a solution:
When you rotate the coordinate system by 45 degrees, the new y-Axis (y') represents the information of x and y in equal amounts. Therefor, assuming a variance only on the new y-Axis involves both x and y.
x' = x*cos((pi/180)*45) + y*sin((pi/180)*45)
y' = -x*sin((pi/180)*45) + y*cos((pi/180)*45)
The above model now results in symmetric results. Where a slope of 0, represents a slope of 1 in the old system.

Solving integral in Matlab containing unknown variables?

The task is to create a cone hat in Matlab by creating a developable surface with numerical methods. There are 3 parts of which I have done 2. My question is regarding part 3 where I need to calculate the least rectangular paper surface that can contain the hat. And I need to calculate the material waste of the paper.
YOU CAN MAYBE SKIP THE LONG BACKGROUND AND GO TO LAST PARAGRAPH
BACKGROUND:
The cone hat can be created with a skewed cone with its tip located at (a; 0; b) and with a circle-formed base.
x = Rcos u,
y = Rsin u
z = 0
0<_ u >_2pi
with
known values for R, a and b
epsilon and eta ('n') is the curves x- and y- values when the parameter u goes from 0 to 2pi and alpha is the angle of inclination for the curve at point (epsilon, eta). Starting values at A:
u=0, alhpa=0, epsilon=0, eta=0.
Curve stops at B where the parameter u has reached 2pi.
1.
I plotted the curve by using Runge-Kutta-4 and showed that the tip is located at P = (0, sqrt(b^2 + (R-alpha)^2))
2.
I showed that by using smaller intervals in RK4 I still get quite good accuracy but the problem then is that the curve isn't smooth. Therefore I used Hermite-Interpolation of epsilon and eta as functions of u in every interval to get a better curve.
3.
Ok so now I need to calculate the least rectangular paper surface that can contain the hat and the size of the material waste of the paper. If the end angle alpha(2pi) in the template is pi or pi/2 the material waste will be less. I now get values for R & alpha (R=7.8 and alpha=5.5) and my task is to calculate which height, b the cone hat is going to get with the construction-criteria alpha(2pi)=pi (and then for alpha(2pi)=pi/2 for another sized hat).
So I took the first equation above (the expression containing b) and rewrote it like an integral:
TO THE QUESTION
What I understand is that I need to solve this integral in matlab and then choose b so that alpha(2pi)-pi=0 (using the given criteria above).
The values for R and alpha is given and t is defined as an interval earlier (in part 1 where I did the RK4). So when the integral is solved I get f(b) = 0 which I should be able to solve with for example the secant method? But I'm not able to solve the integral with the matlab function 'integral'.. cause I don't have the value of b of course, that is what I am looking for. So how am I going to go about this? Is there a function in matlab which can be used?
You can use the differential equation for alpha and test different values for b until the condition alpha(2pi)=pi is met. For example:
b0=1 %initial seed
b=fsolve(#find_b,b0) %use the function fsolve or any of your choice
The function to be solved is:
function[obj]=find_b(b)
alpha0=0 %initual valur for alpha at u=0
uspan=[0 2*pi] %range for u
%Use the internal ode solver or any of your choice
[u,alpha] = ode45(#(u,alpha) integrate_alpha(u,alpha,b), uspan, alpha0);
alpha_final=alpha(end) %Get the last value for alpha (at u=2*pi)
obj=alpha_final-pi %Function to be solved
end
And the integration can be done like this:
function[dalpha]=integrate_alpha(u,alpha,b)
a=1; %you can use the right value here
R=1; %you can use the right value here
dalpha=(R-a*cos(u))/sqrt(b^2+(R-a*cos(u))^2);
end

MATLAB I would like to minimize the difference fitting

I have a curve that looks like an exponentiel function, I would like to fit this curve with this equation :
The goal is to find the value of A, T and d which will minimize V with my initial curve.
I did a function that is able to do it but it takes 10 seconds to run.
3 loops that test all the values that I want to and at the end of the 3 loops I calculate the RMSD (root mean square deviation) between my 2 curves and I put the result in a vector min_RMSN, at the end I check the minimum value of min_RMSD and it's done...
But this is not the best way for sure.
Thank you for your help, ideas :)
Matlab has a built in fminsearch function that does pretty much exactly what you want. You define a function handle that takes the RMSE of your data vs. the function fit, pass in your initial guess for A, T and d, and get a result:
x0 = [A0, T0, d0]
fn = #(x) sum((x(1) * (1 - exp(-x[2] / (t - x[3]))) - y).^2)
V = fminsearch(#fn, x0)
Here t is the x-data for the curve you have, y are the corresponding y-values and, A0, T0, d0 are the initial guesses for your parameters. fn computes the suquare of the RMSE between your ideal curve and y. No need to take the square root since you minimizing the square will also minimize the RMSE itself, and computing square roots takes time.

How can I make all-in-one polynomial from multi-polynomial?

I'm not familiar with expert math. so I don't know where to start from.
I have get a some article like this. I am just following this article description. But this is not easy to me.
But I'm not sure how to make just one polynomial equation(or something like that) from above 4 polynomial equations. Is this can be possible way?
If yes, Would you please help me how to get a polynomial(or something like equation)? If not, would you let me know the reason of why?
UPDATE
I'd like to try as following
clear all ;
clc
ab = (H' * H)\H' * y;
y2 = H*ab;
Finally I can get some numbers like this.
So, is this meaning?
As you can see the red curve line, something wrong.
What did I miss anythings?
All the article says is "you can combine multiple data sets into one to get a single polynomial".
You can also go in the other direction: subdivide your data set into pieces and get as many separate ones as you wish. (This is called n-fold validation.)
You start with a collection of n points (x, y). (Keep it simple by having only one independent variable x and one dependent variable y.)
Your first step should be to plot the data, look at it, and think about what kind of relationship between the two would explain it well.
Your next step is to assume some form for the relationship between the two. People like polynomials because they're easy to understand and work with, but other, more complex relationships are possible.
One polynomial might be:
y = c0 + c1*x + c2*x^2 + c3*x^3
This is your general relationship between the dependent variable y and the independent variable x.
You have n points (x, y). Your function can't go through every point. In the example I gave there are only four coefficients. How do you calculate the coefficients for n >> 4?
That's where the matricies come in. You have n equations:
y(1) = c0 + c1*x(1) + c2*x(1)^2 + c3*x(1)^3
....
y(n) = c0 + c1*x(n) + c2*x(n)^2 + c3*x(n)^3
You can write these as a matrix:
y = H * c
where the prime denotes "transpose".
Premultiply both sides by transpose(X):
transpose(X)* y = transpose(H)* H * c
Do a standard matrix inversion or LU decomposition to solve for the unknown vector of coefficients c. These particular coefficients minimize the sum of squares of differences between the function evaluated at each point x and your actual value y.
Update:
I don't know where this fixation with those polynomials comes from.
Your y vector? Wrong. Your H matrix? Wrong again.
If you must insist on using those polynomials, here's what I'd recommend: You have a range of x values in your plot. Let's say you have 100 x values, equally spaced between 0 and your max value. Those are the values to plug into your H matrix.
Use the polynomials to synthesize sets of y values, one for each polynomial.
Combine all of them into a single large problem and solve for a new set of coefficients. If you want a 3rd order polynomial, you'll only have four coefficients and one equation. It'll represent the least squares best approximation of all the synthesized data you created with your four polynomials.

I have a linear line of best fit. I need data points that will not change my line of best fit

I'm giving a presentation about fitting lines of best fit. I have a simple linear line: y=1x+0. I'm trying to get scattered data points that I can put in a scatter plot that will keep my line of best fit the same equation: y=1x+b.
I'd love to learn this technique in either R or Excel- whichever is easier.
Thanks!
First generate a set of points x and y that fit your equation exactly. Then, add an error vector to y that is orthogonal to x. To do this, generate a vector r, that is same length as y. Then compute the residuals from the least squares fit of r to x, e = r - rhat, where rhat = bhat + x*ahat. (bhat and ahat are the least squares fit to this r, not your original b). Add e to y.