I have a 2 parameter model that includes an integral function and I´m quite lost how to solve this.
My data consists of a set of given molecule sizes, r_m, and a calculated response, K.
The theoretical model is a function of this molecule size, r_m, but also involves the integral of the Gaussian distribution (to make it simpler) of the pore size distribution of an adsorbent material. So it has the following form:
r_p and s_p are the 2 parameters.
So far I've tried to solve this in MATLAB based on the following post:
https://de.mathworks.com/help/optim/examples/nonlinear-data-fitting.html
and this is my code so far:
Data = ...
[0.5 1
1.1 0.83
1.6 0.74
2.2 0.55
2.5 0.28
3.5 0];
r = Data(:,1);
K_exp = Data(:,2);
F = #(x,xdata) quad( (exp(-1/2.*((xdata - x(1))/x(2)).^2).*(1 - (xdata(1)/xdata).^2)),xdata(1),120)./quad(exp(-1/2.*((xdata - x(1))/x(2)).^2,0,120) ; ;
x0 = [6 0.5] ;
[x,resnorm,~,exitflag,output] = lsqcurvefit(F,x0,r,K_exp)
Does this adaption make sense? I'm not sure on the one hand how to properly include an integral into the fitting syntax, and on the other hand I don´t know how to properly tell MATLAB that the lower bound of integration for the numerator should be the size of the molecule, r_m. What I'm imaging is that xdata is some sort of vector for the size, r, as in the example in the link for time, t. However, the lower limit of integration is not fixed but varies for each of the points I want to fit.
Any help is greatly appreciated!
Related
I am trying to make a small angle approximation in MATLAB's symbolic toolbox. This is being used for the equations of motion in a spacecraft control simulation (and yes, I need to linearize, I can't leave them in their more exact form). For those unfamiliar, the small quantity approximation does a few main things that I need. For small quantities delta and gamma,
delta times gamma is approximately 0
delta^2 is approximately 0 (same with higher powers)
sin(delta) is approximately delta
cos(delta) is approximately 1
I have tried using MATLAB's taylor function (link here), but it doesn't seem to be doing what I want except in a very specific scenario (which I am sure is coincidental anyway). A test case is presented below:
syms psiX psiY psiZ rGMag mu Ixx Iyy Izz
QLB = [1,psiZ,-psiY;-psiZ,1,psiX;psiY,-psiX,1]; %linearized version of the rotation matrix from the L frame to the B frame
rG_LVLH = [0;0;rGMag]; %magnitude of the rG vector expressed in the L frame
rG = QLB*rG_LVLH
G = 3*mu/rGMag^5 .* [rG(2)*rG(3)*(Izz-Iyy);rG(1)*rG(3)*(Ixx-Izz);rG(1)*rG(2)*(Iyy-Ixx)]; %gravity-gradient torque
The desired output of the above should have the G vector with a 0 in the third component and symbolic variables left in the other two. This particular example doesn't include a trigonometric example, but I can provide if necessary. Thanks.
I want to compare the error between y and yhat. y is generated using known values which are the coefficients of a Moving Average model. yhat is generated using estimates of the coefficients. What are the statistics which show how close the output are? In machine learning papers I have seen standard deviation and mean square error as the performance metric. But I cannot understand how I can apply these in this example. Any guidance will be very helpful. Thankyou.
N = 100;
a1=0.2;
b1=0.5;
h = [1 a1 b1]; %channel coefficients
h_hat = [1 0.23 0.45];
data = rand(1,N);
y = filter(h,1,data); %transmitted signal through MA channel
yhat = filter(h_hat,1,data);
How to calculate MSE:
MES= mean((y - yhat).^2)
And here the standard error of the mean:
err=y - yhat;
SE = std(err)/sqrt(length(err));
However, the metric you are using should address your research question/ hypothesis. It might be that SE or MSE are not the right choices. Without knowing what you are investigating it is difficult to give any suggestions.
I want to get the offset in samples between two datasets in Matlab (getting them synced in time), a quite common issue. Therefore I use the cross correlation function xcorr or the cross covariance function xcov (both provide similar results in most cases for this purpose). With artificial data it works fine, but I struggle with "real" data, even though it should be pretty much the same. Matlab always says the offset would be zero. I'm using this simple piece of code:
[crossCorr] = xcov(b, c);
[~, peakIndex] = max(crossCorr())
offset = peakIndex - length(b)
I've posted a fully runable example m-file with a downsampled data excerpt on pastebin:
Code with data on pastebin
EDIT: The downsampled excerpt seems to be not fully suitable for evaluating the effect. Here's a much larger sample with the original frequency, pease use this one instead. Unfortunately it was too big for pastebin.
As the plot shows it should be no problem at all to get the offset via cross covariance. I also tried to scale the data nicer in order to avoid numerical problems, but that didn't change anything at all.
Would be great if someone could tell me my mistake.
There's nothing wrong with your method in principle, I used exactly the same approach successfully for temporally aligning different audio recordings of the same signal.
However, it appears that for your time series, correlation (or covariance) is simply not the right measure to compare shifted versions – possibly because they contain components of a time scale comparable to the total length. An alternative is to use residual variance, i.e. the variance of the difference between shifted versions. Here is a (not particularly elegant) implementation of this idea:
lags = -1000 : 1000;
v = nan(size(lags));
for i = 1 : numel(lags)
lag = lags(i);
if lag >= 0
v(i) = var(b(1 + lag : end) - c(1 : end - lag));
else
v(i) = var(b(1 : end + lag) - c(1 - lag : end));
end
end
[~, ind] = min(v);
minlag = lags(ind);
For your (longer) data set, this results in minlag = 169. Plotting residual variance over lags gives:
Your data has a minor peak around 5 and a major peak around 101.
If I knew something about my data then I could might window around an acceptable range of offsets as shown below.
Code for initial exploration:
figure; clc;
subplot(2,1,1)
plot(1:numel(b), b);
hold on
plot(1:numel(c), c, 'r');
legend('b','c')
subplot(2,1,2)
plot(crossCorr,'.b-')
hold on
plot(peakIndex,crossCorr(peakIndex),'or')
legend('crossCorr','peak')
Initial Image:
If you zoom into the first peak you can see that it is not only high around 5, but it is polynomial "enough" to allow sub-element offsets. That is convenient.
Image showing :
Here is what the curve-fitting tool gives as the analytic for a cubic:
Linear model Poly3:
f(x) = p1*x^3 + p2*x^2 + p3*x + p4
Coefficients (with 95% confidence bounds):
p1 = 8.515e-013 (8.214e-013, 8.816e-013)
p2 = -3.319e-011 (-3.369e-011, -3.269e-011)
p3 = 2.253e-010 (2.229e-010, 2.277e-010)
p4 = -4.226e-012 (-7.47e-012, -9.82e-013)
Goodness of fit:
SSE: 2.799e-024
R-square: 1
Adjusted R-square: 1
RMSE: 6.831e-013
You can note that the SSE fits to roundoff.
If you compute the root (near n=4) you use the following matlab code:
% Coefficients
p1 = 8.515e-013
p2 = -3.319e-011
p3 = 2.253e-010
p4 = -4.226e-012
% Linear model Poly3:
syms('x')
f = p1*x^3 + p2*x^2 + p3*x + p4
xz1=fzero(#(y) subs(diff(f),'x',y), 4)
and you get the analytic root at 4.01420240431444.
EDIT:
Hmmm. How about fitting a gaussian mixture model to the convolution? You sweep through a good range of component count, you do between 10 and 30 repeats, and you find which component count has the best/lowest BIC. So you fit a gmdistribution to the lower subplot of the first figure, then test the covariance at the means of the components in decreasing order.
I would try the offset at the means, and just look at sum squared error. I would then pick the offset that has the lowest error.
Procedure:
compute cross correlation
fit cross correlation to Gaussian Mixture model
sweep a reasonable range of components (start with 1-10)
use a reasonable number of repeats (10 to 30 depending on run-to-run variation)
compute Bayes Information Criterion (BIC) for each level, pick the lowest because it indicates a reasonable balance of error and parameter count
each component is going to have a mean, evaluate that mean as a candidate offset and compute sum-squared error (sse) when you offset like that.
pick the offset of the component that gives best SSE
Let me know how well that works.
If the two signals misalign by non-integer number of samples, e.g. 3.7 samples, then the xcorr method may find the max value at 4 samples, it won't be able to find the accurate time shift. In this case, you should try a method called "unified change detection". The web-link for the paper is:
[http://www.phmsociety.org/node/1404/]
Good Luck.
Anyone here that could help me with the following problem?
The following code calculates the best polynomial fit to a given data-set, that is; a polynomial of a specified degree.
Unfortunately, whatever the data-set may be, usually at degree 6 or higher, MATLAB gets a totally wrong fit. Usually the fit curves totally away from the data in a sort of exponantial-looking-manner downwards. (see the example: degree = 8).
x=[1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5] % experimental x-values
y=[4.3 6.2 10.1 13.5 19.8 22.6 24.7 29.2] % experimental y-values
degree=8; % specify the degree
A = zeros(length(x),degree);
for exponent=0:degree;
for data=1:length(x);
A(data,exponent+1)=x(data).^exponent; % create matrix A
end;
end;
a=inv((transpose(A)*A))*transpose(A)*y'; % a are the coëfficients of the polynom
a=flipud(a);
fitpolynoom=polyval(a,x);
error=sum((y-fitpolynoom).^2); % calculates the fit-error using the least-squares method
fitpolynoom=polyval(a,x);
figure;
plot(x,y,'black*',x,fitpolynoom,'g-');
error % displays the value of the fit-error in the Matlab command window
Thanks in advance.
First, some remarks: for least-squares fitting polynomials in Matlab, you could use the existingpolyfit function instead. Furthermore (this may depend on your application) you probably should not be fitting $8$th degree polynomials, especially when you have $8$ data points. In this answer, I assume you have good reasons to fit polynomials to your data (e.g., just for self-study purposes).
The issue is a numeric problem arising from matrix inversion. For solving equations of type $Ax=b$ where $A$ is a square matrix, actually inverting $A$ is not recommended (See Blogpost 'Don't invert that matrix' by John D. Cook). In the least-squares case, instead of
\begin{equation}
a = (A^\mathrm{T} A)^{-1} A^\mathrm{T} y^\mathrm{T}
\end{equation}
it is better to solve
\begin{equation}
(A^\mathrm{T} A)a = A^\mathrm{T} y^\mathrm{T}
\end{equation}
by other means. In your MATLAB code, you may replace
a=inv((transpose(A)*A))*transpose(A)*y';
by
a = (transpose(A) * A) \ (transpose(A) * y');
By this modification to your code, I obtained a fit going through the data points.
I am working towards my degree thesis in image processing, and I'm using the Matlab Image Processing toolbox. I'm calculate the correlation of an image with the co-occurrence matrix using the Matlab function graycoprops. My problem is that I can't understand the meaning of the formula that defines the correlation property (see the previous link):
In particular, what are \mu_i, \mu_j, \sigma_i, \sigma_j ,if i and j are graylevels of the image?
I would imagine it's the mean and standard deviation in the x and y directions. i probably corresponds to x, and j to y. That's just a guess, though.
EDIT: This is supported by looking at the function code. I highly recommend you check it out yourself (simply type edit graycoprops), but here's the relevant part:
function Corr = calculateCorrelation(glcm,r,c)
...
% Calculate the mean and standard deviation of a pixel value in the row
% direction direction. e.g., for glcm = [0 0;1 0] mr is 2 and Sr is 0.
mr = meanIndex(r,glcm);
Sr = stdIndex(r,glcm,mr);
% mean and standard deviation of pixel value in the column direction, e.g.,
% for glcm = [0 0;1 0] mc is 1 and Sc is 0.
mc = meanIndex(c,glcm);
Sc = stdIndex(c,glcm,mc);
I have had the same question, and the paper "Statistical Texture Measures Computed from Gray Level Coocurrence Matrices" by Fritz Albregtsen (2008) was of great help, as it gives a precise definition of all the formulas.