Matlab partial area under the curve - matlab

I want to plot the area above and below a particular value in x axis.
The problem i am facing is with discrete values. The code below for instance has an explicit X=10 so i have written it in such a way that i can find the index and calculate the values above and below that particular value but if i want to find the area under the curve above and below 4 this program will now work.
Though in the plot matlab does a spline fitting(or some sort of fitting for connecting discrete values) there is a value for y corresponding to x=4 that matlab computes i cant seem to store or access it.
%Example for Area under the curve and partial area under the curve using Trapezoidal rule of integration
clc;
close all;
clear all;
x=[0,5,10,15,20];% domain
y=[0,25,50,25,0];% Values
LP=log2(y);
plot(x,y);
full = trapz(x,y);% plot of the total area
I=find(x==10);% in our case will be the distance value up to which we want
half = trapz(x(1:I),y(1:I));%Plot of the partial area
How can we find the area under the curve for a value of ie x = 2 or 3 or 4 or 6 or 7 or ...

This is an elaboration of patrik's comment, "first interpolate and then integrate".
For the purpose of this answer I'll assume that the area in question is the area that can be seen in the plot, and since plot connects points by straight lines I assume that linear interpolation is adequate. Moreover, since the trapezoidal rule itself is based on linear interpolation, we only need interpolated values at the beginning and end of the interval.
Starting from the given points
x = [0, 5, 10, 15, 20];
y = [0, 25, 50, 25, 0];
and the integration interval limits, say
xa = 4;
xb = 20;
we first select the data points within the limits
ind = (x > xa) & (x < xb);
xw = x(ind);
yw = y(ind);
and then complete them by interpolation values at the edges:
ya = interp1(x, y, xa);
yb = interp1(x, y, xb);
xw = [xa, xw, xb];
yw = [ya, yw, yb];
Now we can simply apply trapezoidal integration:
area = trapz(xw, yw);

I think that you either need more samples, or to interpolate the data. Another alternative is to use a function handle. Then you need to know the function though. Example using linear interpolation follows.
x0 = [0;5;10;15;20];
y0 = [0,25,50,25,0];
x1 = 0:20;
y1 = interp1(x0,y0,x1,'linear');
xMax = 4;
partInt = trapz(x1(x1<=xMax),y1(x1<=xMax));
Some other kind of interpolation may be suitable, but that is hard to say without more information. Also, this interpolates from the beginning to x. However, I guess figuring out how to change the limits should be easy from here. This solution is different than the former, since it is less depending on the pyramid shape of the data. So to say, it is more general.

Related

Limit cycle in Matlab for 2nd order autonomous system

I wish to create a limit cycle in Matlab. A limit cycle looks something like this:
I have no idea how to do it though, I've never done anything like this in Matlab.
The equations to describe the limit cycle are the following:
x_1d=x_2
x_2d=-x_1+x_2-2*(x_1+2*x_2)x_2^2
It is to be centered around the equilibrium which is (0,0)
Can any of you help me?
If you use the partial derivatives of your function to make a vector field, you can then use streamlines to visualize the cycle that you are describing.
For example, the function f = x^2+y^2
Gives me partial derivatives dx = 2x, dy=2y For the visualization, I sample from the partial derivatives over a grid.
[x,y] = meshgrid(0:0.1:1,0:0.1:1);
dx = 2*x;
dy = 2*y;
To visualize the vector field, I use quiver;
figure;
quiver(x, y, dx, dy);
Using streamline, I can visualize the path a particle injected into the vector field would take. In my example, I inject the particle at (0.1, 0.1)
streamline(x,y, dx, dy, 0.1, 0.1);
This produces the following visualization
In your case, you can omit the quiver step to remove the hedgehog arrows at every grid point.
Here's another example that shows the particle converging to an orbit.
Edit: Your function specifically.
So as knedlsepp points out, the function you are interested in is a bit ambiguously stated. In Matlab, * represents the matrix product while .* represents the element-wise multiplication between matrices. Similarly, '^2' represents MM for a matrix M, while .^2 represents taking the element-wise power.
So,
[x_1,x_2] = meshgrid(-4:0.1:4,-4:0.1:4);
dx_1 = x_2;
dx_2 = -x_1+x_2-2*(x_1+2*x_2)*(x_2)^2;
figure; streamline(x_1,x_2, dx_1, dx_2, 0:0.1:4, 0:0.1:4);
Looks like
This function will not show convergence because it doesn't converge.
knedlsepp suggests that the function you are actually interested in is
dx_1 = -1 * x_2;
dx_2 = -1 * -x_1+x_2-2*(x_1+2*x_2).*(x_2).^2;
His post has a nice description of the rest.
This post shows the code to produce the integral lines of your vector field defined by:
dx/dt = y
dy/dt = -x+y-2*(x+2*y)*y^2.
It is important to properly vectorize this function. (i.e. Introducing dots at all the important places)
dxdt = #(x,y) y;
dydt = #(x,y) -x+y-2*(x+2*y).*y.^2;
[X,Y] = meshgrid(linspace(-4,4,100));
[sx,sy] = meshgrid(linspace(-3,3,20));
streamline(stream2(X, Y, ... % Points
dxdt(X,Y), dydt(X,Y),... % Derivatives
sx, sy)); % Starting points
axis equal tight
To get a picture more similar to yours, change the grid size and starting points:
[X,Y] = meshgrid(linspace(-1,1,100));
[sx,sy] = meshgrid(linspace(0,0.75,20),0.2);

What interpolation technique does Matlab plot function use to show the data?

It seems to be very basic question, but I wonder when I plot x values against y values, what interpolation technique is used behind the scene to show me the discrete data as continuous? Consider the following example:
x = 0:pi/100:2*pi;
y = sin(x);
plot(x,y)
My guess is it is a Lagrangian interpolation?
No, it's just a linear interpolation. Your example uses a quite long dataset, so you can't tell the difference. Try plotting a short dataset and you'll see it.
MATLAB's plot performs simple linear interpolation. For finer resolution you'd have to supply more sample points or interpolate between the given x values.
For example taking the sinus from the answer of FamousBlueRaincoat, one can just create an x vector with more equidistant values. Note, that the linear interpolated values coincide with the original plot lines, as the original does use linear interpolation as well. Note also, that the x_ip vector does not include (all) of the original points. This is why the do not coincide at point (~0.8, ~0.7).
Code
x = 0:pi/4:2*pi;
y = sin(x);
x_ip = linspace(x(1),x(end),5*numel(x));
y_lin = interp1(x,y,x_ip,'linear');
y_pch = interp1(x,y,x_ip,'pchip');
y_v5c = interp1(x,y,x_ip,'v5cubic');
y_spl = interp1(x,y,x_ip,'spline');
plot(x,y,x_ip,y_lin,x_ip,y_pch,x_ip,y_v5c,x_ip,y_spl,'LineWidth',1.2)
set(gca,'xlim',[pi/5 pi/2],'ylim',[0.5 1],'FontSize',16)
hLeg = legend(...
'No Interpolation','Linear Interpolation',...
'PChip Interpolation','v5cubic Interpolation',...
'Spline Interpolation');
set(hLeg,'Location','south','Fontsize',16);
By the way..this does also apply to mesh and others
[X,Y] = meshgrid(-8:2:8);
R = sqrt(X.^2 + Y.^2) + eps;
Z = sin(R)./R;
figure
mesh(Z)
No, Lagrangian interpolation with 200 equally spaced points would be an incredibly bad idea. (See: Runge's phenomenon).
The plot command simply connects the given (x,y) points by straight lines, in the order given. To see this for yourself, use fewer points:
x = 0:pi/4:2*pi;
y = sin(x);
plot(x,y)

How to make a vector that follows a certain trend?

I have a set of data with over 4000 points. I want to exclude grooves from them, ideally from the point from which they start. The data look for example like this:
The problem with this is the noise I get at the top of the plateaus. I have an idea, in which I would take an average value of the most common within some boundaries (again, ideally sth like the red line here:
and then I would construct a temporary matrix, which would fill up one by one with Y if they are less than this average. If the Y(i) would rise above average, the matrix would find its minima and compare it with the global minima. If the temporary matrix's minima wouldn't be sth like 80% of the global minima, it would be discarded as noise.
I've tried using mean(Y), interpolating and fitting it in a polynomial (the green line) - none of those method would cut it to the point I would be satisfied.
I need this to be extremely robust and it doesn't need to be quick. The top and bottom values can vary a lot, as well as the shape of the plateaus. The groove width is more or less the same.
Do you have any ideas? Again, the point is to extract the values that would make the groove.
How about a median filter?
Let's define some noisy data similar to yours, and plot it in blue:
x = .2*sin((0:9999)/1000); %// signal
x(1000:1099) = x(1000:1099) + sin((0:99)/50*pi); %// noise: spike
x(5000:5199) = x(5000:5199) - sin((0:199)/100*pi); %// noise: wider spike
x = x + .05*sin((0:9999)/10); %// noise: high-freq ripple
plot(x)
Now apply the median filter (using medfilt2 from the Image Processing Toolbox) and plot in red. The parameter k controls the filter memory. It should chosen to be large compared to noise variations, and small compared to signal variations:
k = 500; %// filter memory. Choose as needed
y = medfilt2(x,[1 k]);
hold on
plot(y, 'r', 'linewidth', 2)
In case you don't have the image processing toolbox and can't use medfilt2 a method that's more manual. Skip the extreme values, and do a curve fit with sin1 as curve type. Note that this will only work if the signal is in fact a sine wave!
x = linspace(0,3*pi,1000);
y1 = sin(x) + rand()*sin(100*x).*(mod(round(10*x),5)<3);
y2 = 20*(mod(round(5*x),5) == 0).*sin(20*x);
y = y1 + y2; %// A messy sine-wave
yy = y; %// Store the messy sine-wave
[~, idx] = sort(y);
y(idx(1:round(0.15*end))) = y(idx(round(0.15*end))); %// Flatten out the smallest values
y(idx(round(0.85*end):end)) = y(idx(round(0.85*end)));%// Flatten out the largest values
[foo goodness output] = fit(x.',y.', 'sin1'); %// Do a curve fit
plot(foo,x,y) %// Plot it
hold on
plot(x,yy,'black')
Might not be perfect, but it's a step in the right direction.

3D curvefitting

I have discrete regular grid of a,b points and their corresponding c values and I interpolate it further to get a smooth curve. Now from interpolation data, I further want to create a polynomial equation for curve fitting. How to fit 3D plot in polynomial?
I try to do this in MATLAB. I used Surface fitting toolbox in MATLAB (r2010a) to curve fit 3-dimensional data. But, how does one find a formula that fits a set of data to the best advantage in MATLAB/MAPLE or any other software. Any advice? Also most useful would be some real code examples to look at, PDF files, on the web etc.
This is just a small portion of my data.
a = [ 0.001 .. 0.011];
b = [1, .. 10];
c = [ -.304860225, .. .379710865];
Thanks in advance.
To fit a curve onto a set of points, we can use ordinary least-squares regression. There is a solution page by MathWorks describing the process.
As an example, let's start with some random data:
% some 3d points
data = mvnrnd([0 0 0], [1 -0.5 0.8; -0.5 1.1 0; 0.8 0 1], 50);
As #BasSwinckels showed, by constructing the desired design matrix, you can use mldivide or pinv to solve the overdetermined system expressed as Ax=b:
% best-fit plane
C = [data(:,1) data(:,2) ones(size(data,1),1)] \ data(:,3); % coefficients
% evaluate it on a regular grid covering the domain of the data
[xx,yy] = meshgrid(-3:.5:3, -3:.5:3);
zz = C(1)*xx + C(2)*yy + C(3);
% or expressed using matrix/vector product
%zz = reshape([xx(:) yy(:) ones(numel(xx),1)] * C, size(xx));
Next we visualize the result:
% plot points and surface
figure('Renderer','opengl')
line(data(:,1), data(:,2), data(:,3), 'LineStyle','none', ...
'Marker','.', 'MarkerSize',25, 'Color','r')
surface(xx, yy, zz, ...
'FaceColor','interp', 'EdgeColor','b', 'FaceAlpha',0.2)
grid on; axis tight equal;
view(9,9);
xlabel x; ylabel y; zlabel z;
colormap(cool(64))
As was mentioned, we can get higher-order polynomial fitting by adding more terms to the independent variables matrix (the A in Ax=b).
Say we want to fit a quadratic model with constant, linear, interaction, and squared terms (1, x, y, xy, x^2, y^2). We can do this manually:
% best-fit quadratic curve
C = [ones(50,1) data(:,1:2) prod(data(:,1:2),2) data(:,1:2).^2] \ data(:,3);
zz = [ones(numel(xx),1) xx(:) yy(:) xx(:).*yy(:) xx(:).^2 yy(:).^2] * C;
zz = reshape(zz, size(xx));
There is also a helper function x2fx in the Statistics Toolbox that helps in building the design matrix for a couple of model orders:
C = x2fx(data(:,1:2), 'quadratic') \ data(:,3);
zz = x2fx([xx(:) yy(:)], 'quadratic') * C;
zz = reshape(zz, size(xx));
Finally there is an excellent function polyfitn on the File Exchange by John D'Errico that allows you to specify all kinds of polynomial orders and terms involved:
model = polyfitn(data(:,1:2), data(:,3), 2);
zz = polyvaln(model, [xx(:) yy(:)]);
zz = reshape(zz, size(xx));
There might be some better functions on the file-exchange, but one way to do it by hand is this:
x = a(:); %make column vectors
y = b(:);
z = c(:);
%first order fit
M = [ones(size(x)), x, y];
k1 = M\z;
%least square solution of z = M * k1, so z = k1(1) + k1(2) * x + k1(3) * y
Similarly, you can do a second order fit:
%second order fit
M = [ones(size(x)), x, y, x.^2, x.*y, y.^2];
k2 = M\z;
which seems to have numerical problems for the limited dataset you gave. Type help mldivide for more details.
To make an interpolation over some regular grid, you can do like so:
ngrid = 20;
[A,B] = meshgrid(linspace(min(a), max(a), ngrid), ...
linspace(min(b), max(b), ngrid));
M = [ones(numel(A),1), A(:), B(:), A(:).^2, A(:).*B(:), B(:).^2];
C2_fit = reshape(M * k2, size(A)); % = k2(1) + k2(2)*A + k2(3)*B + k2(4)*A.^2 + ...
%plot to compare fit with original data
surfl(A,B,C2_fit);shading flat;colormap gray
hold on
plot3(a,b,c, '.r')
A 3rd-order fit can be done using the formula given by TryHard below, but the formulas quickly become tedious when the order increases. Better write a function that can construct M given x, y and order if you have to do that more than once.
This sounds like more of a philosophical question than specific implementation, specifically to bit - "how does one find a formula that fits a set of data to the best advantage?" In my experience that is a choice you have to make depending on what you're trying to achieve.
What defines "best" for you? For a data fitting problem you can keep adding more and more polynomial coefficients and making a better R^2 value... but will eventually "over fit" the data. A downside of high order polynomials is behavior outside the bounds of the sample data which you've used to fit your response surface - it can quickly go off in some wild direction which may not be appropriate for whatever it is you're trying to model.
Do you have insight into the physical behavior of the system / data you're fitting? That can be used as a basis for what set of equations to use to create a math model. My recommendation would be to use the most economical (simple) model you can get away with.

How to fit a curve by a series of segmented lines in Matlab?

I have a simple loglog curve as above. Is there some function in Matlab which can fit this curve by segmented lines and show the starting and end points of these line segments ? I have checked the curve fitting toolbox in matlab. They seems to do curve fitting by either one line or some functions. I do not want to curve fitting by one line only.
If there is no direct function, any alternative to achieve the same goal is fine with me. My goal is to fit the curve by segmented lines and get locations of the end points of these segments .
First of all, your problem is not called curve fitting. Curve fitting is when you have data, and you find the best function that describes it, in some sense. You, on the other hand, want to create a piecewise linear approximation of your function.
I suggest the following strategy:
Split manually into sections. The section size should depend on the derivative, large derivative -> small section
Sample the function at the nodes between the sections
Find a linear interpolation that passes through the points mentioned above.
Here is an example of a code that does that. You can see that the red line (interpolation) is very close to the original function, despite the small amount of sections. This happens due to the adaptive section size.
function fitLogLog()
x = 2:1000;
y = log(log(x));
%# Find section sizes, by using an inverse of the approximation of the derivative
numOfSections = 20;
indexes = round(linspace(1,numel(y),numOfSections));
derivativeApprox = diff(y(indexes));
inverseDerivative = 1./derivativeApprox;
weightOfSection = inverseDerivative/sum(inverseDerivative);
totalRange = max(x(:))-min(x(:));
sectionSize = weightOfSection.* totalRange;
%# The relevant nodes
xNodes = x(1) + [ 0 cumsum(sectionSize)];
yNodes = log(log(xNodes));
figure;plot(x,y);
hold on;
plot (xNodes,yNodes,'r');
scatter (xNodes,yNodes,'r');
legend('log(log(x))','adaptive linear interpolation');
end
Andrey's adaptive solution provides a more accurate overall fit. If what you want is segments of a fixed length, however, then here is something that should work, using a method that also returns a complete set of all the fitted values. Could be vectorized if speed is needed.
Nsamp = 1000; %number of data samples on x-axis
x = [1:Nsamp]; %this is your x-axis
Nlines = 5; %number of lines to fit
fx = exp(-10*x/Nsamp); %generate something like your current data, f(x)
gx = NaN(size(fx)); %this will hold your fitted lines, g(x)
joins = round(linspace(1, Nsamp, Nlines+1)); %define equally spaced breaks along the x-axis
dx = diff(x(joins)); %x-change
df = diff(fx(joins)); %f(x)-change
m = df./dx; %gradient for each section
for i = 1:Nlines
x1 = joins(i); %start point
x2 = joins(i+1); %end point
gx(x1:x2) = fx(x1) + m(i)*(0:dx(i)); %compute line segment
end
subplot(2,1,1)
h(1,:) = plot(x, fx, 'b', x, gx, 'k', joins, gx(joins), 'ro');
title('Normal Plot')
subplot(2,1,2)
h(2,:) = loglog(x, fx, 'b', x, gx, 'k', joins, gx(joins), 'ro');
title('Log Log Plot')
for ip = 1:2
subplot(2,1,ip)
set(h(ip,:), 'LineWidth', 2)
legend('Data', 'Piecewise Linear', 'Location', 'NorthEastOutside')
legend boxoff
end
This is not an exact answer to this question, but since I arrived here based on a search, I'd like to answer the related question of how to create (not fit) a piecewise linear function that is intended to represent the mean (or median, or some other other function) of interval data in a scatter plot.
First, a related but more sophisticated alternative using regression, which apparently has some MATLAB code listed on the wikipedia page, is Multivariate adaptive regression splines.
The solution here is to just calculate the mean on overlapping intervals to get points
function [x, y] = intervalAggregate(Xdata, Ydata, aggFun, intStep, intOverlap)
% intOverlap in [0, 1); 0 for no overlap of intervals, etc.
% intStep this is the size of the interval being aggregated.
minX = min(Xdata);
maxX = max(Xdata);
minY = min(Ydata);
maxY = max(Ydata);
intInc = intOverlap*intStep; %How far we advance each iteraction.
if intOverlap <= 0
intInc = intStep;
end
nInt = ceil((maxX-minX)/intInc); %Number of aggregations
parfor i = 1:nInt
xStart = minX + (i-1)*intInc;
xEnd = xStart + intStep;
intervalIndices = find((Xdata >= xStart) & (Xdata <= xEnd));
x(i) = aggFun(Xdata(intervalIndices));
y(i) = aggFun(Ydata(intervalIndices));
end
For instance, to calculate the mean over some paired X and Y data I had handy with intervals of length 0.1 having roughly 1/3 overlap with each other (see scatter image):
[x,y] = intervalAggregate(Xdat, Ydat, #mean, 0.1, 0.333)
x =
Columns 1 through 8
0.0552 0.0868 0.1170 0.1475 0.1844 0.2173 0.2498 0.2834
Columns 9 through 15
0.3182 0.3561 0.3875 0.4178 0.4494 0.4671 0.4822
y =
Columns 1 through 8
0.9992 0.9983 0.9971 0.9955 0.9927 0.9905 0.9876 0.9846
Columns 9 through 15
0.9803 0.9750 0.9707 0.9653 0.9598 0.9560 0.9537
We see that as x increases, y tends to decrease slightly. From there, it is easy enough to draw line segments and/or perform some other kind of smoothing.
(Note that I did not attempt to vectorize this solution; a much faster version could be assumed if Xdata is sorted.)