I have a set of data and i want to use curve fit toolbox in matlab to plot a spline graph for the data. i have done this:
x =
Columns 1 through 10
0 1.2500 1.8800 2.5000 5.0000 6.2500 6.8800 7.1900 7.5000 10.0000
Columns 11 through 13
12.5000 15.0000 20.0000
y =
Columns 1 through 10
-85.9300 -78.8200 -56.9500 -34.5600 -33.5700 -39.6400 -41.9600 -49.2800 -66.6000 -66.6100
Columns 11 through 13
-59.1600 -48.7800 -41.5300
cftool
[breaks,coefs,l,k,d] = unmkpp(pp)
breaks =
Columns 1 through 10
0 1.2500 1.8800 2.5000 5.0000 6.2500 6.8800 7.1900 7.5000 10.0000
Columns 11 through 13
12.5000 15.0000 20.0000
coefs =
-4.8535 30.6309 -25.0170 -85.9300
-4.8535 12.4304 28.8095 -78.8200
-11.9651 3.2573 38.6927 -56.9500
3.0330 -18.9977 28.9337 -34.5600
-0.2294 3.7501 -9.1852 -33.5700
-11.6351 2.8899 -0.8852 -39.6400
-68.6157 -19.1004 -11.0978 -41.9600
130.6350 -82.9130 -42.7220 -49.2800
-6.3971 38.5776 -56.4659 -66.6000
1.6010 -9.4008 16.4760 -66.6100
-0.2967 2.6064 -0.5099 -59.1600
-0.2967 0.3814 6.9597 -48.7800
l =
12
k =
4
d =
1
Correct me if i am wrong, is the command [breaks,coefs,l,k,d] = unmkpp(pp) able to help me get piecewise equations from the spline graph i obtained? If so, can i know how do i understand the command, so i can use to my own advantage and the significance of the values in coefs, k, d. Thanks! Basically i want to be able to obtain an equation/equations to describe the spline graph i obtained through the curve fit toolbox. any help would be greatly appreciated!
This tries to explain how you can pick apart and display splines generated in Matlab.
Generate mock data
xx = [1:10];
yy = cos(xx);
Fit the data with a cubic spline
pp = spline(xx,yy);
Interpolate with the piecewise polynomial, evaluating it over a finer grid in x
xxf = linspace(min(xx),max(xx),100);
yyf=ppval(pp,xxf);
Start by inspecting pp, which contains all of the information about the piecewise polynomial:
pp =
form: 'pp'
breaks: [1 2 3 4 5 6 7 8 9 10]
coefs: [9x4 double]
pieces: 9
order: 4
dim: 1
The function
[breaks,coefs,l,k,d] = unmkpp(pp)
merely unwraps the contents of structure pp, such that:
d = pp.dim;
l = pp.pieces;
breaks = pp.breaks;
coefs = pp.coefs;
k = pp.order;
Therefore it isn't necessary to call unmkpp if pp is a structure containing all of the info (as above), and you just want the coefficients and the breaks. Instead you can just type
breaks = pp.breaks;
coefs = pp.coefs;
and continue working with this information, as shown below.
Note that for a cubic spline, the order of the polynomials is 4, since the polynomials have the form
C(1)*X^(K-1) + C(2)*X^(K-2) + ... + C(K-1)*X + C(K)
with K = 4, and therefore each polynomial has 4 coefficients C. The highest order term X^3 is consistent with the spline being cubic.
To evaluate the piecewise polynomials:
(1) choose the piece over which you want to evaluate
the polynomial, defined by breaks
(2) pick the correct coefficients for that piece, stored in coefs.
Because these are piecewise polynomials, we evaluate them over the
range 0-1 and then stretch and shift them according to the actual value
of x. We use the range 0-1 to evaluate the polynomial coefficients for the selected piece using the standard function polyval to evaluate a polynomial with known coefficients over a range of interest.
So we find the coefficients cf corresponding to the piece and evaluate the polynomial at points xev:
xev = linspace(0,1,100);
cf = pp.coefs(1,:);
yyp=polyval(cf,xev);
We keep some additional info for plotting:
br = pp.breaks(1:2); % find the breaks (beginning and end of stretch of interest)
xxp = linspace(br(1),br(2),100);
We can generalize this procedure. Thus for the nth piece (say #6):
n = 6;
cf = pp.coefs(n,:);
yyp2=polyval(cf,xev);
br = pp.breaks(n:n+1);
xxp2 = linspace(br(1),br(2),100);
Of course you can skip the above and just use ppval (a function dedicated to work with the spline family of functions), which will do the
same for you, say for the 3rd piece:
br = pp.breaks(3:4); % limits of the piece
xxp3 = linspace(br(1),br(2),100);
yyp3=ppval(pp,xxp3);
Finally we plot all of the polynomials evaluated above
plot(xx,yy,'.')
hold on
plot(xxf,ppval(pp,xxf),'k:')
plot(xxp,yyp,'g-','linewidth',2)
plot(xxp2,yyp2,'r-','linewidth',2) % <-- generated with polyval
plot(xxp3,yyp3,'c-','linewidth',2) % <-- generated with ppval
axis tight
Related
I am using Matlab's backslash operator to solve a system of equations written as two matrices M1 and M2. These two matrices are square and tridiagonal, and so I have defined them as sparse. For example, with the dimensions of each being 5x5, they are defined as follows, with the values in each entry being dependent on some constant a:
N = 5;
a = 1e10;
M1 = spdiags([-a*ones(N,1)... % Sub diagonal
(1 + 2*a)*ones(N,1)... % Main Diagonal
-a*ones(N,1)],... % Super diagonal
-1:1,N,N);
M2 = spdiags([+a*ones(N,1)...
(1 - 2*a)*ones(N,1)...
+a*ones(N,1)],...
-1:1,N,N);
M_out = M1\M2;
So for example, M1 looks like the following in full form:
>> full(M1)
ans =
1.0e+10 *
2.0000 -1.0000 0 0 0
-1.0000 2.0000 -1.0000 0 0
0 -1.0000 2.0000 -1.0000 0
0 0 -1.0000 2.0000 -1.0000
0 0 0 -1.0000 2.0000
Now, if I examine the number of non-zero entries in the result M_out, then I can see they are all non-zero, which is fine:
>> nnz(M_out)
ans =
25
The problem is that I also need to do this for larger values of the constant a. However, if, for example, a=1e16 instead, then the off-diagonal entries of M_out are automatically set to zero, presumably because they have become too small:
>> nnz(M_out)
ans =
5
Is there a better way in Matlab of going about this problem of inverting sparse matrices? Or am I using the backslash operator in the wrong way?
If the size of your matrices doesn't grow too much, I recommend doing a full symbolic computation:
N = 5;
syms a
M1 = diag(-a*ones(N-1,1),-1) + diag((1 + 2*a)*ones(N,1),0) + diag(-a*ones(N-1,1),+1);
M2 = diag(+a*ones(N-1,1),-1) + diag((1 - 2*a)*ones(N,1),0) + diag(+a*ones(N-1,1),+1);
M_out = M1\M2;
M_num_1e10 = subs(M_out,a,1e10);
M_num_1e16 = subs(M_out,a,1e16);
vpa(M_num_1e10)
vpa(M_num_1e16)
In that case, you will need the Symbolic Math Toolbox. If you don't have it, I think you should considerer migrating to Python and work with SymPy.
EDIT:
Considering the way you defined your problem, you need extended precision for your computations. The double precision isn't enough. For example, in double precision (1e16+1) has to be rounded to (1e16), in other words (1e16+1)-(1e16) is equal to zero. So your problem starts in the main diagonal of your matrices. MATLAB only provides extended precision through its symbolic toolbox.
If you want to stick with double precision, you may extend the double precision yourself relying on the so called double-double arithmetic. I say that you will have to do it by yourself because I don't think there is a open source double-double library for MATLAB.
I have several numbers in an array and I would like to find the difference
between each one and sort by the lowest result (I don't want to repeat items). I tried using the command "perms" since it gets all the permutations
v = [120;124;130];
p = perms(v)
but it doesn't seem to work the way I would like. Does anyone have any other suggestions
Example:
I have 3 numbers a=[120,124,130] (please note there could be hundreds of numbers) and it would find the differences between the numbers, then sort by the result. The calculations would look like the text below.
124-120 =4
130-124 =6
130-120 =10
So the final array b will look like the array below
b=
[124 120 4
130 124 6
130 120 10]
PS: I'm using octave 3.8.1 which is like matlab
We can use the PDIST function to compute pair-wise distances, then using a combination of ndgrid and tril to get indices into the original values of the vector. Finally we sort using according to distances:
v = [120;124;130];
D = pdist(v, 'cityblock');
[a,b] = ndgrid(1:numel(v), 1:numel(v));
out = sortrows([v(nonzeros(tril(a,-1))) v(nonzeros(tril(b,-1))) D(:)], 3)
For those that can't load the statistics toolbox Thanks goes to #Amro
v = [120;124.6;130];
%taken out from pdist.m from statistics package
order = nchoosek(1:rows(v),2);
Xi = order(:,1);
Yi = order(:,2);
X = v';
d = X(:,Xi) - X(:,Yi);
y = norm (d, "cols");
[a,b] = ndgrid(1:numel(v), 1:numel(v));
out = sortrows([v(nonzeros(tril(a,-1))) v(nonzeros(tril(b,-1))) y(:)], 3)
out=
124.6000 120.0000 4.6000
130.0000 124.6000 5.4000
130.0000 120.0000 10.0000
I am doing a project involving scientific computing. The following are three variables and their values I got after some experiments.
There is also an equation with three unknowns, a, b and c:
x=(a+0.98)/y+(b+0.7)/z+c
How do I get values of a,b,c using the above? Is this possible in MATLAB?
This sounds like a regression problem. Assuming that the unexplained errors in measurements are Gaussian distributed, you can find the parameters via least squares. Basically, you'd have to rewrite the equation so that you get this to the form of ma + nb + oc = p and then you have 6 equations with 3 unknowns (a, b, c) and these parameters can be found through optimization by least squares. Therefore, with some algebra, we get:
za + yb + yzc = xyz - 0.98z - 0.7z
As such, m = z, n = y, o = yz, p = xyz - 0.98z - 0.7z. I'll leave that for you as an exercise to verify that my algebra is right. You can then form the matrix equation:
Ax = d
We would have 6 equations and we want to solve for x where x = [a b c]^{T}. To solve for x, you can employ what is known as the pseudoinverse to retrieve the parameters that best minimize the error between the true output and the output that is generated by these parameters if you were to use the same input data.
In other words:
x = A^{+}d
A^{+} is the pseudoinverse of the matrix A and is matrix-vector multiplied with the vector d.
To put our thoughts into code, we would define our input data, form the A matrix and d vector where each row shared between them both is one equation, and then employ the pseudoinverse to find our parameters. You can use the ldivide (\) operator to do the job:
%// Define x y and z
x = [9.98; 8.3; 8.0; 7; 1; 12.87];
y = [7.9; 7.5; 7.4; 6.09; 0.9; 11.23];
z = [7.1; 5.6; 5.9; 5.8; -1.8; 10.8];
%// Define A matrix
A = [z y y.*z];
%// Define d vector
d = x.*y.*z - 0.98*z - 0.7*z;
%// Find parameters via least-squares
params = A\d;
params stores the parameters a, b and c, and we get:
params =
-37.7383
-37.4008
19.5625
If you want to double-check how close the values are, you can simply use the above expression in your post and compare with each of the values in x:
a = params(1); b = params(2); c = params(3);
out = (a+0.98)./y+(b+0.7)./z+c;
disp([x out])
9.9800 9.7404
8.3000 8.1077
8.0000 8.3747
7.0000 7.1989
1.0000 -0.8908
12.8700 12.8910
You can see that it's not exactly close, but the parameters you got would be the best in a least-squares error sense.
Bonus - Fitting with RANSAC
You can see that some of the predicted values (right column in the output) are more off than others. This is because we used all points in your data to find the appropriate model. One technique that is used to minimize error and increase the robustness of the model estimation is to use something called RANSAC, or RANdom SAmple Consensus. The basic methodology behind RANSAC is that for a certain number of iterations, you take your data and randomly sample the least amount of points necessary to find a model. Once you find this model, you find the overall error if you were to use these parameters to describe your data. You keep randomly choosing points, finding your model, and finding the error and the iteration that produced the least amount of error would be the parameters you keep to define the overall model.
As you can see above, one error that we can define is the sum of absolute differences between the true x points and the predicted x points. There are many other measures, such as the sum of squared errors, but let's stick with something simple for now. If you take a look at the above formulation, we need a minimum of three equations in order to define a, b and c, and so for each iteration, we'd randomly select three points without replacement I might add, find our model, determine the error, and keep iterating and finding the parameters with the least amount of error.
Therefore, you could write a RANSAC algorithm like so:
%// Define cost and number of iterations
cost = Inf;
iterations = 50;
%// Set seed for reproducibility
rng(123);
%// Define x y and z
x = [9.98; 8.3; 8.0; 7; 1; 12.87];
y = [7.9; 7.5; 7.4; 6.09; 0.9; 11.23];
z = [7.1; 5.6; 5.9; 5.8; -1.8; 10.8];
for idx = 1 : iterations
%// Determine where we would need to sample
ind = randperm(numel(x), 3);
xs = x(ind); ys = y(ind); zs = z(ind); %// Sample
%// Define A matrix
A = [zs ys ys.*zs];
%// Define d vector
d = xs.*ys.*zs - 0.98*zs - 0.7*zs;
%// Find parameters via least-squares
params = A\d;
%// Determine error
a = params(1); b = params(2); c = params(3);
out = (a+0.98)./y+(b+0.7)./z+c;
err = sum(abs(x - out));
%// If error produced is less than current error
%// then save parameters
if err < cost
cost = err;
final_params = params;
end
end
When I run the above code, I get for my parameters:
final_params =
-38.1519
-39.1988
19.7472
Comparing this with our x, we get:
a = final_params(1); b = final_params(2); c = final_params(3);
out = (a+0.98)./y+(b+0.7)./z+c;
disp([x out])
9.9800 9.6196
8.3000 7.9162
8.0000 8.1988
7.0000 7.0057
1.0000 -0.1667
12.8700 12.8725
As you can see, the values are improved - especially the fourth and sixth points... and compare it to the previous version:
9.9800 9.7404
8.3000 8.1077
8.0000 8.3747
7.0000 7.1989
1.0000 -0.8908
12.8700 12.8910
You can see that the second value is worse off than the previous version, but the other numbers are much more closer to the true values.
Have fun!
I need to exclude some error data from matrix. I know what data is correct and i am trying to interpolate values between so I can get decent diagrams with not so big errors. I must use that form of matrix and I must preserve its shape. I must only substitute some data that is marked as errors. I will show you my work so far:
M=[0.1000
0.6000
0.7000
0.8000
0.9000
0.9500
1.0000
1.0500
1.1000
1.1500
1.2000
1.2500
1.3000
1.5000
1.7500
2.0000
2.2500
2.5000
3.0000];
CZ1=[ 9.4290
9.5000
9.3250
9.2700
9.2950
9.4350
9.6840
10.0690
10.1840
10.2220
10.2160
9.6160
9.6890
9.4880
9.5000
9.5340
9.3370
9.0990
8.5950];
N1=11;
Nn=13;
Mx1=M(N1);
Mx2=M(Nn);
Mx=[Mx1 Mx2]';
CN1=CZ1(N1);
CN2=CZ1(Nn);
CNy=[C1 C2]';
y1=interp1q(Mx,CNy,M(N1:Nn));
CNf=CZ1;
NEWRangeC=y1;
Cfa=changem(CZ1,[NEWRangeC], [CNf(N1:Nn)]);
figure
plot(M,Cf,'-*b',M,Cfa,'r')
So far as you can see I used points 11 and 13 and i excluded point 12 interpolating that point from 11 to 13. This is working but i want to make a modification.
My question is: How can I select values that are errors and remove them but interpolate space between their neighbors. I want to use a M matrix values as my reference (not points as my example).
Assuming you know which elements are incorrect, you can use Matlab's interp1 function to interpolate them (this will only work if the M matrix is actually a vector`:
error_indices = [11 13];
all_indices = 1:length(M)
% Get the indices where we have valid data
all_correct_indices = setdiff(all_indices, error_indices)
% the first two arguments are the available data.
% the third arguments is what indices you are looking for
M_new = interp1(all_correct_indices, M(all_correct_indices), all_indices)
The above interpolates values at all_indices -- including the missing elements. Where you already have valid data (all_correct_indices), Matlab will return that data. In other places, it will interpolate using the two nearest neighbors.
Try help interp1 for more information on how this function works.
Update - an example
x = 1:10; % all indices
y = x*10;
e = 3:7; % the unknown indices
s = setdiff(x, e); % the known indices
y_est = interp1(s, y(s), x)
ans =
10 20 30 40 50 60 70 80 90 100
And we see that interp1 had interpolated all values from 30 to 70 linearly using the available data (specifically the adjacent points 20 and 80).
Well, you can start out by finding the elements that are errors, with the find command (this will return the indices). This should also work for matrices.
You can then grab the elements around each of the indices, and interpolate between, as you did.
Given two vectors containing numerical values, say for example
a=1.:0.1:2.;
b=a+0.1;
I would like to select only the differing values. For this Matlab provides the function setdiff. In the above example it is obvious that setdiff(a,b) should return 1. and setdiff(b,a) gives 2.1. However, due to computational precision (see the questions here or here) the result differs. I get
>> setdiff(a,b)
ans =
1.0000 1.2000 1.4000 1.7000 1.9000
Matlab provides a function which returns a lower limit to this precision error, eps. This allows us to estimate a tolerance like tol = 100*eps;
My question now, is there an intelligent and efficient way to select only those values whose difference is below tol? Or said differently: How do I write my own version of setdiff, returning both values and indexes, which includes a tolerance limit?
I don't like the way it is answered in this question, since matlab already provides part of the required functionality.
Introduction and custom function
In a general case with floating point precision issues, one would be advised to use a tolerance value for comparisons against suspected zero values and that tolerance must be a very small value. A little robust method would use a tolerance that uses eps in it. Now, since MATLAB basically performs subtractions with setdiff, you can use eps directly here by comparing for lesser than or equal to it to find zeros.
This forms the basis of a modified setdiff for floating point numbers shown here -
function [C,IA] = setdiff_fp(A,B)
%//SETDIFF_FP Set difference for floating point numbers.
%// C = SETDIFF_FP(A,B) for vectors A and B, returns the values in A that
%// are not in B with no repetitions. C will be sorted.
%//
%// [C,IA] = SETDIFF_FP(A,B) also returns an index vector IA such that
%// C = A(IA). If there are repeated values in A that are not in B, then
%// the index of the first occurrence of each repeated value is returned.
%// Get 2D matrix of absolute difference between each element of A against
%// each element of B
abs_diff_mat = abs(bsxfun(#minus,A,B.')); %//'
%// Compare each element against eps to "negate" the floating point
%// precision issues. Thus, we have a binary array of true comparisons.
abs_diff_mat_epscmp = abs_diff_mat<=eps;
%// Find indices of A that are exclusive to it
A_ind = ~any(abs_diff_mat_epscmp,1);
%// Get unique(to account for no repetitions and being sorted) exclusive
%// A elements for the final output alongwith the indices
[C,IA] = intersect(A,unique(A(A_ind)));
return;
Example runs
Case1 (With integers)
This will verify that setdiff_fp works with integer arrays just the way setdiff does.
A = [2 5];
B = [9 8 8 1 2 1 1 5];
[C_setdiff,IA_setdiff] = setdiff(B,A)
[C_setdiff_fp,IA_setdiff_fp] = setdiff_fp(B,A)
Output
A =
2 5
B =
9 8 8 1 2 1 1 5
C_setdiff =
1 8 9
IA_setdiff =
4
2
1
C_setdiff_fp =
1 8 9
IA_setdiff_fp =
4
2
1
Case2 (With floating point numbers)
This is to show that setdiff_fp produces the correct results, while setdiff doesn't. Additionally, this will also test out the output indices.
A=1.:0.1:1.5
B=[A+0.1 5.5 5.5 2.6]
[C_setdiff,IA_setdiff] = setdiff(B,A)
[C_setdiff_fp,IA_setdiff_fp] = setdiff_fp(B,A)
Output
A =
1.0000 1.1000 1.2000 1.3000 1.4000 1.5000
B =
1.1000 1.2000 1.3000 1.4000 1.5000 1.6000 5.5000 5.5000 2.6000
C_setdiff =
1.2000 1.4000 1.6000 2.6000 5.5000
IA_setdiff =
2
4
6
9
7
C_setdiff_fp =
1.6000 2.6000 5.5000
IA_setdiff_fp =
6
9
7
For Tolerance of 1 epsilon This should work:
a=1.0:0.1:2.0;
b=a+0.1;
b=[b b-eps b+eps];
c=setdiff(a,b)
The idea is to expand b to include also its closest values.