How to make a polynomial approximation in Scilab? - polynomial-math

I've a set of measures, which I want to approximate. I know I can do that with a 4th degree polynomial, but I don't know how to find it's five coefficients using Scilab.
For now, I must use the user-friendly functions of Open office calc... So, to keep using only Scilab, I'd like to know if a built-in function exists, or if we can use a simple script.

There is no built-in polyfit function like in Matlab, but you can make your own:
function cf = polyfit(x,y,n)
A = ones(length(x),n+1)
for i=1:n
A(:,i+1) = x(:).^i
end
cf = lsq(A,y(:))
endfunction
This function accepts two vectors of equal size (they can be either row or column vectors; colon operator makes sure they are column-oriented in the computation) and the degree of polynomial.
It returns the column of coefficients, ordered from 0th to the nth degree.
The computational method is straightforward: set up the (generally, overdetermined) linear system that requires the polynomial to pass through every point. Then solve it in the sense of least squares with lsq (in practice, it seems that cf = A\y(:) performs identically, although the algorithm is a bit different there).
Example of usage:
x = [-3 -1 0 1 3 5 7]
y = [50 74 62 40 19 35 52]
cf = polyfit(x,y,4)
t = linspace(min(x),max(x))' // now use these coefficients to plot the polynomial
A = ones(length(t),n+1)
for i=1:n
A(:,i+1) = t.^i
end
plot(x,y,'r*')
plot(t,A*cf)
Output:

The Atom's toolbox "stixbox" has Matlab-compatible "polyfit" and "polyval" functions included.
// Scilab 6.x.x need:
atomsInstall(["stixbox";"makematrix";"distfun";"helptbx";linalg"]) // install toolboxes
// POLYNOMINAL CURVE_FITTING
// Need toolboxes above
x = [-3 -1 0 1 3 5 7];
y = [50 74 62 40 19 35 52];
plot(x,y,"."); // plot sample points only
pcoeff = polyfit(x,y,4); // calculate polynominal coefficients (4th-degree)
xp = linspace(-3,7,100); // generate a little more x-values for a smoother curve fitting
yp = polyval(pcoeff,xp); // calculate the y-values for the curve fitting
plot(xp, yp,"k"); // plot the curve fitting in black

Related

Calculating the root-mean-square-error between two matrices one of which contains NaN values

This is a part of a larger project so I will try to keep only the relevant parts (The variables and my attempt at the calculations)
I want to calculate the root mean squared error between Zi_cubic and Z_actual
RMSE formula
Given/already established variables
rng('default');
% Set up 2,000 random numbers between -1 & +1 as our x & y values
n=2000;
x = 2*(rand(n,1)-0.5);
y = 2*(rand(n,1)-0.5);
z = x.^5+y.^3;
% Interpolate to a regular grid
d = -1:0.01:1;
[Xi,Yi] = meshgrid(d,d);
Zi_cubic = griddata(x,y,z,Xi,Yi,'cubic');
Z_actual = Xi.^5+Yi.^3;
My attempt at a calculation
My approach is to
Arrange Zi_cubic and Z_actual as column vectors
Take the difference
Square each element in the difference
Sum up all the elements in 4 using nansum
Divide by the number of finite elements in 4
Take the square root
D1 = reshape(Zi_cubic,[numel(Zi_cubic),1]);
D2 = reshape(Z_actual,[numel(Z_actual),1]);
D3 = D1 - D2;
D4 = D3.^2;
D5 = nansum(D4)
d6 = sum(isfinite(D4))
D6 = D5/d6
D7 = sqrt(D6)
Apparently this is wrong. I'm either mis-applying the RMSE formula or I don't understand what I'm telling matlab to do.
Any help would be appreciated. Thanks in advance.
Your RMSE is fine (in my book). The only thing that seems possibly off is the meshgrid and griddata. Your inputs to griddata are vectors and you are asking for a matrix output. That is fine, but you're potentially undersampling your input space. In other words, you are giving n samples as inputs, but perhaps you are expected to give n^2 samples as inputs? Here's some sample code for a smaller n to demonstrate this effect more clearly:
rng('default');
% Set up 2,000 random numbers between -1 & +1 as our x & y values
n=100; %Reduced because scatter is slow to plot
x = 2*(rand(n,1)-0.5);
y = 2*(rand(n,1)-0.5);
z = x.^5+y.^3;
S = 100;
subplot(1,2,1)
scatter(x,y,S,z)
%More data, more accurate ...
[x2,y2] = meshgrid(x,y);
z2 = x2.^5+y2.^3;
subplot(1,2,2)
scatter(x2(:),y2(:),S,z2(:))
The second plot should be a lot cleaner and thus will likely provide a more accurate estimate of Z_actual later on.
I also thought you might be running into some issues with floating point numbers and calculating RMSE but that appears not to be the case. Here's some alternative code which is how I would write RMSE.
d = Zi_cubic(:) - Z_actual(:);
mask = ~isnan(d);
n_valid = sum(mask);
rmse = sqrt(sum(d(mask).^2)/n_valid);
Notice that (:) linearizes the matrix. Also it is useful to try and use better variable names than D1-D7.
In the end though these are just suggestions and your code looks fine.
PS - I'm assuming that you are supposed to be using cubic interpolation as that is another place you could perhaps deviate from what's expected ...

Vectorize method for N period moving slope

Can someone please help me vectorize a moving slope calculation. I trying to eliminate the for loop but I am not sure how to do so.
>> pv = [18 19 20 20.5 20.75 21 21.05 21.07 21.07]'; %% price vector
>> slen = 3; %% slope length
function [slope] = slope(pv , slen)
svec = (1:1:slen)';
coef = [];
slope = zeros(size(pv));
for i = slen+1 : size(pv,1)
X = [ones(slen,1) svec];
y = pv( (i - (slen-1)) : i );
a = X\y;
slope(i,1) = a(2);
end
>> slp = slope(pv,3)
slp =
0
0
0
0.75
0.375
0.25
0.15
0.035
0.01
Thanks
EDIT: completely changing answer to make it scalable
function [slope] = calculate_slope(pv , slen) %% Note: bad practice to give a function and variable the same name
svec = (1:1:slen)';
X = [ones(slen,1) svec];
%% the following two lines basically create the all the sliding windows of length slen (as a submatrix of a larger matrix)
c = repmat ( flipud(pv), 1, length(pv))
d = flipud(reshape(c(1:end-1), length(pv)-1, length(pv) + 1));
%% then run MATLAB solver with all windows simultaneously
least_sq_result = X\d( end - slen + 1:end, (slen+1):end);
slope = [zeros(slen-1, 1); least_sq_result(2,:)']; %% padding with zeros is optional
EDIT: fixed swapped indices
Finding the slope in a sliding window using least-squares regression is equivalent to first-order Savitzy-Golay filtering (using a differentiating filter). The concept of SG filtering is to perform local polynomial fits in a sliding window, then use the local model to smooth the signal or compute its derivative. When the data points are spaced equally in time (as they are here), the computation can be run very efficiently by pre-computing a set of filter coefficients, then convolving them with the data. This should be much faster than constructing a giant matrix and doing regression on it.
This is a pretty standard technique, and there's definitely existing matlab code floating around. Search for something like 'Savitzky-Golay differentiation'. Note that SG filters can also perform smoothing (the matlab builtin SG filtering functions do this), but you want the version that does differentiation.
Savitzky and Golay (1964). Smoothing and Differentiation of Data by Simplified Least Squares Procedures

Making sense of CCA (Matlab implementation) 2

I am using CCA for my work and want to understand something.
This is my MATLAB code. I have only taken 100 samples to better understand the concepts of CCA.
clc;clear all;close all;
load carbig;
data = [Displacement Horsepower Weight Acceleration MPG];
data(isnan(data))=0;
X = data(1:100,1:3);
Y = data(1:100,4:5);
[wx,wy,~,U,V] = CCA(X,Y);
clear Acceleration Cylinders Displacement Horsepower MPG Mfg Model Model_Year Origin Weight when org
subplot(1,2,1),plot(U(:,1),V(:,1),'.');
subplot(1,2,2),plot(U(:,2),V(:,2),'.');
My plots are coming like this:
This points out that in the 1st figure (left), the transformed variables are highly correlated with little scatter around the central axis. While in the 2nd figure(right), the scatter around the central axis is much more.
As I understand from here that CCA maximizes the correlation between the data in the transformed space. So I tried to design a matching score which should return a minimum value if the vectors are maximally correlated. I tried to match each vector of U(i,:) with that of V(j,:) with i,j going from 1 to 100.
%% Finding the difference between the projected vectors
for i=1:size(U,1)
cost = repmat(U(i,:),size(U,1),1)- V;
for j=1:size(U,1)
c(i,j) = norm(cost(j,:),size(U,2));
end
[~,idx(i)] = min(c(i,:));
end
Ideally idx should be like this :
idx = 1 2 3 4 5 6 7 8 9 10 ....
as they are maximally correlated. However my output comes something like this :
idx = 80 5 3 1 4 7 17 17 17 10 68 78 78 75 9 10 5 1 6 17 .....
I dont understand why this happens.
Am I wrong somewhere ? Isnt the vectors supposed to be maximally correlated in the transformed CCA subspace?
If my above assumption is wrong, please point me out in the correct direction.
Thanks in advance.
First, Let me transpose your code in R2014b:
load carbig;
data = [Displacement Horsepower Weight Acceleration MPG];
% Truncate the data, to follow-up with your sample code
data = data(1:100,:);
nans = sum(isnan(data),2) > 0;
[wx, wy, r, U, V,] = canoncorr(X(~nans,1:3),X(~nans,4:5));
OK, now the trick is that the vectors which are maximally correlated in the CCA subspace are the column vectors U(:,1) with V(:,1) and U(:,2) with V(:,2), and not the row vectors U(i,:), as you are trying to compute. In the CCA subspace, vectors should be N-dimensional (here N=100), and not simple 2D vectors. That's the reason why visualization of CCA results is often quite complicated !
By the way, the correlations are given by the third output of canoncorr, that you (intentionally ?) choosed to skip in your code. If you check its content, you'll see that the correlations (i.e. the vectors) are well-ordered:
r =
0.9484 0.5991
It is hard to explain CCA better than the link you already provided. If you want to go further, you should probably invest in a book, like this one or this one.

Matlab 3D Plot of transfer function magnitude

How can I plot amplitude of transfer function in three dimension (for instance to check poles and zeros on graph) ?
Suppose this is my transfer function:
My code:
b = [6 -10 2];
a = [1 -3 2];
[x, y] = meshgrid(-3:0.1:3);
z = x+y*j;
res = (polyval(b, z))./(polyval(a,z));
surf(x,y, abs(res));
Is it correct? I'd also like to know is it possible to mark unit circle on plot?
I think it's correct. However, you're computing H(z^-1), not H(z). Is that you want to do? For H(z), just reverse the entries in a from left to right (with fliplr), and do the same to b:
res = (polyval(fliplr(b), z))./(polyval(fliplr(a),z));
To plot the unit circle you can use rectangle. Seriously :-) It has a 'Curvature' property which can be set to generate a circle.
It's best if you use imagesc instead of surf to make the circle clearly visible. You will get a view from above, where color represents height (value of abs(H)):
imagesc(-3:0.1:3,-3:0.1:3, abs(res));
hold on
rectangle('curvature', [1 1], 'position', [-1 -1 2 2], 'edgecolor', 'w');
axis equal
I have never in my whole life heard of a 3D transfer function, it doesn't make sense. I think you are completely wrong: z does not represent a complex number, but the fact that your transfer function is a discrete one, rather than a continuous one (see the Z transform for more details).
The correct way to do this in MATLAB is to use the tf function, which requires the Control System Toolbox (note that I have assumed your discrete sample time to be 0.1s, adjust as required):
>> b = [6 -10 2];
a = [1 -3 2];
>> sys = tf(b,a,0.1,'variable','z^-1')
sys =
6 - 10 z^-1 + 2 z^-2
--------------------
1 - 3 z^-1 + 2 z^-2
Sample time: 0.1 seconds
Discrete-time transfer function.
To plot the transfer function, use the bode or bodeplot function:
bode(sys)
For the poles and zeros, simply use the pole and zero functions.

Arbitrary sampling over an interpolation

I have arbitrary points (8192,4678,1087.2,600,230.4,etc) that I want to interpolate and resample at other define points (100,500.3,802,2045,4399.5125,etc).
I tried cubic spline interpolation but it is using a steady step sampling and depending on the step sampling it may not generate the value I need.
How would you do it ?
If your points are x1=[...] and y1=[...] and you want to evaluate a spline a new base of x2=[...] then you
y2 = spline(x1,y1,x2)
** Example **
x1 = [0,2,4,6,8].'
y1 = [24,25,22,14,6].'
x2 = [2,2.5,3,3.5,4].'
y2 = spline(x1,y1,x2)
y2 =
25.0000
24.7227
24.1563
23.2617
22.0000
It all depends on the underlying physical phenomenon. There is a fine line between interpolating and just making up stuff.
I would probably first upsample & filter until I have a meaningful signal at a fixed sampling rate.
I would then use some interpolation method to estimate the signal at the goal points.
I would recommend you to consider doing this backwards.
Rather than generating a lot of points and hoping that the points that you need are there, calculate a formula for the interpolation (perhaps piecewise linear or something more complicated) and evaluate the function at the required points.
Assuming you have x = [1 2 3 4 10] and y = [11 22 13 24 11] your linear interpolation at point 6 would be:
24+(6-4) * (11-24) / (10-4)
It should not be too hard to generalize this.