MATLAB Fit a line to a histogram - matlab

I am just wondering, how would I go about fitting a line to histogram, using the z-counts as weights? An example of this is shown below (although this post just discusses overlaying multiple plots), taken from Scatter plot with density in Matlab).
My initial thought is to make an array consisting of each pixel from the density plot, repeated n times to make a scatter plot (n == the number of counts), then do a linear polyfit. This seems awfully redundant though.

The other approach is to do a weighted least squares solution. You need the (x,y) location of each pixel and the number of counts n within each pixel. Then, I think that you'd do the weighted least-squares this way:
%gather your known data...have x,y, and n all in the same order as each other
A = [x(:) ones(length(x),1)]; %here are the x values from your histogram
b = y(:); %here are the y-values from your histogram
C = diag(n(:)); %counts from each pixel in your 2D histogram
%Define polynomial coefficients as p = [slope; y_offset];
%usual least-squares solution...written here for reference
% b = A*p; %remember, p = [slope; y_offset];
% p = inv(A'*A)*(A'*b); %remember, p = [slope; y_offset];
%We want to apply a weighting matrix, so incorporate the weighting matrix
% A' * b = A' * C * A * p;
p = inv(A' * C * A)*(A' * b); %remember, p = [slope; y_offset];
The biggest uncertainty for me with this solution is whether the C matrix should be made up of n or n.^2, I can never remember. Hopefully, someone can correct me in the comments, if needed.

If you have the original data, which is a collection of (x,y) points, you simply do a polyfit on the original data:
p = polyfit(x(:),y(:),1); %linear fit
That will give you a best fit (in the least-squares sense) to the original data, which is what you want.
If you do not have the original data, and you only have the 2D histogram, the approach that you defined (which basically recreates a facsimile of the original data) will give a similar answer as if you did the polyfit on the original data.

Related

ifft and using a sum of square waves instead of the sum of sine waves to rebuild a signal

I know that ifft sums multiple sine waves up from data obtain from doing an fft on a signal. is there a way to do a ifft using square waves instead of sine waves?
I'm not trying to get the original signal back but trying to rebuild it using square waves from the data taken from the fft instead of the normal sine wave summation process.
See simple example below: the signals I will be using are human audio signals about 60 seconds long so I'm trying to see if I can use / alter the ifft command in some way.
PS: I'm using Octave 4.0 which is similar to Matlab
clear all,clf reset, clc,tic
Fs = 200; % Sampling frequency
t=linspace(0,1,Fs);
freq=2;
%1 create signal
ya = .5*sin(freq*pi*2*t+pi);
%2 create frequency domain
ya_fft = fft(ya);
%3 rebuild signal
mag = abs(ya_fft);
phase = unwrap(angle(ya_fft));
ya_newifft=ifft(mag.*exp(i*phase));
ifft_sig_combined_L1=ifft(mag.*exp(i*phase),Fs); %use Fs to get correct file length
% square wave
vertoffset=0.5;
A=1
T = 1/freq; % period of the signal
square = mod(t * A / T, A) > A / 2;
square = square - vertoffset;
subplot(3,1,1);
plot(t,ya,'r')
title('orignal signal')
subplot(3,1,2);
plot(t,ifft_sig_combined_L1)
title('rebuilt signal')
subplot(3,1,3);
plot(t,square)
title('rebuilt signal with square wave')
Define the basis vectors you want to use and let them be the columns of a matrix, A. If b is your signal, then just get the least squares solution to Ax = b. If A is full rank, then you will be able to represent b exactly.
Edit:
Think about what a matrix-vector product does: Each column of the matrix is multiplied by the corresponding element of the vector (i.e., the n^th column of the matrix is multiplied by the n^th element of the vector) and the resulting products are summed together. (This would be a lot easier to illustrate if this site supported latex.) In Matlab, a horrible but hopefully illustrative way to do this is
A = some_NxN_matrix;
x = some_Nx1_vector;
b = zeros( size(A,1), 1 );
for n = 1 : length(x)
b = b + A(:,n) * x(n);
end
(Of course, you would never actually do the above but rather b = A*x;.)
Now define whatever square waves you want to use and assign each to its own Nx1 vector. Call these vectors s_1, s_2, ..., s_M, where M is the number of square waves you are using. Now let
A = [s1, s2, ..., s_M];
According to your question, you want to represent your signal as a weighted sum of these square waves. (Note that this is exactly what a DFT does it just uses orthogonal sinusoids rather than square waves.) To weight and sum these square waves, all you have to do is find the matrix-vector product A*x, where x is the vector of coefficients that weight each column (see the above paragraph). Now, if your signal is b and you want to the find the x that will best sum the square waves in order to approximate b, then all you have to do is solve A*x=b. In Matlab, this is given by
x = A \ b;
The rest is just linear algebra. If a left-inverse of A exists (i.e., if A has dimensions M x N and rank N, with M > N), then (A^-1) * A is an identity matrix and
(A^-1) * A * x = (A^-1) * b,
which implies that x = (A^-1) * b, which is what x = A \ b; will return in Matlab. If A has dimensions M x N and rank M, with N > M, then the system is underdetermined and a left-inverse does not exist. In this case you have to use the psuedo-inverse to solve the system. Now suppose that A is NxN with rank N, so that both the left- and right-inverse exist. In this case, x will give an exact representation of b:
x = (A^-1) * b
A * x = A * (A^-1) * b = b
If you want an example of A that uses square waves to get an exact representation of the input signal, check out the Haar transform. There is a function available here.

Transform random draws from a bivariate normal into the unit square in Matlab

I have an nx2 matrix r in Matlab reporting n draws from a bivariate normal distribution
n=1000;
m1=0.3;
m2=-m1;
v1=0.2;
n=10000;
v2=2;
rho=0.5;
mu = [m1, m2];
sigma = [v1,rho*sqrt(v1)*sqrt(v2);rho*sqrt(v1)*sqrt(v2),v2];
r = mvnrnd(mu,sigma,n);
I want to normalise these draws to the unit square [0,1]^2
First option
rmax1=max(r(:,1));
rmin1=min(r(:,1));
rmax2=max(r(:,2));
rmin2=min(r(:,2));
rnew=zeros(n,2);
for i=1:n
rnew(i,1)=(r(i,1)-rmin1)/(rmax1-rmin1);
rnew(i,2)=(r(i,2)-rmin2)/(rmax2-rmin2);
end
Second option
rmin1, rmax1, rmin2, rmax2 may be quite variable due to the sampling process. An alternative is applying the 68–95–99.7 rule (here) and I am asking for some help on how to generalise it to a bivariate normal (in particular Step 1 below). Here's my idea
%Step 1: transform the draws in r into draws from a bivariate normal
%with variance-covariance matrix equal to the 2x2 identity matrix
%and mean equal to mu
%How?
%Let t be the transformed vector
%Step 2: apply the 68–95–99.7 rule to each column of t
tmax1=mu(1)+3*1;
tmin1=mu(1)-3*1;
tmax2=mu(2)+3*1;
tmin2=mu(2)-3*1;
tnew=zeros(n,2);
for i=1:n
tnew(i,1)=(t(i,1)-tmin1)/(tmax1-tmin1);
tnew(i,2)=(t(i,1)-tmin2)/(tmax2-tmin2);
end
%Step 3: discard potential values (very few) outside [0,1]
In your case the x and y coordinates of the random vector are correlated, so it's not just a transformation in x and in y independently. You first need to rotate your samples so that x and y become uncorrelated (then the covariance matrix will be diagonal. You don't need it to be the identity, since anywya you normalize later). Then you can apply the transformation you call "2nd option" to the new x and y independently. Shortly, you need to diagonalize the covariance matrix.
As a side note, your code adds/subtracts 3 times 1, instead of 3 times the standard deviation. Also, you can avoid the for loop, using (e.g) Matlab's bsxfun which applies an operation between matrix and vector:
t = bsxfun(#minus,r,mean(r,1)); % center the data
[v, d] = eig(sigma); % find the directions for projection
t = t * v; % the projected data is uncorrelated
sigma_new = sqrt(diag(d)); % that's the std in the new coordinates
% now transform each coordinate independently
tmax1 = 3*sigma_new(1);
tmin1 = -3*sigma_new(1);
tmax2 = 3*sigma_new(2);
tmin2 = -3*sigma_new(2);
tnew = bsxfun(#minus, t, [tmin1, tmin2]);
tnew = bsxfun(#rdivide, tnew, [tmax1-tmin1, tmax2-tmin2]);
You still need to discard the few samples which are out of [0,1], as you wrote.

curve fitting: a number of curves with different length/number of points into one curve in 3D in Matlab

Say I've got a number of curves with different length (number of points in each curve and points distance are all vary). Could I find a curve in 3D space that fit best for this group of lines?
Code example in Matlab would be appreciated.
example data set:
the 1st curve has 10 points.
18.5860 18.4683 18.3576 18.2491 18.0844 17.9016 17.7709 17.6401 17.4617 17.2726
91.6178 91.5711 91.5580 91.5580 91.5701 91.6130 91.5746 91.5050 91.3993 91.2977
90.6253 91.1090 91.5964 92.0845 92.5565 93.0199 93.5010 93.9785 94.4335 94.8851
the 2nd curve has 8 points.
15.2091 15.0894 14.9765 14.8567 14.7360 14.6144 14.4695 14.3017
90.1138 89.9824 89.8683 89.7716 89.6889 89.6040 89.4928 89.3624
99.4393 99.9066 100.3802 100.8559 101.3340 101.8115 102.2770 102.7296
a desired curve is one that could represent these two exist curves.
I have thinking of make these curves as points scatters and fit a line out of them. But only a straight line can I get from many code snippet online.
So did I missing something or could someone provide some hint. Thanks.
Hard to come up with a bulletproof solution without more details, but here's an approach that works for the sample data provided. I found the line of best fit for all the points, and then parameterized all the points along that line of best fit. Then I did least-squares polynomial fitting for each dimension separately. This produced a three-dimensional parametric curve that seems to fit the data just fine.
Note that curve fitting approaches other than polynomial least-squares might be better suited to some cases---just substitute the preferred fitting function for polyfit and polyval.
Hope this is helpful!
clear;
close all;
pts1=[18.5860 18.4683 18.3576 18.2491 18.0844 17.9016 17.7709 17.6401 17.4617 17.2726;
91.6178 91.5711 91.5580 91.5580 91.5701 91.6130 91.5746 91.5050 91.3993 91.2977;
90.6253 91.1090 91.5964 92.0845 92.5565 93.0199 93.5010 93.9785 94.4335 94.8851]';
pts2=[ 15.2091 15.0894 14.9765 14.8567 14.7360 14.6144 14.4695 14.3017;
90.1138 89.9824 89.8683 89.7716 89.6889 89.6040 89.4928 89.3624;
99.4393 99.9066 100.3802 100.8559 101.3340 101.8115 102.2770 102.7296]';
%Combine all of our curves into a single point cloud
X = [pts1;pts2];
%=======================================================
%We want to first find the line of best fit
%This line will provide a parameterization of the points
%See accepted answer to http://stackoverflow.com/questions/10878167/plot-3d-line-matlab
% calculate centroid
x0 = mean(X)';
% form matrix A of translated points
A = [(X(:, 1) - x0(1)) (X(:, 2) - x0(2)) (X(:, 3) - x0(3))];
% calculate the SVD of A
[~, S, V] = svd(A, 0);
% find the largest singular value in S and extract from V the
% corresponding right singular vector
[s, i] = max(diag(S));
a = V(:, i);
%=======================================================
a=a / norm(a);
%OK now 'a' is a unit vector pointing along the line of best fit.
%Now we need to compute a new variable, 't', for each point in the cloud
%This 't' value will parameterize the curve of best fit.
%Essentially what we're doing here is taking the dot product of each
%shifted point (contained in A) with the normal vector 'a'
t = A * a;
tMin = min(t);
tMax = max(t);
%This variable represents the order of our polynomial fit
%Use the smallest number that produces a satisfactory result
polyOrder = 8;
%Polynomial fit all three dimensions separately against t
pX = polyfit(t,X(:,1),polyOrder);
pY = polyfit(t,X(:,2),polyOrder);
pZ = polyfit(t,X(:,3),polyOrder);
%And that's our curve fit: (pX(t),pY(t),pZ(t))
%Now let's plot it.
tFine = tMin:.01:tMax;
fitXFine = polyval(pX,tFine);
fitYFine = polyval(pY,tFine);
fitZFine = polyval(pZ,tFine);
figure;
scatter3(X(:,1),X(:,2),X(:,3));
hold on;
plot3(fitXFine,fitYFine,fitZFine);
hold off;

Calculating the essential matrix from two sets of corresponding points

I'm trying to reconstruct a 3d image from two calibrated cameras. One of the steps involved is to calculate the 3x3 essential matrix E, from two sets of corresponding (homogeneous) points (more than the 8 required) P_a_orig and P_b_orig and the two camera's 3x3 internal calibration matrices K_a and K_b.
We start off by normalizing our points with
P_a = inv(K_a) * p_a_orig
and
P_b = inv(K_b) * p_b_orig
We also know the constraint
P_b' * E * P_a = 0
I'm following it this far, but how do you actually solve that last problem, e.g. finding the nine values of the E matrix? I've read several different lecture notes on this subject, but they all leave out that crucial last step. Likely because it is supposedly trivial math, but I can't remember when I last did this and I haven't been able to find a solution yet.
This equation is actually pretty common in geometry algorithms, essentially, you are trying to calculate the matrix X from the equation AXB=0. To solve this, you vectorise the equation, which means,
vec() means vectorised form of a matrix, i.e., simply stack the coloumns of the matrix one over the another to produce a single coloumn vector. If you don't know the meaning of the scary looking symbol, its called Kronecker product and you can read it from here, its easy, trust me :-)
Now, say I call the matrix obtained by Kronecker product of B^T and A as C.
Then, vec(X) is the null vector of the matrix C and the way to obtain that is by doing the SVD decomposition of C^TC (C transpose multiplied by C) and take the the last coloumn of the matrix V. This last coloumn is nothing but your vec(X). Reshape X to 3 by 3 matrix. This is you Essential matrix.
In case you find this maths too daunting to code, simply use the following code by Y.Ma et.al:
% p are homogenius coordinates of the first image of size 3 by n
% q are homogenius coordinates of the second image of size 3 by n
function [E] = essentialDiscrete(p,q)
n = size(p);
NPOINTS = n(2);
% set up matrix A such that A*[v1,v2,v3,s1,s2,s3,s4,s5,s6]' = 0
A = zeros(NPOINTS, 9);
if NPOINTS < 9
error('Too few mesurements')
return;
end
for i = 1:NPOINTS
A(i,:) = kron(p(:,i),q(:,i))';
end
r = rank(A);
if r < 8
warning('Measurement matrix rank defficient')
T0 = 0; R = [];
end;
[U,S,V] = svd(A);
% pick the eigenvector corresponding to the smallest eigenvalue
e = V(:,9);
e = (round(1.0e+10*e))*(1.0e-10);
% essential matrix
E = reshape(e, 3, 3);
You can do several things:
The Essential matrix can be estimated using the 8-point algorithm, which you can implement yourself.
You can use the estimateFundamentalMatrix function from the Computer Vision System Toolbox, and then get the Essential matrix from the Fundamental matrix.
Alternatively, you can calibrate your stereo camera system using the estimateCameraParameters function in the Computer Vision System Toolbox, which will compute the Essential matrix for you.

Calculate distance from point p to high dimensional Gaussian (M, V)

I have a high dimensional Gaussian with mean M and covariance matrix V. I would like to calculate the distance from point p to M, taking V into consideration (I guess it's the distance in standard deviations of p from M?).
Phrased differentially, I take an ellipse one sigma away from M, and would like to check whether p is inside that ellipse.
If V is a valid covariance matrix of a gaussian, it then is symmetric positive definite and therefore defines a valid scalar product. By the way inv(V) also does.
Therefore, assuming that M and p are column vectors, you could define distances as:
d1 = sqrt((M-p)'*V*(M-p));
d2 = sqrt((M-p)'*inv(V)*(M-p));
the Matlab way one would rewrite d2as (probably some unnecessary parentheses):
d2 = sqrt((M-p)'*(V\(M-p)));
The nice thing is that when V is the unit matrix, then d1==d2and it correspond to the classical euclidian distance. To find wether you have to use d1 or d2is left as an exercise (sorry, part of my job is teaching). Write the multi-dimensional gaussian formula and compare it to the 1D case, since the multidimensional case is only a particular case of the 1D (or perform some numerical experiment).
NB: in very high dimensional spaces or for very many points to test, you might find a clever / faster way from the eigenvectors and eigenvalues of V (i.e. the principal axes of the ellipsoid and their corresponding variance).
Hope this helps.
A.
Consider computing the probability of the point given the normal distribution:
M = [1 -1]; %# mean vector
V = [.9 .4; .4 .3]; %# covariance matrix
p = [0.5 -1.5]; %# 2d-point
prob = mvnpdf(p,M,V); %# probability P(p|mu,cov)
The function MVNPDF is provided by the Statistics Toolbox
Maybe I'm totally off, but isn't this the same as just asking for each dimension: Am I inside the sigma?
PSEUDOCODE:
foreach(dimension d)
(M(d) - sigma(d) < p(d) < M(d) + sigma(d)) ?
Because you want to know if p is inside every dimension of your gaussian. So actually, this is just a space problem and your Gaussian hasn't have to do anything with it (except for M and sigma which are just distances).
In MATLAB you could try something like:
all(M - sigma < p < M + sigma)
A distance to that place could be, where I don't know the function for the Euclidean distance. Maybe dist works:
dist(M, p)
Because M is just a point in space and p as well. Just 2 vectors.
And now the final one. You want to know the distance in a form of sigma's:
% create a distance vector and divide it by sigma
M - p ./ sigma
I think that will do the trick.