Suppose X~Γ(α, β), I would like to truncate all values of X
MATLAB codes:
t = 0.5; theta = 0.4;
syms alpha beta
EX = beta*( igamma(alpha+1,t/beta) / igamma(alpha,t/beta) ); %Mean
EX2 = beta^2*( igamma(alpha+2,t/beta) / igamma(alpha,t/beta) );%Second moment
VarX = EX2 -EX^2; %Variance
cond1 = alpha > 0; cond2 = beta > 0; cond3 = EX==1; cond4 = VarX ==theta;
conds =[cond1 cond2 cond3 cond4]; vars = [alpha, beta];
sol = solve(conds, [alpha beta], 'ReturnConditions',true);
soln_alpha = vpa(sol.alpha)
soln_beta = vpa(sol.beta)
The above code returns a numeric answer only if the constraint that α>0 is relaxed. The numeric answer has a negative value of α which is wrong since both α (shape parameter) and β (scale parameter) must be strictly positive.
Based on your title, I take it you want to generate samples from a Gamma distribution with mean = 1 and variance = 0.4 but want the distribution truncated to [0, inf].
If X ~ Gamma(alpha,beta), then by definition it must be nonnegative (see Gamma Distribution wiki, or MATLAB page). Indeed, both shape and scale parameters are also nonnegative. Note: MATLAB uses the (k,theta) parameterization found on the wiki page.
MATLAB has implemented probability distribution objects which make a lot of things very convenient from a practitioner perspective (or anyone who uses numerical approaches).
alpha = 0.4;
beta = 0.5;
pd = makedist('Gamma',alpha,beta) % Define the distribution object
Generating samples is now very easy.
n = 1000; % Number of samples
X = random(pd,n,1); % Random samples of X ~ Gamma(alpha,beta)
All that is left is to identify the shape and scale parameters that such that E[X] = 1 and Var(X) = 0.4.
You need to solve
alpha * beta = E[X],
alpha * (beta^2) = Var(X),
for alpha and beta. It is a system of two nonlinear equations with two unknowns.
However, truncation makes these obsolete but numerical approaches will work fine.
LB = 0.5; % lower bound (X > LB)
UB = inf; % upper bound (X < UB)
pdt = truncate(pd,LB,UB) % Define truncated distribution object
Xt = random(pd,n,1);
pdt =
GammaDistribution
Gamma distribution
a = 0.4
b = 0.5
Truncated to the interval [0.5, Inf]
Fortunately, the mean and variance of a distribution object are accessible whether or not it is truncated.
mean(pdt) % compare to mean(pd)
var(pdt) % compare to var(pd)
You can numerically solve this problem to obtain your parameters with something like fmincon.
tgtmean = 1;
tgtvar = 0.4;
fh_mean =#(p) mean(truncate(makedist('Gamma',p(1),p(2)),LB,UB));
fh_var =#(p) var(truncate(makedist('Gamma',p(1),p(2)),LB,UB));
fh =#(p) (fh_mean(p)-tgtmean).^2 + (fh_var(p)-tgtvar).^2;
[p, fval] = fmincon(fh,[alpha;beta],[],[],[],[],0,inf)
You can test the answer for validation:
pd_test = truncate(makedist('Gamma',p(1),p(2)),LB,UB);
mean(pd_test)
var(pd_test)
ans = 1.0377
ans = 0.3758
Note this seems ill-conditioned due to the desired truncation and target mean. This might be good enough depending on your application.
histogram(random(pd_test,n,1)) % Visually inspect distribution
Mean and variance combinations must be feasible under the base distribution (here, Gamma distribution), but if truncating, that further restricts the set of feasible parameters. For example, it would be impossible to truncate X~Gamma() to the interval [5, 500] and seek to get a mean of 2 or a mean of 600.
MATLAB code verified with version R2017a.
Also note that solutions from nonlinear solvers like fmincon may be sensitive to the initial starting point for some problems. If that numerical approach is giving issues, it may be a feasibility issue (as alluded to above) or it may require using multiple start points and multiple fmincon calls to get multiple answers, then use the best one.
Related
I am trying to use the Metropolis Hastings algorithm with a random walk sampler to simulate samples from a function $$ in matlab, but something is wrong with my code. The proposal density is the uniform PDF on the ellipse 2s^2 + 3t^2 ≤ 1/4. Can I use the acceptance rejection method to sample from the proposal density?
N=5000;
alpha = #(x1,x2,y1,y2) (min(1,f(y1,y2)/f(x1,x2)));
X = zeros(2,N);
accept = false;
n = 0;
while n < 5000
accept = false;
while ~accept
s = 1-rand*(2);
t = 1-rand*(2);
val = 2*s^2 + 3*t^2;
% check acceptance
accept = val <= 1/4;
end
% and then draw uniformly distributed points checking that u< alpha?
u = rand();
c = u < alpha(X(1,i-1),X(2,i-1),X(1,i-1)+s,X(2,i-1)+t);
X(1,i) = c*s + X(1,i-1);
X(2,i) = c*t + X(2,i-1);
n = n+1;
end
figure;
plot(X(1,:), X(2,:), 'r+');
You may just want to use the native implementation of matlab mhsample.
Regarding your code, there are a few things missing:
- function alpha,
- loop variable i (it might be just n but it is not suited for indexing since it starts at zero).
And you should always allocate memory in matlab if you want to fill it dynamically, i.e. X in your case.
To expand on the suggestions by #max, the code appears to work if you change the i indices to n and replace
n = 0;
with
n = 2;
X(:,1) = [.1,.1];
It would probably be better to assign X(:,1) to random values within your accept region (using the same code you use later), and/or include a burn-in period.
Depending upon what you are going to do with this, it may also make things cleaner to evaluate the argument to sin in the f function to keep it within 0 to 2 pi (likely by shifting the value by 2 pi if it exceeds those bounds)
I want to construct a 3-dimensional Poisson distribution in Matlab with lambda parameters [0.4, 0.2, 0.6] and I want to truncate it to have support in [0;1;2;3;4;5]. The 3 components are independent.
This is what I do
clear
n=3; %number components of the distribution
supp_marginal=0:1:5;
suppsize_marginal=size(supp_marginal,2);
supp_temp=repmat(supp_marginal.',1,n);
supp_temp_cell=num2cell(supp_temp,1);
output_temp_cell=cell(1,n);
[output_temp_cell{:}] = ndgrid(supp_temp_cell{:});
supp=zeros(suppsize_marginal^n,n);
for h=1:n
temp=output_temp_cell{h};
supp(:,h)=temp(:);
end
suppsize=size(supp,1);
lambda_1=0.4;
lambda_2=0.2;
lambda_3=0.6;
pr_mass=zeros(suppsize,1);
for j=1:suppsize
pr_mass(j)=(poisspdf(supp(j,1),lambda_1).*...
poisspdf(supp(j,2),lambda_2).*...
poisspdf(supp(j,3),lambda_3))/...
sum(poisspdf(supp(:,1),lambda_1).*...
poisspdf(supp(:,2),lambda_2).*...
poisspdf(supp(j,3),lambda_3));
end
When I compute the mean of the obtained distribution, I get lambda_1 and lambda_2 but not lambda_3.
lambda_empirical=sum(supp.*repmat(pr_mass,1,3));
Question: why I do not get lambda_3?
tl;dr: Truncation changes the distribution so different means are expected.
This is expected as truncation itself has changed the distribution and certainly adjusts the mean. You can see this from the experiment below. Notice that for your chosen parameters, this just starts to become noticable around lambda = 0.6.
Similar to the wiki page, this illustrates the difference between E[X] (expectation of X without truncation; fancy word for mean) and E[ X | LB ≤ X ≤ UB] (expectation of X given it is on interval [LB,UB]). This conditional expectation implies a different distribution than the unconditional distribution of X (~Poisson(lambda)).
% MATLAB R2018b
% Setup
LB = 0; % lowerbound
UB = 5; % upperbound
% Simple test to compare theoretical means with and without truncation
TestLam = 0.2:0.01:1.5;
Gap = zeros(size(TestLam(:)));
for jj = 1:length(TestLam)
TrueMean = mean(makedist('Poisson','Lambda',TestLam(jj)));
TruncatedMean = mean(truncate(makedist('Poisson','Lambda',TestLam(jj)),LB,UB));
Gap(jj) = TrueMean-TruncatedMean;
end
plot(TestLam,Gap)
Notice the gap with these truncation bounds and a lambda of 0.6 is still small and is negligible as lambda approaches zero.
lam = 0.6; % <---- try different values (must be greater than 0)
pd = makedist('Poisson','Lambda',lam)
pdt = truncate(pd,LB,UB)
mean(pd) % 0.6
mean(pdt) % 0.5998
Other Resources:
1. Wiki for Truncated Distributions
2. What is a Truncated Distribution
3. MATLAB documentation for truncate(), makedist()
4. MATLAB: Working with Probability Distribution (Objects)
I asked this question in Math Stackexchange, but it seems it didn't get enough attention there so I am asking it here. https://math.stackexchange.com/questions/1729946/why-do-we-say-svd-can-handle-singular-matrx-when-doing-least-square-comparison?noredirect=1#comment3530971_1729946
I learned from some tutorials that SVD should be more stable than QR decomposition when solving Least Square problem, and it is able to handle singular matrix. But the following example I wrote in matlab seems to support the opposite conclusion. I don't have a deep understanding of SVD, so if you could look at my questions in the old post in Math StackExchange and explain it to me, I would appreciate a lot.
I use a matrix that have a large condition number(e+13). The result shows SVD get a much larger error(0.8) than QR(e-27)
% we do a linear regression between Y and X
data= [
47.667483331 -122.1070832;
47.667483331001 -122.1070832
];
X = data(:,1);
Y = data(:,2);
X_1 = [ones(length(X),1),X];
%%
%SVD method
[U,D,V] = svd(X_1,'econ');
beta_svd = V*diag(1./diag(D))*U'*Y;
%% QR method(here one can also use "\" operator, which will get the same result as I tested. I just wrote down backward substitution to educate myself)
[Q,R] = qr(X_1)
%now do backward substitution
[nr nc] = size(R)
beta_qr=[]
Y_1 = Q'*Y
for i = nc:-1:1
s = Y_1(i)
for j = m:-1:i+1
s = s - R(i,j)*beta_qr(j)
end
beta_qr(i) = s/R(i,i)
end
svd_error = 0;
qr_error = 0;
for i=1:length(X)
svd_error = svd_error + (Y(i) - beta_svd(1) - beta_svd(2) * X(i))^2;
qr_error = qr_error + (Y(i) - beta_qr(1) - beta_qr(2) * X(i))^2;
end
You SVD-based approach is basically the same as the pinv function in MATLAB (see Pseudo-inverse and SVD). What you are missing though (for numerical reasons) is using a tolerance value such that any singular values less than this tolerance are treated as zero.
If you refer to edit pinv.m, you can see something like the following (I won't post the exact code here because the file is copyrighted to MathWorks):
[U,S,V] = svd(A,'econ');
s = diag(S);
tol = max(size(A)) * eps(norm(s,inf));
% .. use above tolerance to truncate singular values
invS = diag(1./s);
out = V*invS*U';
In fact pinv has a second syntax where you can explicitly specify the tolerance value pinv(A,tol) if the default one is not suitable...
So when solving a least-squares problem of the form minimize norm(A*x-b), you should understand that the pinv and mldivide solutions have different properties:
x = pinv(A)*b is characterized by the fact that norm(x) is smaller than the norm of any other solution.
x = A\b has the fewest possible nonzero components (i.e sparse).
Using your example (note that rcond(A) is very small near machine epsilon):
data = [
47.667483331 -122.1070832;
47.667483331001 -122.1070832
];
A = [ones(size(data,1),1), data(:,1)];
b = data(:,2);
Let's compare the two solutions:
x1 = A\b;
x2 = pinv(A)*b;
First you can see how mldivide returns a solution x1 with one zero component (this is obviously a valid solution because you can solve both equations by multiplying by zero as in b + a*0 = b):
>> sol = [x1 x2]
sol =
-122.1071 -0.0537
0 -2.5605
Next you see how pinv returns a solution x2 with a smaller norm:
>> nrm = [norm(x1) norm(x2)]
nrm =
122.1071 2.5611
Here is the error of both solutions which is acceptably very small:
>> err = [norm(A*x1-b) norm(A*x2-b)]
err =
1.0e-11 *
0 0.1819
Note that use mldivide, linsolve, or qr will give pretty much same results:
>> x3 = linsolve(A,b)
Warning: Matrix is close to singular or badly scaled. Results may be inaccurate. RCOND = 2.159326e-16.
x3 =
-122.1071
0
>> [Q,R] = qr(A); x4 = R\(Q'*b)
x4 =
-122.1071
0
SVD can handle rank-deficiency. The diagonal matrix D has a near-zero element in your code and you need use pseudoinverse for SVD, i.e. set the 2nd element of 1./diag(D) to 0 other than the huge value (10^14). You should find SVD and QR have equally good accuracy in your example. For more information, see this document http://www.cs.princeton.edu/courses/archive/fall11/cos323/notes/cos323_f11_lecture09_svd.pdf
Try this SVD version called block SVD - you just set the iterations equal to the accuracy you want - usually 1 is enough. If you want all the factors (this has a default # selected for factor reduction) then edit the line k= to the size(matrix) if I recall my MATLAB correctly
A= randn(100,5000);
A=corr(A);
% A is your correlation matrix
tic
k = 1000; % number of factors to extract
bsize = k +50;
block = randn(size(A,2),bsize);
iter = 2; % could set via tolerance
[block,R] = qr(A*block,0);
for i=1:iter
[block,R] = qr(A*(A'*block),0);
end
M = block'*A;
% Economy size dense SVD.
[U,S] = svd(M,0);
U = block*U(:,1:k);
S = S(1:k,1:k);
% Note SVD of a symmetric matrix is:
% A = U*S*U' since V=U in this case, S=eigenvalues, U=eigenvectors
V=real(U*sqrt(S)); %scaling matrix for simulation
toc
% reduced randomized matrix for simulation
sims = 2000;
randnums = randn(k,sims);
corrrandnums = V*randnums;
est_corr_matrix = corr(corrrandnums');
total_corrmatrix_difference =sum(sum(est_corr_matrix-A))
I ran through the algebra which I had previously done for the Verlet method without the force - this lead to the same code as you see below, but the "+(2*F/D)" term was missing when I ignored the external force. The algorithm worked accurately, as expected, however for the following parameters:
m = 7 ; k = 8 ; b = 0.1 ;
params = [m,k,b];
(and step size h = 0.001)
a force far above something like 0.00001 is much too big. I suspect I've missed a trick with the algebra.
My question is whether someone can spot the flaw in my addition of a force term in my Verlet method
% verlet.m
% uses the verlet step algorithm to integrate the simple harmonic
% oscillator.
% stepsize h, for a second-order ODE
function vout = verlet(vinverletx,h,params,F)
% vin is the particle vector (xn,yn)
x0 = vinverletx(1);
x1 = vinverletx(2);
% find the verlet coefficients
D = (2*params(1))+(params(3)*h);
A = (2/D)*((2*params(1))-(params(2)*h^2));
B=(1/D)*((params(3)*h)-(2*params(1)));
x2 = (A*x1)+(B*x0)+(2*F/D);
vout = x2;
% vout is the particle vector (xn+1,yn+1)
end
As written in the answer to the previous question, the moment friction enters the equation, the system is no longer conservative and the name "Verlet" does no longer apply. It is still a valid discretization of
m*x''+b*x'+k*x = F
(with some slight error with large consequences).
The discretization employs the central difference quotients of first and second order
x'[k] = (x[k+1]-x[k-1])/(2*h) + O(h^2)
x''[k] = (x[k+1]-2*x[k]+x[k-1])/(h^2) + O(h^2)
resulting in
(2*m+b*h)*x[k+1] - 2*(2*m+h^2*k) * x[k] + (2*m-b*h)*x[k-1] = 2*h^2 * F[k] + O(h^4)
Error: As you can see, you are missing a factor h^2 in the term with F.
I have a state space system with matrices A,B,C and D.
I can either create a state space system, sys1 = ss(A,B,C,D), of it or compute the transfer function matrix, sys2 = C*inv(z*I - A)*B + D
However when I draw the bode plot of both systems, they are different while they should be the same.
What is going wrong here? Does anyone have a clue? I know btw that the bodeplot generated by sys1 is correct.
The system can be downloaded here: https://dl.dropboxusercontent.com/u/20782274/system.mat
clear all;
close all;
clc;
Ts = 0.01;
z = tf('z',Ts);
% Discrete system
A = [0 1 0; 0 0 1; 0.41 -1.21 1.8];
B = [0; 0; 0.01];
C = [7 -73 170];
D = 1;
% Set as state space
sys1 = ss(A,B,C,D,Ts);
% Compute transfer function
sys2 = C*inv(z*eye(3) - A)*B + D;
% Compute the actual transfer function
[num,den] = ss2tf(A,B,C,D);
sys3 = tf(num,den,Ts);
% Show bode
bode(sys1,'b',sys2,'r--',sys3,'g--');
Edit: I made a small mistake, the transfer function matrix is sys2 = C*inv(z*I - A)*B + D, instead of sys2 = C*inv(z*I - A)*B - D which I did wrote done before. The problem still holds.
Edit 2: I have noticted that when I compute the denominator, it is correct.
syms z;
collect(det(z*eye(3) - A),z)
Your assumption that sys2 = C*inv(z*I- A)*B + D is incorrect. The correct equivalent to your state-space system (A,B,C,D) is sys2 = C*inv(s*I- A)*B + D. If you want to express it in terms of z, you'll need to invert the relationship z = exp(s*T). sys1 is the correct representation of your state-space system. What I would suggest for sys2 is to do as follows:
sys1 = ss(mjlsCE.A,mjlsCE.B,mjlsCE.C,mjlsCE.D,Ts);
sys1_c = d2c(sys1);
s = tf('s');
sys2_c = sys1_c.C*inv(s*eye(length(sys1_c.A)) - sys1_c.A)*sys1_c.B + sys1_c.D;
sys2_d = c2d(sys2_c,Ts);
That should give you the correct result.
Due to inacurracy of the inverse function extra unobservable poles and zeros are added to the system. For this reason you need to compute the minimal realization of your transfer function matrix.
Meaning
% Compute transfer function
sys2 = minreal(C*inv(z*eye(3) - A)*B + D);
What you are noticing is actually a numerical instability regarding pole-zero pair cancellations.
If you run the following code:
A = [0, 1, 0; 0, 0, 1; 0.41, -1.21, 1.8] ;
B = [0; 0; 0.01] ;
C = [7, -73, 170] ;
D = 1 ;
sys_ss = ss(A, B, C, D) ;
sys_tf_simp = tf(sys_ss) ;
s = tf('s') ;
sys_tf_full = tf(C*inv(s*eye(3) - A)*B + D) ;
zero(sys_tf_simp)
zero(sys_tf_full)
pole(sys_tf_simp)
pole(sys_tf_full)
you will see that the transfer function formulated by matrices directly has a lot more poles and zeros than the one formulated by MatLab's tf function. You will also notice that every single pair of these "extra" poles and zeros are equal- meaning that they cancel with each other if you were to simply the rational expression. MatLab's tf presents the simplified form, with equal pole-zero pairs cancelled out. This is algebraically equivalent to the unsimplified form, but not numerically.
When you call bode on the unsimplified transfer function, MatLab begins its numerical plotting routine with the pole-zero pairs not cancelled algebraically. If the computer was perfect, the result would be the same as in the simplified case. However, numerical error when evaluating the numerator and denominators effectively leaves some of the pole-zero pairs "uncancelled" and as many of these poles are in the far right side of the s plane, they drastically influence the output behavior.
Check out this link for info on this same problem but from the perspective of design: http://ctms.engin.umich.edu/CTMS/index.php?aux=Extras_PZ
In your original code, you can think of the output drawn in green as what the naive designer wanted to see when he cancelled all his unstable poles with zeros, but the output drawn in red is what he actually got because in practice, finite-precision and real-world tolerances prevent the poles and zeros from cancelling perfectly.
Why is an unobservable / uncontrollable pole? I think this issue comes only because the inverse of a transfer function matrix is inaccurate in Matlab.
Note:
A is 3x3 and the minimal realization has also order 3.
What you did is the inverse of a transfer function matrix, not a symbolic or numeric matrix.
# Discrete system
Ts = 0.01;
A = [0 1 0; 0 0 1; 0.41 -1.21 1.8];
B = [0; 0; 0.01];
C = [7 -73 170];
D = 1;
z = tf('z', Ts)) # z is a discrete tf
A1 = z*eye(3) - A # a tf matrix with a direct feedthrough matrix A
# inverse it, multiply with C and B from left and right, and plus D
G = D + C*inv(A1)*B
G is now a scalar (SISO) transfer function.
Without "minreal", G has order 9 (funny, I don't know how Matlab computes it, perhaps the "Adj(.)/det(.)" method). Matlab cannot cancel the common factors in the numerator and the denominator, because z is of class 'tf' rather than a symbolic variable.
Do you agree or do I have misunderstanding?