How to make a function that gives probabilities in a Geometric distribution? - matlab

I'm using MATLAB to make a function that returns the probability mass function (PMF) for a Geometric distribution when I enter the values of p, q, and the number of attempts (x) as the inputs.
My function:
function Probability = Geometric(p, q, x)
Probability = p*q^x-1
Now whenever I try to calculate the probability by typing in the values of p, q, and x, such as:
Geometric(0.5, 0.5, 1),
The exact error:
Geometric(0.5,0.5,1)
??? Undefined function or method 'Geometric' for input arguments of type 'double'.
I've tried changing functions, and reducing them to one input and one output.
I expect the probabilities to be calculated, but they just don't.

What's going wrong?
p*q^x-1 % Your original code
Your original code is taking q, raising it the xth power, multiplying it by p, then subtracting 1. This is equivalent to the following code which you certainly didn't intend.
(p*(q^x)) - 1 % What your code was doing written differently
Considering order of operations, the correction is easy.
p*q^(x-1) % Your corrected code
Another possible error source is your function is not saved as a standalone m-file "Geometric.m" which must also be on your MATLAB path (MATLAB has to "see" it). If you have your function file "MyFunction.m" stored in a folder, you can add that folder to MATLAB's visible path with one line (or manually navigate there). For more details, see how to create a function.
mypathtoMyFunction = 'C:\Users\SonnyJordan\Documents\SweetCode\FunctionFolder';
path(path,'mypathtoMyFunction')
A Full Solution (3 approaches)
From your parameterization of the Geometric distribution, you're wanting the support on {1, 2, 3, 4, ...}.
Two things. (1) I'd recommend an anonymous function for something like this. (2) There really isn't a need to separate p and q as separate variables since p + q = 1 and therefore one determines the other (i.e. q = 1-p).
Approach 1: Anonymous function
% MATLAB R2018b
geopmfh =#(p,k) p.*((1-p).^(k-1)); % Define pmf
k = 5; % Number of trials
p = 0.2; % Prob("Success" on trial)
geopmfh(p,k) % Probability
Above code is fully vectorized so you could pass it vectors and/or arrays of inputs.
A quick check to validate it is a valid probability mass function (pmf).
M = 500;
sum(geopmfh(p,[1:M])) % should return 1 if M large enough
Approach 2: Function (w/ error checking)
As an aside, making a function in MATLAB would make a lot of sense if you wanted to add error checking on the function inputs to ensure k is a positive integer and that p is between [0 1].
function [pmf] = geopmf(p,k)
%GEOPMF Calculates pmf for Geometric(p,k) distribution on {1,2,3,...}
% pmf = geopmf(p,k)
% p = n x d matrix of n d-dimensional success probabilities; must be [0,1]
% k = m x d matrix of m d-dimensional numbers of trials
% pmf = n x m matrix of probabilities
%
% Examples:
% k = 4; p = .5;
% pmf = geopmf(p,k) % pmf = 0.0625
% Input Error Checking ****************************************************
if isempty(p) | isempty(k), pmf = []; return, end
if nargin ~= 2, error('Function requires two inputs.'), end
if p < 0 | p > 1, error('p must be between 0 and 1.'), end
if k < 1 | ~isint(k), error('k must be positive integer & k > 0.'), end % with this parameterization
n = size(p,1); d = size(p,2);
m = size(k,1);
if isempty(p) | ~isnumeric(p) | ~ismatrix(p)
error('p must be non-empty numeric scalar, vector, or 2-D matrix.');
elseif isempty(k) | ~isnumeric(k) | ~ismatrix(k)
error('k must be non-empty numeric scalar, vector, or 2-D matrix.');
elseif size(k,2) ~=d
error('Rows of p and k must have same dimensions.');
end
% End (Input Error Checking) **********************************************
pmf = p.*((1-p).^(k-1));
end
Approach 3: MATLAB's built-in function
If you have the Statistics toolbox, MATLAB has a function for this already called geopdf but note it is parameterized according to the other "version" with support {0, 1, 2, ...} (see wiki page).
p.*((1-p).^k) % The other parameterization
geopdf(k,p) % Note order of inputs
You can correct for that by adjusting your input.
geopdf(k-1,p) % Subtract 1 trial
Code tested with MATLAB R2018b.

Related

Function differentiation

I have a function:
Where ||x|| is an Euclidean distance.
For given n (number of variables) I want Matlab to differentiate this function and then substitute real numbers into it.
What I truly don't understand how to do is:
1) How let Matlab create all these new n variables for later differentiation?
2) Each variable is a vector of dimension = d, i.e x=[x1, x2, ... xd], so later I want to differentiate function with respect to certain vector elements, for example with respect to x1, how can I do it?
EDIT 1: function should be differentiated at each x_i, where i=1:n
The derivative of a sum, is the sum of the derivatives of each element... so, we need to find the derivative only once (you can do this manually, if it is a simple function like in your toy-example, but we do this the general way, using the Symbolic Math Toolbox):
syms x y z % declaring 3 symolic variables
F = 1/(norm([x,y,z])); % declaring a function
f = diff(F,x) % calculate the derivative with regard to the symbolic variable x
f = -(abs(x)*sign(x))/(abs(x)^2 + abs(y)^2 + abs(z)^2)^(3/2)
you now have different options. You can use subs to evaluate the function f (simply assign numeric values to x,y, and z and call subs(f). Or, you create a (numeric) function handle by using matlabFunction (this is the way, I prefer)
fnc = matlabFunction(f); % convert to matlab function
Then, you just need to sum up the vector that you created (well, you need to sum up the sum of each of the two vector elements...)
% create arbitrary vector
n = 10;
x = rand(n+1,3);
% initialize total sum
SumFnc = 0;
% loop through elements
for i = 1:n
% calculate local sum
s = x(i,:)+x(i+1,:);
% call derivative-function + sum up
SumFnc = SumFnc + fnc(s(1),s(2),s(3));
end

Why is my third MATLAB function outputing only zeros when using ode45?

I need to model negative, positive and simple regulation of a gene for my systems biology class using MATLAB. The problem is that the functions for negative and simple regulation work but the positive regulation function is only outputting zeros.
My script is as follows:
% Simulation of simple regulation, negative autoregulation and positive
% autoregulation
% Define constants
global a b K n
a = 1;
b = 1;
K = 0.5;
n = 2; % Hill coefficient
% Simulation time
tspan = [0,10];
% Initial condition
X0 = 0;
% Run simulations
[t1,X1] = ode45(#autoregulation_f0,tspan,X0); % Simple regulation
[t2,X2] = ode45(#autoregulation_f1,tspan,X0); % Negative autoregulation
[t3,X3] = ode23(#autoregulation_f2,tspan,X0); % Positive autoregulation
% Plot results
figure;
plot(t1,X1,t2,X2,t3,X3);
legend('simple','negative','Location','southeast');
And my functions are:
function dxdt = autoregulation_f0(t,X)
global a b
dxdt = b - a*X;
end
function dxdt = autoregulation_f1(t,X)
global a b K n
dxdt = b/(1+(X^n)/(K^n)) - a*X;
end
function dxdt = autoregulation_f2(t,X)
global a b K n
dxdt = b*X.^n./(K.^n+X.^n) + a*X;
end
The third function "autoregulation_f2(t,X)" is the one that outputs zeros and therefore when plotting the graph I just get a straight line.
Does anyone know what could be causing this?
Thanks in advance!
It looks to be the correct result for the given function. Your provided dxdt has an X in every term. The initial X0=0 will result in dxdt=0, giving you no change in X. As a result you just end up with a flat line.

Matlab rref() function precision error after 12th column of hilbert matrices

My question may be a simple one but I could not think of a logical explanation for my question:
When I use
rref(hilb(8)), rref(hilb(9)), rref(hilb(10)), rref(hilb(11))
it gives me the result that I expected, a unit matrix.
However when it comes to the
rref(hilb(12))
it does not give a nonsingular matrix as expected. I used Wolfram and it gives the unit matrix for the same case so I am sure that it should have given a unit matrix. There may be a round off error or something like that but then 1/11 or 1/7 have also some troublesome decimals
so why does Matlab behave like this when it comes to 12?
It indeed seems like a precision error. This makes sense as the determinant of Hilbert matrix of order n tends to 0 as n tends to infinity (see here). However, you can use rref with tol parameter:
[R,jb] = rref(A,tol)
and take tol to be very small to get more precise results. For example, rref(hilb(12),1e-20)
will give you identity matrix.
EDIT- more details regarding the role of the tol parameter.
The source code of rref is provided at the bottom of the answer. The tol is used when we search for a maximal element in absolute value in a certain part of a column, to find the pivot row.
% Find value and index of largest element in the remainder of column j.
[p,k] = max(abs(A(i:m,j))); k = k+i-1;
if (p <= tol)
% The column is negligible, zero it out.
A(i:m,j) = zeros(m-i+1,1);
j = j + 1;
If all the elements are smaller than tol in absolute value, the relevant part of the column is filled by zeros. This seems to be where the precision error for rref(hilb(12)) occurs. By reducing the tol we avoid this issue in rref(hilb(12),1e-20).
source code:
function [A,jb] = rref(A,tol)
%RREF Reduced row echelon form.
% R = RREF(A) produces the reduced row echelon form of A.
%
% [R,jb] = RREF(A) also returns a vector, jb, so that:
% r = length(jb) is this algorithm's idea of the rank of A,
% x(jb) are the bound variables in a linear system, Ax = b,
% A(:,jb) is a basis for the range of A,
% R(1:r,jb) is the r-by-r identity matrix.
%
% [R,jb] = RREF(A,TOL) uses the given tolerance in the rank tests.
%
% Roundoff errors may cause this algorithm to compute a different
% value for the rank than RANK, ORTH and NULL.
%
% Class support for input A:
% float: double, single
%
% See also RANK, ORTH, NULL, QR, SVD.
% Copyright 1984-2005 The MathWorks, Inc.
% $Revision: 5.9.4.3 $ $Date: 2006/01/18 21:58:54 $
[m,n] = size(A);
% Does it appear that elements of A are ratios of small integers?
[num, den] = rat(A);
rats = isequal(A,num./den);
% Compute the default tolerance if none was provided.
if (nargin < 2), tol = max(m,n)*eps(class(A))*norm(A,'inf'); end
% Loop over the entire matrix.
i = 1;
j = 1;
jb = [];
while (i <= m) && (j <= n)
% Find value and index of largest element in the remainder of column j.
[p,k] = max(abs(A(i:m,j))); k = k+i-1;
if (p <= tol)
% The column is negligible, zero it out.
A(i:m,j) = zeros(m-i+1,1);
j = j + 1;
else
% Remember column index
jb = [jb j];
% Swap i-th and k-th rows.
A([i k],j:n) = A([k i],j:n);
% Divide the pivot row by the pivot element.
A(i,j:n) = A(i,j:n)/A(i,j);
% Subtract multiples of the pivot row from all the other rows.
for k = [1:i-1 i+1:m]
A(k,j:n) = A(k,j:n) - A(k,j)*A(i,j:n);
end
i = i + 1;
j = j + 1;
end
end
% Return "rational" numbers if appropriate.
if rats
[num,den] = rat(A);
A=num./den;
end

Solve matrix DAE system in matlab

The equations can be found here. As you can see it is set of 8 scalar equations closed to 3 matrix ones. In order to let Matlab know that equations are matrix - wise, I declare variable time dependent vector functions as:
syms t p1(t) p2(t) p3(t)
p(t) = symfun([p1(t);p2(t);p3(t)], t);
p = formula(p(t)); % allows indexing for vector p
% same goes for w(t) and m(t)...
Known matrices are declared as follows:
A = sym('A%d%d',[3 3]);
Fq = sym('Fq%d%d',[2 3]);
Im = diag(sym('Im%d%d',[1 3]));
The system is now ready to be modeled according to guide:
eqs = [diff(p) == A*w + Fq'*m,...
diff(w) == -Im*p,...
Fq*w == 0];
vars = [p; w; m];
At this point, when I try to reduce index (since it equals 2), I receive following error:
[DAEs,DAEvars] = reduceDAEIndex(eqs,vars);
Error using sym/reduceDAEIndex (line 95)
Expecting as many equations as variables.
The error would not arise if we had declared all variables as scalars:
syms A Im Fq real p(t) w(t) m(t)
Quoting symfun documentation (tips section):
Symbolic functions are always scalars, therefore, you cannot index into a function.
However it is hard for me to believe that it's not possible to solve these equations matrix - wise. Obviously, one can expand it to 8 scalar equations, but the multi body system concerned here is very simple and the aim is to be able to solve complex ones - hence the question: is it possible to solve matrix DAE in Matlab, and if so - what has to be fixed in order for this to work?
Ps. I have another issue with Matlab DAE solver: input variables (known coefficient functions) for my model are time variant. As far as example is concerned, they are constant in all domain, however for my problem they change in time. This problem has been brought out here. I would be grateful if you referred to it, should you have any solution.
Finally, I managed to find correct syntax for this problem. I made a mistake of treating matrix variables (such as A, Fq) as a single entity. Below I present code that utilizes matrix approach and solves this particular DAE:
% Define symbolic variables.
q = sym('q%d',[3 1]); % state variables
a = sym('a'); k = sym('k'); % constant parameters
t = sym('t','real'); % independent variable
% Define system variables and group them in vectors:
p1(t) = sym('p1(t)'); p2(t) = sym('p2(t)'); p3(t) = sym('p3(t)');
w1(t) = sym('w1(t)'); w2(t) = sym('w2(t)'); w3(t) = sym('w3(t)');
m1(t) = sym('m1(t)'); m2(t) = sym('m2(t)');
pvect = [p1(t); p2(t); p3(t)];
wvect = [w1(t); w2(t); w3(t)];
mvect = [m1(t); m2(t)];
% Define matrices:
mass = diag(sym('ms%d',[1 3]));
Fq = [0 -1 a;
0 0 1];
A = [1 0 0;
0 1 a;
0 a -q(1)*a] * k;
% Define sets of equations and aggregate them into one set:
set1 = diff(pvect,t) == A*wvect + Fq'*mvect;
set2 = mass*diff(wvect,t) == -pvect;
set3 = Fq*wvect == 0;
eqs = [set1; set2; set3];
% Close all system variables in one vector:
vars = [pvect; wvect; mvect];
% Reduce index of the system and remove redundnat equations:
[DAEs,DAEvars] = reduceDAEIndex(eqs,vars);
[DAEs,DAEvars] = reduceRedundancies(DAEs,DAEvars);
[M,F] = massMatrixForm(DAEs,DAEvars);
We receive very simple 2x2 ODE for two variables p1(t) and w1(t). Keep in mind that after reducing redundancies we got rid of all elements from state vector q. This means that all left variables (k and mass(1,1)) are not time dependent. If there had been time dependency of some variables within the system, the case would have been much harder to solve.
% Replace symbolic variables with numeric ones:
M = odeFunction(M, DAEvars,mass(1,1));
F = odeFunction(F, DAEvars, k);
k = 2000; numericMass = 4;
F = #(t, Y) F(t, Y, k);
M = #(t, Y) M(t, Y, numericMass);
% set the solver:
opt = odeset('Mass', M); % Mass matrix of the system
TIME = [1; 0]; % Time boundaries of the simulation (backwards in time)
y0 = [1 0]'; % Initial conditions for left variables p1(t) and w1(t)
% Call the solver
[T, solution] = ode15s(F, TIME, y0, opt);
% Plot results
plot(T,solution(:,1),T,solution(:,2))

Matlab internal rate of return

You'll have to be easy on me, I am new to matlab and SO. I am having an issue using the matlab solver to calculate internal rate of return(IRR). I saw that the financial toolbox in matlab had a function for this, however I don't believe I have it installed and did not want to get the trial version on their site.
Given the simple nature of my particular IRR calculation, I figured it would be easy enough to simply code in matlab. It is the same yearly cashflow, so what I put into matlab was as follows:
syms x k;
IRR = solve(investment == yrSavings* symsum((1+x)^-k,1, nYears));
It doesn't fail, and in fact gives a number. The only problem is the the result is incorrect! I type in the IRR manually and it never equals the investment. Using wolframalpha I found the actual solution, went back and manually typed in wolframalpha's answer, and the symsum function returned the correct result. I'm not sure what's up with the solver!
The way you have the formula written, the symbolic assumption is that you are using x as the iterator variable. I believe you want to use k. Try this:
syms x k;
IRR = solve(investment == yrSavings* symsum((1+x)^-k,k,1, nYears));
Another approach would be to use the MATLAB function roots to compute the discount factor and then convert that to an IRR. I happened to write such a function the other day, so I thought I might as well post it here for reference. It is heavily commented but the actual code is only three lines.
% Compute the IRR of a stream of cashflows represented as a vector. For example:
% > irr([-123400, 36200, 54800, 48100])
% ans = 0.059616
%
% If the provided stream of cashflows starts with a negative cashflow and all
% other cashflows are positive then `irr` will return a single scalar result.
%
% If multiple IRRs exist, all possible IRRs are returned as a column vector. For
% example:
% > irr([-1.0, 1.0, 1.1, 1.3, 1.0, -3.7])
% ans =
% 0.050699
% 0.824254
%
% If no IRRs exist, or if infinitely many IRRs exist, an empty array is
% returned. For example:
% > irr([1.0, 2.0, 3.0])
% ans = [](0x1)
%
% > irr([0.0])
% ans = [](0x1)
%
% Unlike Excel's IRR function no initial guess can be provided because all
% possible IRRs are returned anyway
%
% This function is not vectorized and will fail if called with a matrix argument
function irrs = irr(cashflows)
%% Overview
% The IRR is defined as the rate, r, such that:
%
% c0 + c1 / (1 + r) + c2 / (1 + r) ^ 2 + ... + cn / (1 + r) ^ n = 0
%
% where c0, c1, c2, ..., cn are the cashflows
%
% We define discount factors, d = 1 / (1 + r), so that the above becomes a
% polynomial in terms of the discount factors:
%
% c0 + c1 * d + c2 * d^2 + ... + cn * d^n = 0
%
% Such a polynomial will have n complex roots but between 0 and n real roots.
% In the important special case that c0 < 0 and c1, c2, ..., cn > 0 there
% will be exactly one real root.
%% Check that input is a vector, not a matrix
assert(isvector(cashflows), 'Input to irr must be a vector of cashflows');
%% Calculation of IRRs
% We use the built-in functions `roots` to compute all roots of the
% polynomial, which are the discount factors `d`. The user will provide a
% vector as [c0, c1, c2, ..., cn] but roots expects something of the form
% [cn, ..., c2, c2, c0] so we reverse the order of cashflows using (end:-1:1).
% At this stage `d` has n elements, most of which are likely complex.
d = roots(cashflows(end:-1:1));
% The function `roots` provides all roots, including complex ones. We are only
% interested in the real roots, so we filter out the complex roots here. There
% many also be spurious real roots less or equal to 0, which we also filter
% out. Now `rd` could have between 0 and n elements but is likely to have a
% single element
rd = d(imag(d) == 0.0 & real(d) > 0);
% We have solved everything in terms of discount factors, so we convert to
% rates by inverting the defining formula. Since the discount factors are
% real and positive, the IRRs are real and greater than -100%.
irrs = 1.0 ./ rd - 1.0;
end