Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 10 years ago.
Improve this question
I'd like to generate a random number in a certain interval using an exponential distribution. My problem is that if I use exprnd I can't control the interval, I can only give a mean value, but that doesn't suit my needs.
Is there another function or is there some trick that I have to use?
Does this help? (or have I misunderstood the problem?)
%#Set the parameters
T = 2000; %#Number of observations to simulate
Mu = 0.5; %#Exponential distribution parameter
LB = 0; %#Lower bound on exponential distribution
UB = 1; %#Upper bound on exponential distribution
%#Validate the parameters
if LB < 0 || UB < 0; error('Bounds must be non-negative'); end
if Mu <= 0; error('Mu must be positive'); end
%#Determine LB and UB in terms of cumulative probabilities
LBProb = expcdf(LB, Mu);
UBProb = expcdf(UB, Mu);
%#Simulate uniform draws from the interval LBProb to UBProb
Draw = LBProb + (UBProb - LBProb) .* rand(T, 1);
%#Convert the uniform draws to exponential draws using the inverse cdf
X = expinv(Draw, Mu);
Exponential distribution is supported on [0,+\infty). You may want to remap it on [0,1) using some measurable invertible map f, so that Y = f(X) is a random variable supported on [0,1).
Problem: you have to build such an f.
My suggestion is
f(x) = 2/pi * arctan(x).
The function arctan maps (-\infty,\infty) to (-pi/2,pi/2). Because you are considering just positive samples (because your X goes exponentially) you will obtain samples in [0,pi/2); thus, you have to rescale by 2/pi. Moreover, because the MacLaurin expansion of arctan is x+o(x), you have samples that go exactly exponentially close enough to the origin.
Now, if you sample from whatever exponential (i.e. using possibly any value of \lambda - preferably small) and you evaluate f on the sample, you get samples that concentrate as you like (i.e. close to 0 and nearly exponential).
Here's a suggestion:
Sample from the exponential distribution with lambda=1, and reject any number outside of your intended interval. If your interval is [0,1], you have a probability of ~0.63 to get a number in that interval. That means a 99% probability of getting a "good" number after 10 samples.
Another possibility is to choose a high enough number n, such that the probability of sampling something over n is sufficiently small. For lambda = 1, n=1000 would suffice. Then you just sample from the exponential and transform it to your random sample by a+(b-a)*(sample/n)
Related
This question already has answers here:
Weighted sampling without replacement in Matlab
(5 answers)
Weighted random numbers in MATLAB
(4 answers)
Closed 7 years ago.
I'm trying to calculate 5 numbers at random. The numbers 1-35 have set probability weights that are assigned to each number. I'm wondering how, in Matlab, to compute 5 random numbers with weights WITHOUT repeats. Also seeking how to compute 5 sets of those.
Although I would suspect MATLAB has a built in function for this, the documentation for randsample suggests otherwise:
Y = randsample(N,K,true,W) or randsample(POPULATION,K,true,W) returns a
weighted sample, using positive weights W, taken with replacement. W is
often a vector of probabilities. This function does not support weighted
sampling without replacement.
So, instead, since you only are looking for a few numbers, looping isn't a terrible idea:
POP = 1:35;
W = rand(1,35); W=W/sum(W);
k = 5;
mynumbers = zeros(1,k);
for i=1:k
mynumbers(i) = randsample(POP,1,true,W);
idx2remove = find(POP==mynumbers(i));
POP(idx2remove) = [];
W(idx2remove) = [];
end
The entries in W are your weights. The vector POP is your number 1 through 35. The number k is how many you'd like to choose.
The loop randomly samples one number (with weights) at a time using MATLAB's randsample, then the selected number and corresponding weight are removed from POP and W.
For larger k I hope there's a better solution...
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking for code must demonstrate a minimal understanding of the problem being solved. Include attempted solutions, why they didn't work, and the expected results. See also: Stack Overflow question checklist
Closed 9 years ago.
Improve this question
I thought randn returns a random number which belongs to a normal distribution with mean 0 and standard deviation 1. Therefore, I expect to get a number in the range (0, 1). But what I get is a number not in the range (0,1).
What am I doing wrong?
You are thinking of a uniform distribution. A normal distribution can, in theory, have very big numbers, with very low probability.
randn has a mean of 0 and standard deviation of 1. The normal distribution is the bell-curve / Gaussian shape, with the highest probability at the mean and probability falling off relative to the standard deviation.
What you are looking for is rand, which "samples" from a uniform random distribution, which gives numbers bounded between 0 and 1 with even probability at all points.
You're confusing the normal distribution with the uniform distribution.
Another possible source of confusion:
A normal distribution with mean 0 and variance 1 is often denoted N(0,1). This is sometimes called the standard normal distribution and implies that samples are drawn from all real numbers, i.e., the range (−∞,+∞), with a mean 0 and variance 1. The standard deviation is also 1 in this case, but this notation specifies the variance (many screw this up). The transformation N(μ,σ2) = μ + σ N(0,1), where μ is the mean, σ2 is the variance, and σ is the standard deviation, is very useful.
Similarly, a continuous uniform distribution over the open interval (0,1) is often denoted U(0,1). This is often called a standard uniform distribution and implies that samples are drawn uniformly from just the range (0,1). Similarly, the transformation U(a,b) = a + (b − a) U(0,1), where a and b represent the edges of a scaled interval, is useful.
Note that the 0's and 1's in these two cases do not represent the same things at all other than being parameters that describe each distribution. The ranges that these two distributions are sampled from are called the support.
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 9 years ago.
Improve this question
I have trouble with the probability density function (p.d.f) of a log-normal distribution. I really need your help. As defined by wikipedia:
http://en.wikipedia.org/wiki/Log-normal_distribution#Probability_density_function
The probability density function of a log-normal distribution is:
My problem is, how to define x variable in MATLAB? Thanks for your help!
If you have the Stats toolbox, you can just use lognpdf:
y = lognpdf(x,mu,sigma);
Though this is a very simple function - fully vectorized, it's effectively just:
y = exp(-0.5*((log(x)-mu)./sigma).^2)./(x.*sqrt(2*pi).*sigma);
But you may want to check that x > 0 and sig > 0. To create this plot on the Wikipedia article that you cited, you can do:
mu = 0;
sigma = [1;0.5;0.25];
x = 0:0.01:3;
y = lognpdf([x;x;x],mu,sigma(:,ones(1,length(x))));
figure; plot(x,y);
When your question asks about defining x, maybe you're actually looking for log-normally distributed random variables, i.e., you want to sample randomly from the log-normal PDF/distribution? In that case you can use lognrnd:
r = lognrnd(mu,sigma);
I'm confused, like you can do this in a one-liner,
fun = #(x,mu,sigma) (1./(x*sigma*sqrt(2*pi))).*exp( -(power((log(x)-mu),2))/(2*power(sigma,2)))
x is any value that satisfies x > 0, the pdf tells you via Wikipedia
In probability theory, a probability density function (pdf), or
density of a continuous random variable, is a function that describes
the relative likelihood for this random variable to take on a given
value.
So any value x given to the log-normal pdf tells you tel relative likelihood that a random variable could be that value.
Consider this toy example:
mu = 1;
sigma = 10;
x = logspace(-2,0,10);
plot( x, fun(x,1,10) )
From this plot as x gets closer to zero, it's relative likelihood of actually taking on that value increases. DISCLAIMER I just threw that function together, it needs to be checked for accuracy, the preceding was for illustration only.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 3 years ago.
Improve this question
I am performing weighted least squares regression as described on wiki: WLS
I need to solve this equation: $B= (t(X)WX)^{-1}*t(X)Wy$
I use SVD to find: $(t(X)WX)^{-1}$ and store it in a matrix. In addition I store the matrix $H= (t(X)WX)^{-1}*t(X)W$ and simply do the following for any new value of y: B= Hy. This way I am able to save the cost of repeating SVD and matrix multiplications as y's change.
W is a diagonal matrix and generally does not change. However sometimes I change one or two elements on the diagonal in the W matrix. In that case I need to do SVD again and recalc the H matrix. This is clearly slow and time consuming.
My question is: If I know what changed in W and nothing changes in X is there a more efficient method to recalculate (t(X)WX)^-1?
Or put differently is there an efficient analytic method to find B given that only diagonal elements in W can change by a known amount?
There is such a method, in the case that the inverse you compute is a true inverse and not a generalised inverse (ie none of the singular values are 0). However some caution in using this is recommended. If you were doing your sums in infinite precision, all would be well. With finite precision, and particularly with nearly singular problems -- if some of the singular values are very large -- these formulae could result in loss of precision.
I'll call inverse you store C. If you add d (which can be positive or negative) to the m'th weight, then the modified C matrix, C~ say, and the modified H, H~, can be computed like this:
(' denotes transpose, and e_m is row the vector that's all 0, except the m'th slot is 1)
Let
c = the m'th column of H, divided by the original m'th weight
a = m'th row of the data matrix X
f = e_m - a*H
gamma = 1/d + a*c
(so c is a column vector, while a and f are row vectors)
Then
C~ = C - c*c'/gamma
H~ = H + c*f/gamma
If you want to find the new B, B~ say, for a given y, it can be calculated via:
r = y[m] - a*B
B~ = B + (r/gamma) * c
The derivation of these formulae is straightforward, but tedious, matrix algebra. The matrix inversion lemma comes in handy.
This question already has an answer here:
PDF and CDF plot for central limit theorem using Matlab
(1 answer)
Closed 3 years ago.
I would like to use MATLAB to visualize the Central Limit Theorem in action. I would like to use rand() to produce 10 samples of uniform distribution U[0,1] and compute their average, then save it to a matrix 'Mat'.
I would then use a histogram to visualize the convergence in distribution. How would you do this and normalize that histogram so it is a valid probability density (instead of just counting the frequency of occurrence)?
To generate the samples I am doing something like:
Mat = rand(N,sizeOfVector) > rand(1);
But I guess I am going to the wrong side.
To generate N samples of length sizeOfVector you start out with rand as you suggested, and then continue as follows (calling the array average instead of Mat for readability):
samples = rand(N,sizeOfVector);
average = mean(samples,1);
binWidth = 3.49*std(average)*N^(-1/3)); %# Scott's rule for good bin width for normal data
nBins = ceil((max(average)-min(average))/binWidth);
[counts,x] = hist(average,nBins);
normalizedCounts = counts/sum(counts);
bar(x,normalizedCounts,1)