Sequentially reading down a matrix column in MATLAB - matlab

My original problem was to create a scenario whereby there is a line of a specific length (x=100), and a barrier at specific position (pos=50). Multiple rounds of sampling are carried out, within which a specific amount of random numbers (p) are made. The numbers generated can either fall to left or right of the barrier. The program outputs the difference between the largest number generated to the left of the barrier and the smallest number generated to the right. This is much clearer to see here:
In this example, the system has created 4 numbers (a,b,c,d). It will ignore a and d and output the difference between b and c. Essentially, it will output the smallest possible fragment of the line that still contains the barrier.
The code I have been using to do this is:
x = 100; % length of the grid
pos = 50; % position of the barrier
len1 = 0; % left edge of the grid
len2 = x; % right edge of the grid
sample = 1000; % number of samples to make
nn = 1:12 % number of points to generate (will loop over these)
len = zeros(sample, length(nn)); % array to record the results
for n = 1:length(nn) % For each number of pts to generate
numpts = nn(n);
for i = 1:sample % For each round of sampling,
p = round(rand(numpts,1) * x); % generate 'numpts' random points.
if any(p>pos) % If any are to the right of the barrier,
pright = min(p(p>pos)); % pick the smallest.
else
pright = len2;
end
if any(p<pos) % If any are to the left of the barrier,
pleft = max(p(p<pos)); % pick the largest.
else
pleft = len1;
end
len(i,n) = pright - pleft; % Record the length of the interval.
end
end
My current problem: I'd like to make this more complex. For example, I would like to be able to use more than just one random number count in each round. Specifically I would like to relate this to Poisson distributions with different mean values:
% Create poisson distributions for λ = 1:12
range = 0:20;
for l=1:12;
y = poisspdf(range,l);
dist(:,l) = y;
end
From this, i'd like to take 1000 samples for each λ but within each round of 1000 samples, the random number count is no longer the same for all 1000 samples. Instead it depends on the poisson distribution. For example, within a mean value of 1, the probabilities are:
0 - 0.3678
1 - 0.3678
2 - 0.1839
3 - 0.0613
4 - 0.0153
5 - 0.0030
6 - 0.0005
7 - 0.0001
8 - 0.0000
9 - 0.0000
10 - 0.0000
11 - 0.0000
12 - 0.0000
So for the first round of 1000 samples, 367 of them would be carried out generating just 1 number, 367 carried out generating 2 numbers, 183 carried out generating 3 numbers and so on. The program will then repeat this using new values it gains from a mean value of 2 and so on. I'd then like to simply collect together all the fragment sizes (pright-pleft) into a column of a matrix - a column for each value of λ.
I know I could do something like:
amount = dist*sample
To multiply the poisson distributions by the sample size to gain how many of each number generation it should do - however i'm really stuck on how to incorporate this into the for-loop and alter the code to meet to tackle this new problem. I am also not sure how to read down a column on a matrix to use each probability value to determine how much of each type of RNG it should do.
Any help would be greatly appreciated,
Anna.

You could generate a vector of random variables from a known pdf object using random, if you have the statistics toolbox. Better still, skip the PDF step and generate the random variables using poissrnd. Round off the value to the nearest integer and call rand as you were doing already. In your loop simply iterate over your generated vector of poisson distributed random numbers.
Example:
x = 100; % length of the grid
pos = 50; % position of the barrier
len1 = 0; % left edge of the grid
len2 = x; % right edge of the grid
sample = 1000; % number of samples to make
lambda = 1:12; % lambdas
Rrnd = round(poissrnd(repmat(lambda,sample,1)));
len = zeros(size(Rrnd)); % array to record the results
for n = lambda; % For each number of pts to generate
for i = 1:sample % For each round of sampling,
numpts = Rrnd(i,n);
p = round(rand(numpts,1) * x); % generate 'numpts' random points.
len(i,n) = min([p(p>pos);len2]) - max([p(p<pos);len1]); % Record the length
end
end

Related

How to use the randn function in Matlab to create an array of values (range 0-10) of size 1,000 that follows a Gaussian distribution? [duplicate]

Matlab has the function randn to draw from a normal distribution e.g.
x = 0.5 + 0.1*randn()
draws a pseudorandom number from a normal distribution of mean 0.5 and standard deviation 0.1.
Given this, is the following Matlab code equivalent to sampling from a normal distribution truncated at 0 at 1?
while x <=0 || x > 1
x = 0.5 + 0.1*randn();
end
Using MATLAB's Probability Distribution Objects makes sampling from truncated distributions very easy.
You can use the makedist() and truncate() functions to define the object and then modify (truncate it) to prepare the object for the random() function which allows generating random variates from it.
% MATLAB R2017a
pd = makedist('Normal',0.5,0.1) % Normal(mu,sigma)
pdt = truncate(pd,0,1) % truncated to interval (0,1)
sample = random(pdt,numRows,numCols) % Sample from distribution `pdt`
Once the object is created (here it is pdt, the truncated version of pd), you can use it in a variety of function calls.
To generate samples, random(pdt,m,n) produces a m x n array of samples from pdt.
Further, if you want to avoid use of toolboxes, this answer from #Luis Mendo is correct (proof below).
figure, hold on
h = histogram(cr,'Normalization','pdf','DisplayName','#Luis Mendo samples');
X = 0:.01:1;
p = plot(X,pdf(pdt,X),'b-','DisplayName','Theoretical (w/ truncation)');
You need the following steps
1. Draw a random value from uniform distribution, u.
2. Assuming the normal distribution is truncated at a and b. get
u_bar = F(a)*u +F(b) *(1-u)
3. Use the inverse of F
epsilon= F^{-1}(u_bar)
epsilon is a random value for the truncated normal distribution.
Why don't you vectorize? It will probably be faster:
N = 1e5; % desired number of samples
m = .5; % desired mean of underlying Gaussian
s = .1; % desired std of underlying Gaussian
lower = 0; % lower value for truncation
upper = 1; % upper value for truncation
remaining = 1:N;
while remaining
result(remaining) = m + s*randn(1,numel(remaining)); % (pre)allocates the first time
remaining = find(result<=lower | result>upper);
end

Using polyarea to calculate the area of a subcycle

I was wondering how to use polyarea in MATLAB at different intervals. For example, I have disp=[1,2,3,4,5.....] and load = [3,4,5,6,7,8....]. I would like to calculate polyarea(disp,load) at every 40 rows (or intervals). disp and load are cyclic loading and displacement data, containing 1000+ rows like this. Any help is much appreciated!
EDIT 1: (based on m7913d's answer) It seems the code is somewhat not giving the answers appropriately. Is anything wrong with the code?
data=xlsread('RE.xlsx');
time=data(:,1);
load=data(:,2);
disp=data(:,3);
duration = 40;
n = length(disp); % number of captured samples
nCycles = floor(n/duration); % number of completed cycles
areas = zeros(nCycles, 1); % initialise output (area of each cycle)
for i=1:nCycles % loop over the cycles
range = (i-1)*duration + (1:duration); % calculate the indexes corresponding with the ith cycle
areas(i) = polyarea(disp(range), load(range)); % calculate the area of the ith cycle
end
Assuming each cycle has the same known duration (duration = 40), you can calculate the area of each cycle as follows:
duration = 40;
n = length(A); % number of captured samples
nCycles = floor(n/duration); % number of completed cycles
areas = zeros(nCycles, 1); % initialise output (area of each cycle)
for i=1:nCycles % loop over the cycles
range = (i-1)*duration + (1:duration); % calculate the indexes corresponding with the ith cycle
areas(i) = polyarea(A(range), B(range)); % calculate the area of the ith cycle
end
Further Reading
As this seems a basic question to me, it may be useful to have a look at the Getting Started tutorial of MATLAB.

Generating random numbers in matlab biased towards the boundaries

I want to generate biased random numbers in matlab. Let me explain a bit more, by what I mean by biased.
Lets say I have a defined upper bound and lower bound of 30 and 10 respectively.
I want to generate N random numbers biased towards the bounds, such that the probability of the numbers lying close to 10 and 30 (the extremes) is more as compared to them lying some where in the middle.
How can I do this?
Any help is much appreciated :)
% Upper bound
UB = 30
% Lower bound
LB = 0;
% Range
L = UB-LB;
% Std dev - you may want to relate it to L - maybe use sigma=sqrt(L)
sigma = L/6;
% Number of samples to generate
n = 1000000;
X = sigma*randn(1,n);
% Remove items that are above bounds - not sure if it's what you want.. if not comment the two following lines
X(X<-L) = [];
X(X>L) = [];
% Take values above zero for lower bounds, other values for upper bound
ii = X > 0;
X(ii) = LB + X(ii);
X(~ii) = UB + X(~ii);
% plot histogram
hist(X, 100);
I used a normal distribution here but obviously you can adapt to use others.. you can change the sigma also.
To generate random numbers for an arbitrary distribution, you have to define the inverted cumulative distribution function. Let's say you called it myICDF. Once you got this function, you can generate random samples using myICDF(rand(n,m)).

exponential random numbers with a bound in matlab

I want to pick values between, say, 50 and 150 using an exponential random number generator (a flat hazard function). How do I implement bounds on the built-in exponential random number function in matlab?
A quick way is to a sequence longer than you need, and throw out values outside your desired range.
dist = exprnd(100,1,1000);
%# mean of 100 ---^ ^---^--- 1x1000 random numbers
dist(dist<50 | dist>150) = []; %# will be shorter than 1000
If you don't have enough values after pruning, you can repeat and append onto the vector, or however else you want to do it.
exprandn uses rand (see >> open exprnd.m) so you can bound the output of that instead by reversing the process and sampling uniformly within the desired range [r1, r2].
sizeOut = [1, 1000]; % sample size
mu = 100; % parameter of exponential
r1 = 50; % lower bound
r2 = 150; % upper bound
r = exprndBounded(mu, sizeOut, r1, r2); % bounded output
function r = exprndBounded(mu, sizeOut, r1, r2);
minE = exp(-r1/mu);
maxE = exp(-r2/mu);
randBounded = minE + (maxE-minE).*rand(sizeOut);
r = -mu .* log(randBounded);
The drawn densities (using a non-parametric kernel estimator) look like the following for 20K samples

conditional generation of random numbers using matlab

I have a function that generates normal random number matrix having normal distribution using normrnd.
values(vvvv)= normrnd(0,0.2);
The output is from round1 is:
ans =
0.0210 0.1445 0.5171 -0.1334 0.0375 -0.0165 Inf -0.3866 -0.0878 -0.3589
The output from round 2 is:
ans =
0.0667 0.0783 0.0903 -0.0261 0.0367 -0.0952 0.1724 -0.2723 Inf Inf
The output from round 3 is:
ans =
0.4047 -0.4517 0.4459 0.0675 0.2000 -0.3328 -0.1180 -0.0556 0.0845 Inf
the function will be repeated 20 times.
It is obvious that the function is completely random. What I seek is to add a condition.
What I need is: if any entry has a value between 0.2 and 0.3. that value will be fixed in the next rounds. Only the remaining entries will be subjected to change using the function rand.
I have found the rng(sd) which seeds the random number generator using the nonnegative integer sd so that rand, randi, and randn produce a predictable sequence of numbers.
How to set custom seed for pseudo-random number generator
but how to make several entries of the matrix only effected!!
Another problem: seems that rng is not available for matlab r2009
How to get something similar without entering in the complication of probability & statistics
You can do this more directly than actually generating all these matrices, and it's pretty easy to do so, by thinking about the distribution of the final output.
The probability of a random variable distributed by N(0, .2) lying between .2 and .3 is p ~= .092.
Call the random variable of the final output of your matrix X, where you do this n (20) times. Then either (a) X lies between .2 and .3 and you stopped early, or (b) you didn't draw a number between .2 and .3 in the first n-1 draws and so you went with whatever you got on the nth draw.
The probability of (b) happening is just b=(1-p)^(n-1): the independent events of drawing outside [.2, .3], which have probability 1-p, happend n-1 times. Therefore the probability of (a) is 1-b.
If (b) happened, you just draw a number from normrnd. If (a) happened, you need the value of a normal variable, conditional on its being between .2 and .3. One way to do this is to find the cdf values for .2 and .3, draw uniformly from the range between there, and then use the inverse cdf to get back the original number.
Code that does this:
mu = 0;
sigma = .2;
upper = .3;
lower = .2;
n = 20;
sz = 15;
cdf_upper = normcdf(upper, mu, sigma);
cdf_lower = normcdf(lower, mu, sigma);
p = cdf_upper - cdf_lower;
b = (1-p) ^ (n - 1);
results = zeros(sz, sz);
mask = rand(sz, sz) > b; % mask value 1 means case (a), 0 means case (b)
num_a = sum(mask(:));
cdf_vals = rand(num_a, 1) * p + cdf_lower;
results(mask) = norminv(cdf_vals, mu, sigma);
results(~mask) = normrnd(mu, sigma, sz^2 - num_a, 1);
If you want to simulate this directly for some reason (which is going to involve a lot of wasted effort, but apparently you don't like "the complications of statistics" -- by the way, this is probability, not statistics), you can generate the first matrix and then replace only the elements that don't fall in your desired range. For example:
mu = 0;
sigma = .2;
n = 10;
m = 10;
num_runs = 20;
lower = .2;
upper = .3;
result = normrnd(mu, sigma, n, m);
for i = 1 : (num_runs - 1)
to_replace = (result < lower) | (result > upper);
result(to_replace) = normrnd(mu, sigma, sum(to_replace(:)), 1);
end
To demonstrate that these are the same, here's a plots of the empirical CDFs of doing this for 1x1 matrices 100,000 times. (That is, I ran both functions 100k times and saved the results, then used cdfplot to plot values on the x axis vs portion of the obtained values that are less than that on the y axis.)
They're identical. (Indeed, a K-S test for identity of distribution gives a p-value of .71.) But the direct way was a bunch faster to run.