Random number generation - code not working as it should - matlab

So I have to generate a random number (called 'p' here) between 0 and 90 whose frequency distribution is a cosine function (i.e I should have more numbers between 0 and 45 than numbers between 45 and 90).
I am working on matlab
The code is as follows -
flag = 1;
while flag == 1
candidate = randi([0,90]);
if rand < cosd( candidate )
p = candidate;
flag = 2;
end
end
I am generating 20 such numbers but always I get most of the numbers towards the higher end (45-90).
From those 20 numbers, there is hardly 1-2 numbers < 45.
Is my code wrong?
EDIT: Okay, so I got the answer. I tried running the code separately as follows-
for i = 1:20
flag = 1;
while flag == 1
candidate = randi([0,90]);
if rand < cosd( candidate )
p = candidate;
flag = 2;
disp(p);
end
end
end
And I'm getting most of the values of p between 0 and 45. My original code had an external 'if' condition which was the reason for only accepting higher values of 'p'. I used a while loop and the number of iterations were much more than 20 to get 20 values of 'p'.
Here is my original code snippet -
while zz <=20
d = randi([0,359]);
flag = 1;
while flag == 1
c = randi([0,90]);
x = rand(1);
if x < cosd(c)
p = c;
flag = 2;
end
end
if 'external condition'
strike(zz) = d;
dip(zz) = p;
slip(zz) = round(i);
zz= zz+1;
end
end

If you just want to get answer, read the last line. But if you want to know that why that answer is right, read the explanation.
Assume that you have a distinct distribution function like this:
f(0)=1;
f(1.5)=10;
f(4)=9;
So the cumulative function is:
F(0)=1;
F(1.5)=11;
F(4)=20;
No we want to have a relative cumulative function, as F(4)=20 (4 is the last item), we divide cumulative function by 20. So it would be:
F'(0)=0.05
F'(1.5)=0.55
F'(4)=1.00
Now, we generate a random number between 0 and 1. Every time we generate a random number, we generate a value for F'(x) and if F'(x) does not have that value anywhere, we use nearest bigger number (like y) which for some x, F(x)=y. For my example, based on relative cumulative function:
If the random number was less than 0.05, our distribution-based random number is 1.5
If the random number was between 0.05 and 0.55, our distribution-based random number is 2,
If was more than 0.55, our distribution-based random number is 4
We should do a similar work with continuous distribution functions. The difference is that in continuous world, we use integral instead of cumulative function. So for your question, we have:
f(x)=cos(x) , 0<=x<=90
F(x)=sin(x)-sin(0)=sin(x) , 0<=x<=90
F'(x)=cos(x) , 0<=x<=90 (Because F(90)=1)
Now we generate a random number between 0 and 1 (like r). So we have:
F'(x)=r => sin(x)=r => x=arcsin(r)
Actually, you just need to generate a random number between 0 and 1 and calculate the arcsin of that.

Related

Find all numbers which are equal to the sum of the factorial of their digits

How can Find all numbers (e.g. 145= 1! + 4! + 5! = 1 + 24 + 120 = 145.)
which are equal to the sum of the factorial of their digits, by MATLAB?
I want to chop off digits, add the factorial of the digits together and compare it with the original number. If factorial summation be equal to original number, this numbers is on of the solution and must be keep. I can't code my idea, How can I code it? Is this true?
Thanks
The main reason that I post this answer is that I can't leave the use of eval in the previous answer without a decent alternative
Here is a small function to check this for any given (integer) n:
isFact = #(n) n==sum(factorial(int2str(n)-'0'));
Explanation:
int2str(n)-'0': "chop off digits"
sum(factorial(...)): "add the factorial of the digits together"
n==...: "compare it with the original number"
You can now plug it in a loop to find all the numbers between 1 to maxInt:
maxInt = 100000; % just for the example
solution = false(1,maxInt); % preallocating memory
for k = 1:maxInt
solution(k) = isFact(k);
end
find(solution) % find all the TRUE indices
The result:
ans =
1 2 145 40585
The loop above was written to be simple. If you look for further efficiency and flexibility (like not checking all the numbers between 1 to maxInt and checking array in any shape), you can change it to:
% generating a set of random numbers with no repetitions:
Vec2Check = unique(randi(1000,1,1000)); % you can change that to any array
for k = 1:numel(Vec2Check)
if isFact(Vec2Check(k))
Vec2Check(k) = Vec2Check(k)+0.1;
end
end
solution = Vec2Check(Vec2Check>round(Vec2Check))-0.1
The addition of 0.1 serves as a 'flag' that marks the numbers that isFact returns true for them. We then extract them by comparing the vector to it's rounded vertsion.
You can even go with a one-line solution:
solution = nonzeros(arrayfun(#(n) n.*(n==sum(factorial(int2str(n)-'0'))),Vec2Check))
The following snippet finds the numbers up to 1000 satisfying this condition.
numbers = [];
for i=1:1000
number_char = int2str(i);
sum = 0;
for j=1:length(number_char)
sum = sum+ factorial(eval(number_char(j)));
end
if (sum == i)
numbers(end+1) = i;
end
end
disp(numbers)
This should yield:
1 2 145
Note that if (log10(n)+1)*9! is less than n, then there is no number satisfying the condition larger than n.

How can I generate a distribution whose values are distributed according to an input array?

I have written the following script which should create an array which contains the probability mass of every number from 1 to N, defined following the robust soliton distribution. The values of delta, N and M are completely arbitrary.
N = 300; % length of the the array
in = [1:1:N]; % index array
delta = 0.5;
M = 70;
R = N/M;
t(1:M-1) = 1./(in(1:M-1)*M);
t(M) = log(R/delta)/M;
t(M+1:N) = 0;
What I'm trying to do now is using the arrays in and t in order to "generate" in some way a pdf which returns the numbers in the array in with the probability contained in the array t. I have already looked in the manual and I found the makedist function, but I didn't find an option which allowed me to use as arguments two input arrays. I don't really know where to look.
The numbers generated should be used to encode packets using LT codes (for didactic purposes, I'm just trying to understand how to build them).
It sounds like you would like to be able to randomly pick numbers i element of 1:N with probabilities proportional to the value t(i).
First, let's restructure the unnormalized probabilities into an array that lists the ranges of each value; I.e.
t-> p {0,0.01,0.05,0.09, etc} I just used random values here.
Then what we can do is randomly pick a number from 0 to 1, and find the value of i associated with that random number. I.e. if we get 0.07, then the value of i would be 3 in my example because 0.07 is between 0.05 and 0.09 and the value i=3 has a 4% probability of being picked;
s = sum(t);
p = double.empty(N,1);
for i = 1:N
if(i == 0)
p(i) = 0
else
p(i) = p(i-1) + t(i-1)/s;
end
end
Now whenever we need a number from the distribution, we can use matlab's inherent find function
r = rand()
i = max(find(r-p>0)) % this could probably be optimized
What this does by example: If we use the same r and p as above:
r-p = {0.07, 0.06, 0.02, -0.02, etc}
find(r-p>0) = {1,2,3}

bernuli,geometric simulation on matlab

I am trying to simulate a simple bernuli simulation and also a simple geometric simulation on matlab and since I am new to matlab it seems a bit difficult.
I have been using this to understand it better http://www.academia.edu/1722549/Useful_distributions_using_MATLAB
but I Havent been able to make a good simulation so far.Can some help me or show me a good tutorial. thank you.
NEW EDIT:
answer from here:
this is my own asnwer that I try to com up with is it correct:
If we want to simulate Bernoulli distribution in Matlab, we can simply use random number generator rand to simulate a Bernoulli experiment. In this case we try to simulate tossing a coin 4 times with p = 0.5:
>> p = 0.5;
>> rand(1,4) < p
ans =
1 1 1 0
Using function rand, it returns values distributed between 0 and 1. By using “ < “, every value that is less than 0.5 is a success and therefore it prints 1 for that value; and for values equal or greater than 0.5 is a failure and therefore it prints 0 for that value.
Our ans is: 1 1 1 0. Which means that 3 times we have value less than 0.5 and 1 times we had values greater or equal to 0.5.
rand(1,n) < p will give count of tails in n Bernoulli trails assuming 1 is head. Alternatively, you can use binornd(n,p) function in MATLAB to simulate Bernoulli trial for n=1. One small caveat is that using rand(1,n) < p is quite faster as compared to binornd(n,p).
From the Wikipedia and your link, you can reply the question on your own:
The Binomial distribution is the discrete probability distribution of the number of successes (n) in a sequence of n independent yes/no experiments. The Bernoulli distribution is a special case of the Binomial distribution where n=1.
function pdf = binopdf_(k,n,p)
m = 10000;
idx = 0;
for ii=1:m
idx = idx + double(nnz(rand(n,1) < p)==k);
end
pdf = idx/m;
end
For example, if I toss a fair coin (p=0.5) 20 times, how many tails will I get?
k = 0:20;
y_pdf = binopdf_(k,20,0.5);
y_cdf = cumsum(y_pdf);
figure;
subplot(1,2,1);
stem(k,y_pdf);
title('PDF');
subplot(1,2,2);
stairs(k,y_cdf);
axis([0 20 0 1]);
title('CDF');
If you see the PDF, the mean value of tails we will see is 10.
The geometric distribution probability distribution of the number X of Bernoulli trials needed to get one success.
function pdf = geopdf_(k,p)
m = 10000;
pdf = zeros(numel(k));
for jj=1:numel(k)
idx = 0;
for ii=1:m
idx = idx + double(nnz(rand(jj,1) < p) < 1);
end
pdf(jj) = idx/m;
end
end
For example, how many times we have to toss a fair coin (p=0.5) to get one tail?
k = 0:20;
y_pdf = geopdf_(k,0.5);
y_cdf = cumsum(y_pdf);
figure;
subplot(1,2,1);
stem(k,y_pdf)
title('PDF');
subplot(1,2,2);
stairs(k,y_cdf);
axis([0 20 0 1]);
title('CDF');
If you see the PDF, we have 0.5 possibilities of getting a tail in the first trial, 0.75 possibilities of getting a tail in the first two trials, etc.

project euler 23 MATLAB

I'm slowly working my way though problem 23 in project Euler but I;ve run into a snag. Problem #23 involves trying to find the sum of all numbers that cannot be creat by two abundant numbers.
First here's my code:
function [divisors] = SOEdivisors(num)
%SOEDIVISORS This function finds the proper divisors of a number using the sieve
%of eratosthenes
%check for primality
if isprime(num) == 1
divisors = [1];
%if not prime find divisors
else
divisors = [0 2:num/2]; %hard code a zero at one.
for i = 2:num/2
if divisors(i) %if divisors i ~= 0
%if the remainder is not zero it is not a divisor
if rem(num, divisors(i)) ~= 0
%remove that number and all its multiples from the list
divisors(i:i:fix(num/2)) = 0;
end
end
end
%add 1 back and remove all zeros
divisors(1) = 1;
divisors = divisors(divisors ~= 0);
end
end
This function finds abundant numbers
function [abundantvecfinal] = abundantnum(limitnum)
%ABUNDANTNUM creates a vector of abundant numbers up to a limit.
abundantvec = [];
%abundant number count
count = 1;
%test for abundance
for i = 1:limitnum
%find divisors
divisors = SOEdivisors(i);
%if sum of divisors is greater than number it is abundant, add it to
%vector
if i < sum(divisors)
abundantvec(count) = i;
count = count + 1;
end
end
abundantvecfinal = abundantvec;
end
And this is the main script
%This finds the sum of all numbers that cannot be written as the sum of two
%abundant numbers and under 28123
%get abundant numbers
abundant = abundantnum(28153);
%total non abundant numbers
total = 0;
%sums
sums = [];
%count moves through the newsums vector allowing for a new space for each
%new sum
count = 1;
%get complete list of possible sums under using abundant numbers under
%28123 then store them in a vector
for i = 1:length(abundant)
for x = 1:length(abundant)
%make sure it is not the same number being added to itself
if i ~= x
sums(count) = abundant(i) + abundant(x);
count = count + 1;
end
end
end
%setdiff function compares two vectors and removes all similar elements
total = sum(setdiff(1:28153, sums));
disp(total)
The first problem is it gives me the wrong answer. I know that I'm getting the correct proper divisors and the correct abundant numbers so the problem probably lies in the main script. And it seems as though it almost certainly lies in the creation of the abundant sums. I was hoping someone might be able to find an error I havent been able to.
Beyond that, the code is slow due to the multiple for loops, so I'm also looking for ways to do problems like this more efficiently.
Thanks!
Well, I don't have enough reputation to just comment. Why are you ruling out adding the same number to itself? The problem statement gives the example 12+12=24.
I also don't see a reason that x should ever be less than i. You don't need to sum the same two numbers twice.

Matlab: Cannot plot timeseries with repeated x values. How to get rid of repeated rows?

so I have a matrix Data in this format:
Data = [Date Time Price]
Now what I want to do is plot the Price against the Time, but my data is very large and has lines where there are multiple Prices for the same Date/Time, e.g. 1st, 2nd lines
29 733575.459548611 40.0500000000000
29 733575.459548611 40.0600000000000
29 733575.459548612 40.1200000000000
29 733575.45954862 40.0500000000000
I want to take an average of the prices with the same Date/Time and get rid of any extra lines. My goal is to do linear intrapolation on the values which is why I must have only one Time to one Price value.
How can I do this? I did this (this reduces the matrix so that it only takes the first line for the lines with repeated date/times) but I don't know how to take the average
function [ C ] = test( DN )
[Qrows, cols] = size(DN);
C = DN(1,:);
for i = 1:(Qrows-1)
if DN(i,2) == DN(i+1,2)
%n = 1;
%while DN(i,2) == DN(i+n,2) && i+n<Qrows
% n = n + 1;
%end
% somehow take average;
else
C = [C;DN(i+1,:)];
end
end
[C,ia,ic] = unique(A,'rows') also returns index vectors ia and ic
such that C = A(ia,:) and A = C(ic,:)
If you use as input A only the columns you do not want to average over (here: date & time), ic with one value for every row where rows you want to combine have the same value.
Getting from there to the means you want is for MATLAB beginners probably more intuitive with a for loop: Use logical indexing, e.g. DN(ic==n,3) you get a vector of all values you want to average (where n is the index of the date-time-row it belongs to). This you need to do for all different date-time-combinations.
A more vector-oriented way would be to use accumarray, which leads to a solution of your problem in two lines:
[DateAndTime,~,idx] = unique(DN(:,1:2),'rows');
Price = accumarray(idx,DN(:,3),[],#mean);
I'm not quite sure how you want the result to look like, but [DataAndTime Price] gives you the three-row format of the input again.
Note that if your input contains something like:
1 0.1 23
1 0.2 47
1 0.1 42
1 0.1 23
then the result of applying unique(...,'rows') to the input before the above lines will give a different result for 1 0.1 than using the above directly, as the latter would calculate the mean of 23, 23 and 42, while in the former case one 23 would be eliminates as duplicate before and the differing row with 42 would have a greater weight in the average.
Try the following:
[Qrows, cols] = size(DN);
% C is your result matrix
C = DN;
% this will give you the indexes where DN(i,:)==DN(i+1)
i = find(diff(DN(:,2)==0);
% replace C(i,:) with the average
C(i,:) = (DN(i,:)+DN(i+1,:))/2;
% delete the C(i+1,:) rows
C(i,:) = [];
Hope this works.
This should work if the repeated time values come in pairs (the average is calculated between i and i+1). Should you have time repeats of 3 or more then try to rethink how to change these steps.
Something like this would work, but I did not run the code so I can't promise there's no bugs.
newX = unique(DN(:,2));
newY = zeros(1,length(newX));
for ix = 1:length(newX)
allOcurrences = find(DN(:,2)==DN(i,2));
% If there's duplicates, take their mean
if numel(allOcurrences)>1
newY(ix) = mean(DN(allOcurrences,3));
else
% If not, use the only Y value
newY(ix) = DN(ix,3);
end
end