My function called RollDice simulates the rolling of a given number of six sided dice a given number of times. The function has two input arguments, the number of dice (NumDice) that will be rolled in each experiment and the total number (NumRolls) of times that the dice will be rolled. The output of the function will be a vector SumDice of length NumRolls that contains the sum of the dice values in each experiment.
This is my code right now: how do I account for the SUM of the dice? Thanks!
function SumDice= RollDice(NumDice,NumRolls)
FACES= 6;
maxOut= FACES*NumDice;
count= zeros(1,maxOut);
for i = 1:NumRolls
outcome= 0;
for k= 1:NumDice
outcome= outcome + ceil(ranNumDice(1)*FACES);
end
count(outcome)= count(outcome) + 1;
end
bar(NumDice:maxOut, count(NumDice:length(count)));
message= sprintf('% NumDice rolls of % NumDice fair dice', NumRolls, NumDice);
title(message);
xlabel('sum of dice values'); ylabel('Count');
This is a simple and neat little problem (+1) and I enjoyed looking into it :-)
There were quite a few areas where I felt I could improve on your function. Rather than going through them one by one, I thought I'd just re-write the function how I'd do it, and we could go from there. I've written it as a script, but it can be turned into a function easily enough. Finally, I also generalized it a bit by allowing for the dice to have any number of faces (6 or otherwise). So, here it is:
%#Define the parameters
NumDice = 2;
NumFace = 6;
NumRoll = 50;
%#Generate the rolls and obtain the sum of the rolls
AllRoll = randi(NumFace, NumRoll, NumDice);
SumRoll = sum(AllRoll, 2);
%#Determine the bins for the histogram
Bins = (NumDice:NumFace * NumDice)';
%#Build the histogram
hist(SumRoll, Bins);
title(sprintf('Histogram generated from %d rolls of %d %d-sided dice', NumRoll, NumDice, NumFace));
xlabel(sprintf('Sum of %d dice', NumDice));
ylabel('Count');
I suggest you have a close look at my code and the documentation for each function I've used. The exercise may prove useful for you when tackling other problems in Matlab in the future. Once you've done that, if there is anything you don't understand, then please let me know in a comment and I'll try to help. Cheers!
ps, If you don't ever need to refer to the individual rolls again, you can of course convert the AllRoll and SumRoll line into a one-liner, ie: SumRoll = sum(randi(NumFace, NumRoll, NumDice), 2);. I think the two-liner is more readable personally, and I doubt it will make much difference to the efficiency of the code.
Related
I have to construct the following function in MATLAB and am having trouble.
Consider the function s(t) defined for t in [0,4) by
{ sin(pi*t/2) , for t in [0,1)
s(t) = { -(t-2)^3 , for t in [1,3)*
{ sin(pi*t/2) , for t in [3,4)
(i) Generate a column vector s consisting of 512 uniform
samples of this function over the interval [0,4). (This
is best done by concatenating three vectors.)
I know it has to be something of the form.
N = 512;
s = sin(5*t/N).' ;
But I need s to be the piecewise function, can someone provide assistance with this?
If I understand correctly, you're trying to create 3 vectors which calculate the specific function outputs for all t, then take slices of each and concatenate them depending on the actual value of t. This is inefficient as you're initialising 3 times as many vectors as you actually want (memory), and also making 3 times as many calculations (CPU), most of which will just be thrown away. To top it off, it'll be a bit tricky to use concatenate if your t is ever not as you expect (i.e. monotonically increasing). It might be an unlikely situation, but better to be general.
Here are two alternatives, the first is imho the nice Matlab way, the second is the more conventional way (you might be more used to that if you're coming from C++ or something, I was for a long time).
function example()
t = linspace(0,4,513); % generate your time-trajectory
t = t(1:end-1); % exclude final value which is 4
tic
traj1 = myFunc(t);
toc
tic
traj2 = classicStyle(t);
toc
end
function trajectory = myFunc(t)
trajectory = zeros(size(t)); % since you know the size of your output, generate it at the beginning. More efficient than dynamically growing this.
% you could put an assert for t>0 and t<3, otherwise you could end up with 0s wherever t is outside your expected range
% find the indices for each piecewise segment you care about
idx1 = find(t<1);
idx2 = find(t>=1 & t<3);
idx3 = find(t>=3 & t<4);
% now calculate each entry apprioriately
trajectory(idx1) = sin(pi.*t(idx1)./2);
trajectory(idx2) = -(t(idx2)-2).^3;
trajectory(idx3) = sin(pi.*t(idx3)./2);
end
function trajectory = classicStyle(t)
trajectory = zeros(size(t));
% conventional way: loop over each t, and differentiate with if-else
% works, but a lot more code and ugly
for i=1:numel(t)
if t(i)<1
trajectory(i) = sin(pi*t(i)/2);
elseif t(i)>=1 & t(i)<3
trajectory(i) = -(t(i)-2)^3;
elseif t(i)>=3 & t(i)<4
trajectory(i) = sin(pi*t(i)/2);
else
error('t is beyond bounds!')
end
end
end
Note that when I tried it, the 'conventional way' is sometimes faster for the sampling size you're working on, although the first way (myFunc) is definitely faster as you scale up really a lot. In anycase I recommend the first approach, as it is much easier to read.
I have tried to implement the algorithm described in here to find primitive roots for a prime number.
It works for small prime numbers, however as I try big numbers, it doesn't return correct answers anymore.
I then notice that a^(p-1)/pi tends to be a big number, it returns inf in MATLAB, so I thought factorizing (p-1) could help, but I am failing to see how.
I wrote a small piece of code in MATLABand here it is.
clear all
clc
%works with prime =23,31,37,etc.
prime=761; %doesn't work for this value
F=factor(prime-1); % the factors of prime-1
for i = 2: prime-1
a=i;
tag =1;
for j= 1 :prime-1
if (isprime(j))
p_i = j;
if(mod(a^((prime-1)/p_i),prime)== 1)
tag=0;
break
else
tag = tag +1;
end
end
end
if (tag > 1 )
a %it should only print the primitive root
break
end
end
Any input is welcome.
Thanks
What Matlab does in this case is it calculates a^((p-1)/p) before taking the modulus. As a^((p-1)/p) quite quickly becomes too large to handle, Matlab seems to resolve this by turning it into a floating point number, losing some resolution and yielding the wrong result when you take the modulus.
As mentioned by #rayreng, you could use an arbitrary precision toolbox to resolve this.
Alternatively, you could split the exponentiation into parts, taking the modulus at each stage. This should be faster, as it is less memory intensive. You could dump this in a function and just call that.
% Calculates a^b mod c
i = 0;
result = 1;
while i < b
result = mod(result*a, c);
i = i + 1;
end
I have a matrix time-series data for 8 variables with about 2500 points (~10 years of mon-fri) and would like to calculate the mean, variance, skewness and kurtosis on a 'moving average' basis.
Lets say frames = [100 252 504 756] - I would like calculate the four functions above on over each of the (time-)frames, on a daily basis - so the return for day 300 in the case with 100 day-frame, would be [mean variance skewness kurtosis] from the period day201-day300 (100 days in total)... and so on.
I know this means I would get an array output, and the the first frame number of days would be NaNs, but I can't figure out the required indexing to get this done...
This is an interesting question because I think the optimal solution is different for the mean than it is for the other sample statistics.
I've provided a simulation example below that you can work through.
First, choose some arbitrary parameters and simulate some data:
%#Set some arbitrary parameters
T = 100; N = 5;
WindowLength = 10;
%#Simulate some data
X = randn(T, N);
For the mean, use filter to obtain a moving average:
MeanMA = filter(ones(1, WindowLength) / WindowLength, 1, X);
MeanMA(1:WindowLength-1, :) = nan;
I had originally thought to solve this problem using conv as follows:
MeanMA = nan(T, N);
for n = 1:N
MeanMA(WindowLength:T, n) = conv(X(:, n), ones(WindowLength, 1), 'valid');
end
MeanMA = (1/WindowLength) * MeanMA;
But as #PhilGoddard pointed out in the comments, the filter approach avoids the need for the loop.
Also note that I've chosen to make the dates in the output matrix correspond to the dates in X so in later work you can use the same subscripts for both. Thus, the first WindowLength-1 observations in MeanMA will be nan.
For the variance, I can't see how to use either filter or conv or even a running sum to make things more efficient, so instead I perform the calculation manually at each iteration:
VarianceMA = nan(T, N);
for t = WindowLength:T
VarianceMA(t, :) = var(X(t-WindowLength+1:t, :));
end
We could speed things up slightly by exploiting the fact that we have already calculated the mean moving average. Simply replace the within loop line in the above with:
VarianceMA(t, :) = (1/(WindowLength-1)) * sum((bsxfun(#minus, X(t-WindowLength+1:t, :), MeanMA(t, :))).^2);
However, I doubt this will make much difference.
If anyone else can see a clever way to use filter or conv to get the moving window variance I'd be very interested to see it.
I leave the case of skewness and kurtosis to the OP, since they are essentially just the same as the variance example, but with the appropriate function.
A final point: if you were converting the above into a general function, you could pass in an anonymous function as one of the arguments, then you would have a moving average routine that works for arbitrary choice of transformations.
Final, final point: For a sequence of window lengths, simply loop over the entire code block for each window length.
I have managed to produce a solution, which only uses basic functions within MATLAB and can also be expanded to include other functions, (for finance: e.g. a moving Sharpe Ratio, or a moving Sortino Ratio). The code below shows this and contains hopefully sufficient commentary.
I am using a time series of Hedge Fund data, with ca. 10 years worth of daily returns (which were checked to be stationary - not shown in the code). Unfortunately I haven't got the corresponding dates in the example so the x-axis in the plots would be 'no. of days'.
% start by importing the data you need - here it is a selection out of an
% excel spreadsheet
returnsHF = xlsread('HFRXIndices_Final.xlsx','EquityHedgeMarketNeutral','D1:D2742');
% two years to be used for the moving average. (250 business days in one year)
window = 500;
% create zero-matrices to fill with the MA values at each point in time.
mean_avg = zeros(length(returnsHF)-window,1);
st_dev = zeros(length(returnsHF)-window,1);
skew = zeros(length(returnsHF)-window,1);
kurt = zeros(length(returnsHF)-window,1);
% Now work through the time-series with each of the functions (one can add
% any other functions required), assinging the values to the zero-matrices
for count = window:length(returnsHF)
% This is the most tricky part of the script, the indexing in this section
% The TwoYearReturn is what is shifted along one period at a time with the
% for-loop.
TwoYearReturn = returnsHF(count-window+1:count);
mean_avg(count-window+1) = mean(TwoYearReturn);
st_dev(count-window+1) = std(TwoYearReturn);
skew(count-window+1) = skewness(TwoYearReturn);
kurt(count-window +1) = kurtosis(TwoYearReturn);
end
% Plot the MAs
subplot(4,1,1), plot(mean_avg)
title('2yr mean')
subplot(4,1,2), plot(st_dev)
title('2yr stdv')
subplot(4,1,3), plot(skew)
title('2yr skewness')
subplot(4,1,4), plot(kurt)
title('2yr kurtosis')
I have to write some code in Matlab that simulates tossing a coin 150 times. I have to count how many times the coin lands on heads and create a vector that gives a running percentage of the heads.
Then I have to make a table of the number of trials, random 'flips", and the running percentages of heads. I assume random "flips" means heads or tails for that trial.
I also have to create a line graph with trials on the x-axis and probabilities (percentages) on the y-axis. I'm assuming the percentages are just the percentage of getting heads.
Sorry if this post was long. I figure giving the details now will make it easier to see what I was trying to do with the code. I didn't create the table or plot yet because I'm not even sure how to code for the actual problem.
NUM_TRIALS = 150;
trials = 1:NUM_TRIALS;
heads = 0;
t = rand(NUM_TRIALS,1);
percent_h = zeros(size(t));
for i = trials
if (t(i) < 0.5)
heads = heads + 1;
percent_h = heads./trials;
end
end
flips = t;
disp('Number of Trials, Random flips, Heads Percentage')
disp([trials', flips, percent_h'])
plot(trials,percent_h)
title('Trial Number vs. Percent Heads')
xlabel('Trial number')
ylabel('Percent Heads')
Your code is actually pretty close to answering your question, but there are a few issues that I see.
You should index t by the current trial number.
Likewise, percent_h should be indexed accordingly. This should be pre-allocated as well.
Not sure what z is supposed to represent...
To make the plot, just use plot. xlabel will give a label to the x axis, ylabel to the y axis. title will give a name to the plot.
You should divide by i, not trials.
So, your code should look something like this. There's a fair number of ways to simplify it, but I'll preserve your code as much as possible.
NUM_TRIALS = 150;
trials = 1:NUM_TRIALS;
heads = 0;
t = rand(NUM_TRIALS,1);
percent_h=zeros(size(t));
for i = trials
if (t(i) < 0.5)
heads = heads + 1;
end
percent_h(i) = heads/i;
end
plot(trials,percent_h)
xlabel('Trial Number')
ylabel('Percent Heads')
title ('Trial Number vs Percent Heads')
You can actually solve this more simply by taking advantage of a few other MATLAB functions, as hinted at by #PearsonArtPhoto. Firstly, you can use RANDI to generate the coin tosses as ones for a head. Then, you can use CUMSUM to get the cumulative number of heads. Dividing this element wise by 1:n gives you the cumulative fraction of heads.
n=150;
ishead = randi([0,1],1,n);
plot(cumsum(ishead)./(1:n));
Does anyone know how to make the following Matlab code approximate the exponential function more accurately when dealing with large and negative real numbers?
For example when x = 1, the code works well, when x = -100, it returns an answer of 8.7364e+31 when it should be closer to 3.7201e-44.
The code is as follows:
s=1
a=1;
y=1;
for k=1:40
a=a/k;
y=y*x;
s=s+a*y;
end
s
Any assistance is appreciated, cheers.
EDIT:
Ok so the question is as follows:
Which mathematical function does this code approximate? (I say the exponential function.) Does it work when x = 1? (Yes.) Unfortunately, using this when x = -100 produces the answer s = 8.7364e+31. Your colleague believes that there is a silly bug in the program, and asks for your assistance. Explain the behaviour carefully and give a simple fix which produces a better result. [You must suggest a modification to the above code, or it's use. You must also check your simple fix works.]
So I somewhat understand that the problem surrounds large numbers when there is 16 (or more) orders of magnitude between terms, precision is lost, but the solution eludes me.
Thanks
EDIT:
So in the end I went with this:
s = 1;
x = -100;
a = 1;
y = 1;
x1 = 1;
for k=1:40
x1 = x/10;
a = a/k;
y = y*x1;
s = s + a*y;
end
s = s^10;
s
Not sure if it's completely correct but it returns some good approximations.
exp(-100) = 3.720075976020836e-044
s = 3.722053303838800e-044
After further analysis (and unfortunately submitting the assignment), I realised increasing the number of iterations, and thus increasing terms, further improves efficiency. In fact the following was even more efficient:
s = 1;
x = -100;
a = 1;
y = 1;
x1 = 1;
for k=1:200
x1 = x/200;
a = a/k;
y = y*x1;
s = s + a*y;
end
s = s^200;
s
Which gives:
exp(-100) = 3.720075976020836e-044
s = 3.720075976020701e-044
As John points out in a comment, you have an error inside the loop. The y = y*k line does not do what you need. Look more carefully at the terms in the series for exp(x).
Anyway, I assume this is why you have been given this homework assignment, to learn that series like this don't converge very well for large values. Instead, you should consider how to do range reduction.
For example, can you use the identity
exp(x+y) = exp(x)*exp(y)
to your advantage? Suppose you store the value of exp(1) = 2.7182818284590452353...
Now, if I were to ask you to compute the value of exp(1.3), how would you use the above information?
exp(1.3) = exp(1)*exp(0.3)
But we KNOW the value of exp(1) already. In fact, with a little thought, this will let you reduce the range for an exponential down to needing the series to converge rapidly only for abs(x) <= 0.5.
Edit: There is a second way one can do range reduction using a variation of the same identity.
exp(x) = exp(x/2)*exp(x/2) = exp(x/2)^2
Thus, suppose you wish to compute the exponential of large number, perhaps 12.8. Getting this to converge acceptably fast will take many terms in the simple series, and there will be a great deal of subtractive cancellation happening, so you won't get good accuracy anyway. However, if we recognize that
12.8 = 2*6.4 = 2*2*3.2 = ... = 16*0.8
then IF you could efficiently compute the exponential of 0.8, then the desired value is easy to recover, perhaps by repeated squaring.
exp(12.8)
ans =
362217.449611248
a = exp(0.8)
a =
2.22554092849247
a = a*a;
a = a*a;
a = a*a;
a = a*a
362217.449611249
exp(0.8)^16
ans =
362217.449611249
Note that WHENEVER you do range reduction using methods like this, while you may incur numerical problems due to the additional computations necessary, you will usually come out way ahead due to the greatly enhanced convergence of your series.
Why do you think that's the wrong answer? Look at the last term of that sequence, and it's size, and tell me why you expect you should have an answer that's close to 0.
My original answer stated that roundoff error was the problem. That will be a problem with this basic approach, but why do you think 40 is enough terms for the appropriate mathematical ( as opposed to computer floating point arithmetic) answer.
100^40 / 40! ~= 10^31.
Woodchip has the right idea with range reduction. That's the typical approach people use to implement these kinds of functions very quickly. Once you get that all figured out, you deal with roundoff errors of alternating sequences, by summing adjacent terms within the loop, and stepping with k = 1 : 2 : 40 (for instance). That doesn't work here until you use woodchips's idea because for x = -100, the summands grow for a very long time. You need |x| < 1 to guarantee intermediate terms are shrinking, and thus a rewrite will work.