If I flip a coin 100 times, what is the probability that exactly 50 will be heads? My thoughts were to get the number of times exactly 50 appeared in the 100 coin flips out of 1000 times and divide that by 1000, the number of events.
I have to model this experiment in Matlab.
I understand that flipping a coin 100 times and retrieving the number of heads and adding a count to the number of exactly 50 heads is one event. But I do not know how to repeat that event 1000, or 10000 times.
Here is the code I have written so far:
total_flips=100;
heads=0;
tails=0;
n=0;
for z=1:1000
%tosses 100 coins
for r=1:100
%randomizes to choose 1 or 0, 0 being heads
coin=floor(2*rand(1));
if (coin==0)
heads=heads+1;
else
tails=tails+1;
end
end
if heads==50
n=n+1;
end
end
I have tried to encompass the for loop and the if statement within a for loop, but had no luck. How do I repeat it?
although your problem is solved, here comes comments on your code:
1) You set the variable total_flips=100, but you do not use it in your for-loop, where it goes from 1 to 100. It could go from 1 to total_flips
2) Omitting for-loops: although this was not your question, but your code can be optimized. You do not need a single for-loop for your problem:
repititions = 1000;
total_flips = 100;
coin_flip_matrix = floor(2*rand(total_flips, repititions)); % all coin flips: one column per repitition
num_of_heads = sum(coin_flip_matrix); % number of heads for each repitition (shaped: 1 x repitions)
n = sum(num_of_heads == 50) % how often did we hit 50?
You don't need tails at all, and you need to set heads back to zero inside the outer for z=1:1000 loop.
Related
I need to calculate the Spearman's rank correlation (using corr function) for pairs of vectors with different lengths (for example 5-element vectors to 20-element vectors). The number of pairs is usually above 300 pairs for each length. I track the progress with waitbar. I have noticed that it takes unusually very long time for 9-element pair of vectors, where for other lengths (greater and smaller) it takes very short times. Since the formula is exactly the same, the problem must have originated in MATLAB function corr.
I wrote the following code to verify that the problem is with corr function and not other calculations that I have besides 'corr', where all of that calculations (including 'corr') take place inside some 2 or 3 'for' loops. The code repeats the timing 50 times to avoid accidental results.
The result is a bar graph, confirming that it takes a long time for MATLAB to calculate Spearman's rank correlation for 9-element vectors. Since my calculations are not that heavy, this problem does not cause endless wait, it just increases the total time consumed for the whole process. Can someone tell me that what causes the problem and how to avoid it?
Times1 = zeros(20,50);
for i = 5:20
for j = 1:50
tic
A = rand(i,2);
[r,p] = corr(A(:,1),A(:,2),'type','Spearman');
Times1(i,j) = toc;
end
end
Times2 = mean(Times1,2);
bar(Times2);
xticks(1:25);
xlabel('number of elements in vectors');
ylabel('average time');
After some investigation, I think I found the root of this very interesting problem. My tests have been conducted profiling every outer iteration using the built-in Matlab profiler, as follows:
res = cell(20,1);
for i = 5:20
profile clear;
profile on -history;
for j = 1:50
uni = rand(i,2);
corr(uni(:,1),uni(:,2),'type','Spearman');
end
profile off;
p = profile('info');
res{i} = p.FunctionTable;
end
The produced output looks like this:
The first thing I noticed is that the Spearman correlation for matrices with a number of rows less than or equal to 9 is computed in a different way than for matrices with 10 or more rows. For the former, the functions being internally called by the corr function are:
Function Number of Calls
----------------------- -----------------
'factorial' 100
'tiedrank>tr' 100
'tiedrank' 100
'corr>pvalSpearman' 50
'corr>rcumsum' 50
'perms>permsr' 50
'perms' 50
'corr>spearmanExactSub' 50
'corr>corrPearson' 50
'corr>corrSpearman' 50
'corr' 50
'parseArgs' 50
'parseArgs' 50
For the latter, the functions being internally called by the corr function are:
Function Number of Calls
----------------------- -----------------
'tiedrank>tr' 100
'tiedrank' 100
'corr>AS89' 50
'corr>pvalSpearman' 50
'corr>corrPearson' 50
'corr>corrSpearman' 50
'corr' 50
'parseArgs' 50
'parseArgs' 50
Since the computation of the Spearman correlation for matrices with 10 or more rows seems to run smoothly and quickly and doesn't show any evidence of performance bottlenecks, I decided to avoid losing time investigating on this fact and I focused on the main concern: the small matrices.
I tried to understand the difference between the execution time of the whole process for a matrix with 5 rows and for one with 9 rows (the one notably showing the worst performance). This is the code I used:
res5 = res{5,1};
res5_tt = [res5.TotalTime];
res5_tt_perc = ((res5_tt ./ sum(res5_tt)) .* 100).';
res9_tt = [res{9,1}.TotalTime];
res9_tt_perc = ((res9_tt ./ sum(res9_tt)) .* 100).';
res_diff = res9_tt_perc - res5_tt_perc;
[~,res_diff_sort] = sort(res_diff,'desc');
tab = [cellstr(char(res5.FunctionName)) num2cell([res5_tt_perc res9_tt_perc res_diff])];
tab = tab(res_diff_sort,:);
tab = cell2table(tab,'VariableNames',{'Function' 'TT_M5' 'TT_M9' 'DIFF'});
And here is the result:
Function TT_M5 TT_M9 DIFF
_______________________ _________________ __________________ __________________
'corr>spearmanExactSub' 7.14799963478685 16.2879721171023 9.1399724823154
'corr>pvalSpearman' 7.98185309750143 16.3043118970503 8.32245879954885
'perms>permsr' 3.47311716905926 8.73599255035966 5.26287538130039
'perms' 4.58132952553723 8.77488502392486 4.19355549838763
'corr>corrSpearman' 15.629476293326 16.440893059217 0.811416765890929
'corr>rcumsum' 0.510550019981949 0.0152486312660671 -0.495301388715882
'factorial' 0.669357868472376 0.0163923929871943 -0.652965475485182
'parseArgs' 1.54242684137027 0.0309456171268161 -1.51148122424345
'tiedrank>tr' 2.37642998160463 0.041010720272735 -2.3354192613319
'parseArgs' 2.4288171135289 0.0486075856244615 -2.38020952790444
'corr>corrPearson' 2.49766877262937 0.0484657591710417 -2.44920301345833
'tiedrank' 3.16762535118088 0.0543584195582888 -3.11326693162259
'corr' 21.8214856092549 16.5664346332513 -5.25505097600355
Once the bottleneck was detected, I started analyzing the internal code (open corr) and I finally found the cause of the problem. Within the spearmanExactSub, this part of code is being executed (where n is the number of rows of the matrix):
n = arg1;
nfact = factorial(n);
Dperm = sum((repmat(1:n,nfact,1) - perms(1:n)).^2, 2);
A permutation is being computed on a vector whose values range from 1 to n. This is what comes into play increasing the computational complexity (and, obviously, the computational time) of the function. Other operations, like the subsequent repmat on factorial(n) of 1:n and the ones below that point, contribute to worsen the situation. Now, long story short...
factorial(5) = 120
factorial(6) = 720
factorial(7) = 5040
factorial(8) = 40320
factorial(9) = 362880
can you see the reason why, between 5 and 9, your bar graph shows an "exponentially" increasing computational time?
On a side note, there is nothing you can do to solve this problem, unless you find another implementation of the Spearman correlation that doesn't present the same bottleneck or you implement your own.
I want to program an experiment that should consist of 10 trials (10 pictures) that a shown either on the left or right side. At the same time there is a odd or even number shown on the opposite side. I want to measure reaction time and response (odd or even). I guess I am stuck with the trial structure.
Is it enough to just define the ntrials = length(pictures) or do I need a for loop for the variables (pic_position, number_position)?
This is my approach so far:
pic_pos = {'left' 'right'};
num_pos = {'left' 'right'};
evenodd = {'odd' 'even'};
ntrials = length(pictures);
for n = 1:length(pictures)
trials(ntrials).picture = pictures(n)
end
pictures = Shuffle(pictures);
for trial = 1:ntrials
currentnumber = num2str(numbers{trial})
switch trials(trial).num_pos
case 'right'
x = screencentrex + img_dist
case 'left'
x = screencentrex - img_dist
end;
Screen('TextSize', win, [25]);
DrawFormattedText(win, currentnumber, [x], 'center', [255 255 255]);
Screen('Flip', win);
WaitSecs(3);
Unfortunately it doesn't show me the number.
You don't neccessarily need to loop over the position or number variables. Instead, you can generate the stimulus parameters for each trial in advance, for example using the Psychtoolbox function BalanceFactors
[trialNumberPositions, trialNumberEvenOrOdd] = BalanceTrials(ntrials, 1, num_pos, evenodd);
This returns combinations of the levels of the factors 'num_pos' and 'evenodd', the orders of which are then randomized. So for example the number position for the trial number saved within the variable 'trial', in your example would be accessed as trialNumberPositions{trial}. Keep in mind that you have 4 unique combinations of evenodd and num_pos, so for your trial numbers to be balanced across conditions you would have a total number of trials that is a multiple of 4 (for example 12 trials total, rather than 10). I didn't include pic_pos because the pic_pos would always be whatever num_pos is not, as in your description the two stimuli would never be presented on the same side.
As to why your number isn't being displayed, it is hard to tell without more of the experiment script. But you are currently writing white text to the screen, is the background non-white?
I'm a beginner in Matlab and I'm trying to model the spread of an infectious disease using Matlab. However, I encounter some problems.
At first, I define the matrices that need to be filled and their initial status:
diseasematrix=zeros(20,20);
inirow=10;
inicol=10;
diseasematrix(inirow,inicol)=1; % The first place where a sick person is
infectionmatrix=zeros(20,20); % Infected people, initially all 0
healthymatrix=round(rand(20,20)*100); % Initial healthy population (randomly)
Rate=0.0001; % Rate of spread
Now, I want to make a plot where the spread of the disease is shown, using a for loop. But i'm stuck here...
for t=1:365
Zneighboursum=zeros(size(diseasematrix));
out_ZT = calc_ZT(Zneighboursum, diseasematrix);
infectionmatrix(t) = round((Rate).*(out_ZT));
diseasematrix(t) = diseasematrix(t-1) + infectionmatrix(t-1);
healthymatrix(t) = healthymatrix(t-1) - infectionmatrix(t-1);
imagesc(diseasematrix(t));
title(sprintf('Day %i',t));
drawnow;
end
This basically says that the infectionmatrix is calculated based upon the formula in the loop, the diseasematrix is calculated by adding up the sick people of the previous timestep with the infected people of the previous time. The healthy people that remain are calculated by substracting the healthy people of the previous time step with the infected people. The variable out_ZT is a function I made:
function [ZT] = calc_ZT(Zneighboursum, diseasematrix)
Zneighboursum = Zneighboursum + circshift(diseasematrix,[1 0]);
Zneighboursum = Zneighboursum + circshift(diseasematrix,[0 1]);
ZT=Zneighboursum;
end
This is to quantify the number of sick people around a central cell.
However, the result is not what I want. The plot does not evolve dynamically and the values don't seem to be right. Can anyone help me?
Thanks in advance!
There are several problems with the code:
(Rate).*(out_ZT) is wrong. Because first one is a scalar and
second is a matrix, while .* requires both to be matrices of the
same size. so a single * would work.
The infectionmatrix,
diseasematrix, healthymatrix are all 2 dimensional matrices and
in order to keep them in memory you need to have a 3 dimensional
matrix. But since you don't use the things you store later you can
just rewrite on the old one.
You store integers in the
infectionmatrix, because you calculate it with round(). That
sets the result always to zero.
The value for Rate was too low to see any result. So I increased it to 0.01 instead
(just a cautionary point) you haven't used healthymatrix in your code anywhere.
The code for the function is fine, so after debugging according to what I perceived, here's the code:
diseasematrix=zeros(20,20);
inirow=10;
inicol=10;
diseasematrix(inirow,inicol)=1; % The first place where a sick person is
infectionmatrix=zeros(20,20); % Infected people, initially all 0
healthymatrix=round(rand(20,20)*100); % Initial healthy population (randomly)
Rate=0.01;
for t=1:365
Zneighboursum=zeros(size(diseasematrix));
out_ZT = calc_ZT(Zneighboursum, diseasematrix);
infectionmatrix = (Rate*out_ZT);
diseasematrix = diseasematrix + infectionmatrix;
healthymatrix = healthymatrix - infectionmatrix;
imagesc(diseasematrix);
title(sprintf('Day %i',t));
drawnow;
end
There is several problems:
1) If you want to save a 3D matrix you will need a 3D vector:
so you have to replace myvariable(t) by myvariable(:,:,t);
2) Why did you use round ? if you round a value < 0.5 the result will be 0. So nothing will change in your loop.
3) You need to define the boundary condition (t=1) and then start your loop with t = 2.
diseasematrix=zeros(20,20);
inirow=10;
inicol=10;
diseasematrix(inirow,inicol)=1; % The first place where a sick person is
infectionmatrix =zeros(20,20); % Infected people, initially all 0
healthymatrix=round(rand(20,20)*100); % Initial healthy population (randomly)
Rate=0.01; % Rate of spread
for t=2:365
Zneighboursum=zeros(size(diseasematrix,1),size(diseasematrix,2));
out_ZT = calc_ZT(Zneighboursum, diseasematrix(:,:,t-1));
infectionmatrix(:,:,t) = (Rate).*(out_ZT);
diseasematrix(:,:,t) = diseasematrix(:,:,t-1) + infectionmatrix(:,:,t-1);
healthymatrix(:,:,t) = healthymatrix(:,:,t-1) - infectionmatrix(:,:,t-1);
imagesc(diseasematrix(:,:,t));
title(sprintf('Day %i',t));
drawnow;
end
IMPORTANT: circshift clone your matrix in order to deal with the boundary effect.
I want to double a parameter (population size) until it reaches a certain value, regardless of the number of loops (generations). Say I have the following loop:
population_size = 10; %initial population size
for i = 0:10, %no. of generations
%(*call function for model*)
population_size = (population_size*2);
gene_frequency = (gene_frequency*population_size)/population_size;
end
How would I do this in MATLAB?
As Yvon has suggested, use a while loop that will keep looping until you meet a certain condition. I can see that your population size is doubling, and so you want to make the while loop until is it equal to, or exceeds this number.
I do have one question though: Your gene_frequency call seems useless. You are taking the variable, multiplying by population_size, then dividing by population_size.... and you'll just get the same number as you did last time. I'm going to leave this statement out as it doesn't contribute anything meaningful to your question.
As such:
population_size = 10; %initial population size
target_population = ... ;%// You place the population you want met here
while population_size < target_population %// NEW
%//(*call function for model*)
population_size = (population_size*2);
end
Edit - July 30th, 2014
You have mentioned in your comments that you want to run this for 15 generations, but when the population size reaches its limit, it will remain the same for the rest of the generations. You can combine this with a for loop and an if statement. We will go through each generation, then check to see what the population size is. As long as the population size is less than the target population, we will double the size. Once it finally exceeds the population size, this will no longer double but the loop will keep going until go through the rest of the generations.
You can do this like so:
population_size = 10; %initial population size
target_population = ... ;%// You place the population you want met here
for gen = 1 : 15
%//(*call function for model*)
if (population_size <= target_population)
population_size = (population_size*2);
end
end
I have to write some code in Matlab that simulates tossing a coin 150 times. I have to count how many times the coin lands on heads and create a vector that gives a running percentage of the heads.
Then I have to make a table of the number of trials, random 'flips", and the running percentages of heads. I assume random "flips" means heads or tails for that trial.
I also have to create a line graph with trials on the x-axis and probabilities (percentages) on the y-axis. I'm assuming the percentages are just the percentage of getting heads.
Sorry if this post was long. I figure giving the details now will make it easier to see what I was trying to do with the code. I didn't create the table or plot yet because I'm not even sure how to code for the actual problem.
NUM_TRIALS = 150;
trials = 1:NUM_TRIALS;
heads = 0;
t = rand(NUM_TRIALS,1);
percent_h = zeros(size(t));
for i = trials
if (t(i) < 0.5)
heads = heads + 1;
percent_h = heads./trials;
end
end
flips = t;
disp('Number of Trials, Random flips, Heads Percentage')
disp([trials', flips, percent_h'])
plot(trials,percent_h)
title('Trial Number vs. Percent Heads')
xlabel('Trial number')
ylabel('Percent Heads')
Your code is actually pretty close to answering your question, but there are a few issues that I see.
You should index t by the current trial number.
Likewise, percent_h should be indexed accordingly. This should be pre-allocated as well.
Not sure what z is supposed to represent...
To make the plot, just use plot. xlabel will give a label to the x axis, ylabel to the y axis. title will give a name to the plot.
You should divide by i, not trials.
So, your code should look something like this. There's a fair number of ways to simplify it, but I'll preserve your code as much as possible.
NUM_TRIALS = 150;
trials = 1:NUM_TRIALS;
heads = 0;
t = rand(NUM_TRIALS,1);
percent_h=zeros(size(t));
for i = trials
if (t(i) < 0.5)
heads = heads + 1;
end
percent_h(i) = heads/i;
end
plot(trials,percent_h)
xlabel('Trial Number')
ylabel('Percent Heads')
title ('Trial Number vs Percent Heads')
You can actually solve this more simply by taking advantage of a few other MATLAB functions, as hinted at by #PearsonArtPhoto. Firstly, you can use RANDI to generate the coin tosses as ones for a head. Then, you can use CUMSUM to get the cumulative number of heads. Dividing this element wise by 1:n gives you the cumulative fraction of heads.
n=150;
ishead = randi([0,1],1,n);
plot(cumsum(ishead)./(1:n));