Generate 0 entity in a run - simulation

Suppose to have an time interval ( from 0 to 3600000 that is one hour in milliseconds). I have to generate entity with average 3 and I utilise an Exponential Distribution. The average is (3600000/3) that is how I wanna sample the distribution. If in a particular run I obtain 0 entity create is wrong or can be correct result? Anyone can help me?

It's not an error to get zero. With exponential interarrival times and a rate of 3 per hour, the number of occurrences in an hour has a Poisson distribution with λ=3. The probability of getting n outcomes is
e-λλn/n!
which for n=0 is just under 0.05. In other words, you would see a zero roughly one out of every 20 times.

Related

Source is producing too many agents

My arrival rate for my source is in hours, and I am using events to set it's rate to a distribution at different times, sourceShoppers.set_rate(triangular(1, 5, 2));. However, the source is producing roughly 3 per second, as opposed to an average of 2 an hour.
do this:
self.set_rate(triangular(1,5,2), PER_HOUR);
Nevertheless, when you do that, notice that you will get a random sample for the triangular distribution, which will set the rate to be for instance 1.2 per hour in which case the arrivals will follow a poisson distribution with an average of 1.2 per hour always unless you change the rate again...
If you want some advice, you have to say what you want to achieve...

Anylogic: How to block a line by a probability?

So I'm modelling a production line (simple, with 5 processes which I modelled as Services). I'm simulating for 1 month, and during this one month, my line stops approximately 50 times (due to a machine break down). This stop can last between 3 to 60 min and the avg = 12 min (depending on a triangular probability). How could I implement this to the model? I'm trying to create an event but can't figure out what type of trigger I should use.
Have your services require a resource. If they are already seizing a resource like labor, that is ok, they can require more than one. On the resourcePool, there is an area called "Shifts, breaks, failures, maintenance..." Check "Failures/repairs:" and enter your downtime distribution there.
If you want to use a triangular, you need min/MODE/max, not min/AVERAGE/max. If you really wanted an average of 12 minutes with a minimum of 3 and maximum of 60; then this is not a triangular distribution. There is no mode that would give you an average of 12.
Average from triangular, where X is the mode:
( 3 + X + 60 ) / 3 = 12
Means X would have to be negative - not possible for there to be a negative delay time for the mode.
Look at using a different distribution. Exponential is used often for time between failures (or poisson for failures per hour).

Pedestrian arrival rate of 5 per h only 3 showing for 1 h during the simulation. Any reason why?

I'm trying to simulate a pedestrian flow in the entrance of an hospital.
We are installing check-in platforms and I want to know how many platforms we should get according to the patient flow.
I'm using Anylogic personal learning edition and when I put an arrival rate of 5 per hour during the simulation only 3 appears.
I'm trying to understand how anylogic works and distribute the pedestrians according to the rate we put.
For the personnal learning edition 1h equal 1min in real.
enter image description here
if you choose rate=5, the pedSource block will generate pedestrians with an exponentially distributed interarrival time with mean = 1/rate = 1/5.
Which means that the average of arrivals on the long term will be 5, but you won't get 5 every hour since it's a stochastic variable.
If you change the seed, you will have different arrivals... click on Simulation: Main and you can change the seed or use a random seed:
Now if you really want exactly 5 per hour in a deterministic way, you need to change the arrival from rate to inject function:
Then you can create an event that runs cyclically 5 times per hour.. or 1 time every 12 minutes:
and you do pedSource.inject(1);

Tableau running average

I have a column of numeric data and another column by date. I'm trying to calculate a running average by week. I'm using a table calculation, Running Total on Average. This is not producing the running average I am expecting.
Example:
For 3rd week Running average, the running average is calculating the first week average + second week average + third week average, and then taking the average of those 3 numbers. What I want it to do is take all prior 3 week data and THEN take one single average as a whole. Hope that makes sense.
This seems to have done it. Calculated field:
RUNNING_SUM(SUM([NPS]))/RUNNING_SUM(COUNT([NPS]))

significant differences between means

Considering the picture below
each values X could be identified by the indeces X_g_s_d_h
g = group g=[1:5]
s = subject number (variable for each g)
d = day number (variable for each s)
h = hour h=[1:24]
so X_1_3_4_12 means that the value X is referred to the
12th hour
of 4th day
of 3rd subject
of group 1
First I calculate the mean (hour by hour) over all the days of each subject. Doing that the index d disappear and each subject is represented by a vector containing 24 values.
X_g_s_h will be the mean over the days of a subject.
Then I calculate the mean (subject by subject) of all the subjects belonging to the same group resulting in X_g_h. Each group is represented by 1 vector of 24 values
Then I calculate the mean over the hours for each group resulting in X_g. Each group now is represented by 1 single value
I would like to see if the means X_g are significantly different between the groups.
Can you tell me what is the proper way?
ps
The number of subjects per group is different and it is also different the number of days for each subject. I have more than 2 groups
Thanks
Ok so I am posting an answer to summarize some of the problems you may have.
Same subjects in both groups
Not averaging:
1-First if we assume that you have only one measure that is repeated every hour for a certain amount of days, that is independent on which day you pick and each hour, then you can reshape your matrix into one column for each subject, per group and perform a ttest with repetitive measures.
2-If you cannot assume that your measure is independent on the hour, but is in day (lets say the concentration of a drug after administration that completely vanish before your next day measure), then you can make a ttest with repetitive measures for each hour (N hours), having a total of N tests.
3-If you cannot assume that your measure is independent on the day, but is in hour (lets say a measure for menstrual cycle, which we will assume stable at each day but vary between days), then you can make a ttest with repetitive measures for each day (M days), having a total of M tests.
4-If you cannot assume that your measure is independent on the day and hour, then you can make a ttest with repetitive measures for each day and hour, having a total of NXM tests.
Averaging:
In the cases where you cannot assume independence you can average the dependent variables, therefore removing the variance but also lowering you statistical power and interpretation.
In case 2, you can average the hours to have a mean concentration and perform a ttest with repetitive measures, therefore having only 1 test. Here you lost the information how it changed from hour 1 to N, and just tested whether the mean concentration between groups within the tested hours is different.
In case 3, you can average both hour and day, and test if for example the mean estrogen is higher in one group than in another, therefore having only 1 test. Again you lost information how it changed between the different days.
In case 4, you can average both hour and day, therefore having only 1 test. Again you lost information how it changed between the different hours and days.
NOT same subjects in both groups
Paired tests are not possible. Follow the same ideology but perform an unpaired test.
You need to perform a statistical test for the null hypothesis H0 that the data in different groups comes from independent random samples from distributions with equal means. It's better to avoid sequential 'mean' operation, but just to regroup data on g. If you assume normality and independence of observations (as pointed out by #ASantosRibeiro below), that you can perform ttest (http://www.mathworks.nl/help/stats/ttest2.html)
clear all;
X = randn(6,5,4,3); %dummy data in g_s_d_h format
Y = reshape(X,5*4*3,6); %reshape data per group
h = zeros(6,6);
for i = 1 : 6
for j = 1 : 6
h(i,j)=ttest2(Y(:,i),Y(:,j));
end
end
If you want to take into account the different weights of the observations, you need to calculate t-value yourself (e.g., see here http://support.sas.com/documentation/cdl/en/statug/63033/HTML/default/viewer.htm#statug_ttest_a0000000126.htm)