How to run Markov chain simulations but only return the results for the final year? - simulation

I want to simulate data with 10000 people for 50 steps. And repeat this 1000 times - to find out the long proportions.
So that is going to be really big data (doing it for 5 steps already takes like 3-4gb). So I want Rstudio to run the simulations and only return the results for the 50th step (and dispose of the rest of the data).
Is this possible?
I am using the below right now to run my simulations:
Result<-replicate(n=1000, lapply(Datareq2, function(state2) rmarkovchain(n=2, object=reqObject2, t0=state2)))

Related

Why does every simulation I perform in Arena Simulation give the same results?

Every time I run a simulation with the same parameters in the Run window I get exactly the same results. The results are different if a different number of replications is set each time the run is started
These are my settings in the run window:
enter image description here
I have a lot of Process blocks. Most of them have a normal distribution in duration. Why are the results not different?
If it helps in any way, her is a photo of the constructed model:
enter image description here
Arena uses the same random nr stream for your run unless you tell it to use another - so your first rep will look the same every time. The answer depends on the distributions you sample from. If you change the logic and sample at other times, the answer will change. Each following rep will have a unique answer based on the random nr streams you are sampling from. It makes it easier/possible to find errors if you can execute exactly the same run.
Arena provides (by default) ten different "streams" of (pseudo) random numbers. If you don't ask the system to use a particular stream, it will use stream 10. For example NORM(10,2) will use stream 10 to calculate a random normally distributed number (mean 10, standard deviation 2); NORM(10,2,4) will use stream 4 to calculate a similarly distributed number.
By default, the 10 random number generators for the streams are initialised at the beginning of each run to 14561, 25971, 31131, 22553, 12121, 32323, 19991, 18765, 14327, and 32535 (from Arena help). At the end of one replication, the generators will not be re-initialised, so they will start the next replication with a new value.
You can control the random number generator initialisation with the SEEDS element.
As #Marlize says, this helps to ensure that you can reproduce a simulation result if you need to.

Anylogic - How to measure work in process inventory (WIP) within simulation

I am currently working on a simple simulation that consists of 4 manufacturing workstations with different processing times and I would like to measure the WIP inside the system. The model is PennyFab2 in case anybody knows it.
So far, I have measured throughput and cycle time and I am calculating WIP using Little's law, however the results don't match he expectations. The cycle time is measured by using the time measure start and time measure end agents and the throughput by simply counting how many pieces flow through the end of the simulation.
Any ideas on how to directly measure WIP without using Little's law?
Thank you!
For little's law you count the arrivals, not the exits... but maybe it doesn't make a difference...
Otherwise.. There are so many ways
you can count the number of agents inside your system using a RestrictedAreaStart block and use the entitiesInside() function
You can just have a variable that adds +1 if something enters and -1 if something exits
No matter what, you need to add the information into a dataset or a statistics object and you get the mean of agents in your system
Little's Law defines the relationship between:
Work in Process =(WIP)
Throughput (or Flow rate)
Lead Time (or Flow Time)
This means that if you have 2 of the three you can calculate the third.
Since you have a simulation model you can record all three items explicitly and this would be my advice.
Little's Law should then be used to validate if you are recording the 3 values correctly.
You can record them as follows.
WIP = Record the average number of items in your system
Simplest way would be to count the number of items that entered the system and subtract the number of items that left the system. You simply do this calculation every time unit that makes sense for the resolution of your model (hourly, daily, weekly etc) and save the values to a DataSet or Statistics Object
Lead Time = The time a unit takes from entering the system to leaving the system
If you are using the Process Modelling Library (PML) simply use the timeMeasureStart and timeMeasureEnd Blocks, see the example model in the help file.
Throughput = the number of units out of the system per time unit
If you run the model and your average WIP is 10 units and on average a unit takes 5 days to exit the system, your throughput will be 10 units/5 days = 2 units/day
You can validate this by taking the total units that exited your system at the end of the simulation and dividing it by the number of time units your model ran
if you run a model with the above characteristics for 10 days you would expect 20 units to have exited the system.

Matlab: run program until condition is met

I'm currently modelling the dynamics of an ice sheet. I therefore made a script that plots the volume of an ice sheet throughout time (in steps of 500 years). The volume increases rapidly at first, but the curve flattens later on as the volume does not change anymore and the ice sheet is in steady state... its shape is familiar like y=ln(x)... I thus have 2 output arrays, namely a) vol_time with the time in steps of 500 years and b) vol with the corresponding volume. Now, the program runs until a fixed time that I inserted (200 000 years) but I want to run the program only until this steady state is reached. So my question is: how can I make the program run only until the volume changes with only 0.002% per 500 years?
Thanks
You can ether wrap your ice-sheet thickness calculation in a while loop so the code performs the calculation until the 0.0002% condition is met or you loop through the whole 200.000 years.
Another option could be to add a if check end the end of your ice-sheet thickness calculation and if you enter and then add break in the if, this way the loop terminate.

Average results from multiple simulations

I have a simulation with a lot of random components, so I would like to run many simulations and average the results (the result is determined by a variable called score).
How would you do this in Netlogo?
Currently I'm working on a program that will export the results to csv, then I plan to use python/excel to average them. I don't like this because I want to run 100+ simulations (so there will be 100+ files)... I'm hoping there is a better solution
EDIT or an implementation of what I described (I have to relearn enough python/vba to solve this, so it's going to take me some time)
This should be simple enough if you use BehaviorSpace.
In your experiment definition, put score in the Measure runs using these reporters textbox and uncheck Measure run at every step.
When you run your experiment, save your results using Table output. It will produce a csv that you can open in your spreadsheet application. From there, producing an average of the score column should be trivial.

Calculate time of script execution previously with Matlab

Good morning,
I have a question about the time execution of a script on Matlab. Is it possible to know previously how long spend the execution of a script before running it (an estimated time, for example)? I know that with tic and toc command, among others, is it possible to know the time at the end but I don't know if it's possible to know it before.
Thanks in advance,
It is not too hard to make an estimate of how long your calculation will take.
You already know how to record calculation times with tic and toc, so now you can do this:
Start with a small scale test (example, n=1) and record the calculation time
Multiply n with a constant k (I usually choose 2 or 10 for easy calculations), record the calculation time
Keep multiplying with n untill you find a consistent relation: 'If I multiply my input size with k, my calculation time changes like so ...'
Now you can extrapolate your estimated calculation time by:
calculating how many times you need to multiply input size of the biggest small scale example to get your real data size
Applying the consistent relation that you found exactly that many times to the calculation time of your biggest small scale example
Of course this combines well with some common sense, like if you do certain things t times they will take about t times as long. This can easily be used when you have to perform a certain calculation a million times. Just interrupt the loop after a minute or so, if it is still in the first ten calculations you may want to give up!