I'm estimating last mile delivery costs in an large urban network using by-route distances. I have over 8000 customer agents and over 100 retail store agents plotted in a GIS map using lat/long coordinates. Each customer receives deliveries from its nearest store (by route). The goal is to get two distance measures in this network for each store:
d0_bar: the average distance from a store to all of its assigned customers
d1_bar: the average distance between all customers common to a single store
I've written a startup function with a simple foreach loop to assign each customer to a store based on by-route distance (customers have a parameter, "customer.pStore" of Store type). This function also adds, in turn, each customer to the store agent's collection of customers ("store.colCusts"; it's an array list with Customer type elements).
Next, I have a function that iterates through the store agent population and calculates the two average distance measures above (d0_bar & d1_bar) and writes the results to a txt file (see code below). The code works, fortunately. However, the problem is that with such a massive dataset, the process of iterating through all customers/stores and retrieving distances via the openstreetmap.org API takes forever. It's been initializing ("Please wait...") for about 12 hours. What can I do to make this code more efficient? Or, is there a better way in AnyLogic of getting these two distance measures for each store in my network?
Thanks in advance.
//for each store, record all customers assigned to it
for (Store store : stores)
{
distancesStore.print(store.storeCode + "," + store.colCusts.size() + "," + store.colCusts.size()*(store.colCusts.size()-1)/2 + ",");
//calculates average distance from store j to customer nodes that belong to store j
double sumFirstDistByStore = 0.0;
int h = 0;
while (h < store.colCusts.size())
{
sumFirstDistByStore += store.distanceByRoute(store.colCusts.get(h));
h++;
}
distancesStore.print((sumFirstDistByStore/store.colCusts.size())/1609.34 + ",");
//calculates average of distances between all customer nodes belonging to store j
double custDistSumPerStore = 0.0;
int loopLimit = store.colCusts.size();
int i = 0;
while (i < loopLimit - 1)
{
int j = 1;
while (j < loopLimit)
{
custDistSumPerStore += store.colCusts.get(i).distanceByRoute(store.colCusts.get(j));
j++;
}
i++;
}
distancesStore.print((custDistSumPerStore/(loopLimit*(loopLimit-1)/2))/1609.34);
distancesStore.println();
}
Firstly a few simple comments:
Have you tried timing a single distanceByRoute call? E.g. can you try running store.distanceByRoute(store.colCusts.get(0)); just to see how long a single call takes on your system. Routing is generally pretty slow, but it would be good to know what the speed limit is.
The first simple change is to use java parallelism. Instead of using this:
for (Store store : stores)
{ ...
use this:
stores.parallelStream().forEach(store -> {
...
});
this will process stores entries in parallel using standard Java streams API.
It also looks like the second loop - where avg distance between customers is calculated doesn't take account of mirroring. That is to say distance a->b is equal to b->a. Hence, for example, 4 customers will require 6 calculations: 1->2, 1->3, 1->4, 2->3, 2->4, 3->4. Whereas in case of 4 customers your second while loop will perform 9 calculations: i=0, j in {1,2,3}; i=1, j in {1,2,3}; i=2, j in {1,2,3}, which seems wrong unless I am misunderstanding your intention.
Generally, for long running operations it is a good idea to include some traceln to show progress with associated timing.
Please have a look at above and post results. With more information additional performance improvements may be possible.
Related
I am using a collection to represent available trucks in a system. I am using a 1 or 0 for a given index number, using a 1 to say that indexed truck is available. I am then trying to assign that index number to a customer ID. I am trying to randomly select an available truck from those listed as available. I am getting an error saying the left-hand side of an assignment must be a variable and highlighting the portion of the code reading Available_Trucks() = 1. This is the code:
agent.ID = randomWhere(Available_Trucks, Available_Trucks() = 1);
The way you are doing it won't work... randomWhere when applied to a collection of integers, will return the element of the collection (in this case 1 or 0).
So doing
randomWhere(Available_Trucks,at->at==1); //this is the right synthax
will return 1 always since that's the value of the number chosen in the collection. So what you need is to get the index of the number of the collection that is equal to 1. But you will have to create a function to do that yourself... something like this (probably not the best way but it works: agent.ID=getRandomAvailbleTruck(Available_Trucks);
And the function getRandomAvailbleTruck will take as an argument a collection (arrayList probably).. it will return -1 if there is no availble truck
int availableTrucks=count(collection,c->c==1);
if(availableTrucks==0) return -1;
int rand=uniform_discr(1,availableTrucks);
int i=0;
int j=0;
while(i<rand){
if(collection.get(j)==1){
i++;
if(i==rand){
return j;
}
}
j++;
}
return -1;
Now another idea is to instead of using 0 and 1 for the availability, you can use correlative numbers: 1,2,3,4,5 ... etc and use a 0 if it's not available. For instance if truck 3 is not availble, the array will be 1,2,0,4,5 and if it's available it will be 1,2,3,4,5.
In that case you can use
agent.ID=randomTrue(available_trucks,at->at>0);
But you will get an error if there is no available truck, so check that.
Nevertheless, what you are doing is horrible practice... And there is a much easier way to do it if you put the availability in your truck if your truck is an agent...
Then you can just do
Truck truck=randomWhere(trucks,t->t.available==1);
if(truck!=null)
agent.ID=truck.ID;
I would like to know how to measure the throughput rate of the production line on Anylogic.
Question: Are there any methods to measure the Time Between Departure of the agent at the sink block? >>(I will calculate the throughput rate by inverting the time between departure value.)
At the moment, I just simply calculated the throughput based on Little's law, which I use the average lead time and WIP level of the line. I am not sure that whether the throughput value based on this calculation will be equal to the inverted value of the time between departure or not?
I hope you guys could help me figure it out.
Thanks in advance!
There is a function "time()" that returns the current model time in model time units. Using this function, you may know the times when agent A and agent B left the system, and calculate the difference between these times. You can do this by writing the code like below in the "On exit" field of the "sink" block:
statistic.add(time() - TimeOfPreviousAgent);
TimeOfPreviousAgent = time();
"TimeOfPreviousAgent" is a variable of "double" type;
"statistic" is a "Statistic" element used to collect the measurements
This approach of measuring time in the process flow is described in the tutorial Bank Office.
As an alternative, you can store leaving time of each agent into a collection. Then, you will need to iterate over the samples stored in the collection to find the difference between each pair of samples.
Not sure if this will help but it stems off Tatiana's answer. In the agents state chart you can create variables TimeIn, TimeOut, and TimeInSystem. Then at the Statechart Entry Point have,
TimeIn = time();
And at the Final state have,
TimeOut = time();
TimeInSystem = TimeOut - TimeIn;
To observe these times for each individual agent you can use the following code,
System.out.println("I came in at " + TimeIn + " and exited at " TimeOut + " and spent " + TimeInSystem + " seconds in the system";
Then for statistical analysis you can calculate the min, avg, and max throughputs of all agents by creating in Main variables, TotalTime, TotalAgentsServiced, AvgServiceTime, MaxServiceTime, MinServiceTime and then add a function call it say TrackAvgTimeInSystem ... within the function add argument NextAgent with type double. In the function body have,
TotalTime += NextAgent;
TotalAgentsServiced += 1;
AverageServiceTime = TotalTime/TotalCarsServiced;
if(MinServiceTimeReported == 0)
{
MinServiceTime = NextAgent;
}
else if(NextAgent < MinServiceTime)
{
MinServiceTime = NextAgent;
}
if(NextAgent > MaxServiceTime)
{
MaxServiceTime = NextAgent;
}
Then within your agent's state charts, in the Final State call the function
get_Main().TrackAvgTimeInSystem(TimeInSystem);
This then calculates the min, max, and average throughput of all agents.
I am looking for an algorithm that fairly samples p percent of users from an infinite list of users.
A naive algorithm looks something like this:
//This is naive.. what is a better way??
def userIdToRandomNumber(userId: Int): Float = userId.toString.hashCode % 1000)/1000.0
//An event listener will call this every time a new event is received
def sampleEventByUserId(event: Event) = {
//Process all events for 3% percent of users
if (userIdToRandomNumber(event.user.userId) <= 0.03) {
processEvent(event)
}
}
There are issues with this code though (hashCode may favor shorter strings, modulo arithmetic is discretizing value so its not exactly p, etc.).
Was is the "more correct" way of finding a deterministic mapping of userIds to a random number for the function userIdToRandomNumber above?
Try the method(s) below instead of the hashCode. Even for short strings, the values of the characters as integers ensure that the sum goes over 100. Also, avoid the division, so you avoid rounding errors
def inScope(s: String, p: Double) = modN(s, 100) < p * 100
def modN(s: String, n: Int): Int = {
var sum = 0
for (c <- s) { sum += c }
sum % n
}
Here is a very simple mapping, assuming your dataset is large enough:
For every user, generate a random number x, say in [0, 1].
If x <= p, pick that user
This is a practically used method on large datasets, and gives you entirely random results!
I am hoping you can easily code this in Scala.
EDIT: In the comments, you mention deterministic. I am interpreting that to mean if you sample again, it gives you the same results. For that, simply store x for each user.
Also, this will work for any number of users (even infinite). You just need to generate x for each user once. The mapping is simply userId -> x.
EDIT2: The algorithm in your question is biased. Suppose p = 10%, and there are 1100 users (userIds 1-1100). The first 1000 userIds have a 10% chance of getting picked, the next 100 have a 100% chance. Also, the hashing will map user ids to new values, but there is still no guarentee that modulo 1000 would give you a uniform sample!
I have come up with a deterministic solution to randomly sample users from a stream that is completely random (assuming the random number generator is completely random):
def sample(x: AnyRef, percent: Double): Boolean = {
new Random(seed=x.hashCode).nextFloat() <= percent
}
//sample 3 percent of users
if (sample(event.user.userId, 0.03)) {
processEvent(event)
}
I have projects that requires to simulate a market with 3 registers. Every second an amount of clients come to the registers and we assume that each clients takes 4 seconds to the register before he leaves. Now lets suppose that we get an input of all the customers and their arriving time: e.x: 0001122334455 which means that 3 customers enter at second 0, 2 at second 1 etc. What i need to find is the total time which need to serve all the customers now matter how many they are and also to find the average waiting time at the store.
Can someone come up with a pseudocode for this problem?
while(flag){
while(i<A.length-1){
if(fifo[tail].isEmpty()) fifo[tail].put(A[i] +4);
else{
temp= fifo[tail].peek();
fifo[tail].put(A[i]-temp+4);
i++;
}
if(tail == a-1){
tail=0;
}else tail++;
if(i>3){
for(int q =0; q<a; q++){
temp = fifo[q].peek();
if(temp==i){
fifo[q].get();
}
}
}
}
}
where A is the array which contains all the customers as numbers as required from the input, and fifo is the array of the registers with the get , put and peek(get the tail but not remove it) methods. I have no clue though how to find the total time an the average waiting time
Recursion is still baffling me. I understand the basis of it and how it's supposed to work, but I am struggling with how to actually make it work. For my function, I'm given a cell array that has costume items and prices, as well as a budget (given as a double). I have to output a cell array of the items I can buy (in order from cheapest to most expensive) and output how much money I have leftover in my budget. There is a chance I will run out of money before I buy all of the items I need to, and a chance where I do buy everything I need. These would be my two terminating conditions. I have to use recursion and I am not allowed to use sort in this problem. So I am struggling a little. Mostly with figuring out the base case situation. I don't understand that bit. Or how to do recursion with two inputs and outputs. So basically my function looks like:
function[bought, money] = costumeParty(items, budget)
Here is what I have to output:
Test case:
Costume Items:
'Eyepatch' 8.94000000000000
'Adult-sized Teletubby Onesie' 2.89000000000000
'Cowboy Boots' 1.30000000000000
'Mermaid Tail' 1.75000000000000
'Life Vest' 8.10000000000000
'White Bedsheet With Eyeholes' 4.30000000000000
'Lizard Leggings' 0.650000000000000
'Gandalf Beard' 4.23000000000000
'Parachute Pants' 7.49000000000000
'Ballerina Tutu' 8.75000000000000
'Feather Boa' 1.69000000000000
'Groucho Glasses' 6.74000000000000
'80''s Leg Warmers' 5.08000000000000
'Cat Ear Headband' 6.36000000000000
'Ghostface Mask' 1.83000000000000
'Indoor Sunglasses' 2.25000000000000
'Vampire Fangs' 0.620000000000000
'Batman Utility Belt' 7.08000000000000
'Fairy Wand' 5.48000000000000
'Katana' 6.81000000000000
'Blue Body Paint' 5.70000000000000
'Superman Cape' 4.78000000000000
'Assorted Glow Sticks' 4.07000000000000
'Ash Ketchum''s Baseball Cap' 3.57000000000000
'Hipster Mustache' 6.47000000000000
'Camouflage Jacket' 8.73000000000000
'Two Chains Value Pack' 4.76000000000000
'Toy Pistol' 8.41000000000000
'Sushi Chef Headband' 2.59000000000000
'Pitchfork' 8.57000000000000
'Witch Hat' 4.27000000000000
'Dora''s Backpack' 4.13000000000000
'Fingerless Gloves' 0.270000000000000
'George Washington Wig' 7.35000000000000
'Clip-on Parrot' 4.32000000000000
'Christmas Stockings' 8.69000000000000
A lot of items sorry.
[costume1, leftover1] = costumeParty(costumeItems, 5);
costume1 => {'Fingerless Gloves'
'Vampire Fangs'
'Lizard Leggings'
'Cowboy Boots'
'Feather Boa' }
leftover1 => 0.47
What I have:
function[bought, money] = costumeParty(items, budget)
%// I commented these out, because I was unsure of using them:
%// item = [items(:,1)];
%// costumes = [item{:,:}];
%// price = [items{:,2}];
if budget == 0 %// One of the terminating conditions. I think.
money = budget;
bought ={};
%// Here is where I run into issues. I am trying to use recursion to find out the money leftover
else
money = costumeParty(items{:,2}) - costumeParty(budget);
%// My logic here was, costumeParty takes the second column of items and subtracts it from the budget, but it claims I have too many inputs. Any suggestions?
bought = {items(1,:)};
end
end
If I could get an example of how to do recursion with two inputs/outputs, that'd be great, but I couldn't seem to find any. Googling did not help. I'm just...baffled.
I did try to do something like this:
function[bought, money] = costumeParty(items, budget)
item = [items(:,1)];
costumes = [item{:,:}];
price = [items{:,2}];
if budget == 0
money = 0;
bought ={};
else
money = price - budget;
bought = {items(1,:)};
end
end
Unfortunately, that's not exactly recursive. Or, I don't think it is and that didn't really work anyway. One of the tricks to doing recursion is pretending the function is already doing what you want it to do (without you actually coding it in), but how does that work with two inputs and outputs?
Another attempt, because I'm going to figure this darn thing out somehow:
function[bought, money] = costumeParty(items, budget)
price = [items{:,2}]; %// Gives me the prices in a 1x36 double
if price <= budget %// If the price is less than the budget (which my function should calculate) you populate the list with these items
bought = [costumeParty(items,budget)];
else %// if not, keep going until you run out of budget money. Or something
bought = [costumeParty(items{:,2},budget)];
end
I think I need to figure out how to sort the prices first. Without using the sort function. I might just need a whole lesson on recursion. This stuff confuses me. I don't think it should be this hard .-.
I think I'm getting closer!
function[bought, money] = costumeParty(items, budget)
%My terminating conditions are when I run out of the budget and when buying
%the next item, would break my budget
price = [items{:,2}];
Costumes = [items(:,1)];
[~,c] = size(price);
bought = {};
Locate = [];
List = [];
for j = 1:c %// Need to figure out what to do with this
[Value, IND] = min(price(:));
List = [List price(IND)];
end
while budget >= 0
if Value < budget
bought = {Costumes(IND)};
money = budget - price(IND);
elseif length(Costumes) == length(items)
bought = {Costumes(IND)};
money = budget - price(IND);
else
bought=43; %// Arbitrary, ignore
budget = budget - price;
end
budget = budget - price;
end
duck = 32; %// Arbitrary, ignore
From my understanding of the question the recursion needs to be used for sorting the items arrays and then after you have a sorted array you can then decide how many objects and which can be bought based on the budget you have
Therefore, you need to implement a classic recursive sorting algorithm. You may find a few online but the idea is to split your whole list into sub lists and do the same sorting for them and so on.
After the implementation, you will then need to have a threshold of the budget in place.
Another approach will be as you started with 2 items. Then you will need to scan the whole list every time in the look for the cheapest item, cross it from the list and pass the next function an item list with this item missing and a budget that will be lower by that some. Though I don't see the need of a recursion for this implementation, since loops will be more then enough here.
Edit: Code:
This is an idea of a code, didn't run it, and it should have problems with the indexing (you nedd to address the budget and the lables differently) but I think it shows the point.
function main(items,budget)
boughtItemIndex=itemslist(items,budget)
moneyLeft=budget;
for i=1:1:length(boughtItemIndex)
disp(item(boughtItemIndex(i)))
moneyLeft=moneyLeft-boughtItemIndex(i);
end
disp('Money left:');
moneyLeft;
boughtItemIndex=function itemslist(items,budget)
[minVal minInd]=findmin(items)
if (budget>minVal)
newitems=items;
newitem(minInd)=[];
newbudget=budget-minVal;
boughtItemIndex=[minIn, itemlist(newitem,newbudget)];
end
[minVal minInd]=function findmin(items)
minVal=0;
minInd=0;
for i=1:1:length(items)
if (items(i)<minVal)
minVal=items(i);
minInd=i;
end
end