Time taken by a reworked product- Anylogic - anylogic

I am building a fairly simple model to record the time taken by a product to be manufactured.
Need any guidance on how to segregate "virgin products" from "rework products".
I have around 4 rework loops. The product might visit either one or all four rework loops. I am not sure on how I can classify any "agent" that enters a rework loop in anylogic. I can measure individual rework times per block, but unable to classify an overview of how many agents entered rework as a whole and how many did not. I am using a DES approach.
Thanks

You need to turn your products into individual agents that flow through the DES flow. Then, you can access each individual product and flag it as "reworked", log individual times...
Suggest you check out a few DES example models, most use agents to flow through the blocks, so you can see how it is done.
btw, this is one of the most powerful features of AnyLogic that competitors cannot compare with in terms of flexibility, so worth learning :-)

Related

evaluation NLP classifier with annotated data

if we want to evaluate a classifier of NLP application with data that are annotated with two annotators, and they are not completely agreed on the annotation, how is the procedure?
That is, if we should compare the classifier output with just the portion of data that annotators agreed on? or just one of the annotator data? or the both of them separately and then compute the average?
Taking the majority vote between annotators is common. Throwing out disagreements is also done.
Here's a blog post on the subject:
Suppose we have a bunch of annotators and we don’t have perfect agreement on items. What do we do? Well, in practice, machine learning evals tend to either (1) throw away the examples without agreement (e.g., the RTE evals, some biocreative named entity evals, etc.), or (2) go with the majority label (everything else I know of). Either way, we are throwing away a huge amount of information by reducing the label to artificial certainty. You can see this pretty easily with simulations, and Raykar et al. showed it with real data.
What's right for you depends heavily on your data and how the annotators disagree; for starters, why not use only items they agree on and see what then compare the model to the ones they didn't agree on?

Anylogic : modeling parking lot

I am trying to model the parking lot using road traffic library. However, the example mostly gas station is only give me an idea on queue problems not a parking lot.
Can anyone guide me on the idea of how I can set the parking lot problem in any logic tools?? i.e. vehicles are coming in the parking lot and randomly select the space and holding in the space in some period before exit.
I am really confusing on how to set the parking lot spaces in the model.
Thank you in advances
I don't think you'll find an object that directly does that.
I'll probably start by looking at defining the lot with nodes and paths and model the lots as a resource. when a car comes in it requests a resource from the lot-pool.
To release it again you have different options. The easy way is just a random delay before it is released, if you want to specify a specific car to release you'll need to have a function to pick those out.

Can I use Apache Mahout Taste for User Preferences matching?

I am trying to match objects based on predefined user preferences. A simple example would be finding best matching vechicle.
Lets say a user 'Tom' is offered a rented vehicle for travel based on his predefined preferences. In this case, the predefined user preferences will be -
** Pre-defined user preferences for Tom:
PreferredVehicle (Make='ANY', Type='3-wheeler/4-wheeler',
Category='Sedan/Hatchback', AC/Non-AC='AC')
** while the 10 available vehicles are -
Vechile1(Make='Toyota', Type='4-wheeler', Category='Hatchback', AC/Non-AC='AC')
Vechile2(Make='Tata', Type='3-wheeler', Category='Transport', AC/Non-AC='Non-AC')
Vechile3(Make='Honda', Type='4-wheeler', Category='Sedan', AC/Non-AC='AC')
;
;
and so on upto 'Vehicle10'
All I want to do is - choose a vehicle for Tom that best matches his preferences and also probably give him choices in order, i.e. best match first.
Questions I have :
Can this be done with Mahout Taste?
If yes, can someone please point me to some example code where I can start quickly?
A recommender may not be the best tool for the job here, for a few reasons. First, I don't expect that the best answers are all that personal in this domain. If I wanted a Ford Focus, the best alternative you have is likely about the same for most every user. Second, there is not much of a discovery problem here. I'm searching for a vehicle that meets certain needs; I don't particularly want or need to find new and unknown vehicles, like I would for music. Finally you don't have much data per user; I assume most users have never rented before, and very few have even 3+ rentals.
Can you throw this data at a recommender anyway? Sure, try Mahout Taste (I'm the author). If you have the book Mahout in Action it will walk you through it. Since it's non-rating data, I can also recommend the successor project, Myrrix (http://myrrix.com) as it will be easier to set up and run. You can at least evaluate the results to see if it's anywhere near useful.
Either way, your work will just be to make a CSV file of "userID,vehicleID" pairs from your data and feed it in. Then it will give you vehicle IDs as recommendations for any user ID.
But, I imagine you will do much better to analyze what people picked when the car wasn't available, and look at the difference, and learn which attributes they are most and least likely to be sacrificed, and learn to score the alternatives that way. This is entirely feasible since this data set is small, and because you have rich item attribute data.

simulation with arena

I want to simulate a supermarket with the arena to find the proper number of cashiers which market needs.
I want to start the simulation with one cashier then increase the number of cashiers in next simulations until the utilization of cashiers is less than 70%.
each cashier is a "resource module" and has a "process module" for it's service time.
am I make a separate model for each different number of the cashier(for example a model for a supermarket with one cashier, another model for a supermarket with two cashiers and so on) or is there a better way?
It's a little more advance but it sounds like Arena's Process Analyzer would help you determine the number of cashiers needed.
The Process Analyzer assists in the evaluation of alternatives
presented by the execution of different simulation model scenarios.
This is useful to simulation model developers, as well as decision-
makers
The Process Analyzer is focused at post-model development
comparison of models. The role of the Process Analyzer then is to
allow for comparison of the outputs from validated models based on
different model inputs.
via pelincec.isep.pw.edu.pl/doc/Simulation_Warsaw%20Part%205.pdf
A Google search for Arena Process Analyzer provides plenty of lecture notes, book references and examples:
https://www.google.com/search?q=arena+process+analyzer
Also, it sounds like this model isn't very complicated so, although it may be tedious, it'll probably be quicker to alter your model and run n simulations for each solution {1 cashier, 2 cashiers, ...}.
Also, if the model is indeed pretty simple, why not create multiple independent models in the same simulation file. For instance, one simulation file has three independent models of 1, 2 and 3 cashiers. The next has 4, 5 and 6 cashiers and so on. This would consolidate the statistics a little more and make analysis easier.
There are several ways to do this without making multiple models. A cashier is simply a resource, but it could also be an entity.
You can build your model to require throughput (customers) to be processed when two entities are available - a register entity and a cashier entity. This could be done with a batch module.
cashier entities would be set up according to a schedule you would like to test... from minimum cashier availability to full cashier availability.
Register entities would probably be held constant, but you could make them variable according to a schedule, too.
Your batched entity would then go into the process entity until a schedule called for the cashier to "leave" the system - split the batch and destroy the cashier entity. Register entity loops back to the batch to be grouped with another cashier or wait.

How do I adapt my recommendation engine to cold starts?

I am curious what are the methods / approaches to overcome the "cold start" problem where when a new user or an item enters the system, due to lack of info about this new entity, making recommendation is a problem.
I can think of doing some prediction based recommendation (like gender, nationality and so on).
You can cold start a recommendation system.
There are two type of recommendation systems; collaborative filtering and content-based. Content based systems use meta data about the things you are recommending. The question is then what meta data is important? The second approach is collaborative filtering which doesn't care about the meta data, it just uses what people did or said about an item to make a recommendation. With collaborative filtering you don't have to worry about what terms in the meta data are important. In fact you don't need any meta data to make the recommendation. The problem with collaborative filtering is that you need data. Before you have enough data you can use content-based recommendations. You can provide recommendations that are based on both methods, and at the beginning have 100% content-based, then as you get more data start to mix in collaborative filtering based.
That is the method I have used in the past.
Another common technique is to treat the content-based portion as a simple search problem. You just put in meta data as the text or body of your document then index your documents. You can do this with Lucene & Solr without writing any code.
If you want to know how basic collaborative filtering works, check out Chapter 2 of "Programming Collective Intelligence" by Toby Segaran
Maybe there are times you just shouldn't make a recommendation? "Insufficient data" should qualify as one of those times.
I just don't see how prediction recommendations based on "gender, nationality and so on" will amount to more than stereotyping.
IIRC, places such as Amazon built up their databases for a while before rolling out recommendations. It's not the kind of thing you want to get wrong; there are lots of stories out there about inappropriate recommendations based on insufficient data.
Working on this problem myself, but this paper from microsoft on Boltzmann machines looks worthwhile: http://research.microsoft.com/pubs/81783/gunawardana09__unified_approac_build_hybrid_recom_system.pdf
This has been asked several times before (naturally, I cannot find those questions now :/, but the general conclusion was it's better to avoid such recommendations. In various parts of the worls same names belong to different sexes, and so on ...
Recommendations based on "similar users liked..." clearly must wait. You can give out coupons or other incentives to survey respondents if you are absolutely committed to doing predictions based on user similarity.
There are two other ways to cold-start a recommendation engine.
Build a model yourself.
Get your suppliers to fill in key information to a skeleton model. (Also may require $ incentives.)
Lots of potential pitfalls in all of these, which are too common sense to mention.
As you might expect, there is no free lunch here. But think about it this way: recommendation engines are not a business plan. They merely enhance the business plan.
There are three things needed to address the Cold-Start Problem:
The data must have been profiled such that you have many different features (with product data the term used for 'feature' is often 'classification facets'). If you don't properly profile data as it comes in the door, your recommendation engine will stay 'cold' as it has nothing with which to classify recommendations.
MOST IMPORTANT: You need a user-feedback loop with which users can review the recommendations the personalization engine's suggestions. For example, Yes/No button for 'Was This Suggestion Helpful?' should queue a review of participants in one training dataset (i.e. the 'Recommend' training dataset) to another training dataset (i.e. DO NOT Recommend training dataset).
The model used for (Recommend/DO NOT Recommend) suggestions should never be considered to be a one-size-fits-all recommendation. In addition to classifying the product or service to suggest to a customer, how the firm classifies each specific customer matters too. If functioning properly, one should expect that customers with different features will get different suggestions for (Recommend/DO NOT Recommend) in a given situation. That would the 'personalization' part of personalization engines.