Architecture/Optimization of a scheduling problem on OptaPlanner - drools

I'd like to generate a planning with OptaPlanner, with the following problem :
"Task" is the planning entity
It has a fixed duration
It has an employee (planning variable) which must match required skills
It has a workstation (planning variable) which must match required attributes
It has dependencies: some tasks must start after some others
Some tasks have a deadline
At first, I tried with chained/shadow variables, basing on the OptaPlanner Task Assignment example. On my first attempt, I kept the employee as an anchor, and let the workstation management to the solver:
I quickly saw that there was a problem with this approach. Here, Task A and Task B (and all others) have an influence the one on the other while they are not on the same chain and also the previous element of the chain has not enough information to determine the start time of the task. Also, the worst thing is that workstations changes not being tracked in this model, the solutions are just non-sense as workstations can be used several times.
To fix the problem of workstation tracking, I added workstations to the anchor and made all employee/workstation combinations
This way I know the employee and workstation on each chain. However this does not solve the problem of start time of tasks on a chain being dependent on tasks on other chains (e.g, Task A and Task D share the same workstation) and therefore Task insertions/removals shall have repercussion on other chains, which is not the spirit.
I ended giving up the idea of chained variables as it seems that their usage does not fit my problem.
So I modified my classes and the start time of tasks is now resolved with all intelligence left to OptaPlanner's solver, driven by pure drools rules and a ValueRangeProvider.
When I have only the following rules :
No employee recovery (Hard)
No workstation recovery (Hard)
Skills and attributes requirements (Hard)
Tasks with deadlines ending before deadline (Soft, could be Medium)
Tasks ending as soon as possible (sum of squared end times, Soft)
I can get quite fast a solution that seems to be the best.
However, when I add dependencies between tasks (with a hard rule going down task dependencies to see if a dependency does not end after the task starts), the complexity seems to dramatically increase so for a few dozens task, with only 2 operators, 3 workstations, a satisfying solution (without an unexpected holes between tasks) can take an hour to come, with the following parameters:
Experiment length: 10,000 min
Granularity: 1 min
I also have a TaskDifficultyComparator to help the solver place the hardest tasks before
This is very long for a few tasks and it will be far worse when inserting a notion of availability of user, I even suspect it never converges as task end time will "jump" depending on availabilities.
So my questions are:
Are there solutions leveraging chained variables that would suit my problem?
Are their any optimizations on the solver/rules/something else that could grant me a precious speed-up?

Related

Mutually Exclusive Bitbake Recipes/Tasks

I have several recipes who's do_compile task uses a lot of memory (lots of C++ templates). If I build the recipes at the same time, they exhaust the host machine of memory and the out-of-memory killer starts killing.
I've modified BB_NUMBER_PARSE_THREADS, BB_NUMBER_THREADS, PARALLEL_MAKE, and PARALLEL_MAKEINST numerous times, but it is not feasible to pick numbers for these variables that'll work well in all situations. For example, if I set BB_NUMBER_THREADS to 1 in order to get only one of these recipes to build at a time, I end up increasing the build time a lot when there are no changes (everything can be pulled from the cache). I don't feel like those are the right solution to my problem.
Is there any way to tell bitbake to only build one of these recipe's do_compile tasks at a time, but let other recipe's tasks build normally?
It isn't quite the answer you're looking for but you could have something like:
do_compile[lockfiles] = "${WORKDIR}/mylock"
which would require the task to take and hold the lock to execute, then you can be sure only one would run at a time.

Considering differences in the same materials - Stations

I am trying to simulate a manufacturing assembly process, where material items are processed following an unique line. In the quality control stations, when a failure is detected, the object is sent to the repair area (by a truckpallet) and, when it is repaired, the same resource takes it and puts it at the start of the line. Until this point, I already programmed it.
The problem is the following: when it is a repaired object, is has to follow the same conveyor but with no stops in the stations. The object must enter a station and leave it, with no delays (as the works related with the stations have already been made).
I thought the best would be to consider the difference (repaired vs. not repaired) in the Agent, but then I can't work with it in the Main agent... I have also tried alternative solutions, for example defining variables in each station 1 and consider that in the stations delays and in the following station:
triangular( 109.1*delay_bec, delaytime_bec*delay_bec, 307.6*delay_bec)
Actions - in finished process:
if(delay_bec==0){
delay_headers=0;
delay_bec=1;}
But nothing worked... Any help?
I thought the best would be to consider the difference (repaired vs. not repaired) in the Agent, but then I can't work with it in the Main agent...
This is actually the correct approach. You need to create a custom agent type, give in a boolean variable isRepaired (or similar) and then in the delay you can dynamically adjust the duration using that characteristic:
agent.isRepaired ? 0. : 100
This will delay normal agents by 100 time units and repaired agents not at all.
Obviously, you must make sure that the agents flowing through the flow blocks are of your custom agent type (see the help how to do that)

How to stop timeout in service block

I am modeling ticket system with various SLA. The model must contain several service blocks with different reaction time ( from 2 to 32 hours). In the service block only working hours should be taken into account. So in the service block timeout should stop when non-workong hours and on the weekend. Could you please kindly tell me how i can realize it?
Thank you very much in advance!
I can think of two answers, one simplified but works in many cases, the other more advanced and probably more accurate:
Simplified approach: I would set the model in hours and keep everything running as is without any stop. So, at the end of the simulation, if the total time is 100 hours and you know that you have 8 hours/day with 5 days/week, then you'd know the total duration is 2.5 weeks. Of course, this might have limitations or might become more complex later on if you want day-specific actions (e.g. you want to differentiate between Monday, Tuesday, etc.)
Advanced more accurate approach: Create resources whose capacities are defined by schedule and assigned them to your services. Create a schedule and specify the working hours in that schedule. Check the below link to learn more about schedules. I call this the more advanced approach because you need to make sure the schedule is defined correctly and make sure all elements in the model are properly controlled (e.g. non-service blocks such as source, delays, etc.).
https://help.anylogic.com/topic/com.anylogic.help/html/data/schedule.html?resultof=%22%73%63%68%65%64%75%6c%65%73%22%20%22%73%63%68%65%64%75%6c%22%20
I personally would use the first approach if the model is rather simple and modeling working hours is enough for analysis. Otherwise, I'd go for option 2.
Finally, another option I'd like to highlight is the "suspend/resume" functions. I am only adding this because you asked "how to stop timeout". So these functions specifically stop and resume timeout. But you'll need to define the times at which they are executed (through an event for example).

Scheduling variables sized work items efficiently

(I have also posted this question at math.stackexchange.com because I'm not sure where it should belong.)
I have a system with the following inputs:
Set of work items to be completed. These are variable sized. They do not have to be completed in any particular order.
Historical data as to how long work items have taken to complete in the past. However, past performance is no guarantee of future success! That is, once we come to actually execute a work item, we may find that it takes longer or shorter than it has previously.
There can be work items that I have never seen before and hence have no historical data about.
Work items further have a "classification" of "parallel" or "serial".
Set of "agents" which are capable of picking up a work item and working on it. The number of agents is fixed and known in advance. An agent can only work on one work item at a time.
Set of "servers" against which the agents execute work items. Servers have different capabilities. Specifically, they are capable of handling different numbers of agents simultaneously.
Rules:
If a server is being using to execute a "serial" work item, it cannot simultaneously be used to execute any other work item.
Provided a server isn't being used to execute any "serial" work items, it can simultaneously handle as many agents as it is capable of, all executing "parallel" work items.
There are a handful of work items which must be executed against a specific server (although any agent can do that). These work items are "parallel", if that matters. (It may be easier to ignore this rule for now!)
Requirement:
Given the inputs and rules above, I need to execute the set of work items "as quickly as possible". Since we cannot know how long a work item will take until it is complete, we cannot possibly hope to derive a perfect solution up front (I suppose), so "as quickly as possible" means not manifestly doing something stupid like just using one agent to execute each work item one by one!
Historically, I've had a very simple round-robin algorithm and simply sorted the work items by descending historical duration such that the longest running work items get scheduled sooner and, hopefully, at the end of the cycle I'm able to keep all agents and servers reasonably well loaded with short-duration work items. This has resulted in a pretty good "square" shape to the utilization graph with no long tail of long-duration work items hanging around at the end of the cycle.
This historical algorithm, however, has required me to pre-configure the number of agents and servers and pre-allocate work items to "pools" and assign pools to servers, and lots of other horrible stuff. I now need to support a dynamic number of agents and servers without having to reconfigure things. (Note that the number of servers will be fixed during a cycle - that is, the number will only change between cycles - but the number of agents may increase or decrease in the middle of the cycle.)
Once all work items are complete, we record how long each work item took to feed in to the next cycle and start again from the beginning!

WF performance with new 20,000 persisted workflow instances each month

Windows Workflow Foundation has a problem that is slow when doing WF instances persistace.
I'm planning to do a project whose bussiness layer will be based on WF exposed WCF services. The project will have 20,000 new workflow instances created each month, each instance could take up to 2 months to finish.
What I was lead to belive that given WF slownes when doing peristance my given problem would be unattainable given performance reasons.
I have the following questions:
Is this true? Will my performance be crap with that load(given WF persitance speed limitations)
How can I solve the problem?
We currently have two possible solutions:
1. Each new buisiness process request(e.g. Give me a new drivers license) will be a new WF instance, and the number of persistance operations will be limited by forwarding all status request operations to saved state values in a separate database.
2. Have only a small amount of Workflow Instances up at any give time, without any persistance ofso ever(only in case of system crashes etc.), by breaking each workflow stap in to a separate worklof and that workflow handling each business process request instance in the system that is at that current step(e.g. I'm submitting my driver license reques form, which is step one... we have 100 cases of that, and my step one workflow will handle every case simultaneusly).
I'm very insterested in solution for that problem. If you want to discuss that problem pleas be free to mail me at nstjelja#gmail.com
The number of hydrated executing wokflows will be determined by environmental factors memory server through put etc. Persistence issue really only come into play if you are loading and unloading workflows all the time aka real(ish) time in that case workflow may not be the best solution.
In my current project we also use WF with persistence. We don't have quite the same volume (perhaps ~2000 instances/month), and they are usually not as long to complete (they are normally done within 5 minutes, in some cases a few days). We did decide to split up the main workflow in two parts, where the normal waiting state would be. I can't say that I have noticed any performance difference in the system due to this, but it did simplify it, since our system sometimes had problems matching incoming signals to the correct workflow instance (that was an issue in our code; not in WF).
I think that if I were to start a new project based on WF I would rather go for smaller workflows that are invoked in sequence, than to have big workflows handling the full process.
To be honest I am still investigating the performance characteristics of workflow foundation.
However if it helps, I have heard the WF team have made many performance improvements with the new release of WF 4.
Here are a couple of links that might help (if you havn't seem them already)
A Developer's Introduction to Windows Workflow Foundation (WF) in .NET 4 (discusses performance improvements)
Performance Characteristics of Windows Workflow Foundation (applies to WF 3.0)
WF on 3.5 had a performance problem. WF4 does not - 20000 WF instances per month is nothing. If you were talking per minute I'd be worried.