I'm trying to clean up an enterprise BI system that currently is using a prioritized FIFO scheduling algorithm (so a priority 4 report from Tuesday will be executed before priority 4 reports from Thursday and priority 3 reports from Monday.) Additional details:
The queue is never empty, jobs are always being added
Jobs range in execution time from under a minute to upwards of 24 hours
There are 40 some odd identical app servers used to execute jobs
I think I could get optaPlanner up and running for this scenario, with hard rules around priority and some soft rules around average time in the queue. I'm new to scheduling optimization so I guess my question is what should I be looking for in this situation to decide if optaPlanner is going to help me or not?
The problem looks like a form of bin packing (and possibly job shop scheduling), which are NP-complete, so OptaPlanner will do better than a FIFO algorithm.
But is it really NP-complete? If all of these conditions are met, it might not be:
All 40 servers are identical. So running a priority report on server A instead of server B won't deliver a report faster.
All 40 servers are identical. So total duration (for a specific input set) is a constant.
Total makespan doesn't matter. So given 20 small jobs of 1 hour and 1 big job of 20 hours and 2 machines, it's fine that it takes all small jobs are done after 10 hours before the big job starts, given a total makespan of 30 hours. There's no desire to reduce the makespan to 20 hours.
"the average time in the queue" is debatable: do you care about how long the jobs are in the queue until they are started or until they are finished? If the total duration is a constant, this can be done by merely FIFO'ing the small jobs first or last (while still respecting priority of course).
There are no dependencies between jobs.
If all these conditions are met, OptaPlanner won't be able to do better than a correctly written greedy algorithm (which schedules the highest priority job that is the smallest/largest first). If any of these conditions aren't met (for example you buy 10 new servers which are faster), then OptaPlanner can do better. You just have to evaluate if it's worth spending 1 thread to figure that out.
If you use OptaPlanner, definitely take a look at real-time scheduling and daemon mode, to replan as new reports enter the system.
Related
I am writing a school project about real-time systems and their scheduling algorithms. Particularly I am trying to compare several of these algorithms, namely RMS, EDF, and LLF under overloaded systems. But I am confused about how these systems deal with deadlines that are missed.
For example, consider the following set of tasks. I assume all tasks are periodic and their deadlines are equal to their periods.
Task 1:
Execution time: 2
Period: 5
Task 2:
Execution time: 2
Period: 6
Task 3:
Execution time: 2
Period: 7
Task 4:
Execution time: 2
Period: 8
It is not possible to make a feasible schedule with these tasks because the CPU Utilization is over 100%, which means some deadlines will miss and more importantly some tasks will not be completed at all. For the sake of comparison, I want to calculate a penalty (or cost) for each of the tasks which increases as more and more deadlines are missed. Here's where the questions and confusion start.
Now I understand, for example, that in RMS, the first task will never miss since it has the highest priority, and the second task also never misses. On the other hand, the third task does miss a bunch of deadlines. Here is the first question:
Do we consider a task to be dropped in RMS if it misses its deadline and a new task is dispatched?
1.a) If we do consider it dropped how would I reflect this in my penalty calculations? Since the task is never completed it would seem redundant to calculate the time it took to complete the task after its deadline passed.
1.b) If we do not consider it to be dropped and the execution of the task continues even after its deadline passes by a whole period, what happens to the new task that is dispatched? Do we drop that task instead of the one we started already, or does it just domino onto the next one and the next one and etc.? If that is the case this means that when a schedule with a length of the LCM of the tasks' periods are made, there are some task 3 dispatches that are not completed at all.
Another confusion is of the same nature but with EDF. EDF fails after a certain time on several tasks. I understand that in the case of EDF I must continue with the execution of the tasks even if I pass their deadlines which means all of the tasks will be completed even though they will not fit their deadlines completely, hence the domino effect. Then the question becomes
Do we drop any tasks at all? What happens to the tasks which are dispatched because the period resets but they cannot be executed because the same task is being executed since it missed its deadline on the period before?
I know it is a long post but any help is appreciated. Thank you. If you cannot understand any of the questions I may clarify them at your request.
Part one:
How can I run a production line only for 15 hours per day?
My logic
Part Two:
How to implement a machine malfunctioning 3 times in 10 days?
I achieved the malfunctioning of a machine using this logic.
if(countAssembler==10){
self.suspend(agent);
create_MyDynamicEvent(2, HOUR,agent);
}
But malfunctioning is occurring on item count (i.e. countAssembler==10) right now. I want it to occur after 3 days.
part 1
To make the services work for a period of time every day, you need resources to be used in those services, and the resources to be subjected to a schedule. The schedule is an object that you can find in the process modeling library and there's a ton of documentation and examples on how to use them. Review the help documentation.
part 2
There's a block in the process modeling library called downtime. Again there's a ton of documentation and examples on how to use this for any model. Then you don't need any java code
I'm looking for an option in autosys to send an alert if the dependant jobs don't start within 10 min of the avg run time, for example, the last 30 days. The dependant jobs, do not have a fixed starting condition. They might have run at varying times in the last 30 days based on the completion of starting jobs. Would it be possible in autosys, to dynamically set the must start time for the jobs which don't have starting time rather they are dependant on starting conditions?
I use something like this in my critical jobs. You can try them too, if you want :
must_start_times: "19:01"
days_of_week: mo,tu,we,th,fr,sa,su
start_times: "19:00"
I am hitting a well known problem, but I can't find a simple answer that tells me how to solve it.
I would appreciate you directing me by answering which feature I should look for in available queuing software or suitable algorithms if the solution requires programming in addition to the tools. and if you can direct me to Python supported tools, it would be helpful
My problem is that I get over the span of the day jobs which deploy 10, 100 or 1000 tests (I exaggerate , but it helps make a point). Many jobs deploy 10 tests, some deploy 100 tests and one or two deploy 1000 tests.
I want to deploy the tests in such a manner that the delay in execution is spread in a fair manner between all jobs. Let me explain myself.
If the very large job takes 2 hours on a idle server, it would be acceptable if it completes after 4 hours.
If a small job takes 3 minutes on an idle server, it would be acceptable if it completes after 15 minutes.
I want the delay of running the jobs to be spread in a fair way, so jobs that started earlier don't get too delayed. If it looks that the job is going to be delayed more than allowed it's priority will increase.
I think that prioritizing queues may be the solution, so dynamically changing the weights on a large queue will make it faster when needed.
Is there a queue software that knows how to do the above automatically. Lets say that I give each job some time limit and the queue software knows how to prioritize the tests from each queue so that no job is delayed too much?
Thanks.
Adding information following Jim's comments.
Not enough information to supply an answer. Is a job essentially just a list of tests? Can multiple tests for a single job be run concurrently? Do you always run all tests for a job? – Jim Mischel 14 hours ago
Each job deploys between 10 to 1000 tests.
The test can run concurrently to all other tests from the same or other users without conflicts.
All tests that were deploy by a job, are planned to run.
Additional info:
I've learned so far that Prioritized Queues are actually about applying weights to items in a single queue, where items with the hightest are pulled first. If two or more items have the same highest priority, the first item to arrive will be executed first.
When I pondered about Priority Queues it was more in the way of:
Multiple Queues, where each queue has a priority assigned to the entire queue.
The priority can be changed dynamically in runtime, based on some condition, e.g. setting a time limit on the execution of the entire queue.
Hi I got the following questions for homework but I am unsure about my conclusion any help would be appreciated
1) For what types of workloads does SJF have the turnaround times as FIFO?
I think the only possible way this could happen is if your sort your workloads in sjf order before running FIFO.
2) For what types of workloads and time quanta does RR give the same response times as SJF?
This was a lot harder the only case I could find was when the workloads were of same length and the time quanta is greater than the length of the workloads.
Are these assumption right or am I missing something are there more possible workloads?
I think you're mostly correct on both counts.
For SJF/FIFO, if you're taking about turnaround times for each job from the time they enter the queue, they would have to enter the queue in shortest-job-first order.
However, if turnaround time is measured from the time the job starts running, they could come in any order.
For RR/SJF, you would need to ensure the jobs all run in a single quantum so that the round-robin nature was discounted. But again, it depends on whether response time is from job entry or job start.
It's more likely to be the former so the jobs would again have to come in in SJF order. I don't think they'd all actually have to be the same length.