Is jobs in quartz are executed as process or thread?
If it is executed as a thread then will it effect the performance of quartz scheduler when heavy jobs or time consuming jobs are executed.
If so then please suggest the solution.
If we execute 10 time consuming jobs simultaneously what is the effect?
I read the tutorials but didnt find the solution.
Please suggest the solution.
Thanks.
Read the documentation regarding Configuring the thread pool which explains how the quartz thread pool can be suited for your needs. More specifically the org.quartz.threadPool.threadCount configuration property can be set according to your needs as the documentation explains:
The number of threads available for concurrent execution of jobs. You
can specify any positive integer, although only numbers between 1 and
100 are practical. If you only have a few jobs that fire a few times a
day, then one thread is plenty. If you have tens of thousands of jobs,
with many firing every minute, then you want a thread count more like
50 or 100 (this highly depends on the nature of the work that your
jobs perform, and your systems resources).
In the specific example you mentioned regarding 10 jobs firing simultaneously, if you have configured above property with more than 10 threads, then each job will run concurrently on its own thread. Otherwise if you have configured less, some will start first, and the others will wait for threads to become available. If no threads become available until a configured period of time, the misfire instructions you have set will handle the action to be taken, which usually is to trigger delayed jobs as soon as possible but this is also a configurable setting.
Related
This is purely for a non-eager pytest mode of operation. I want to know when celery has "caught" up with all the outstanding work. Is there any way to find that information? My testing config has a celery_session_app and a single celery_session_worker in it's own thread.
Check the number of entries in the Rabbit queue. This has problems because of pre-fetch. I can set prefetch to 1 and maybe solve it that way but I worry about race conditions. (I'm testing chords and some celery tasks queue other celery tasks)
Add a task to the "end" of the list and then .wait() on it to finish. This has problems for tasks that queue other tasks because the queue is being extended in the other thread so I can be at the end of the list when queued, but that quickly moves forward as tasks are queued behind it. I can work around this using .apply_async(countdown=3) but this is pretty much the definition of a race condition and I might need countdown=4 or I might need nothing and that is some number of seconds wasted on a test regardless.
Use signals (somehow). But what I really need is a worker_is_bored which does not exist and suffers from the same kind of race conditions mentioned above. Tasks queueing tasks could make it flash "bored" and right back to "busy".
time.sleep(N) but what should N be. (i'm running pytest -n 10 so how busy the machine is during tests, is non-trivial). And this wastes time like countdown= above.
There are four different timeout options in the ActivityOptions, and two of those are mandatory without any default values: ScheduleToStartTimeout and StartToCloseTimeout.
What considerations should be made when selecting values for these timeouts?
As mentioned in the question, there are four different timeout options in ActivityOptions, and the differences between them may not be super clear to a new Cadence user. Let’s first briefly explain what those are:
ScheduleToStartTimeout: This configuration specifies the maximum
duration between the time the Activity is scheduled by a workflow and
it’s picked up by an activity worker to start executing it. In other
words, it configures the time a task spends in the queue.
StartToCloseTimeout: This one specifies the maximum time taken by
an activity worker from the time it fetches a task until it reports
the completion of it to the Cadence server.
ScheduleToCloseTimeout: This configuration specifies an end-to-end
timeout duration for an activity from the time it is scheduled by the
workflow until it is completed by an activity worker.
HeartbeatTimeout: If your activity is a heartbeating activity, this
configuration basically specifies the maximum duration the Cadence
server would wait for a heartbeat before assuming the activity worker
has failed.
How to select a proper timeout value
Picking the StartToCloseTimeout is fairly straightforward when you know what it does. Essentially, you should make this long enough so that the activity can complete under normal circumstances. Therefore, you should account for everything that can affect the time taken by an activity worker the latency of your down-stream (ie. services, networking etc.). On the other hand, you should aim to keep this value as small as it’s feasible to make your end-to-end system more responsive. If you can’t make this timeout less than a couple of minutes (ideally 1 minute or less), you should consider using a HeartbeatTimeout config and implement heartbeating in your activity.
ScheduleToCloseTimeout is also easy to understand, but it is more common to face issues caused by picking a less-than-ideal value here. Therefore, it’s important to ensure that a moment to pay some extra attention to this configuration.
Basically, you should consider everything that can create a backlog in the activity task queue. Some common events that contribute to a backlog are:
Reduced worker pool throughput due to deployments, maintenance or
network-related issues.
Down-stream latency spikes that would increase the time it takes to
complete each activity task, which then reduces the throughput of the
worker pool.
A significant spike in the number of workflow instances that schedule
the activity; especially if one of the upstream services is also an
asynchronous queue/stream processor which can create its own backlog
and suddenly start processing it at a very high-volume.
Ideally, no activity should timeout while waiting in the task queue, especially if the queue is backed up and the activity is configured to be retried. Because the retries would add more activity tasks to the queue and subsequently make it harder to recover from backlog or make it even worse. On the other hand, there are many use cases where business requirements really limit the total time the system can take to process an activity. Therefore, it’s usually not a bad idea to aim for a high ScheduleToCloseTimeout value as long as the business requirements allow. Depending on your use case, it might not make sense to keep your activity in the queue for more than a few minutes or it might be perfectly fine to keep it there for several days before timing out.
I am hitting a well known problem, but I can't find a simple answer that tells me how to solve it.
I would appreciate you directing me by answering which feature I should look for in available queuing software or suitable algorithms if the solution requires programming in addition to the tools. and if you can direct me to Python supported tools, it would be helpful
My problem is that I get over the span of the day jobs which deploy 10, 100 or 1000 tests (I exaggerate , but it helps make a point). Many jobs deploy 10 tests, some deploy 100 tests and one or two deploy 1000 tests.
I want to deploy the tests in such a manner that the delay in execution is spread in a fair manner between all jobs. Let me explain myself.
If the very large job takes 2 hours on a idle server, it would be acceptable if it completes after 4 hours.
If a small job takes 3 minutes on an idle server, it would be acceptable if it completes after 15 minutes.
I want the delay of running the jobs to be spread in a fair way, so jobs that started earlier don't get too delayed. If it looks that the job is going to be delayed more than allowed it's priority will increase.
I think that prioritizing queues may be the solution, so dynamically changing the weights on a large queue will make it faster when needed.
Is there a queue software that knows how to do the above automatically. Lets say that I give each job some time limit and the queue software knows how to prioritize the tests from each queue so that no job is delayed too much?
Thanks.
Adding information following Jim's comments.
Not enough information to supply an answer. Is a job essentially just a list of tests? Can multiple tests for a single job be run concurrently? Do you always run all tests for a job? – Jim Mischel 14 hours ago
Each job deploys between 10 to 1000 tests.
The test can run concurrently to all other tests from the same or other users without conflicts.
All tests that were deploy by a job, are planned to run.
Additional info:
I've learned so far that Prioritized Queues are actually about applying weights to items in a single queue, where items with the hightest are pulled first. If two or more items have the same highest priority, the first item to arrive will be executed first.
When I pondered about Priority Queues it was more in the way of:
Multiple Queues, where each queue has a priority assigned to the entire queue.
The priority can be changed dynamically in runtime, based on some condition, e.g. setting a time limit on the execution of the entire queue.
MarkLogic Scheduled Tasks cannot be configured to run at an interval less than a minute.
Is there any way I can execute an XQuery module at an interval of 1 second?
NOTE:
Considering the situation where the Task Server is fully loaded and I need to make sure that the secondly scheduled task gets the Task Server thread whenever it needs.
Please let me know if there is anything in MarkLogic that can be used to achieve this.
Wanting rapid-fire scheduled tasks may be a hint that the design needs rethinking.
Even running a task once a minute can be risky, and needs careful thought to manage the possibilities of overlapping tasks and runaway tasks. If the application design calls for a scheduled task to run once a second, I would raise that as a potentially serious problem. Back up a few steps, and if necessary ask a new question about the higher-level problem that led to looking at scheduled tasks.
There was a sub-question about managing queue priority for tasks. Task priorities can handle some of that. There are two priorities: normal and higher. The Task Server empties the higher-priority queue first, then the normal queue. But each queue is still a simple queue, and there's no way to change priorities after a task has been spawned. So if you always queue tasks with priority=higher, then they'll all be in the higher priority queue and they'll all run in order. You can play some games with techniques like using server fields as signals to already-running tasks. But wanting to reorder tasks within a queue could be another hint that the design needs rethinking.
If, after careful thought about all the pitfalls and dangers, I decided I needed a rapid-fire task of some kind.... I would probably do it using external requests. Pick any scripting language and write a simple while loop with an HTTP request to the MarkLogic cluster. Even so, spend some time thinking about overlapping requests and locking. What happens if the request times out on the client side? Will it keep running on the server? Will that lead to overlapping requests and require deadlock resolution? Could it lead to runaway resource consumption?
Avoid any ideas that use xdmp:sleep. That will tie up a Task Server thread during the sleep period, and then you'll have two problems.
Is there a way to ensure that when a trigger fire time arrived it has always a thread to run on.triggers has priority but this not guarantee ,and i giving more thread then max number of job is not an option.I have lots of job but one of them must run at specified time.
Use a separate scheduler for the job. So the important job will always have a free thread, because nobody else would use it.