Concurrent tasks workers with celery - mongodb

I have a mongodb collections which are 20 in number that i am using to store some data regarding a tasks that i am currently processing using cron jobs.
I have one worker per collection when using cron jobs. I want to improve this arrangement and i am looking into celery. I want to have at least 4 workers per collection since i have many records in each collection.
I want the jobs to be done as they come and not wait for the five minutes wait as its happening when using cron jobs.
Is this possible for me to have 4 workers per collection in celery in the way i have described?.

Celery workers will pick tasks as soon as a new task is initiated and will execute it, celery can use redis, or rabittMQ for storing the tasks queue. Any day you can scale the celery by running it distributed on multiple machines or by scaling up the machine and increasing the number of workers. https://www.slideshare.net/nicolasgrasset/scaling-up-task-processing-with-celery
Instead of using the crontab, use celery beat which is the task scheduler for celery.
There is no need of having collection wise celery workers.
Please go through the below celery documentation for understanding celery.
http://docs.celeryproject.org/en/latest/getting-started/introduction.html

Related

Queries regarding celery scalability

I have few questions regarding celery. Please help me with that.
Do we need to put the project code in every celery worker? If yes, if I am increasing the number of workers and also I am updating my code, what is the best way to update the code in all the worker instances (without manually pushing code to every instance everytime)?
Using -Ofair in celery worker as argument disable prefetching in workers even if have set PREFETCH_LIMIT=8 or so?
IMPORTANT: Does rabbitmq broker assign the task to the workers or do workers pull the task from the broker?
Does it make sense to have more than one celery worker (with as many subprocesses as number of cores) in a system? I see few people run multiple celery workers in a single system.
To add to the previous question, whats the performance difference between the two scenarios: single worker (8 cores) in a system or two workers (with concurrency 4)
Please answer my questions. Thanks in advance.
Do we need to put the project code in every celery worker? If yes, if I am increasing the number of workers and also I am updating my code, what is the best way to update the code in all the worker instances (without manually pushing code to every instance everytime)?
Yes. A celery worker runs your code, and so naturally it needs access to that code. How you make the code accessible though is entirely up to you. Some approaches include:
Code updates and restarting of workers as part of deployment
If you run your celery workers in kubernetes pods this comes down to building a new docker image and upgrading your workers to the new image. Using rolling updates this can be done with zero downtime.
Scheduled synchronization from a repository and worker restarts by broadcast
If you run your celery workers in a more traditional environment or for some reason you don't want to rebuild whole images, you can use some central file system available to all workers, where you update the files e.g. syncing a git repository on a schedule or by some trigger. It is important you restart all celery workers so they reload the code. This can be done by remote control.
Dynamic loading of code for every task
For example in omega|ml we provide lambda-style serverless execution of
arbitrary python scripts which are dynamically loaded into the worker process.
To avoid module loading and dependency issues it is important to keep max-tasks-per-child=1 and use the prefork pool. While this adds some overhead it is a tradeoff that we find is easy to manage (in particular we run machine learning tasks and so the little overhead of loading scripts and restarting workers after every task is not an issue)
Using -Ofair in celery worker as argument disable prefetching in workers even if have set PREFETCH_LIMIT=8 or so?
-O fair stops workers from prefetching tasks unless there is an idle process. However there is a quirk with rate limits which I recently stumbled upon. In practice I have not experienced a problem with neither prefetching nor rate limiting, however as with any distributed system it pays of to think about the effects of the asynchronous nature of execution (this is not particular to Celery but applies to all such such systems).
IMPORTANT: Does rabbitmq broker assign the task to the workers or do workers pull the task from the broker?
Rabbitmq does not know of the workers (nor do any of the other broker supported by celery) - they just maintain a queue of messages. That is, it is the workers that pull tasks from the broker.
A concern that may come up with this is what if my worker crashes while executing tasks. There are several aspects to this: There is a distinction between a worker and the worker processes. The worker is the single task started to consume tasks from the broker, it does not execute any of the task code. The task code is executed by one of the worker processes. When using the prefork pool (which is the default) a failed worker process is simply restarted without affecting the worker as a whole or other worker processes.
Does it make sense to have more than one celery worker (with as many subprocesses as number of cores) in a system? I see few people run multiple celery workers in a single system.
That depends on the scale and type of workload you need to run. In general CPU bound tasks should be run on workers with a concurrency setting that doesn't exceed the number of cores. If you need to process more of these tasks than you have cores, run multiple workers to scale out. Note if your CPU bound task uses more than one core at a time (e.g. as is often the case in machine learning workloads/numerical processing) it is the total number of cores used per task, not the total number of tasks run concurrently that should inform your decision.
To add to the previous question, whats the performance difference between the two scenarios: single worker (8 cores) in a system or two workers (with concurrency 4)
Hard to say in general, best to run some tests. For example if 4 concurrently run tasks use all the memory on a single node, adding another worker will not help. If however you have two queues e.g. with different rates of arrival (say one for low frequency but high-priority execution, another for high frequency but low-priority) both of which can be run concurrently on the same node without concern for CPU or memory, a single node will do.

What if i schedule tasks for celery to perform every minute and it is not able to complete it in time?

If I schedule the task for every minute and if it is not able to be getting completed in the time(one minute). Would the task wait in queue and it will go on like this? if this happens then after few hours it will be overloaded. Is there any solution for this kind of problems?
I am using beat and worker combination for this. It is working fine for less records to perform tasks. but for large database, I think this could cause problem.
Task is assign to queue (RabbitMQ for example).
Workers are queue consumers, more workers (or worker with high concurrency) - more tasks could be handled in parallel.
Your periodic task produce messages of the same type (I guess) and your celery router route them to the same queue.
Just set your workers to consume messages from that queue and that's all.
celery worker -A celeryapp:app -l info -Q default -c 4 -n default_worker#%h -Ofair
In the example above I used -c 4 for concurrency of four (eqv. to 4 consumers/workers). You can also start move workers and let them consume from the same queue with -Q <queue_name> (in my example it's default queue).
EDIT:
When using celery (the worker code) you are initiate Celery object. In Celery constructor you are setting your broker and backend (celery used them as part of the system)
for more info: http://docs.celeryproject.org/en/latest/getting-started/first-steps-with-celery.html#application

Scaling periodic tasks in celery

We have a 10 queue setup in our celery, a large setup each queue have a group of 5 to 10 task and each queue running on dedicated machine and some on multiple machines for scaling.
On the other hand, we have a bunch of periodic tasks, running on a separate machine with single instance, and some of the periodic tasks are taking long to execute and I want to run them in 10 queues instead.
Is there a way to scale celery beat or use it purely to trigger the task on a different destination "one of the 10 queues"?
Please advise?
Use celery routing to dispatch the task to where you need:

Number of celery tasks executed at a given point of time

I am trying to create a bunch of celery tasks asynchronously on the fly. Say there are 1000 tasks I start asynchronously and I have only one celeryd process running to execute tasks. How many threads will be created by celery to handle these tasks?
If there are multiple threads that celery starts automatically to process the task queue, how do I limit celery to execute only 100 threads at a given point of time.
Thanks.
Its starts as many as you specify with the CELERYD_OPTS concurrency parameter.
Which is also discussed here.

How do I coordinate a cluster of celery beat daemons?

I have a cluster of three machines. I want to run celery beat on those. I have a few related questions.
Celery has this notion of a persistent scheduler. As long as my schedule consists only of crontab entries and is statically defined by CELERYBEAT_SCHEDULE, do I need to persist it at all?
If I do, then do I have to ensure this storage is synchronized between all machines of the cluster?
Does djcelery.schedulers.DatabaseScheduler automatically take care of concurrent beat daemons? That is, if I just run three beat daemons with DatabaseScheduler, am I safe from duplicate tasks?
Is there something like DatabaseScheduler but based on MongoDB, without Django ORM? Like Celery’s own MongoDB broker and result backend.
Currently Celery doesn't support multiple concurrent celerybeat instances.
You have to ensure only a single scheduler is running for a schedule
at a time, otherwise you would end up with duplicate tasks. Using a
centralized approach means the schedule does not have to be
synchronized, and the service can operate without using locks.
http://docs.celeryproject.org/en/latest/userguide/periodic-tasks.html