Question: Usage of django celery.backend_cleanup - celery

There is not much documentation available for the actual usage of django celery.backend_cleanup
Let's assume i have following 4 tasks scheduled with different interval
Checking DatabaseScheduler Logs I had found that only Task1 is executing on interval.
[2018-12-28 11:21:08,241: INFO/MainProcess] Writing entries...
[2018-12-28 11:24:08,778: INFO/MainProcess] Writing entries...
[2018-12-28 11:27:09,315: INFO/MainProcess] Writing entries...
[2018-12-28 11:28:32,948: INFO/MainProcess] Scheduler: Scheduler: Sending due TASK1(project_monitor_tasks)
[2018-12-28 11:30:13,215: INFO/MainProcess] Writing entries...
[2018-12-28 11:33:13,772: INFO/MainProcess] Writing entries...
[2018-12-28 11:36:14,316: INFO/MainProcess] Writing entries...
[2018-12-28 11:39:14,868: INFO/MainProcess] Writing entries...
[2018-12-28 11:42:15,397: INFO/MainProcess] Writing entries...
[2018-12-28 11:43:55,700: INFO/MainProcess] DatabaseScheduler: Schedule changed.
[2018-12-28 11:43:55,700: INFO/MainProcess] Writing entries...
[2018-12-28 11:45:15,997: INFO/MainProcess] Writing entries...
.....
....
[2018-12-28 17:16:28,613: INFO/MainProcess] Writing entries...
[2018-12-28 17:19:29,138: INFO/MainProcess] Writing entries...
[2018-12-28 17:22:29,625: INFO/MainProcess] Writing entries...
[2018-12-28 17:25:30,140: INFO/MainProcess] Writing entries...
[2018-12-28 17:28:30,657: INFO/MainProcess] Writing entries...
[2018-12-28 17:28:32,943: INFO/MainProcess] Scheduler: Sending due TASK1(project_monitor_tasks)
[2018-12-28 17:31:33,441: INFO/MainProcess] Writing entries...
[2018-12-28 17:34:34,009: INFO/MainProcess] Writing entries...
[2018-12-28 17:37:34,578: INFO/MainProcess] Writing entries...
[2018-12-28 17:40:35,130: INFO/MainProcess] Writing entries...
[2018-12-28 17:43:35,657: INFO/MainProcess] Writing entries...
[2018-12-28 17:43:50,716: INFO/MainProcess] DatabaseScheduler: Schedule changed.
[2018-12-28 17:43:50,716: INFO/MainProcess] Writing entries...
[2018-12-28 17:46:36,266: INFO/MainProcess] Writing entries...
[2018-12-28 17:49:36,809: INFO/MainProcess] Writing entries...
[2018-12-28 17:52:37,352: INFO/MainProcess] Writing entries...
Q1) why other TASKS which are at different intervals such as 24,8,10 Hours are not executing? , I'm Assuming this is because Crontab of celery.backend_cleanup is set to every 4 Hours which is cleaning up queue tasks. Should i keep the large interval time for celery.backend_cleanup task ?
Q2) Why we should keep celery.backend_cleanup task? Does it loads new tasks on every cleanup?

Q1: We can't answer more without seeing the actual schedules, knowing your configuration of celery, or the logs past twenty-four hours. The backend_cleanup job has no effect on the broker, it's purpose is to clean up task results that have expired by deleting expired results from an RDBMS celery result backend, so it would have no effect on whether a task executes properly.
Q2: See above. You should use this task if you are using an RDBMS / database backend if you want your expired celery results to be deleted from your celery results backend database.

Related

Celery. Running single celery beat + multiple celery workers scale

Having single celery beat running by:
celery -A app:celery beat --loglevel=DEBUG
and three workers running by:
celery -A app:celery worker -E --loglevel=ERROR -n n1
celery -A app:celery worker -E --loglevel=ERROR -n n2
celery -A app:celery worker -E --loglevel=ERROR -n n3
Same Redis DB used as messages broker for all workers and beat.
All workers started on same machine for development purposes while they will be deployed using different Kubernetes pods on production. Main idea of usage multiple workers to distribute 50-150 tasks between different Kube pods each running on 4-8 core machine. We expect that none of pod will take more tasks than he have cores until there are any worker exists that has less tasks than available cores so max amount of tasks to be executed concurrently.
So I having troubles to test it locally.
Here is local beat triggers three tasks:
[2021-08-23 21:35:32,700: DEBUG/MainProcess] Current schedule:
<ScheduleEntry: task-5872-accrual Task5872Accrual() <crontab: 36 21 * * * (m/h/d/dM/MY)>
<ScheduleEntry: task-5872-accrual2 Task5872Accrual2() <crontab: 37 21 * * * (m/h/d/dM/MY)>
<ScheduleEntry: task-5872-accrual3 Task5872Accrual3() <crontab: 38 21 * * * (m/h/d/dM/MY)>
[2021-08-23 21:35:32,700: DEBUG/MainProcess] beat: Ticking with max interval->5.00 minutes
[2021-08-23 21:35:32,701: DEBUG/MainProcess] beat: Waking up in 27.29 seconds.
[2021-08-23 21:36:00,017: DEBUG/MainProcess] beat: Synchronizing schedule...
[2021-08-23 21:36:00,026: INFO/MainProcess] Scheduler: Sending due task task-5872-accrual (Task5872Accrual)
[2021-08-23 21:36:00,035: DEBUG/MainProcess] Task5872Accrual sent. id->96e671f8-bd07-4c36-a595-b963659bee5c
[2021-08-23 21:36:00,035: DEBUG/MainProcess] beat: Waking up in 59.95 seconds.
[2021-08-23 21:37:00,041: INFO/MainProcess] Scheduler: Sending due task task-5872-accrual2 (Task5872Accrual2)
[2021-08-23 21:37:00,043: DEBUG/MainProcess] Task5872Accrual2 sent. id->532eac4d-1d10-4117-9d7e-16b3f1ae7aee
[2021-08-23 21:37:00,043: DEBUG/MainProcess] beat: Waking up in 59.95 seconds.
[2021-08-23 21:38:00,027: INFO/MainProcess] Scheduler: Sending due task task-5872-accrual3 (Task5872Accrual3)
[2021-08-23 21:38:00,029: DEBUG/MainProcess] Task5872Accrual3 sent. id->68729b64-807d-4e13-8147-0b372ce536af
[2021-08-23 21:38:00,029: DEBUG/MainProcess] beat: Waking up in 5.00 minutes.
I expect that each worker will take single task to optimize load between workers but unfortunately here how they are distributed:
So i am not sure what does different workers synchronized between each other to distribute load between them smoothly? If not can I achieve that somehow? Tried to search in Google but there are mostly about concurrency between tasks in single worker but what to do if I need to run more tasks concurrently than single machine in Kube claster is have?
You should do two things in order to achieve what you want:
Run workers with the -O fair option. Example: celery -A app:celery worker -E --loglevel=ERROR -n n1 -O fair
Make workers prefetch as little as possible with worker_prefetch_multiplier=1 in your config.

How to remove all due tasks from celery scheduler DatabaseScheduler

My project has a lot of pending tasks task.com-43 to get executed on every 5 seconds. I want to remove all my pending tasks.
→ celery -A Project beat --loglevel=debug --scheduler django_celery_beat.schedulers:DatabaseScheduler
celery beat v4.2.1 (windowlicker) is starting.
__ - ... __ - _
LocalTime -> 2018-12-30 08:44:30
Configuration ->
. broker -> redis://localhost:6379//
. loader -> celery.loaders.app.AppLoader
. scheduler -> django_celery_beat.schedulers.DatabaseScheduler
. logfile -> [stderr]#%DEBUG
. maxinterval -> 5.00 seconds (5s)
[2018-12-30 08:44:30,310: DEBUG/MainProcess] Setting default socket timeout to 30
[2018-12-30 08:44:30,311: INFO/MainProcess] beat: Starting...
[2018-12-30 08:44:30,312: DEBUG/MainProcess] DatabaseScheduler: initial read
[2018-12-30 08:44:30,312: INFO/MainProcess] Writing entries...
[2018-12-30 08:44:30,312: DEBUG/MainProcess] DatabaseScheduler: Fetching database schedule
[2018-12-30 08:44:30,348: DEBUG/MainProcess] Current schedule:
[2018-12-30 08:44:30,418: INFO/MainProcess] Scheduler: Sending due task task5.com-43 (project_monitor_tasks)
[2018-12-30 08:44:30,438: DEBUG/MainProcess] beat: Synchronizing schedule...
[2018-12-30 08:44:30,438: INFO/MainProcess] Writing entries...
[2018-12-30 08:44:30,455: DEBUG/MainProcess] project_monitor_tasks sent. id->d440432f-111d-4c96-ab4f-00923f4cf7e1
[2018-12-30 08:44:30,464: DEBUG/MainProcess] beat: Waking up in 4.93 seconds.
[2018-12-30 08:44:35,413: INFO/MainProcess] Scheduler: Sending due task task.com-43 (project_monitor_tasks)
[2018-12-30 08:44:35,414: DEBUG/MainProcess] project_monitor_tasks sent. id->ff0438ce-9fb9-4ab0-aa8a-8a7636c67d90
[2018-12-30 08:44:35,424: DEBUG/MainProcess] beat: Waking up in 4.98 seconds.
[2018-12-30 08:44:40,419: INFO/MainProcess] Scheduler: Sending due task task.com-43 (project_monitor_tasks)
[2018-12-30 08:44:40,420: DEBUG/MainProcess] project_monitor_tasks sent. id->d0022780-7d5f-4e7b-965e-9fda0d607cbe
[2018-12-30 08:44:40,431: DEBUG/MainProcess] beat: Waking up in 4.98 seconds.
[2018-12-30 08:44:45,425: INFO/MainProcess] Scheduler: Sending due task task.com-43 (project_monitor_tasks)
[2018-12-30 08:44:45,427: DEBUG/MainProcess] project_monitor_tasks sent. id->9b3eb775-60d5-4daa-a019-e0dfae932380
[2018-12-30 08:44:45,439: DEBUG/MainProcess] beat: Waking up in 4.98 seconds.
....
....
I'm using Redis for the backend database for Project tasks, I Tried Purging The Celery & Flushing the redis but still, it is executing all pending tasks.
ps auxww | grep 'celery worker' | awk '{print $2}' | xargs kill -9 ## Stopping all workers first
celery -A project purge
redis-cli FLUSHALL
service redis-server restart
One way you could remove all the tasks is by deleting the tasks from Periodic Tasks Models but first stop all your workers & purge all project tasks.
The answer to the question is here:
https://stackoverflow.com/a/33047721/10372434

celery beat instantly stopping with resource error

The logs of celery beat is like this where the last line just stopped and not continuing and recovering anymore.
[2018-08-20 11:20:59,002: INFO/MainProcess] Scheduler: Sending due task check result delays every 10sec (notify_delay)
[2018-08-20 11:21:00,000: INFO/MainProcess] Scheduler: Sending due task load abnormal schedules (load_abnormal_schedules)
[2018-08-20 11:21:00,004: INFO/MainProcess] Scheduler: Sending due task check close schedule every 5sec (close_schedule)
[2018-08-20 11:21:05,000: INFO/MainProcess] Scheduler: Sending due task check close schedule every 5sec (close_schedule)
[2018-08-20 11:21:10,000: INFO/MainProcess] Scheduler: Sending due task check close schedule every 5sec (close_schedule)
[2018-08-20 11:21:14,002: INFO/MainProcess] Scheduler: Sending due task check result delays every 10sec (notify_delay)
[2018-08-20 11:21:15,000: INFO/MainProcess] Scheduler: Sending due task load abnormal schedules (load_abnormal_schedules)
[2018-08-20 11:21:15,003: INFO/MainProcess] Scheduler: Sending due task check close schedule every 5sec (close_schedule)
[2018-08-20 11:21:20,000: INFO/MainProcess] Scheduler: Sending due task check close schedule every 5sec (close_schedule)
[2018-08-20 11:21:25,000: INFO/MainProcess] Scheduler: Sending due task check close schedule every 5sec (close_schedule)
[2018-08-20 11:21:29,003: INFO/MainProcess] Scheduler: Sending due task check result delays every 10sec (notify_delay)
It is run inside a docker container. When I checked via top it shows a high CPU percentage
120549 root 20 0 356016 150144 16388 S 23.4 1.0 3:36.33 celery
Then when I ssh inside container and try to the celery beat command. The error below initially returned
root#4a298cc9c6e2:/usr/src/app# celery -A ghost beat -l info --pidfile=
celery beat v4.2.0 (windowlicker) is starting.
__ - ... __ - _
LocalTime -> 2018-08-20 11:32:51
Configuration ->
. broker -> amqp://ghost:**#ghost-rabbitmq:5672/ghost
. loader -> celery.loaders.app.AppLoader
. scheduler -> celery.beat.PersistentScheduler
. db -> celerybeat-schedule
. logfile -> [stderr]#%INFO
. maxinterval -> 5.00 minutes (300s)
[2018-08-20 11:32:51,526: INFO/MainProcess] beat: Starting...
[2018-08-20 11:32:51,535: ERROR/MainProcess] Removing corrupted schedule file 'celerybeat-schedule': error(11, 'Resource temporarily unavailable')
Traceback (most recent call last):
File "/usr/local/lib/python3.6/site-packages/kombu/utils/objects.py", line 42, in __get__
return obj.__dict__[self.__name__]
KeyError: 'scheduler'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.6/site-packages/celery/beat.py", line 476, in setup_schedule
self._store = self._open_schedule()
File "/usr/local/lib/python3.6/site-packages/celery/beat.py", line 466, in _open_schedule
return self.persistence.open(self.schedule_filename, writeback=True)
File "/usr/local/lib/python3.6/shelve.py", line 243, in open
return DbfilenameShelf(filename, flag, protocol, writeback)
File "/usr/local/lib/python3.6/shelve.py", line 227, in __init__
Shelf.__init__(self, dbm.open(filename, flag), protocol, writeback)
File "/usr/local/lib/python3.6/dbm/__init__.py", line 94, in open
return mod.open(file, flag, mode)
_gdbm.error: [Errno 11] Resource temporarily unavailable
Take note that I'm only using pure celery and not django-celery-beat
My dear friend, when you every up your docker container, celery wants create celerybeat.pid file and If exist django raises error about this. So you should add these command for delete current celerybeat.pid file on your Dockerfile like this:
COPY entrypoint.sh /code/entrypoint.sh
ENTRYPOINT ["/code/entrypoint.sh"]
RUN chmod +x /entrypoint.sh
And you should create a entrypoint.sh file like below:
#!/bin/sh
rm -rf /code/badpanty/*.pid
exec "$#"
I hope it's helpful.

Mysterious unrelated task_id being returned for celery workflow.

I have a rather complex workflow (which is built dynamically) which looks something like this:
workflow= chain(
signature('workflow.tasks.start_workflow', kwargs= {}),
chord(
[
signature('workflow.tasks.group_task', kwargs= {}),
signature('workflow.tasks.sample_task_2', kwargs= {}),
signature('workflow.tasks.sample_task_10', kwargs= {})
],
chain(
signature('workflow.tasks.sample_task_3', kwargs= {}),
chord(
[
signature('workflow.tasks.group_task', kwargs= {}),
chain(
signature('workflow.tasks.sample_task_5', kwargs= {}),
signature('workflow.tasks.sample_task_6', kwargs= {}),
),
chain(
signature('workflow.tasks.sample_task_7', kwargs= {}),
signature('workflow.tasks.sample_task_8', kwargs= {}),
)
],
chain(
signature('workflow.tasks.sample_task_9', kwargs= {}),
signature('workflow.tasks.end_workflow', kwargs= {})
)
)
)
)
)
which celery then turns into this:
workflow.tasks.start_workflow() | celery.chain(
[
workflow.tasks.group_task(),
workflow.tasks.sample_task_2(),
workflow.tasks.sample_task_10()
],
tasks=(
workflow.tasks.sample_task_3(),
celery.chain(
[
workflow.tasks.group_task(),
workflow.tasks.sample_task_5() | workflow.tasks.sample_task_6(),
workflow.tasks.sample_task_7() | workflow.tasks.sample_task_8()
], tasks=(
workflow.tasks.sample_task_9(),
workflow.tasks.end_workflow()
)
)
)
)
Note how the the tasks at the end of the chord's are pushed into the "tasks" header. From what I've read these tasks are stored in the main tasks header and not are put on the queue until the chords header finishes executing.
When I try to display the task_id for the entire workflow (which I would expect to be one of the task_id's within the workflow).
workflow= workflow.apply_async()
print workflow.id
>> 1d538872-79af-4585-aef8-ebfc06cd0b5b
This task id I get is not stored in celery_taskmeta or celery_tasksetmeta. It's not any task that gets executed within the workflow (see worker log below). Any idea, what this task_id represents and if there is anyway I can link it to any of the executing task?
I'd like to be able to traverse through the results and display a state for each task in the workflow. However, this task id I get back doesn't seem to relate to any of the tasks. Below is the worker log and you'll see that task id printed above is no where to be found! Any ideas? Thanks.
[2015-03-03 15:34:42,306: INFO/MainProcess] Received task: workflow.tasks.start_workflow[45b54d46-56cc-4c46-a126-d38ab8e8a2e8]
[2015-03-03 15:34:42,334: INFO/MainProcess] Received task: workflow.tasks.group_task[ccce5c5b-0946-499a-9879-613b79333419]
[2015-03-03 15:34:42,335: INFO/MainProcess] Received task: workflow.tasks.sample_task_2[3262ad97-c8ea-4b26-9bdc-f3a95fd41cf4]
[2015-03-03 15:34:42,335: INFO/MainProcess] Received task: workflow.tasks.sample_task_10[64286589-7665-4574-864a-69f3175ec281]
[2015-03-03 15:34:42,336: INFO/MainProcess] Received task: celery.chord_unlock[055938f3-5a4e-4c77-aa76-ab3399206c87] eta:[2015-03-03 15:34:43.335836+00:00]
[2015-03-03 15:34:42,363: INFO/MainProcess] Task workflow.tasks.start_workflow[45b54d46-56cc-4c46-a126-d38ab8e8a2e8] succeeded in 0.0562515768688s: None
[2015-03-03 15:34:42,391: INFO/MainProcess] Task workflow.tasks.group_task[ccce5c5b-0946-499a-9879-613b79333419] succeeded in 0.0555328750052s: True
[2015-03-03 15:34:43,402: INFO/MainProcess] Received task: celery.chord_unlock[055938f3-5a4e-4c77-aa76-ab3399206c87] eta:[2015-03-03 15:34:44.400298+00:00]
[2015-03-03 15:34:43,404: INFO/MainProcess] Task celery.chord_unlock[055938f3-5a4e-4c77-aa76-ab3399206c87] retry: Retry in 1s
[2015-03-03 15:34:45,323: INFO/MainProcess] Received task: celery.chord_unlock[055938f3-5a4e-4c77-aa76-ab3399206c87] eta:[2015-03-03 15:34:46.320054+00:00]
[2015-03-03 15:34:45,325: INFO/MainProcess] Task celery.chord_unlock[055938f3-5a4e-4c77-aa76-ab3399206c87] retry: Retry in 1s
[2015-03-03 15:34:47,299: INFO/MainProcess] Received task: celery.chord_unlock[055938f3-5a4e-4c77-aa76-ab3399206c87] eta:[2015-03-03 15:34:48.297891+00:00]
[2015-03-03 15:34:47,299: INFO/MainProcess] Task celery.chord_unlock[055938f3-5a4e-4c77-aa76-ab3399206c87] retry: Retry in 1s
[2015-03-03 15:34:47,390: INFO/MainProcess] Task workflow.tasks.sample_task_2[3262ad97-c8ea-4b26-9bdc-f3a95fd41cf4] succeeded in 5.05364968092s: True
[2015-03-03 15:34:47,392: INFO/MainProcess] Task workflow.tasks.sample_task_10[64286589-7665-4574-864a-69f3175ec281] succeeded in 5.05569092603s: True
[2015-03-03 15:34:48,426: INFO/MainProcess] Task celery.chord_unlock[055938f3-5a4e-4c77-aa76-ab3399206c87] succeeded in 0.0345057491213s: None
[2015-03-03 15:34:48,426: INFO/MainProcess] Received task: workflow.tasks.sample_task_3[89e6b3a6-1595-48e3-801d-28b36aafb581]
[2015-03-03 15:34:53,483: INFO/MainProcess] Received task: workflow.tasks.group_task[5e4f63b9-6968-4210-91f7-b89e939d1c9a]
[2015-03-03 15:34:53,484: INFO/MainProcess] Received task: workflow.tasks.sample_task_5[fc10ce62-5701-4c75-987e-7dac7b17bab6]
[2015-03-03 15:34:53,484: INFO/MainProcess] Received task: workflow.tasks.sample_task_7[9893dd87-844b-44a3-b5d8-bca086ee15ec]
[2015-03-03 15:34:53,485: INFO/MainProcess] Received task: celery.chord_unlock[018c1e9e-3b2e-4a4c-90ed-5265b01eb9fb] eta:[2015-03-03 15:34:54.484729+00:00]
[2015-03-03 15:34:53,490: INFO/MainProcess] Task workflow.tasks.sample_task_3[89e6b3a6-1595-48e3-801d-28b36aafb581] succeeded in 5.06310376804s: True
[2015-03-03 15:34:53,527: INFO/MainProcess] Task workflow.tasks.group_task[5e4f63b9-6968-4210-91f7-b89e939d1c9a] succeeded in 0.043258280959s: True
[2015-03-03 15:34:55,328: INFO/MainProcess] Received task: celery.chord_unlock[018c1e9e-3b2e-4a4c-90ed-5265b01eb9fb] eta:[2015-03-03 15:34:56.327396+00:00]
[2015-03-03 15:34:55,329: INFO/MainProcess] Task celery.chord_unlock[018c1e9e-3b2e-4a4c-90ed-5265b01eb9fb] retry: Retry in 1s
[2015-03-03 15:34:57,336: INFO/MainProcess] Received task: celery.chord_unlock[018c1e9e-3b2e-4a4c-90ed-5265b01eb9fb] eta:[2015-03-03 15:34:58.333722+00:00]
[2015-03-03 15:34:57,339: INFO/MainProcess] Task celery.chord_unlock[018c1e9e-3b2e-4a4c-90ed-5265b01eb9fb] retry: Retry in 1s
[2015-03-03 15:34:58,424: INFO/MainProcess] Received task: celery.chord_unlock[018c1e9e-3b2e-4a4c-90ed-5265b01eb9fb] eta:[2015-03-03 15:34:59.423050+00:00]
[2015-03-03 15:34:58,425: INFO/MainProcess] Task celery.chord_unlock[018c1e9e-3b2e-4a4c-90ed-5265b01eb9fb] retry: Retry in 1s
[2015-03-03 15:34:58,517: INFO/MainProcess] Received task: workflow.tasks.sample_task_8[aff7d810-9989-4dfe-8cca-1032efcf4624]
[2015-03-03 15:34:58,521: INFO/MainProcess] Received task: workflow.tasks.sample_task_6[b758014f-5837-4bed-9426-5c2e03af2c2f]
[2015-03-03 15:34:58,538: INFO/MainProcess] Task workflow.tasks.sample_task_7[9893dd87-844b-44a3-b5d8-bca086ee15ec] succeeded in 5.05185400695s: True
[2015-03-03 15:34:58,539: INFO/MainProcess] Task workflow.tasks.sample_task_5[fc10ce62-5701-4c75-987e-7dac7b17bab6] succeeded in 5.05522017297s: True
[2015-03-03 15:35:01,325: INFO/MainProcess] Received task: celery.chord_unlock[018c1e9e-3b2e-4a4c-90ed-5265b01eb9fb] eta:[2015-03-03 15:35:02.322996+00:00]
[2015-03-03 15:35:01,326: INFO/MainProcess] Task celery.chord_unlock[018c1e9e-3b2e-4a4c-90ed-5265b01eb9fb] retry: Retry in 1s
[2015-03-03 15:35:03,337: INFO/MainProcess] Received task: celery.chord_unlock[018c1e9e-3b2e-4a4c-90ed-5265b01eb9fb] eta:[2015-03-03 15:35:04.335374+00:00]
[2015-03-03 15:35:03,339: INFO/MainProcess] Task celery.chord_unlock[018c1e9e-3b2e-4a4c-90ed-5265b01eb9fb] retry: Retry in 1s
[2015-03-03 15:35:03,594: INFO/MainProcess] Task workflow.tasks.sample_task_6[b758014f-5837-4bed-9426-5c2e03af2c2f] succeeded in 5.0567153669s: True
[2015-03-03 15:35:03,595: INFO/MainProcess] Task workflow.tasks.sample_task_8[aff7d810-9989-4dfe-8cca-1032efcf4624] succeeded in 5.05580001394s: True
[2015-03-03 15:35:05,315: INFO/MainProcess] Task celery.chord_unlock[018c1e9e-3b2e-4a4c-90ed-5265b01eb9fb] succeeded in 0.0105995119084s: None
[2015-03-03 15:35:05,316: INFO/MainProcess] Received task: workflow.tasks.sample_task_9[2492e5e0-d6df-402c-b5a5-ab15d99b42ad]
[2015-03-03 15:35:10,336: INFO/MainProcess] Received task: workflow.tasks.end_workflow[4a2c0a15-77c9-417e-bd21-8a7f1d248981]
[2015-03-03 15:35:10,357: INFO/MainProcess] Task workflow.tasks.sample_task_9[2492e5e0-d6df-402c-b5a5-ab15d99b42ad] succeeded in 5.04111725814s: True
[2015-03-03 15:35:10,374: INFO/MainProcess] Task workflow.tasks.end_workflow[4a2c0a15-77c9-417e-bd21-8a7f1d248981] succeeded in 0.0367547438946s: None
I don't remember exactly where that identifier gets generated, but celery does generate task identifiers for internal use depending on what canvas primitives are used. The biggest problem I ran into was the chord_unlock tasks, which you have no control over.
This won't help you in the short term, but I created a patch a long time ago that allows you to pass a 'root_id' argument when creating a workflow that will then get propagated to all children tasks and can be used to link them all together like you describe. It will also include a 'parent_id' field to further assist with traversing/tracking workflows as well.
The idea was accepted, but the code hasn't landed in an official release yet. It is supposed to land in the next major release v3.2.0 and you can monitor it here: https://github.com/celery/celery/pull/1318

How to use foreach (each record add to solr) in foreachRDD to spark streaming?

I got some problem for spark streaming for design patterns for using foreachRDD.
I applied design patten like this.
http://spark.apache.org/docs/latest/streaming-programming-guide.html
- Guide sample code
dstream.foreachRDD(rdd => {
rdd.foreachPartition(partitionOfRecords => {
// ConnectionPool is a static, lazily initialized pool of connections
val connection = ConnectionPool.getConnection()
partitionOfRecords.foreach(record => connection.send(record))
ConnectionPool.returnConnection(connection) // return to the pool for future reuse
})
})
- My code
dsteam.foreachRDD( rdd => {
rdd.foreachPartition(partitionOfRecords => {
val connection = SolrConnectionPool.getConnection()
partitionOfRecords.foreach(record => connection.add(makeSolrInputDocument(record)))
SolrConnectionPool.returnConnection(connection)
})
})
** Got error logs **
> log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
init cloudSolrServer ===== > org.apache.solr.client.solrj.impl.CloudSolrServer#157dbcd4
init cloudSolrServer ===== > org.apache.solr.client.solrj.impl.CloudSolrServer#6fa2fa45
init cloudSolrServer ===== > org.apache.solr.client.solrj.impl.CloudSolrServer#5802ffe7
...................... (skip)
14/10/17 13:22:01 INFO JobScheduler: Finished job streaming job 1413519720000 ms.0 from job set of time 1413519720000 ms
14/10/17 13:22:01 INFO JobScheduler: Starting job streaming job 1413519720000 ms.1 from job set of time 1413519720000 ms
14/10/17 13:22:01 INFO SparkContext: Starting job: foreachPartition at SbclogCep.scala:49
14/10/17 13:22:01 INFO DAGScheduler: Got job 1 (foreachPartition at SbclogCep.scala:49) with 1 output partitions (allowLocal=false)
14/10/17 13:22:01 INFO DAGScheduler: Final stage: Stage 1(foreachPartition at SbclogCep.scala:49)
-------------------------------------------
Time: 1413519730000 ms
-------------------------------------------
14/10/17 13:22:57 INFO SparkContext: Starting job: foreachPartition at SbclogCep.scala:49
14/10/17 13:22:57 INFO TaskSchedulerImpl: Cancelling stage 1
14/10/17 13:22:57 INFO JobScheduler: Starting job streaming job 1413519730000 ms.0 from job set of time 1413519730000 ms
14/10/17 13:22:57 INFO JobScheduler: Finished job streaming job 1413519730000 ms.0 from job set of time 1413519730000 ms
14/10/17 13:22:57 INFO JobScheduler: Starting job streaming job 1413519730000 ms.1 from job set of time 1413519730000 ms
14/10/17 13:22:57 ERROR JobScheduler: Error running job streaming job 1413519720000 ms.1
org.apache.spark.SparkException: Job aborted due to stage failure: All masters are unresponsive! Giving up.
at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1033)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1017)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1015)
at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1015)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:633)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:633)
at scala.Option.foreach(Option.scala:236)
at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:633)
at org.apache.spark.scheduler.DAGSchedulerEventProcessActor$$anonfun$receive$2.applyOrElse(DAGScheduler.scala:1207)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498)
at akka.actor.ActorCell.invoke(ActorCell.scala:456)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237)
at akka.dispatch.Mailbox.run(Mailbox.scala:219)
at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
14/10/17 13:22:57 INFO SparkContext: Job finished: foreachPartition at SbclogCep.scala:49, took 2.6276E-5 s
-------------------------------------------
Time: 1413519740000 ms
-------------------------------------------
Exception in thread "main" org.apache.spark.SparkException: Job aborted due to stage failure: All masters are unresponsive! Giving up.
at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1033)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1017)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1015)
at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1015)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:633)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:633)
at scala.Option.foreach(Option.scala:236)
at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:633)
at org.apache.spark.scheduler.DAGSchedulerEventProcessActor$$anonfun$receive$2.applyOrElse(DAGScheduler.scala:1207)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498)
at akka.actor.ActorCell.invoke(ActorCell.scala:456)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237)
at akka.dispatch.Mailbox.run(Mailbox.scala:219)
at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
It is not working foreachRDD ... How do I do?
Please help me..