This is not a question about how to capture logging on celery workers. Is there any way to capture celery logging on a Producer. What I want is to capture every log that get generated by celery on the Producer when I call task.delay(...) or task.apply_async(...).
EDIT:
I don't want to capture worker logs on producer. I want to capture everything that happen in celery from the time of my call to apply_async until the task is sent to the broker.
No, there is no way to capture worker logs on the producer. All you get is the exception, if thrown. Any logging is happening on the worker side, so you have to examine logs of that particular worker, or if you use some centralised log system then you have to look for logs from that worker...
Update: seems like you want to capture eventual logging from Celery on the producer (client) side. As far as I know Celery and the underlying transport handling library (Kombu) do not log anything. I could be wrong of course, but I can't remember seeing any logging there and I have read Celery (Kombu not that much to be fair) code many times...
A possible solution is to make Celery workers send logs to some centralised system that your Celery client can access...
Related
We have a requirement, where we would need to showcase the resiliency of a kafka cluster. To prove this, we have a use case where we need to run a producer and consumer ( I am thinking kafka-console-producer and kafla-console-consumer) preferably via cli commands and/or scripts to run continuously for 24hrs. We are not concerned with the message size and contents; preferably the size can be as small as possible and messages be any random value, say the present timestamp.
How can I achieve this?
There's nothing preventing you from doing this, and the problem isn't unique to Kafka.
You can use nohup to run a script as a daemon, otherwise, the commands will terminate when that console session ends. You could also use cron to schedule any script, a minimum of every minute...
Or you can write your own app with a simple while(true) loop.
Regardless, you will want a proess supervisor to truly ensure the command remains running at all times.
I am running a sinkTask using connect-standalone.sh and connect-standalone.properties. I am doing this in a shell script and I am not sure how to stop the sinkTask once the data is consumed by the consumer.
I tried various settings in the properties file like connections.max.idle.ms=5000. But nothing is stopping the sink.
I don't want to try the distributed mode as it requires REST API calls. Any suggest to stop the sinkTask once the messages in the producer are empty?
When running in standalone, the only way to stop a connector is to stop the connect process you started with connect-standalone.sh.
If you want to often start and stop connectors, I'd recommend you to reconsider distributed mode as it makes controlling the life cycle of connectors easy to manage via the REST API.
I have a critical Kafka application that needs to be up and running all the time. The source topics are created by debezium kafka connect for mysql binlog. Unfortunately, many things can go wrong with this setup. A lot of times debezium connectors fail and need to be restarted, so does my apps then (because without throwing any exception it just hangs up and stops consuming). My manual way of testing and discovering the failure is checking kibana log, then consume the suspicious topic through terminal. I can mimic this in code but obviously no way the best practice. I wonder if there is the ability in KafkaStream api that allows me to do such health check, and check other parts of kafka cluster?
Another point that bothers me is if I can keep the stream alive and rejoin the topics when connectors are up again.
You can check the Kafka Streams State to see if it is rebalancing/running, which would indicate healthy operations. Although, if no data is getting into the Topology, I would assume there would be no errors happening, so you need to then lookup the health of your upstream dependencies.
Overall, sounds like you might want to invest some time into using monitoring tools like Consul or Sensu which can run local service health checks and send out alerts when services go down. Or at the very least Elasticseach alerting
As far as Kafka health checking goes, you can do that in several ways
Is the broker and zookeeper process running? (SSH to the node, check processes)
Is the broker and zookeeper ports open? (use Socket connection)
Are there important JMX metrics you can track? (Metricbeat)
Can you find an active Controller broker (use AdminClient#describeCluster)
Are there a required minimum number of brokers you would like to respond as part of the Controller metadata (which can be obtained from AdminClient)
Are the topics that you use having the proper configuration? (retention, min-isr, replication-factor, partition count, etc)? (again, use AdminClient)
We are using Airflow(1.10.3) with Celery executor(4.1.1 (latentcall)) and broker SQS. While debugging an issue we tried our hands on celery CLI and found out that SQS broker is not supported for any of the inspect commands or monitoring tool eg. Flower.
Is there any way we can monitor the tasks or events on celery workers?
We have tried the celery monitor as follows:
celery events -b sqs://
But it shows no worker discovered and no tasks selected.
The celery inspect command help page shows:
Availability: RabbitMQ (AMQP) and Redis transports.
Please let me know if I am missing something or is it even possible to monitor celery workers with SQS.
SQS transport does not provide support for monitoring/inspection (this is the main reason why I do not use it)... According to the latest documentation Redis and RabbitMQ are the only broker types that have support for monitoring/inspection and remote control.
we use celery with rabbitMQ backend and some of our servers hang with error: "[Errno 113] No route to host"(which can be a result of half of our servers being in US and half in Europe).
I need to be sure that every task is being delivered, unfortunately I have no idea how to retry tasks sent using send_task/string identifier(server that sends tasks has no access to code of remote worker) like this:
send_task("remote1.tasks.add_data", args=[...], kwargs={}, queue="remote1")
Is it possible to retry such task?
sent_task send just the message to the broker, if the exception is raised on the servers that call the sent_task, probably the message simply doesn't reach the broker, than there is no task to retry but just an exception to be handled.
Otherwise if all you workers randomly raise this exception because they can't reach the broker for some reason probably you can solve by set to true the celery conf vars
CELERY_ACKS_LATE = True
"Late ack means the task messages will be acknowledged after the task has been executed, not just before, which is the default behavior."
This means that if something go mad during the execution of the task in the worker, the broker doesn't receive the acks and another worker will execute the task.