monitor celery queue pending tasks with or without flower - celery

I am trying to monitor celery queue so that if no of tasks increases in a queue i can chose to spawn more worker.
How can i do this with or without Flower(the celery monitoring tool)
eg: I can get a list of all the workers like this
curl -X GET http://localhost:5555/api/workers
{
"celery#ip-172-0-0-1": {
"status": true,
"queues": [
"tasks"
],
"running_tasks": 0,
"completed_tasks": 0,
"concurrency": 1
},
"celery#ip-172-0-0-2": {
"status": true,
"queues": [
"tasks"
],
"running_tasks": 0,
"completed_tasks": 5,
"concurrency": 1
},
"celery#ip-172-0-0-3": {
"status": true,
"queues": [
"tasks"
],
"running_tasks": 0,
"completed_tasks": 5,
"concurrency": 1
}
}
similarly i need a list of tasks pending by queue name so i can start a worker on that queue.
Thanks for not down voting this question.

Reserved tasks does not make sense here. it only includes the portion of received but not running ones.
We could use rabbitmq-management to monitor the queue if using RabbitMQ as broker. celery document also provide some ways to do the same things.

Related

POD is being terminated and created again due to scale up and it's running twice

I have an application that runs a code and at the end it sends an email with a report of the data. When I deploy pods on GKE , certain pods get terminated and a new pod is created due to Auto Scale, but the problem is that the termination is done after my code is finished and the email is sent twice for the same data.
Here is the JSON file of the deploy API:
{
"apiVersion": "batch/v1",
"kind": "Job",
"metadata": {
"name": "$name",
"namespace": "$namespace"
},
"spec": {
"template": {
"metadata": {
"name": "********"
},
"spec": {
"priorityClassName": "high-priority",
"containers": [
{
"name": "******",
"image": "$dockerScancatalogueImageRepo",
"imagePullPolicy": "IfNotPresent",
"env": $env,
"resources": {
"requests": {
"memory": "2000Mi",
"cpu": "2000m"
},
"limits":{
"memory":"2650Mi",
"cpu":"2650m"
}
}
}
],
"imagePullSecrets": [
{
"name": "docker-secret"
}
],
"restartPolicy": "Never"
}
}
}
}
and here is a screen-shot of the pod events:
Any idea how to fix that?
Thank you in advance.
"Perhaps you are affected by this "Note that even if you specify .spec.parallelism = 1 and .spec.completions = 1 and .spec.template.spec.restartPolicy = "Never", the same program may sometimes be started twice." from doc. What happens if you increase terminationgraceperiodseconds in your yaml file? – "
#danyL
my problem was that I had another jobs that deploy pods on my nodes with more priority , so it was trying to terminate my running pods but the job was already done and the email was already sent , so i fixed the problem by fixing the request and the limit resources on all my json files , i don't know if it's the perfect solution but for now it solved my problem.
Thank you all for you help

JBPM created case and tasks not visible or accessible

I have succesfully integrated Keycloak into JBPM for user management and can login using keycloak into business central and case management. I have also successfully configured the kie-server using keycloak credentials and can deploy a stripped down version of the IT Orders sample application on the running sample-server kie-server. When I perform a GET/kie-server/services/rest/server/containers I can see my container itorders_1.0.0-SNAPSHOT is up and running in business central and also when I call GET /kie-server/services/rest/server/containers which gives the output below
{
"type": "SUCCESS",
"msg": "List of created containers",
"result": {
"kie-containers": {
"kie-container": [
{
"container-id": "itorders_1.0.0-SNAPSHOT",
"release-id": {
"group-id": "itorders",
"artifact-id": "itorders",
"version": "1.0.0-SNAPSHOT"
},
"resolved-release-id": {
"group-id": "itorders",
"artifact-id": "itorders",
"version": "1.0.0-SNAPSHOT"
},
"status": "STARTED",
"scanner": {
"status": "DISPOSED",
"poll-interval": null
},
"config-items": [
{
"itemName": "KBase",
"itemValue": "",
"itemType": "BPM"
},
{
"itemName": "KSession",
"itemValue": "",
"itemType": "BPM"
},
{
"itemName": "MergeMode",
"itemValue": "MERGE_COLLECTIONS",
"itemType": "BPM"
},
{
"itemName": "RuntimeStrategy",
"itemValue": "PER_CASE",
"itemType": "BPM"
}
],
"messages": [
{
"severity": "INFO",
"timestamp": {
"java.util.Date": 1598900747932
},
"content": [
"Release id successfully updated for container itorders_1.0.0-SNAPSHOT"
]
}
],
"container-alias": "itorders"
}
]
}
}
}
I can get the case definitions using GET /kie-server/services/rest/server/queries/cases
{
"definitions": [
{
"name": "Order for IT hardware",
"id": "itorders.orderhardware",
"version": "1.0",
"case-id-prefix": "IT",
"container-id": "itorders_1.0.0-SNAPSHOT",
"adhoc-fragments": [
{
"name": "Prepare hardware spec",
"type": "HumanTaskNode"
}
],
"roles": {
"owner": 1
},
"milestones": [],
"stages": []
}
]
}
I can then do a POST /kie-server/services/rest/server/containers/itorders_1.0.0-SNAPSHOT/cases/itorders.orderhardware/instances which correctly returns the Case ID of the case created e.g. IT-0000000014. The call returns http status code 201
However when I do a GET /kie-server/services/rest/server/queries/cases/instances there are no instances returned as per below
{
"instances": []
}
When I create a case in the JBPM Case Management showcase I get the green prompt to show the case was successfully created however no open cases appear in the grid even if I refresh the screen.
I can see the process instance associated with the case in the process instances view including the diagram which shows that the "Prepare hardware spec" is active and the current activity. However viewing the tasks associated with the process does not show any tasks. Similarly the task inboxes of the user I am expecting to get claim the task is also empty.
Take note that I am using token based authentication with Keycloak and executed the above rest calls using Postman
Why can I not view the case instance I created? Why can I not view the tasks associated with the process instance?
With this query GET /kie-server/services/rest/server/queries/cases/instances you can see only instances on which is your user setup as potential. Make sure that user used in token is setup as potential owner.

Kafka rebalance the data in a topic due to slow(er) consumer

For an example, say I have a topic with 4 partitions. I send 4k messages to this topic. Each partition gets 1k messages. Due to outside factors, 3 of the consumers process all 1k of their messages respectively. However, the 4th partition was only able to get through 200 messages, leaving 800 messages left to process. Is there a mechanism to allow me to "rebalance" the data in the topic to say give partition 1-3 200 of partition 4s data leaving all partitions with 200 messages a piece of process?
I am not looking for a way adding additional nodes to the consumer group and have kafka balance the partitions.
Added output from reassign partitions:
Current partition replica assignment
{
"version": 1,
"partitions": [
{
"topic": "MyTopic",
"partition": 0,
"replicas": [
0
],
"log_\ndirs": [
"any"
]
},
{
"topic": "MyTopic",
"partition": 1,
"replicas": [
0
],
"log_dirs": [
"any"
]
},
{
"topic": "MyTopic",
"partition": 4,
"replicas": [
0
],
"log_dirs": [
"any"
]
},
{
"topic": "MyTopic",
"partition": 3,
"replicas": [
0
],
"log_dirs": [
"any"
]
},
{
"topic": "MyTopic",
"p\nartition": 2,
"replicas": [
0
],
"log_dirs": [
"any"
]
},
{
"topic": "MyTopic",
"partition": 5,
"replicas": [
0
],
"log_dirs": [
"any"
]
}
]
}
Proposed partition reassignment configuration
{
"version": 1,
"partitions": [
{
"topic": "MyTopic",
"partition": 3,
"replicas": [
0
],
"log_ dirs": [
"any"
]
},
{
"topic": "MyTopic",
"partition": 0,
"replicas": [
0
],
"log_dirs": [
"any"
]
},
{
"topic": "MyTopic",
"partition": 5,
"replicas": [
0
],
"log_dirs": [
"any"
]
},
{
"topic": "MyTopic",
"partition": 2,
"replicas": [
0
],
"log_dirs": [
"any"
]
},
{
"topic": "MyTopic",
"p artition": 4,
"replicas": [
0
],
"log_dirs": [
"any"
]
},
{
"topic": "MyTopic",
"partition": 1,
"replicas": [
0
],
"log_dirs": [
"any"
]
}
]
}
The partition is assigned when a message is produced. They are never automatically moved between partitions. In general, for each partition there can be multiple consumers (with different consumer group id) consuming at different paces so the broker can't move the messages between partitions based on the slowness of a consumer (group). There are a few things you can try though:
more partitions, hoping for a fairer distribution of load (you can have more partitions than consumers)
have producers explicitly set the partition on each message to produce a distribution between partitions that the consumers can better cope with
have consumers monitor their lag and actively unsubscribe from partitions when they fall behind so as to let other consumers pick up the load.
Couple of things which you can do to improve the performance
Increase number of partitions
Increase the consumer groups which are consuming the partitions.
The first will rebalance the load on your partitions and the second will increase the parallelism on your partitions to consume messages quickly.
I hope this helps. You can refer to this link for more understanding
https://xyu.io/2016/02/29/balancing-kafka-on-jbod/
Kafka consumers are part of consumer groups. A group has one or more consumers in it. Each partition gets assigned to one consumer.
If you have more consumers than partitions, then some of your consumers will be idle. If you have more partitions than consumers, more than one partition may get assigned to a single consumer.
Whenever a new consumer joins, a rebalance gets initiated and the new consumer is assigned some partitions previously assigned to other consumers.
For example, if there are 20 partitions all being consumed by one consumer, and another consumer joins, there'll be a rebalance.
During rebalance, the consumer group "pauses".

How to get percentage of job completion for a Spark Job?

I have been looking for a way to get percentage of Job completed for the corresponding job id.
Right now, the Spark JobServer UI shows the corresponding status for a running job:
{
"duration": "Job not done yet",
"classPath": "jobserver.spark.sql.SparkJobServerClient",
"startTime": "2017-11-13T11:22:46.030+05:30",
"context": corresponding_context_name,
"status": "RUNNING",
"jobId": "ef16374c-f370-442c-9cea-25aa1b427a0a"
}
And immediately afterwards, the completion status would look like:
{
"duration": "5.909 secs",
"classPath": "jobserver.spark.sql.SparkJobServerClient",
"startTime": "2017-11-13T11:22:46.030+05:30",
"context": corresponding_context_name,
"result": "2017-10-24-00-00-00,3120,9958,25.74,23.61,2.7,7195,4.31,4.54,8.84,13.41,9.96,8.11,6.77,5.59,4.68,3.96,3.39,2.94,15.5,4.94,2.61,0.45,1,4.6146717E7\n",
"status": "FINISHED",
"jobId": "ef16374c-f370-442c-9cea-25aa1b427a0a"
}
What I would like is to get the percentage of job status during the processing stage, so that it can be shown at the frontend.
If anybody could help me with this, it would be very helpful.
P.S. First post here, any error in format for posting question is regretted. Thanks.
In spark jobserver this feature is not available.

Data Factory Execution Order on Pipeline

I'm having some troubles with the execution order of scheduled pipelines in Data Factory.
My pipeline is as follows:
{
"name": "Copy_Stage_Upsert",
"properties": {
"description": "",
"activities": [
{
"type": "Copy",
"typeProperties": {
"source": {
"type": "BlobSource"
},
"sink": {
"type": "SqlDWSink",
"writeBatchSize": 10000,
"writeBatchTimeout": "00:10:00"
}
},
"inputs": [
{
"name": "csv_extracted_file"
}
],
"outputs": [
{
"name": "stage_table"
}
],
"policy": {
"timeout": "01:00:00",
"retry": 2
},
"scheduler": {
"frequency": "Hour",
"interval": 1
},
"name": "Copy to stage table"
},
{
"type": "Copy",
"typeProperties": {
"source": {
"type": "SqlDWSource",
"sqlReaderQuery": "SELECT * from table WHERE id NOT IN (SELECT id from stage_table) UNION ALL SELECT * from stage_table"
},
"sink": {
"type": "SqlDWSink",
"writeBatchSize": 10000,
"writeBatchTimeout": "00:10:00"
}
},
"inputs": [
{
"name": "stage_table"
}
],
"outputs": [
{
"name": "upsert_table"
}
],
"policy": {
"timeout": "01:00:00",
"retry": 2
},
"scheduler": {
"frequency": "Hour",
"interval": 1
},
"name": "Copy"
},
{
"type": "SqlServerStoredProcedure",
"typeProperties": {
"storedProcedureName": "sp_rename_tables"
},
"inputs": [
{
"name": "upsert_table"
}
],
"outputs": [
{
"name": "table"
}
],
"scheduler": {
"frequency": "Hour",
"interval": 1
},
"name": "Rename tables"
}
],
"start": "2017-02-09T18:00:00Z",
"end": "9999-02-06T15:00:00Z",
"isPaused": false,
"hubName": "",
"pipelineMode": "Scheduled"
}
}
For simplicity imagine that I've one pipeline called A with three simple tasks:
Task 1, Task 2 and finally Task 3.
Scenario A
One execution of Pipeline A scheduled.
It runs as:
Task 1 -> Task 2 -> Task 3
Scenario B
Two or more executions of Pipeline A scheduled to be executed.
It runs as:
First Scheduled Pipeline Task 1 -> Second Scheduled Pipeline Task 1 -> First Scheduled Pipeline Task 2 -> Second Scheduled Pipeline Task 2 -> First Scheduled Pipeline Task 2 -> First Scheduled Pipeline Task 3 -> Second Scheduled Pipeline Task 3.
Is it possible run the second scenario as:
First Scheduled Pipeline Task 1 -> First Scheduled Pipeline Task 2 -> First Scheduled Pipeline Task 3, Second Scheduled Pipeline Task 1 -> Second Scheduled Pipeline Task 2 -> Second Scheduled Pipeline Task 3
In other words, I need to finish the first scheduled pipeline before the second pipeline starts.
Thank you in advance!
It's possible. However, it will require some fake input and output datasets to enforce the dependency behaviour and create the chain as you describe. So possible, but a dirty hack!
This isn't a great solution and it will become complicated if your outputs/downstream datasets in the second pipeline have different time slice intervals to the first. I recommend testing and understanding this.
Really ADF isn't designed to do what you want. It's not a tool to sequence things like steps in a SQL Agent job. ADF is for scale and parallel work streams.
As I understand in whispers from Microsoft peeps there might be more event driven scheduling coming soon in ADF. But I don't know for sure.
Hope this helps.