I would like to know if there is a REST API call to list an archived set of Tasks in jBPM 6.x
As far as I know, there are just two query calls available:
[GET] /query/runtime/task
[GET] /query/task
Neither of those can retrieve all archived tasks. I saw in the BPM Suite dashboard that all tasks can be listed somehow. I'm wondering if there is an API available for that operation.
To list completed tasks try with below REST API:
[GET] /query/runtime/task?taskstatus=Completed
Related
I am trying to implement a Rest client for Flink to send jobs via Restful Flink services. And also I want to integrate Flink and Kubernetes natively. I have decided to use “Application Mode” as deployment mode according to Flink documentation .
I have already implemented a job and packaged it as jar. And I have tested it on Standalone Flink. But my aim is to move on Kubernetes and deploy my application in Application mode via Rest API of Flink.
I have already investigated the samples at Flink documentation - Native Kubernetes. But I cannot find a sample for executing same samples via Restful services (esp. how to set --target kubernetes-application/kubernetes-session or other parameters).
In addition to samples, I checked out the Flink sources from GitHub and tried to find some sample implementation or get some clue.
I think the below ones are related with my case.
org.apache.flink.client.program.rest. RestClusterClient
org.apache.flink.kubernetes. KubernetesClusterDescriptorTest. testDeployApplicationCluster
But they are all so complicated for me to understand below points.
For application mode, are there any need to initialize a container to serve Flink Rest services before submitting job? If so, is it JobManager?
For application mode, how can I set the same command line parameters via Rest services?
For session mode, in command line samples, kubernetes-session.sh is executed before job submission to initialize a JobManager container. How sould I do this step via Rest client?
For session mode, how can I set the same command line parameters via Rest services? Although the command line samples send .jar job as parameter, should I upload jar before submitting job?
Could you please provide me some clue/sample to continue my implementation?
Best regards,
Burcu
I suspect that if you study the implementation of the Apache Flink Kubernetes Operator you'll find some clues.
I want to pass the runid of Data fusion pipeline to some function upon pipeline completion but i am not able to find any run-time variable which holds this value. Please help!
As an update to the previous answer, the first thing to do is to obtain the details of the deployed pipelines in a given namespace. For this, the following endpoint should be queried: '/v3/namespaces/${NAMESPACE}/apps'. Where ${NAMESPACE} is the namespace where the pipeline is deployed.
This endpoint returns a list with the pipelines deployed on this namespace ${NAMESPACE} (not the pipeline JSON, just a high level description list). Once the pipeline list is obtained, to obtain the run metrics of a given pipeline, the following endpoint should be called: '/v3/namespaces/${NAMESPACE}/apps/${PIPELINE}/workflows/DataPipelineWorkflow/runs', where ${PIPELINE} is the name of the pipeline. This endpoint will return the details of all the runs for this pipeline. This is where the run_id can be obtained. The field containing the run_id is actually called runid in this list.
With the run_id, you can then obtain all the run logs for example by querying the endpoint '{CDAP_ENDPOINT}/v3/namespaces/{NAMESPACE}/apps/{PIPELINE}/workflows/DataPipelineWorkflow/runs/{run["runid"]}/logs?start={run["start"]}&stop={run["start"]}'. The previous snippet is a python snippet where run is a dictionary containing the run details of a particular run.
As explained in the CDAP microservice guide, to call these endpoints, the CDAP endpoint must be obtained by running the command: gcloud beta data-fusion instances describe --project=${PROJECT} --location=${REGION} --format="value(apiEndpoint)" ${INSTANCE_ID}. The authentication token will also be needed and this can be found through running: gcloud auth print-access-token.
The correct answer has been provided by #Edwin Elia in the comment section:
Retrieving the run-id of a Data Fusion pipeline within its run or the predecessor pipeline's is not possible currently. Here is an enhancement that you can track that would make it possible.
When talking about retrieving the run_id value after pipeline completion you should be able to use the REST API from the CDAP documentation to get information on the run including the run-id.
I am working on a cloud service platform that consists of getting tasks from users, executing them, and giving back the results.
TL;DR
Is there a way to have a "task queue", where tasks can be inserted via a REST API, and extracted automatically by the Google Kubernetes Engine cluster by guaranteeing an automatic scaling?
Long description
Users can send tasks in parallel, and each task is time consuming and need to be performed on a GPU. So, setting up an auto-scaling GPU cluster is what I thought of.
More in particular, in my idea, users could send tasks/data through a REST API, the REST API provides in filling a task queue, and the task queue itself will feed tasks to workers on the GPU auto-scaling cluster. Of course, there are other details (authentication, database, storage, etc.) that have to be addressed but are not the point of my question.
For reasons I don't specify here, the project is already started on the Google Cloud Platform, so switching to AWS or other providers is not an option.
For what I understood, things seem a bit different from standard Docker-only clusters in AWS, that is, we have to use the Google Kubernetes Engine (GKE) to setup the auto-scaling cluster, even for "simple" GPU-enabled Docker containers.
By looking at the not-so-exhaustive documentation, I know that queues are used, but what I don't know is whether feeding of tasks to the cluster is automatically handled. Also, the so-called "Task Queue" service has been deprecated.
Thank you!
First I thought Cloud Tasks queues may be the answer to your troubles, but more this post seems to promote Cloud Pub/Sub as a better alternative.
After a quick chat with batch developers, the current solution (before the batch service become public) is to adopt a third-party queue system like Slurm.
As per my understanding:
A custom resource is just an AWS Lambda function that runs whenever the stack is provisioned or updated or deleted.
A resource provider is plain old code where one writes hooks for all the Stack operations (update, create, delete, etc).
I can't see why anyone would use the former over the latter. Resource providers seem easier to write and test.
One historical reason is that custom resources were the only option until recently:
CloudFormation Release History
18 Nov 2019 Resource Provider announcement
Are there any open source Job Scheduler with REST API for commercial use which will support features like:
Tree like Job dependency
Hold & Release
Rerun failed steps
Parallelism
Help would be appreciated :)
NOTE: we are looking for open source alternative for TWS,Control-M,AutoSys.
JobScheduler would seem to meet your requirements:
Open Source see: Open Source and Commercial Licenses
Rest API see: Web Service Integration
Parallelism see: Organisation of Jobs and Job Chains
I think that these areas are also covered (I downloaded and trialled the application): See here
Tree like Job dependency
Hold & Release
Rerun failed steps
I'm not affiliated with SOS GmbH
ProActive Scheduler is an open source job scheduler.
It is part of OW2 organization
It is written in Java so it comes with a Java and a REST API
It provides workflows that are set of tasks with dependencies and more (loop,replicate, branch), upon failures you can control if the task should be cancelled or restarted
Parallelism and distribution is at the heart of it, with features like for instance
Commercial Support is provided by Activeeon, the company behind ProActive (full disclosure: I work for Activeeon).
You might be interested in DKron
Dkron is a system service that runs scheduled jobs at given intervals or times, just like the cron unix service but distributed in several machines in a cluster. If a machine fails (the leader), a follower will take over and keep running the scheduled jobs without human intervention. Dkron is Open Source and freely available.
http://dkron.io/
While not open source, a more cost effective job scheduler solution with REST API and support for the features listed is ActiveBatch workload automation and job scheduling. I do work for the company (being up-front) but our customers love how they can easily extend their automated processes to connect to any application, any service, any server with our REST API adapter. You can get more information here: https://www.advsyscon.com/en-us/activebatch/rest-api-adapter