fetching Cloud Data Fusion Runtime info - google-cloud-data-fusion

I want to pass the runid of Data fusion pipeline to some function upon pipeline completion but i am not able to find any run-time variable which holds this value. Please help!

As an update to the previous answer, the first thing to do is to obtain the details of the deployed pipelines in a given namespace. For this, the following endpoint should be queried: '/v3/namespaces/${NAMESPACE}/apps'. Where ${NAMESPACE} is the namespace where the pipeline is deployed.
This endpoint returns a list with the pipelines deployed on this namespace ${NAMESPACE} (not the pipeline JSON, just a high level description list). Once the pipeline list is obtained, to obtain the run metrics of a given pipeline, the following endpoint should be called: '/v3/namespaces/${NAMESPACE}/apps/${PIPELINE}/workflows/DataPipelineWorkflow/runs', where ${PIPELINE} is the name of the pipeline. This endpoint will return the details of all the runs for this pipeline. This is where the run_id can be obtained. The field containing the run_id is actually called runid in this list.
With the run_id, you can then obtain all the run logs for example by querying the endpoint '{CDAP_ENDPOINT}/v3/namespaces/{NAMESPACE}/apps/{PIPELINE}/workflows/DataPipelineWorkflow/runs/{run["runid"]}/logs?start={run["start"]}&stop={run["start"]}'. The previous snippet is a python snippet where run is a dictionary containing the run details of a particular run.
As explained in the CDAP microservice guide, to call these endpoints, the CDAP endpoint must be obtained by running the command: gcloud beta data-fusion instances describe --project=${PROJECT} --location=${REGION} --format="value(apiEndpoint)" ${INSTANCE_ID}. The authentication token will also be needed and this can be found through running: gcloud auth print-access-token.

The correct answer has been provided by #Edwin Elia in the comment section:
Retrieving the run-id of a Data Fusion pipeline within its run or the predecessor pipeline's is not possible currently. Here is an enhancement that you can track that would make it possible.
When talking about retrieving the run_id value after pipeline completion you should be able to use the REST API from the CDAP documentation to get information on the run including the run-id.

Related

Services section of Azure YAML pipeline

I'm looking at an example of the YAML pipeline with a services section. Here is a sample:
The YAML schema doesn't have services defined.
Where can I get information about the services section of the pipeline?
Update: Per Bowman's answer, the services section is part of the job step. In this scenario, there is only one job so the job step is omitted.
In the simplest case, a pipeline has a single job. In that case, you do not have to explicitly use the job keyword unless you are using a template. You can directly specify the steps in your YAML file.
here is the reference
There is in the official document:
https://learn.microsoft.com/en-us/azure/devops/pipelines/yaml-schema/jobs-job?view=azure-pipelines
services: # Container resources to run as a service container.
I think you directly check it in the top level, right? In fact, in this situation, there is a hidden default job, the definition of the job also be hidden. services section is under the job definition of that hidden job, not the top level.

Run Kubectl DevOps task with run-time specified resource details

We're building out a release pipeline in Azure DevOps which pushes to a Kubernetes cluster. The first step in the pipeline is to run an Azure CLI script which sets up all the resources - this is an idempotent script so we can run it each time we run the release pipeline. Our intention is to have a standardised release pipeline which we can run against several clusters, existing and new.
The final step in the pipeline is to run the Kubectl task with the apply command.
However, this pipeline task requires specifying in advance (at the time of building the pipeline) the names of the resource group and cluster against which it should be executed. But the point of the idempotent script in the first step is to ensure that the resources and to create if not.
So there's the possibility that neither the resource group nor the cluster will exist before the pipeline is run.
How can I achieve this in a DevOps pipeline if the Kubectl task requires a resource group and a cluster to be specified at design time?
This Kubectl task works with service connection type: Azure Resource Manager. And it requires to select Resource group field and Kubernetes cluster field after you select the Azure subscription, as below.
After testing, we find that these 2 fields supports variable. Thus you could use variable in these 2 fields, and using PowerShell task to set variable value before this Kubectl task. See: Set variables in scripts for details.

In teraform, is there a way to refresh the state of a resource using TF files without using CLI commands?

I have a requirement to refresh the state of a resource "ibm_is_image" using TF files without using CLI commands ?
I know that we can import the state of a resource using "terraform import ". But I should do the same using IaC in TF files.
How to achieve this ?
Example:
In workspace1, I create a resource "f5_custom_image" which gets deleted later from command line. In workspace2, the same code in TF file will assume that "f5_custom_image" already exists and it fails to read the custom image resource. So, my code has to refresh the terraform state of this resource for every execution of "terraform apply":
resource "ibm_is_image" "f5_custom_image" {
depends_on = ["data.ibm_is_images.custom_images"]
href = "${local.image_url}"
name = "${var.vnf_vpc_image_name}"
operating_system = "centos-7-amd64"
timeouts {
create = "30m"
delete = "10m"
}
}
In Terraform's model, an object is fully managed by a single Terraform configuration and nothing else. Having an object be managed by multiple configurations or having an object be created by Terraform but then deleted later outside of Terraform is not a supported workflow.
Terraform is intended for managing long-lived architecture that you will gradually update over time. It is not designed to manage build artifacts like machine images that tend to be created, used, and then destroyed.
The usual architecture for this sort of use-case is to consider the creation of the image as a "build" step, carried out using some other software outside of Terraform, and then we use Terraform only for the "deploy" step, at which point the long-lived infrastructure is either created or updated to use the new image.
That leads to a build and deploy pipeline with a series of steps like this:
Use separate image build software to construct the image, and record the id somewhere from which it can be retrieved using a data source in Terraform.
Run terraform apply to update the long-lived infrastructure to make use of the new image. The Terraform configuration should include a data block to read the image id from wherever it was recorded in the previous step.
If desired, destroy the image using software outside of Terraform once Terraform has completed.
When implementing a pipeline like this, it's optional but common to also consider a "rollback" process to use in case the new image is faulty:
Reset the recorded image id that Terraform is reading from back to the id that was stored prior to the new build step.
Run terraform apply to update the long-lived infrastructure back to using the old image.
Of course, supporting that would require retaining the previous image long enough to prove that the new image is functioning correctly, so the normal build and deploy pipeline would need to retain at least one historical image per run to roll back to. With that said, if you have a means to quickly recreate a prior image during rollback then this special workflow isn't strictly needed: instead, you can implement rollback instead by "rolling forward" to an image constructed with the prior configuration.
An example software package commonly used to prepare images for use with Terraform on other cloud vendors is HashiCorp Packer, but sadly it looks like it does not have IBM Cloud support and so you may need to look for some similar software that does support IBM Cloud, or write something yourself using the IBM Cloud SDK.

Azure DevOps passing Dynamically assigned variables between build hosts

I'm using Azure DevOps on a vs2017-win2016 build agent to provision some infrastructure using Terraform.
What I want to know is it possible to pass the Terraform Output of a hosts dynamically assigned IP address to a
2nd Job running a different build agent.
I'm able to pass these to build variables in the first Job
BASTION_PRIV_IP=x.x.x.x
BASTION_PUB_IP=1.1.1.1
But un-able to get these variables to appear to be consumed with the second build agent running ubuntu-16.04
I am able to pass any static defined parameters like Azure Resource Group name that I define before the job start, its just the
dynamically assigned ones.
This is pretty easily done when you are using the YAML based builds.
It's important to know that variables are only available within the scope of current job by default.
However you can set a variable as an output variable for your job.
This output variable can then be mapped to a variable within second job (do note that you need to set the first job as a dependency for the second job).
Please see the following link for an example of how to get this to work
https://learn.microsoft.com/en-us/azure/devops/pipelines/process/variables?view=azure-devops&tabs=yaml%2Cbatch#set-a-multi-job-output-variable
It may also be doable in the visual designer type of build, but i couldn't get that to work in the quick test i did, maybe you can get something to work inspired on the linked example.

AWS ECS get placement constraint after task creation

I am trying to create a CI build step that will stop and re-run my tasks when my docker containers changed.
The definition itself would be pointing at latest tag in ECR, and so all i need is to stop-task and then run-task.
Two of the parameters in the API as well as the UI are PlacementConstraints and PlacementStrategy.
Is there any way to get these from the API AFTER the task has been started? e.g. get them for a running task. describe-tasks doesn't seem to return this information.