I'm trying to parametrize a pipeline in Azure Data Factory in order to enable a certain functionality to mulptiple environments. The idea is that the current environment is always available through a global parameter. I'd like to use this parameter to look up an array of environments to process data to. Example:
targetEnvs = [{ "dev": ["dev"], "test": ["dev", "test"], "acc": [], "prod": ["acc", "prod"] }]
Then one should be able to select the targetEnv array with something like targetEnvs[environment] or targetEnvs.environment. Subsequently a ForEach is used to execute some logic on these target environments.
I tried setting this up with targetEnvs as a pipeline parameter (with default value mapping each env directly to targetEnv, as follows: {"dev": ["dev"], "test": ["test"]}) Then I have a Set variable step to take value from the targetEnvs parameter, as follows:.
I'm now looking for a way to use the current environment (stored in a global parameter) instead of hardcoding "dev" in the Set Variable expression, but I'm not sure how to do this.
.
Using this expression won't even start the pipeline.
.
Question: how do I select this attribute of the object? Any other suggestions on how to do tackle this problem are welcome as well!
(Python analogy would be to have a dictionary target_envs and taking a value from it by using the key "current_env": target_envs[current_env].)
When I tried to access the object same as you, the same error occurred. I have taken the parameter targetEnv (given array) and global parameter environment with value as dev.
You can use the following dynamic content to access the key value.
#pipeline().parameters.targetEnv[0][pipeline().globalParameters.environment]
Related
I have a mutli-step azure pipeline used to trigger the execution of a certain job based on keywords I have in azure devops work items.
First step executed is a powershell script that stores into a 'validTags' variable a comma-separated list of strings:
Write-Host "##vso[task.setvariable variable=validTags]$csTags"
After this step, I correctly see the list formatted as I expect:
string1,string2,string3
The 'validTags' variable is then passed as a parameter to another pipeline in which I should split this list and trigger separate jobs:
- template: run.yml
parameters:
tags: $(validTags)
directory: 'path\to\tests'
platforms: 'platform1,platform2'
In the 'run' pipeline I defined this 'tags' parameter:
parameters:
- name: tags
type: string
default: 'someDefaultValue'
and I try to split the parameter:
- ${{each t in split(parameters.tags, ',')}}:
- script: |
echo 'starting job for ${{t}}'
but when I execute the pipeline, I have in 't' still the full string (string1,string2,string3) not splitted.
I have noticed that if I try to perform the split on the "platforms" parameter which is passed along with "tags" to the run.yml pipeline, it works, so it seems that the problem is related to the fact that I am trying to split a string stored in an external variable?
Anyone with a similar issue? Any help on this is much appreciated.
Thanks
For those interested in the outcome of this issue:
I tested several possible alternate solutions, including the use of global variables and group variables, but without success.
I submitted a request to MSFT engineering support to get some insight on this and their response is:
The pipeline does not support splitting the runtime variable with
template syntax ${{ }} currently, and we are not able to find other
workarounds to meet your request. Sorry for the inconvenience. Hope
you can understand.
So, to overcome the issue I removed the split at the pipeline level, as initially planned, but rather passed the comma-separated value's string to the template and added there the necessary processing in Powershell.
Another option would have been to perform all the operations from within the first PowerShell script step:
transform the 'run.yml' template in a separate pipeline
in the script, after getting the tags, loop over their values and trigger the 'run.yml' pipeline passing the single tag as a parameter.
I avoided this solution to keep the operations separate and have more control over the execution flow.
I have many files in a blob container. However I wanted to run a Stored procedure only IF a certain file (e.g. SRManifest.csv) exists on the blob Container. I used Get metadata and IF Condition on Data Factory. Can you please help me with the dynamic script for this. I tried this #bool(startswith(
activity('Get Metadata1').output.childitems.ItemName,
'SRManifest.csv')). It doesnt work.
Then I thought, what if i used #greaterOREquals(activity('Get Metadata1').output.LastModified,adddays(utcnow(),-2))But this checks the last modified within 2 days of the Bloob not the file exist. Thank you.
Please see below my diagram
I have understood your requirement differently I think.
I wanted to run a Stored procedure only IF a certain file (e.g. SRManifest.csv) exists on the blob Container
1 Change your metadata activity to look for existence of sentinel file (SRManifest.csv)
2 Follow with an IF activity, use this condition:
3 Put your sp in the True part of the IF activity
If you also needed the file list passed to the sp then you'll need the GetMetadata with childitems option inside the IF-True activity
Based on your diagram, since you are looping over all the blob names already, you can add a Boolean variable to the pipeline and set its default value to false:
Inside the ForEach activity, you only want to attempt to set the variable if the value is still false, and if the blob name is found, set it to true. Since Set Variable cannot be self-referential, do this inside the False branch of an If activity:
This will only attempt to process if the value is false (so the file name has not been found yet), and will do nothing if the value is true. Now set the variable based on your file name:
[NOTE: This value can be hard coded, parameterized, or based on a variable]
When you execute the pipeline, you'll see the Set Variable stops attempting once the value is set to true:
In the main pipeline, after the ForEach activity has completed, you can use the variable to set the condition of your final If activity. If the blob is never found, it will still be false, so put the Stored Procedure activity inside the True branch.
I have a concat expression defined in the Function Name setting of an Azure Function in my pipeline, where it concatenates the API Query with the current filename that I want to run on this function. When I debug the pipeline, it fails without giving me any feedback. It just says "AzureFunction failed:"
If I manually insert the string, it works fine.
the concat expression is:
#concat('HttpTrigger?filename=', variables('filename'))
I'm new to Azure, any way I can debug this?
try this way:
#concat(variables('FirstName') ,variables('LastName'))
You could use Set Variable Activity with your Azure Function Activity.
In the Variable Activity, set the value of the variable.
Then refer the variable in the Azure Function Activity:
#concat('HttpTriggerJS1?name=',variables('name'))
I'm doing some experimentation with Kubeflow Pipelines and I'm interested in retrieving the run id to save along with some metadata about the pipeline execution. Is there any way I can do so from a component like a ContainerOp?
You can use kfp.dsl.EXECUTION_ID_PLACEHOLDER and kfp.dsl.RUN_ID_PLACEHOLDER as arguments for your component. At runtime they will be replaced with the actual values.
I tried to do this using the Python's DSL but seems that isn't possible right now.
The only option that I found is to use the method that they used in this sample code. You basically declare a string containing {{workflow.uid}}. It will be replaced with the actual value during execution time.
You can also do this in order to get the pod name, it would be {{pod.name}}.
Since kubeflow pipeline relies on argo, you can use argo variable to get what you want.
For example,
#func_to_container_op
def dummy(run_id, run_name) -> str:
return run_id, run_name
#dsl.pipeline(
name='test_pipeline',
)
def test_pipeline():
dummy('{{workflow.labels.pipeline/runid}}', '{{workflow.annotations.pipelines.kubeflow.org/run_name}}')
You will find that the placeholders will be replaced with the correct run_id and run_name.
For more argo variables: https://github.com/argoproj/argo-workflows/blob/master/docs/variables.md
To Know what are recorded in the labels and annotation in the kubeflow pipeline run, just get the corresponding workflow from k8s.
kubectl get workflow/XXX -oyaml
create_run_from_pipeline_func which returns RunPipelineResult, and has run_id attribute
client = kfp.Client(host)
result = client.create_run_from_pipeline_func(…)
result.run_id
Your component's container should have an environment variable called HOSTNAME that is set to its unique pod name, from which you derive all necessary metadata.
I have a JOB rundeck called "TEST"
I have an option called country
this option retreives a list of key, value from a remote URL as :
[
{"name":"FRANCE", "value":"FR"},
{"name":"ITALY", "value":"IT"},
{"name":"ALGERIA", "value":"DZ"}
]
I would like to use both of the name and the value in a job step.
echo ${option.country.name}
echo ${option.country.value}
But this doesn't work and I'm not able to get the name of the parameter
getting the value can be done using ${option.country}
Is there any trick to get the parameter name ???
Just for the record answer: Maybe the best approach is to create some script-step that reads the JSON file and extracts the name, also, you can use the same value name like this example (of course is not applicable for all cases).