Airflow execution Macros returning wrong value - date

Im trying to get the prev_execution_date using Airflow`s macros but only the current execution_date is returned regardless what Macro I call
This is the task code that i`m using
get_last_run = BashOperator(
task_id='get_last_run',
xcom_push=True,
bash_command='echo "{{ execution_date }}, {{ prev_execution_date }}, {{next_execution_date}}"',
dag=dag,
)
And this are the returned values (the exactly same):
[2019-10-07 16:26:41,781] {{bash_operator.py:123}} INFO - 2019-10-07T16:23:44.864787+00:00, 2019-10-07T16:23:44.864787+00:00, 2019-10-07T16:23:44.864787+00:00
Do someone knows whats happening and why the macros arent working properly?

Related

How to understand this Helm usage

I saw the following Helm from time to time, but I didn't understand how it works.
{{- include "common.tplvalues.render" ( dict "value" .Values.abc.service.annotations "context" $) | nindent 4 }}
First, where is the definition of common.tplvalues.render? I searched all over the internet, and seems that this is related to Bitnami? Or it is a built-in function provided by Helm?
Second, I understand how to create a dictionary by reading this doc about template functions. However, why create a dictionary named value in this situation?
Third, the usage of $ sign. In this official doc about variables, it stated that '$ - this variable will always point to the root context'. What is 'root context'? Is this 'root context' related to the 'context' parameter of the 'value' dictionary?

has function in the helm template is not returning true

I have a values.yaml file containing list variable like:
excludePort: [32104, 30119]
I am trying to use that in the helm template like:
{{ if has 32104 .Values.excludePort }}
But it seems to be not returning true. (The block after the condition is not executing. Any reason for it?
This is a problem caused by a variable type mismatch.
{{ kindOf (first .Values.excludePort) }}
output:
float64
You need to understand that the essence of helm rendering templates is to first parse the values.yaml file into map through golang, and the number will be deserialized to float64 type by default, which is determined by the underlying implementation of the golang language.
See this: Go Decode 、json package
So the elements in the excludePort array is of type float64 and 32104 is of type int
In order to get the desired result you need to implement this:
{{- if has 32104.0 .Values.excludePort}}
Of course, this is not a good implementation, because there is a precision problem caused by float, it is best to use a string to solve it.
Lisk this:
values.yaml
excludePort: ["32104", "30119"]
template/xxx.yaml
{{- if has "32104" .Values.excludePort}}
...

Airflow parameters to Postgres Operator

I am trying to pass the execution date as runtime parameter to the postgres operator
class MyPostgresOperator(PostgresOperator):
template_fields = ('sql','parameters')
task = MyPostgresOperator(
task_id='test_date',
postgres_conn_id='redshift',
sql="test_file.sql",
parameters={'crunch_date':'{{ ds }}'},
dag=dag
)
Then I try to use this parameter in the sql query to accept the value as passed by the dag
select
{{ crunch_date }} as test1,
The dag sends the parameter correctly, however the query is just taking a null value instead of the execution date that is passed. Is there a way to have the postgresql with redshift accept the correct value for this parameter?
You will have to update your sql query as below:
select
{{ ds }} as test1,
You won't be able to use one templated field in other. If you want to pass a param in task and use it in Jinja template, use params parameter.
UPDATE:
But do note that params is not a templated field. And if you template it, it won't render as nested templating won't work.
task = MyPostgresOperator(
task_id='test_date',
postgres_conn_id='redshift',
sql="test_file.sql",
params={'textstring':'abc'},
dag=dag
)
where test_file.sql is:
select
{{ params.textstring }} as test1,
Check 4th point in https://medium.com/datareply/airflow-lesser-known-tips-tricks-and-best-practises-cf4d4a90f8f to understand more about params.
You can use the airflow macros inside the query string - which needs to be passed to the redshift.
Example:
PostgresOperator(task_id="run_on_redshift",
dag=dag,
postgres_conn_id=REDSHIFT_CONN_ID,
sql="""
UNLOAD ('select * from abc.xyz') TO 's3://path/{{ds}}/' iam_role 's3_iam_role' DELIMITER AS '^' ALLOWOVERWRITE addquotes ESCAPE HEADER parallel off;
"""
)

How to reference a DAG's execution date inside of a `KubernetesPodOperator`?

I am writing an Airflow DAG to pull data from an API and store it in a database I own. Following best practices outlined in We're All Using Airflow Wrong, I'm writing the DAG as a sequence of KubernetesPodOperators that run pretty simple Python functions as the entry point to the Docker image.
The problem I'm trying to solve is that this DAG should only pull data for the execution_date.
If I was using a PythonOperator (doc), I could use the provide_context argument to make the execution date available to the function. But judging from the KubernetesPodOperator's documentation, it seems that the Kubernetes operator has no argument that does what provide_context does.
My best guess is that you could use the arguments command to pass in a date range, and since it's templated, you can reference it like this:
my_pod_operator = KubernetesPodOperator(
# ... other args here
arguments=['python', 'my_script.py', '{{ ds }}'],
# arguments continue
)
And then you'd get the start date like you'd get any other argument provided to a Python file run as a script, by using sys.argv.
Is this the right way of doing it?
Thanks for the help.
Yes, that is the correct way of doing it.
Each Operator would have template_fields. All the parameters listed in template_fields can render Jinja2 templates and Airflow Macros.
For KubernetesPodOperator, if you check docs, you would find:
template_fields = ['cmds', 'arguments', 'env_vars', 'config_file']
which means you can pass '{{ ds }}'to any of the four params listed above.

Using where() node to filter empty tags in Kapacitor

Using Kapacitor 1.3 and I am trying to use the following where node to keep measurements with an empty tag. Nothing is passing through and I get the same result with ==''.
| where(lambda: 'process-cpu__process-name' =~ /^$/)
I can workaround this issue using a default value for missing tags and filter on this default tag, in the following node but I am wondering if there is a better way structure the initial where statement and avoid an extra node.
| default()
.tag('process-cpu__process-name','system')
| where(lambda: \"process-cpu__process-name\" == 'system' )
Sure it doesn't pass, 'cause this
'process-cpu__process-name'
is a string literal it TICKScript, not a reference to a field, which is
"process-cpu__process-name"
You obviously got the condition always false in this case.
Quite common mistake though, especially for someone with previous experience with the languages that tolerates both single & double quote for mere string. :-)
Also, there's a function in TICKScript lambda called strLength(), find the doc here, please.