Airflow unpause dag during manual trigger - airflow-2.x

I am using Airflow 2.2.2 and have a dag which is scheduled to run every 10 minutes and is paused. I am trying to invoke it manually using airflow client. Dag is not getting unpaused and dagrun is in queued stated. Is it possible to unpause dag using airflow client when creating dag run without invoking additional API call.
api_instance = dag_run_api.DAGRunApi(api_client)
dag_run = DAGRun(
logical_date=datetime.now(timezone(timedelta())),
conf=request_data,
)
api_response = api_instance.post_dag_run(
"airflow_testn", dag_run
)

You can add the parameter is_paused_upon_creation = False to your default_args
(Optional[bool]) -- Specifies if the dag is paused when created for the first time. If the dag exists already, this flag will be ignored. If this optional parameter is not specified, the global config setting will be used.

Related

Airflow PostgresOperator Safety -- execution_timeout not respected? How to kill process in DB if it's taking too long?

Working on getting Airflow implemented at my company but need to perform some safety checks prior to connecting to our prod dbs.
There is concern about stupid SQL being deployed and eating up too many resources. My thought was that an execution_timeout setting on a PostgresOp task would:
Fail the task
Kill the query process in the db
I have found neither to be true.
Code:
with DAG(
# Arguments applied to instantiate this DAG. Update your values here
# All parameters visible in airflow.models.dag
dag_id=DAG_ID,
default_args=DEFAULT_ARGS,
dagrun_timeout=timedelta(minutes=20),
start_date=days_ago(1),
schedule_interval=None,
tags=['admin'],
max_active_runs=1
) as dag:
kill_test = PostgresOperator(
task_id="kill_test",
execution_timeout=timedelta(seconds=10),
postgres_conn_id="usa_db",
sql="""
SET application_name to airflow_test;
<SELECT... intentionally long running query> ;
""")
Airflow does not fail the task after the timeout.
Even when I manually fail the task in the UI, it does not kill the query in the Postgres db.
What is the deal here? Is there any way to put in safety measures to hard kill an Airflow initiated Postgres query in the db?
I'm not posting here, but I have checked:
Airflow UI shows task instance duration way over execution timeout
pg_stat activity to confirm query is running way over execution timeout
I guess you are looking for this parameter runtime_parameters={'statement_timeout': '180000ms'}.(airflow example)
I don't know in which version this was added but if you update your apache-airflow-providers-postgres module to the last version you can use the mentioned parameter.

Airflow how to stop DAGS from triggering when unpaused

As expected an airflow DAG when unpaused runs the last schedule. For example, if I have an hourly DAG and I paused it at 2:50pm today and then restarted at 3:44pm, it automatically triggers the DAG with a run time of 3:00pm. Is there a way I can prevent the automatic triggering on unpausing a DAG. I am currently on Airflow 2.2.3. Thanks!

Specifying DAG queue via Airflow's UI TriggerDag parameters

thanks for reading this question.
I accomplished to set up an Airflow cluster following the official instructions here and I managed to add workers hosted at remote machines. It seems that everything works fine (connections to Redis and Postgre are working and DAG tasks are distributed and executed properly across the different workers).
Also, I can execute DAGs at a specific worker by subscribing each worker to an exclusive queue and hardcoding the queue parameter of each operator in the DAG.
My problem is I would like to be able to parametrize said execution queue via Airflow CLI, with the help of the Trigger DAG button. I tried using Jinja Templates and XComs, but these options didn't help me with my problem since Jinja Templates don't seem to work on the parameters of Operators and XCom needs the ti parameter or Jinja Templates.
I know some people have written plugins in order to simplify this task, but since all the information I found was prior to Airflow 2.x I wanted to know if there was already a solution for this problem.
Thank you so much in advance
Edit: I would like to do this: Triggering DAG via UI.
However,
task_test = BashOperator(
task_id='test2',
queue='{{ dag_run.conf['queue'] }}',
bash_command="echo hi",
)
Does not work, since the job gets queued at {{ dag_run.conf['queue'] }} instead of queue1
I also tried the following approach, and it doesnt work either, as all
jobs get scheduled on default queue:
with DAG(
'queue_execution_test',
default_args=default_args,
description='Test specific execution.',
schedule_interval=None, # Para que solo se ejecute on demand
start_date=days_ago(2),
) as dag:
run_on_queue = 'default' #'{{ dag_run.conf["queue"] if dag_run else "default" }}'
def parse_queue_parameter(ti, **kwargs): # Task instance for XCom and kwargs for dagrun parameters
try:
ti.xcom_push(key='custom_queue', value= kwargs['dag_run'].conf['queue'] )
except:
print("No queue defined")
initial_task = PythonOperator(
task_id='test1',
queue=run_on_queue,
provide_context=True,
python_callable=parse_queue_parameter,
)
task_test = BashOperator(
task_id='test2',
queue=run_on_queue,
bash_command="echo hi",
)
initial_task.set_downstream(task_test)

How to safely restart Airflow and kill a long-running task?

I have Airflow is running in Kubernetes using the CeleryExecutor. Airflow submits and monitors Spark jobs using the DatabricksOperator.
My streaming Spark jobs have a very long runtime (they run forever unless they fail or are cancelled). When pods for Airflow worker are killed while a streaming job is running, the following happens:
Associated task becomes a zombie (running state, but no process with heartbeat)
Task is marked as failed when Airflow reaps zombies
Spark streaming job continues to run
How can I force the worker to kill my Spark job before it shuts down?
I've tried killing the Celery worker with a TERM signal, but apparently that causes Celery to stop accepting new tasks and wait for current tasks to finish (docs).
You need to be more clear about the issue. If you are saying that the spark cluster finishes the jobs as expected and not calling the on_kill function, it's expected behavior. As per the docs on kill function is for cleaning up after task get killed.
def on_kill(self) -> None:
"""
Override this method to cleanup subprocesses when a task instance
gets killed. Any use of the threading, subprocess or multiprocessing
module within an operator needs to be cleaned up or it will leave
ghost processes behind.
"""
In your case when you manually kill the job it is doing what it has to do.
Now if you want to have a clean_up even after successful completion of the job, override post_execute function.
As per the docs. The post execute is
def post_execute(self, context: Any, result: Any = None):
"""
This hook is triggered right after self.execute() is called.
It is passed the execution context and any results returned by the
operator.
"""

Apache Airflow through HTTP

I've been running into an issue where I can successfully trigger a dag from airflow's rest api command(s) (https://airflow.apache.org/api.html); however, the dag INSTANCES do not run. I'm calling -> POST /api/experimental/dags/dag_id/dag_runs where dag_id is the dag I'm running. The only thing that happens is that the dag immediately returns success. I trigged the dag manually and I get running dag instances (see picture 2nd dag run). Note the 2nd DAG run fails - this should not affect the issue I am trying to fix.
DAG
Fixed the issue -> Had to deal with scheduler. I added 'depends_on_past': False, 'start_date': datetime(2019, 6, 1) and it got fixed
The dag runs created outside the scheduler still must occur after the start_date; if there are no existing runs already you might want to set the schedule to #once and the start_date to a past date for which you want to have the execution_date run. This will give you a successful run (once it completes) against which other manual runs can compare themselves for depends_on_past.