how to DAG check failure then trigger rule airflow with postgres?
I have database insert from airbyte. when syncing data, there are missing table views. I want to make airflow for the table view so it doesn't disappear. how to make the airflow and run it with triggerRule?
Related
I am running airflow using postgres.
There was a phenomenon that the web server was slow during operation.
It was a problem caused by data continuing to accumulate in dag_run and log of the db table (it became faster by accessing postgres and deleting data directly).
Are there any airflow options to clean the db periodically?
If there is no such option, we will try to delete the data directly using the dag script.
And I think it's strange that the web server slows down because there is a lot of data. Does the web server get all the data when opening another window?
You can purge old records by running:
airflow db clean [-h] --clean-before-timestamp CLEAN_BEFORE_TIMESTAMP [--dry-run] [--skip-archive] [-t TABLES] [-v] [-y]
(cli reference)
It is a quite common setup to include this command in a DAG that runs periodically.
I have a problem replicating postgres to clickhouse. I can easily create materialized postgresql engine and it works fine. But we need to have synchronous commits work so that any write to our main postgres db should wait for the same writes to be written to clickhouse before the transaction is completed.
I tried adding the following setting to postgresql config. But this prevents clickhouse from replicating.
synchronous_commit: "on"
synchronous_standby_names: "'*'"
I have recently upgraded composer/airflow from 1.17/2.0 to 1.18/2.2 on GCP. Upon upgrade I see the following warnings on airflow UI:
Airflow found incompatible data in the dag_run table in the metadatabase, and has moved them to _airflow_moved__2_2__dag_run during the database migration to upgrade. Please inspect the moved data to decide whether you need to keep them, and manually drop the _airflow_moved__2_2__dag_run table to dismiss this warning. Read more about it in Upgrading.
and
Airflow found incompatible data in the task_instance table in the metadatabase, and has moved them to _airflow_moved__2_2__task_instance during the database migration to upgrade. Please inspect the moved data to decide whether you need to keep them, and manually drop the _airflow_moved__2_2__task_instance table to dismiss this warning. Read more about it in Upgrading.
to remove the tables as instructed in the error message, i followed GCP documentation on how to access the airflow database. once connected i see the tables listed however the following attempt to remove the tables doesn't work. no error message produced.
DROP TABLE public._airflow_moved__2_2__dag_run
DROP TABLE public._airflow_moved__2_2__task_instance
I am currently writing some airflow DAG integrity tests for the first time. I am coming across an error where some of my operators/tasks are referencing airflow variables, eg:
test_var= Variable.get("AIRFLOW_VAR_TEST_VAR")
When i run a integrity test using pytest, i get the below error:
sqlalchemy.exc.OperationalError: (sqlite3.OperationalError) no such table: variable
I can work around this by replacing the Variable.get code with a hard-coded value, but wondering if there is a better way to deal with this error?
Thanks,
You should run
AIRFLOW__CORE__UNIT_TEST_MODE=True airflow db reset
This will initialise and re-create unit test sqlite database from scratch.
Alternatively you can run pytest with the airflow-specific switch --with-db-init switch which does the same.
I have Apache-Airflow implemented on an Ubuntu version 18.04.3 server. When I set it up, I used the sql lite generic database, and this uses the sequential executor. I did this just to play around and get used to the system. Now I'm trying to use the Local Executor, and will need to transition my database from sqlite to the recommended postgres sql.
Does anybody know how to make this transition? All of the tutorials I've found entail setting up Airflow with postgres sql from the beginning. I know there are a ton of moving parts and I'm scared of messsing up what I currently have running. Anybody who knows how to do this or can point me at where to look is much appreciated. Thanks!
Just to complete #lalligood answer with some commands:
In airflow.cfg file look for sql_alchemy_conn and update it to point to your PostgreSQL serv:
sql_alchemy_conn = postgresql+psycopg2://user:pass#hostadress:port/database
For instance:
sql_alchemy_conn = postgresql+psycopg2://airflow:airflow#localhost:5432/airflow
As indicated in the above line you need both user and database called airflow, therefore you need to create that. To do so, open your psql command line and type the following commands to create a user and database called airflow and give all privileges over database airflow to user airflow:
CREATE USER airflow;
CREATE DATABASE airflow;
GRANT ALL PRIVILEGES ON DATABASE airflow TO airflow;
Now you are ready to init the airflow application using postgres:
airflow initdb
If everything was right, access the psql command line again, enter in airflow database with \c airflow command and type \dt command to list all tables of that database. You should see a list of airflow tables, currently it is 23.
Another option other than adding to the airflow.cfg file
is to set the ENV varibale AIRFLOW__CORE__SQL_ALCHEMY_CONN to the postgresql server you want.
Example: export AIRFLOW__CORE__SQL_ALCHEMY_CONN_SECRET=sql_alchemy_conn
Or you can set it in your Dockerfile setting.
See documentation here
I was able to get it working by doing the following 4 steps:
Assuming that you are starting from scratch, initialize your airflow environment with the SQLite database. The key takeaway here is for it to generate the airflow.cfg file.
Update the sql_alchemy_conn line in airflow.cfg to point to your PostgreSQL server.
Create the airflow role + database in PostgreSQL. (Revoke all permissions from public to airflow database & ensure airflow role owns airflow database!)
(Re)Initialize airflow (airflow initdb) & confirm that you see ~19 tables in the airflow database.