Airflow - DAG integrity tests - sqlalchemy.exc.OperationalError: (sqlite3.OperationalError) no such table: variable - pytest

I am currently writing some airflow DAG integrity tests for the first time. I am coming across an error where some of my operators/tasks are referencing airflow variables, eg:
test_var= Variable.get("AIRFLOW_VAR_TEST_VAR")
When i run a integrity test using pytest, i get the below error:
sqlalchemy.exc.OperationalError: (sqlite3.OperationalError) no such table: variable
I can work around this by replacing the Variable.get code with a hard-coded value, but wondering if there is a better way to deal with this error?
Thanks,

You should run
AIRFLOW__CORE__UNIT_TEST_MODE=True airflow db reset
This will initialise and re-create unit test sqlite database from scratch.
Alternatively you can run pytest with the airflow-specific switch --with-db-init switch which does the same.

Related

How is airflow database managed periodically?

I am running airflow using postgres.
There was a phenomenon that the web server was slow during operation.
It was a problem caused by data continuing to accumulate in dag_run and log of the db table (it became faster by accessing postgres and deleting data directly).
Are there any airflow options to clean the db periodically?
If there is no such option, we will try to delete the data directly using the dag script.
And I think it's strange that the web server slows down because there is a lot of data. Does the web server get all the data when opening another window?
You can purge old records by running:
airflow db clean [-h] --clean-before-timestamp CLEAN_BEFORE_TIMESTAMP [--dry-run] [--skip-archive] [-t TABLES] [-v] [-y]
(cli reference)
It is a quite common setup to include this command in a DAG that runs periodically.

Apache Airflow Init Db

I am trying to initialize a database for my project which is based on using apache airflow. I am not too familiar with what happened but I changed my value from airflow.cfg file to sql_alchemy_conn =postgresql+psycopg2:////Users/gabeuy/airflow/airflow.db. Then when I saved the changes and ran the command airflow db init, the error occurred and did not allow me to run the db.
I tried looking up different ways to change it, ensured that I had Postgres and psycopg installed but it still resulted in an error when I ran the command. I was expecting it to run so that I could access the airflow db local host with the DAGs. error occured
Your sql_alchemy_conn is pointing to a local file path (indicating a SQLite DB), but the protocol is indicating a PostgreSQL DB. The error is telling you it's missing a password, which is required by PostgreSQL.
For PostgreSQL, the expected URL format is:
postgresql+psycopg2://<user>:<password>#<host>/<db>
And for a SQLite DB, the expected URL format is:
sqlite:////<path/to/airflow.db>
A SQLite DB is convenient for testing purposes. A SQLite DB is stored as a single file on your computer which makes it easy to set up (airflow db init will generate the file if it doesn't exist). A PostgreSQL DB takes a bit more work to set up, but is generally advised for a production scenario.
For more information about Airflow database configuration, see: https://airflow.apache.org/docs/apache-airflow/stable/howto/set-up-database.html.
And for more information about airflow db CLI commands, see: https://airflow.apache.org/docs/apache-airflow/stable/cli-and-env-variables-ref.html#db.

using executable in Liquibase changesets

I am using execute command tag from my liquibase changesets and this inturn is configured to run the sqls in oracle instant client sql plus.
when i run a liquibase update on my changelogxml everything works fine and the liquibase update is sucessfull.I can see the changes to the table also.
But when i try to fail the update process by giving a syntax error in my sql file refered in the changeset.Liquibase still returns liquibase update sucessfull.I expected it to throw sql errors.The sql when run seperately in toad throws syntax error.What should i do to get the error displayed out.?
Datical has created a custom Liquibase change tag that executes SQL using the sqlplus command line client. It was surprisingly much more complicated that you might think.
Some of the issues we had to deal with:
we had to do things to ensure that the sql files always had certain statements in place, and never had certain other statements. This might include things like setting the schema, ensuring that the only spool commands were ones we knew about, that the script had an 'EXIT' command, and ensuring that whenever there was a SQL error that the exit code was returned.
The sqlplus executable does not return an exit code (i.e. a non-zero exit code form the native process) in all cases, and instead will write errors to an error table in the database. The table where sqlplus writes errors is called sperrorlog, and this may be what you will need to look into.
I can't really go into all the details, but just know that what you are attempting to do is neither simple nor straightforward.

Enterprise library semantic logging block. SQLDatabase sink. Out of process

I am using Enterprise library semantic logging block (out of process) and using SQL Database sink to dump all the message. After putting everything in place and doing a test run, I am getting the following error - could not find stored procedure 'dbo.WriteTraces'.
Anybody faced similar issue ? Pl suggest.
Out of process semantic logging assembly comes with some powershell scripts and .sql files. We have to edit (to change DB name) and run these scripts. This will generate the stored procs and the associated table for us.
I encountered this same error but it was because we were trying to use a schema other than dbo for our logging database. Once we changed it back to dbo that resolved the problem. We were using the out of process SemanticLogging-svc.exe, which, from what I can tell, assumes that dbo is the schema name.

Setting up clean database for each test when using postgresql

I'm trying to find a way to create set of tests that would need to have a clean database before running test case.
InMemoryDB seems not to be an option because the DDL we use fails to execute in H2.
The database creation is done with evolutions, so it would be handy if I was able to use evolutions to generate clean database for each test and then drop the database after test is run.
Marko
As option you can try Flashback (or Point In Time Recovery) feature to restore DB initial state or consider writing of tests that does not depend on each other.