Airflow initdb slot_pool does not exists - postgresql

I'm facing an issue with airflow initialization on postgres backend
Ubuntu : 18.04.1
Airflow : v1.10.6
Postgres : 10.10
Python 3.6
And when I run
airflow initdb
I get
[2019-11-22 10:17:23,564] {db.py:368} INFO - Creating tables
INFO [alembic.runtime.migration] Context impl PostgresqlImpl.
INFO [alembic.runtime.migration] Will assume transactional DDL.
Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/sqlalchemy/engine/base.py", line 1246, in _execute_context
cursor, statement, parameters, context
File "/usr/local/lib/python3.6/dist-packages/sqlalchemy/engine/default.py", line 581, in do_execute
cursor.execute(statement, parameters)
psycopg2.errors.UndefinedTable: relation "airflow.slot_pool" does not exist
LINE 2: FROM airflow.slot_pool
^
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/local/bin/airflow", line 37, in <module>
args.func(args)
File "/usr/local/lib/python3.6/dist-packages/airflow/bin/cli.py", line 1131, in initdb
db.initdb(settings.RBAC)
File "/usr/local/lib/python3.6/dist-packages/airflow/utils/db.py", line 106, in initdb
upgradedb()
File "/usr/local/lib/python3.6/dist-packages/airflow/utils/db.py", line 377, in upgradedb
add_default_pool_if_not_exists()
File "/usr/local/lib/python3.6/dist-packages/airflow/utils/db.py", line 74, in wrapper
return func(*args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/airflow/utils/db.py", line 90, in add_default_pool_if_not_exists
if not Pool.get_pool(Pool.DEFAULT_POOL_NAME, session=session):
File "/usr/local/lib/python3.6/dist-packages/airflow/utils/db.py", line 70, in wrapper
return func(*args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/airflow/models/pool.py", line 44, in get_pool
return session.query(Pool).filter(Pool.pool == pool_name).first()
File "/usr/local/lib/python3.6/dist-packages/sqlalchemy/orm/query.py", line 3265, in first
ret = list(self[0:1])
File "/usr/local/lib/python3.6/dist-packages/sqlalchemy/orm/query.py", line 3043, in __getitem__
return list(res)
File "/usr/local/lib/python3.6/dist-packages/sqlalchemy/orm/query.py", line 3367, in __iter__
return self._execute_and_instances(context)
File "/usr/local/lib/python3.6/dist-packages/sqlalchemy/orm/query.py", line 3392, in _execute_and_instances
result = conn.execute(querycontext.statement, self._params)
File "/usr/local/lib/python3.6/dist-packages/sqlalchemy/engine/base.py", line 982, in execute
return meth(self, multiparams, params)
File "/usr/local/lib/python3.6/dist-packages/sqlalchemy/sql/elements.py", line 287, in _execute_on_connection
return connection._execute_clauseelement(self, multiparams, params)
File "/usr/local/lib/python3.6/dist-packages/sqlalchemy/engine/base.py", line 1101, in _execute_clauseelement
distilled_params,
File "/usr/local/lib/python3.6/dist-packages/sqlalchemy/engine/base.py", line 1250, in _execute_context
e, statement, parameters, cursor, context
File "/usr/local/lib/python3.6/dist-packages/sqlalchemy/engine/base.py", line 1476, in _handle_dbapi_exception
util.raise_from_cause(sqlalchemy_exception, exc_info)
File "/usr/local/lib/python3.6/dist-packages/sqlalchemy/util/compat.py", line 398, in raise_from_cause
reraise(type(exception), exception, tb=exc_tb, cause=cause)
File "/usr/local/lib/python3.6/dist-packages/sqlalchemy/util/compat.py", line 152, in reraise
raise value.with_traceback(tb)
File "/usr/local/lib/python3.6/dist-packages/sqlalchemy/engine/base.py", line 1246, in _execute_context
cursor, statement, parameters, context
File "/usr/local/lib/python3.6/dist-packages/sqlalchemy/engine/default.py", line 581, in do_execute
cursor.execute(statement, parameters)
sqlalchemy.exc.ProgrammingError: (psycopg2.errors.UndefinedTable) relation "airflow.slot_pool" does not exist
LINE 2: FROM airflow.slot_pool
^
[SQL: SELECT airflow.slot_pool.id AS airflow_slot_pool_id, airflow.slot_pool.pool AS airflow_slot_pool_pool, airflow.slot_pool.slots AS airflow_slot_pool_slots, airflow.slot_pool.description AS airflow_slot_pool_description
FROM airflow.slot_pool
WHERE airflow.slot_pool.pool = %(pool_1)s
LIMIT %(param_1)s]
[parameters: {'pool_1': 'default_pool', 'param_1': 1}]
(Background on this error at: http://sqlalche.me/e/f405)
I've tried deleting / recreating database and user rights (with search_path as said in doc). My postgres is accessible and well configured as tables have been previously created by airflow before the crash ;)
Any ideas?
I've made a try with with Airflow 1.10.2 and it work smoothly with postgres backend.

This is might be because of examples. Try load_examples = False in airflow.cfg
andrun airflow upgradedb or airflow resetdb

Have you tried
airflow resetdb
airflow initdb: Initialize the metadata database
airflow resetdb: Burn down and rebuild the metadata database
airflow upgradedb: Apply missing migrations - this is also idempotent and safe but it tracks migrations so if your tables aren't in the state that alembic thinks that they are in you would need to find that "state" and edit that
Most importantly, if you have any connections or variables set they will be deleted if you run this.

Airflow uses default backend SQLite, and have default tables(like slot_pool) there. So if you consider to use another backend like postgres you should at least add these default tables there.
$ airflow initdb

I know it's kinda late, but I experienced the same trying $ airflow db init against a MySQL 8 backend, and got a similar error.
Turned out that there was a mismatch in my configuration:
The database was named airflow_db, so I had
sql_alchemy_conn = mysql+mysqlconnector://airflow:******#localhost:3306/airflow_db, but a few lines down in the airflow.cfg, I also had the line
sql_alchemy_schema = airflow

Related

Airflow "Something Bad Has Happened" Error: Session Table does not exist

I'm trying to crteate a simple test dag to write a test query in a AWS EC2 postgres instance behind a bastion host.
After adding this script in airflow with touch pg_test.py and nano pg_test.py
from airflow import DAG
from airflow.operators.postgres_operator import PostgresOperator
from datetime import datetime
Query = """DROP TABLE IF EXISTS dataset.test;
create TABLE dataset.test as (select * from table
);"""
dag = DAG(
'postgres_test_dag',
schedule_interval = '0 * * * *',
start_date = datetime(2021, 3, 20), catchup = False
)
create_table = PostgresOperator(
task_id='create_table',
dag=dag,
postgres_conn_id='postgres_dwh',
sql=Query
)
create_table
and installing the airflow postgres provider with
`pip install apache-airflow-providers-postgres`
I'm getting this error and I do not know how to solve it
Something bad has happened.
Airflow is used by many users, and it is very likely that others had similar problems and you can easily find
a solution to your problem.
Consider following these steps:
* gather the relevant information (detailed logs with errors, reproduction steps, details of your deployment)
* find similar issues using:
* GitHub Discussions
* GitHub Issues
* Stack Overflow
* the usual search engine you use on a daily basis
* if you run Airflow on a Managed Service, consider opening an issue using the service support channels
* if you tried and have difficulty with diagnosing and fixing the problem yourself, consider creating a bug report.
Make sure however, to include all relevant details and results of your investigation so far.
Python version: 3.8.10
Airflow version: 2.2.4
Node: ip-XXXXXXXXX.ec2.internal
-------------------------------------------------------------------------------
Traceback (most recent call last):
File "/home/airflow/.local/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1276, in _execute_context
self.dialect.do_execute(
File "/home/airflow/.local/lib/python3.8/site-packages/sqlalchemy/engine/default.py", line 608, in do_execute
cursor.execute(statement, parameters)
psycopg2.errors.UndefinedTable: relation "session" does not exist
LINE 2: FROM session
^
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/airflow/.local/lib/python3.8/site-packages/flask/app.py", line 2447, in wsgi_app
response = self.full_dispatch_request()
File "/home/airflow/.local/lib/python3.8/site-packages/flask/app.py", line 1953, in full_dispatch_request
return self.finalize_request(rv)
File "/home/airflow/.local/lib/python3.8/site-packages/flask/app.py", line 1970, in finalize_request
response = self.process_response(response)
File "/home/airflow/.local/lib/python3.8/site-packages/flask/app.py", line 2269, in process_response
self.session_interface.save_session(self, ctx.session, response)
File "/home/airflow/.local/lib/python3.8/site-packages/airflow/www/session.py", line 32, in save_session
return super().save_session(*args, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/flask_session/sessions.py", line 553, in save_session
saved_session = self.sql_session_model.query.filter_by(
File "/home/airflow/.local/lib/python3.8/site-packages/sqlalchemy/orm/query.py", line 3429, in first
ret = list(self[0:1])
File "/home/airflow/.local/lib/python3.8/site-packages/sqlalchemy/orm/query.py", line 3203, in __getitem__
return list(res)
File "/home/airflow/.local/lib/python3.8/site-packages/sqlalchemy/orm/query.py", line 3535, in __iter__
return self._execute_and_instances(context)
File "/home/airflow/.local/lib/python3.8/site-packages/sqlalchemy/orm/query.py", line 3560, in _execute_and_instances
result = conn.execute(querycontext.statement, self._params)
File "/home/airflow/.local/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1011, in execute
return meth(self, multiparams, params)
File "/home/airflow/.local/lib/python3.8/site-packages/sqlalchemy/sql/elements.py", line 298, in _execute_on_connection
return connection._execute_clauseelement(self, multiparams, params)
File "/home/airflow/.local/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1124, in _execute_clauseelement
ret = self._execute_context(
File "/home/airflow/.local/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1316, in _execute_context
self._handle_dbapi_exception(
File "/home/airflow/.local/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1510, in _handle_dbapi_exception
util.raise_(
File "/home/airflow/.local/lib/python3.8/site-packages/sqlalchemy/util/compat.py", line 182, in raise_
raise exception
File "/home/airflow/.local/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1276, in _execute_context
self.dialect.do_execute(
File "/home/airflow/.local/lib/python3.8/site-packages/sqlalchemy/engine/default.py", line 608, in do_execute
cursor.execute(statement, parameters)
sqlalchemy.exc.ProgrammingError: (psycopg2.errors.UndefinedTable) relation "session" does not exist
LINE 2: FROM session
^
[SQL: SELECT session.id AS session_id_1, session.session_id AS session_session_id, session.data AS session_data, session.expiry AS session_expiry
FROM session
WHERE session.session_id = %(session_id_1)s
LIMIT %(param_1)s]
[parameters: {'session_id_1': '3c0eb88d-9042-4951-94a1-c6d127d02450', 'param_1': 1}]
(Background on this error at: http://sqlalche.me/e/13/f405)
I already tried to reboot the airflow instance and reconnect to the terminal but did not work. I suspect that it has something to with the connection to postgres (AWS EC2) and the internal airflow postgres database.
Do you have any suggestions ? And maybe also a precise explanation of this issue? It would be very appreciated.
The problem is that Airflow is looking for a session table in the metadata database but isn't finding one. I see in the migration that this table was added fairly recently, you can run airflow db upgrade to run the latest migrations and that should fix your problem.

Airflow scheduler failure

I have followed
this tutorial in attempt to build an airflow cluster on localhost with my own DAGs. When I ran airflow scheduler after having set executor = CeleryExecutor in the config file, I received the following traceback:
Traceback (most recent call last):
File "/home/yurii/Tools/anaconda3/bin/airflow", line 28, in
args.func(args)
File"/home/yurii/Tools/anaconda3/lib/python3.6/site-packages/airflow/bin/cli.py", line 839, in scheduler job.run()
File "/home/yurii/Tools/anaconda3/lib/python3.6/site-packages/airflow/jobs.py", line 200, in run
self._execute()
File "/home/yurii/Tools/anaconda3/lib/python3.6/site-packages/airflow/jobs.py", line 1309, in _execute
self._execute_helper(processor_manager)
File "/home/yurii/Tools/anaconda3/lib/python3.6/site-packages/airflow/jobs.py", line 1441, in _execute_helper
self.executor.heartbeat()
File "/home/yurii/Tools/anaconda3/lib/python3.6/site-packages/airflow/executors/base_executor.py", line 124, in heartbeat
self.execute_async(key, command=command, queue=queue)
File "/home/yurii/Tools/anaconda3/lib/python3.6/site-packages/airflow/executors/celery_executor.py", line 80, in execute_async
args=[command], queue=queue)
File "/home/yurii/Tools/anaconda3/lib/python3.6/site-packages/celery/app/task.py", line 573, in apply_async
**dict(self._get_exec_options(), **options)
File "/home/yurii/Tools/anaconda3/lib/python3.6/site-packages/celery/app/base.py", line 354, in send_task
reply_to=reply_to or self.oid, **options
File "/home/yurii/Tools/anaconda3/lib/python3.6/site-packages/celery/app/amqp.py", line 310, in publish_task
**kwargs
File "/home/yurii/Tools/anaconda3/lib/python3.6/site-packages/kombu/messaging.py", line 172, in publish
routing_key, mandatory, immediate, exchange, declare)
File "/home/yurii/Tools/anaconda3/lib/python3.6/site-packages/kombu/connection.py", line 449, in _ensured
return fun(*args, **kwargs)
File "/home/yurii/Tools/anaconda3/lib/python3.6/site-packages/kombu/messaging.py", line 188, in _publish
mandatory=mandatory, immediate=immediate,
File "/home/yurii/Tools/anaconda3/lib/python3.6/site-packages/librabbitmq/init.py", line 122, in basic_publish
mandatory or False, immediate or False,
TypeError: an integer is required (got type NoneType)
Some additional information:
I am using Airflow 1.8.0 along with Celery 3.1.25 and RabbitMQ 3.5.7 as a broker and backend, but also tried Airflow 1.9.0 with Celery 4.2.
Airflow with sequential executor works without any problems.
`airflow test "dag_name" "task_name" "exec_date" runs succeessfully.
I am new to Airflow/Celery/RabbitMQ/SQL, so any help would be appreciated!
To add to previous answer. Using py-amqp involves either changing from broker_url = amqp://XXXXX to broker_url = pyamqp://XXXXX OR
pip uninstall librabbitmq.
Additionally you may need to change celery_result_backend variable to result_backend in your airflow.cfg. The celery_ prefix has been removed for variables in the [celery] node in airflow.cfg in recent versions.
It seems you are using librabbitmq as amqp broker which is not recommended by celery core team. Use py-amqp as the rabbitmq broker and you should get rid of this error.

Librabbitmq 2.0.0 with Python 3 gives TypeError: can't pickle memoryview objects

I am using the latest master branch of the git repo https://github.com/celery/librabbitmq and installing librabbitmq==2.0.0 for Python 3.6 by following the instructions in the readme
Using the development version
You can clone the repository by doing the following:
$ git clone git://github.com/celery/librabbitmq.git
Then install it by doing the following:
$ cd librabbitmq
$ make install # or make develop
This works fine (after installing certain binaries for c compliation in the OS), but when I then make a small a+b add task and call it with add.delay(2,2) it fails with the following error. I looked up and saw that Celery 4 uses json as serializer, so clearly it is not because if pickle serialization
Changing from librabbitmq to pyamqp broker works normally
Same exact situation in both MacOS and Ubuntu 16
[2018-04-30 23:40:02,956: CRITICAL/MainProcess] Unrecoverable error:
SystemError(' returned a result with an error set',) Traceback (most
recent call last): File
"/Users/somghosh/.virtualenvs/ctdb/lib/python3.6/site-packages/kombu/messaging.py",
line 624, in _receive_callback
return on_m(message) if on_m else self.receive(decoded, message) File
"/Users/somghosh/.virtualenvs/ctdb/lib/python3.6/site-packages/celery/worker/consumer/consumer.py",
line 570, in on_task_received
callbacks, File "/Users/somghosh/.virtualenvs/ctdb/lib/python3.6/site-packages/celery/worker/strategy.py",
line 145, in task_message_handler
handle(req) File "/Users/somghosh/.virtualenvs/ctdb/lib/python3.6/site-packages/celery/worker/worker.py",
line 221, in _process_task_sem
return self._quick_acquire(self._process_task, req) File "/Users/somghosh/.virtualenvs/ctdb/lib/python3.6/site-packages/kombu/async/semaphore.py",
line 62, in acquire
callback(*partial_args, **partial_kwargs) File "/Users/somghosh/.virtualenvs/ctdb/lib/python3.6/site-packages/celery/worker/worker.py",
line 226, in _process_task
req.execute_using_pool(self.pool) File "/Users/somghosh/.virtualenvs/ctdb/lib/python3.6/site-packages/celery/worker/request.py",
line 531, in execute_using_pool
correlation_id=task_id, File "/Users/somghosh/.virtualenvs/ctdb/lib/python3.6/site-packages/celery/concurrency/base.py",
line 155, in apply_async
**options) File "/Users/somghosh/.virtualenvs/ctdb/lib/python3.6/site-packages/billiard/pool.py",
line 1486, in apply_async
self._quick_put((TASK, (result._job, None, func, args, kwds))) File
"/Users/somghosh/.virtualenvs/ctdb/lib/python3.6/site-packages/celery/concurrency/asynpool.py",
line 813, in send_job
body = dumps(tup, protocol=protocol) TypeError: can't pickle memoryview objects
The above exception was the direct cause of the following exception:
Traceback (most recent call last): File
"/Users/somghosh/.virtualenvs/ctdb/lib/python3.6/site-packages/celery/worker/worker.py",
line 203, in start
self.blueprint.start(self) File "/Users/somghosh/.virtualenvs/ctdb/lib/python3.6/site-packages/celery/bootsteps.py",
line 119, in start
step.start(parent) File "/Users/somghosh/.virtualenvs/ctdb/lib/python3.6/site-packages/celery/bootsteps.py",
line 370, in start
return self.obj.start() File "/Users/somghosh/.virtualenvs/ctdb/lib/python3.6/site-packages/celery/worker/consumer/consumer.py",
line 320, in start
blueprint.start(self) File "/Users/somghosh/.virtualenvs/ctdb/lib/python3.6/site-packages/celery/bootsteps.py",
line 119, in start
step.start(parent) File "/Users/somghosh/.virtualenvs/ctdb/lib/python3.6/site-packages/celery/worker/consumer/consumer.py",
line 596, in start
c.loop(*c.loop_args()) File "/Users/somghosh/.virtualenvs/ctdb/lib/python3.6/site-packages/celery/worker/loops.py",
line 88, in asynloop
next(loop) File "/Users/somghosh/.virtualenvs/ctdb/lib/python3.6/site-packages/kombu/async/hub.py",
line 354, in create_loop
cb(*cbargs) File "/Users/somghosh/.virtualenvs/ctdb/lib/python3.6/site-packages/kombu/transport/base.py",
line 236, in on_readable
reader(loop) File "/Users/somghosh/.virtualenvs/ctdb/lib/python3.6/site-packages/kombu/transport/base.py",
line 218, in _read
drain_events(timeout=0) File "/Users/somghosh/.virtualenvs/ctdb/lib/python3.6/site-packages/librabbitmq-2.0.0-py3.6-macosx-10.6-intel.egg/librabbitmq/init.py",
line 227, in drain_events
self._basic_recv(timeout) SystemError: returned a result with an error set
This library is not recommended to use as rabbitmq broker with celery. Instead please try py-amqp. this is more maintained and less buggy.

Mongo connector with neo4j doc manager crashing

I'm using mongo-connector and neo4j_doc_manager for syncing the mongodb's data to neo4j, it used to work perfectly but today it started giving following error.
2016-07-29 17:18:59,558 [CRITICAL] mongo_connector.oplog_manager:549 - Exception during collection dump
Traceback (most recent call last):
File "/usr/local/lib/python2.7/site-packages/mongo_connector/oplog_manager.py", line 501, in do_dump
upsert_all(dm)
File "/usr/local/lib/python2.7/site-packages/mongo_connector/oplog_manager.py", line 485, in upsert_all
dm.bulk_upsert(docs_to_dump(namespace), mapped_ns, long_ts)
File "/usr/local/lib/python2.7/site-packages/mongo_connector/util.py", line 38, in wrapped
reraise(new_type, exc_value, exc_tb)
File "/usr/local/lib/python2.7/site-packages/mongo_connector/util.py", line 32, in wrapped
return f(*args, **kwargs)
File "/usr/local/lib/python2.7/site-packages/mongo_connector/doc_managers/neo4j_doc_manager.py", line 89, in bulk_upsert
tx.commit()
File "/usr/local/lib/python2.7/site-packages/py2neo/cypher/core.py", line 306, in commit
return self.post(self.__commit or self.__begin_commit)
File "/usr/local/lib/python2.7/site-packages/py2neo/cypher/core.py", line 261, in post
raise self.error_class.hydrate(error)
File "/usr/local/lib/python2.7/site-packages/py2neo/cypher/error/core.py", line 54, in hydrate
error_cls = getattr(error_module, title)
Neo4jOperationFailed: 'module' object has no attribute 'ConstraintValidationFailed'
2016-07-29 17:18:59,563 [ERROR] mongo_connector.oplog_manager:557 - OplogThread: Failed during dump collection cannot recover!
You're trying to insert data which doesn't match the constraints of your Neo4j schema (unicity or existence), and apparently the code doesn't know how to handle that type of error, though it does give its name:
ConstraintValidationFailed
You should maybe activate some log to see the data which it is trying to insert, or the Cypher query it's trying to execute.

OpenERP Server Error Access denied

After installing Odoo, I went to web panel where it asked create new database.
As I entered details I got error. I can change master password successfully.
I already created database on putty and there is no openerp-server.conf file under /etc/ folder.
Odoo
OpenERP Server Error
Traceback (most recent call last):
File "/usr/lib/python2.7/dist-packages/openerp/http.py", line 500, in _handle_exception
return super(JsonRequest, self)._handle_exception(exception)
File "/usr/lib/python2.7/dist-packages/openerp/http.py", line 517, in dispatch
result = self._call_function(**self.params)
File "/usr/lib/python2.7/dist-packages/openerp/http.py", line 284, in _call_function
return self.endpoint(*args, **kwargs)
File "/usr/lib/python2.7/dist-packages/openerp/http.py", line 733, in __call__
return self.method(*args, **kw)
File "/usr/lib/python2.7/dist-packages/openerp/http.py", line 376, in response_wrap
response = f(*args, **kw)
File "/usr/lib/python2.7/dist-packages/openerp/addons/web/controllers/main.py", line 714, in create
params['create_admin_pwd'])
File "/usr/lib/python2.7/dist-packages/openerp/http.py", line 807, in proxy_method
result = dispatch_rpc(self.service_name, method, args)
File "/usr/lib/python2.7/dist-packages/openerp/http.py", line 100, in dispatch_rpc
result = dispatch(method, params)
File "/usr/lib/python2.7/dist-packages/openerp/service/db.py", line 62, in dispatch
security.check_super(passwd)
File "/usr/lib/python2.7/dist-packages/openerp/service/security.py", line 33, in check_super
raise openerp.exceptions.AccessDenied()
AccessDenied: Access denied.
Using following command you will get location of .conf file
locate openerp-server.conf
Now go to that path and open it and check out master password whether it's same as you given while creating a new database.
#Danish the master password should be "admin" it is required to newly create your database
The master password that you are using to create the database is not same as what you have set for the PostgreSQL server.
#Danish
Check the tools -> config file and look what are the username and
password been set for accessing the database.
Check pg_hba.conf file under /etc/postgresql/9.1/main, that it is
allowing the connection for the specified database user in tools ->
config file.