Airflow - Switching to CeleryExecutor results in password authentication failed for user "airflow" exception - celery

I run docker container with apache airflow
If I set executor = LocalExecutor, everything works fine, however, if I set executor = CeleryExecutor and run a DAG I get the following exception printed
[2020-07-13 04:17:41,065] {{celery_executor.py:266}} ERROR - Error fetching Celery task state, ignoring it:OperationalError('(psycopg2.OperationalError) FATAL: password authentication failed for user "airflow"\n')
Traceback (most recent call last):
File "/usr/local/lib/python3.7/site-packages/airflow/executors/celery_executor.py", line 108, in fetch_celery_task_state
I provide however the following ENV variables in docker run call
docker run --name test -it \
-p 8000:80 -p 5555:5555 -p 8080:8080 \
-v `pwd`:/app \
-e AWS_ACCESS_KEY_ID \
-e AWS_SECRET_ACCESS_KEY \
-e AWS_DEFAULT_REGION \
-e PYTHONPATH=/app \
-e ENVIRONMENT=local \
-e XCOMMAND \
-e POSTGRES_PORT=5432 \
-e POSTGRES_HOST=postgres \
-e POSTGRES_USER=project_user \
-e POSTGRES_PASSWORD=password \
-e DJANGO_SETTINGS_MODULE=config.settings.local \
-e AIRFLOW_DB_NAME=project_airflow_dev \
-e AIRFLOW_ADMIN_USER=project_user \
-e AIRFLOW_ADMIN_EMAIL=admin#project.com \
-e AIRFLOW_ADMIN_PASSWORD=password \
-e AIRFLOW__CORE__SQL_ALCHEMY_CONN=postgresql+psycopg2://project_user:password#postgres:5432/project_airflow_dev \
-e AIRFLOW__CORE__EXECUTOR=CeleryExecutor \
-e AIRFLOW__CELERY__BROKER_URL=redis://redis:6379/1 \
--network="project-network" \
--link project_cassandra_1:cassandra \
--link project_postgres_1:postgres \
--link project_redis_1:redis \
registry.dkr.ecr.us-east-2.amazonaws.com/airflow:v1.0
In LocalExecutor - everything is fine, so I can login into admin UI and trigger the dag and get successful results, it's just that when I switch to CeleryExecutor - I get a weird error about "airflow" user, as if AIRFLOW__CORE__SQL_ALCHEMY_CONN env var is not visible or used at all.
Any ideas?

solution:
Adding AIRFLOW__CELERY__RESULT_BACKEND env var fixed the issue.
...
-e AIRFLOW__CELERY__RESULT_BACKEND=db+postgresql+psycopg2://project_user:password#postgres:5432/project_airflow_dev \
...
or edit airflow.cfg
[celery]
result_backend = db+postgresql://airflow:airflow#postgres/airflow

Related

Why my local file is empty after mounting?

When i try to mount a database from postgresql, i see my local directory is empty.
This is my code:
winpty docker run -it \
-e POSTGRES_USER="root" \
-e POSTGRES_PASSWORD="root" \
-e POSTGRES_DB="ny_taxi" \
-v /c/src/ny:/var/lib/postgresql/data \
-p 5432:5432 \
postgres:13
When i run that code on MINGW64, i see docker produce a file named "ny;C" and it's empty.
Why is empty and why its named "ny;C" instead of "ny"? How can i fix that problem?

How to specify whether a connector is a source or a sink?

I am currently configuring kafka connect (with debezium/connect docker image), I successfully connected it to Kafka using environment variables:
docker run -it --rm --name AAAAAA-kafka-connect -p 8083:8083 \
-v aaaaa.jks:aaaaa.jks \
-v bbbbbb.jks:bbbbbb.jks \
-e LOG_LEVEL=INFO \
-e HOST_NAME="AAAAAA-kafka-connect" \
-e HEAP_OPTS="-Xms256m -Xmx2g" \
-e BOOTSTRAP_SERVERS="BBBBB:9092" \
-e CONNECT_CLIENT_ID="xxx-kafka-connect" \
-e CONNECT_SASL_JAAS_CONFIG="org.apache.kafka.common.security.scram.ScramLoginModule required username=\"...\" password=\"...\";" \
-e CONNECT_SECURITY_PROTOCOL="SASL_SSL" \
-e CONNECT_SASL_MECHANISM="PLAIN" \
-e CONNECT_SSL_TRUSTSTORE_LOCATION="bbbbbb.jks" \
-e CONNECT_SSL_TRUSTSTORE_PASSWORD="..." \
-e CONNECT_SSL_KEYSTORE_LOCATION="aaaaa.jks" \
-e CONNECT_SSL_KEYSTORE_PASSWORD="..." \
-e GROUP_ID="XXX.grp.kafka.connect" \
-e CONFIG_STORAGE_TOPIC="XXX.connect.configs.v1" \
-e OFFSET_STORAGE_TOPIC="XXX.connect.offsets.v1" \
-e STATUS_STORAGE_TOPIC="XXX.connect.statuses.v1" \
quay.io/debezium/connect:1.9
Now I have to create a source connector (posgresql db) and I want the data kafka connect will grab from the source to be sink in a kafka topic.
Where do I have to set the kafka configuration of the sink since there is no such config in the json config of the database connector?
Have I to create a sink connector to the kafka topic? if so, where do we specify if this is a sink or a source connector??
PS: I already have created the kafka topic where i want to put datas in
Feel free to ask questions
Environment variables only modify the client parameters.
Source and Sinks are determined when you actually create the connector. You need a JSON config and it will have a connector.class.
In Kafka API there is SinkTask and SourceTask.
Debezium is always a Source. Sources write to Kafka; that doesn't make Kafka a sink. You need to install a new connector plugin to get a sink for your database, such as the JDBC Connector from Confluent which has classes for both sources and sinks.
ok, you have to add the CONNECT_PRODUCER_* or CONNECT_CONSUMER_* environment variables to specify the config of source or sink !!!!!!
Like this:
docker run -it --rm --name AAAAAA-kafka-connect -p 8083:8083 \
-v aaaaa.jks:aaaaa.jks \
-v bbbbbb.jks:bbbbbb.jks \
-e LOG_LEVEL=INFO \
-e HOST_NAME="AAAAAA-kafka-connect" \
-e HEAP_OPTS="-Xms256m -Xmx2g" \
-e BOOTSTRAP_SERVERS="BBBBB:9092" \
-e CONNECT_CLIENT_ID="xxx-kafka-connect" \
-e CONNECT_SASL_JAAS_CONFIG="org.apache.kafka.common.security.scram.ScramLoginModule required username=\"...\" password=\"...\";" \
-e CONNECT_SECURITY_PROTOCOL="SASL_SSL" \
-e CONNECT_SASL_MECHANISM="PLAIN" \
-e CONNECT_SSL_TRUSTSTORE_LOCATION="bbbbbb.jks" \
-e CONNECT_SSL_TRUSTSTORE_PASSWORD="..." \
-e CONNECT_SSL_KEYSTORE_LOCATION="aaaaa.jks" \
-e CONNECT_SSL_KEYSTORE_PASSWORD="..." \
-e GROUP_ID="XXX.grp.kafka.connect" \
-e CONFIG_STORAGE_TOPIC="XXX.connect.configs.v1" \
-e OFFSET_STORAGE_TOPIC="XXX.connect.offsets.v1" \
-e STATUS_STORAGE_TOPIC="XXX.connect.statuses.v1" \
-e CONNECT_PRODUCER_TOPIC_CREATION_ENABLE=false \
-e CONNECT_PRODUCER_SASL_JAAS_CONFIG="org.apache.kafka.common.security.scram.ScramLoginModule required username=\"...\" password=\"...\";" \
-e CONNECT_PRODUCER_SECURITY_PROTOCOL="SASL_SSL" \
-e CONNECT_PRODUCER_SASL_MECHANISM="PLAIN" \
-e CONNECT_PRODUCER_SSL_TRUSTSTORE_LOCATION="bbbbbb.jks" \
-e CONNECT_PRODUCER_SSL_TRUSTSTORE_PASSWORD="..." \
-e CONNECT_PRODUCER_SSL_KEYSTORE_LOCATION="aaaaa.jks" \
-e CONNECT_PRODUCER_SSL_KEYSTORE_PASSWORD="..." \
-e CONNECT_PRODUCER_CLIENT_ID="xxx-kafka-connect" \
-e CONNECT_PRODUCER_TOPIC_CREATION_ENABLE=false \
quay.io/debezium/connect:1.9
the sink or source property comes from the connector.class used in the json definition of the connector. However, Debeziums CDC connectors can only be used as a source connector that captures real-time event change records from external database systems (https://hevodata.com/learn/debezium-vs-kafka-connect/#:~:text=Debezium%20platform%20has%20a%20vast,records%20from%20external%20database%20systems.)

Install Postrouting in docker postgis-postgresql container

I created a postgis database with docker using the postgis image as usual
docker run -d \
--name mypostgres \
-p 5555:5432 \
-e POSTGRES_PASSWORD=postgres \
-v /data/postgres/data:/var/lib/postgresql/data \
-v /data/postgres/lib:/usr/lib/postgresql/10/lib \
postgis/postgis:10-3.0
now I can see all extensiones in the database,it has postgis, it's ok. but not have postrouting.
so I pull another image:
docker pull pgrouting/pgrouting:11-3.1-3.1.3
and do the same command:
docker run -d \
--name pgrouting \
-p 5556:5432 \
-e POSTGRES_PASSWORD=postgres \
-v /data/pgrouting/data/:/var/lib/postgresql/data/ \
-v /data/postgres/lib/:/usr/lib/postgresql/11/lib/ \
pgrouting/pgrouting:11-3.1-3.1.3
but when I exec this command:
create extensione postrouting
I get this error message:
could not load library "/usr/lib/postgresql/11/lib/plpgsql.so": /usr/lib/postgresql/11/lib/plpgsql.so: undefined symbol: AllocSetContextCreate
I can't solve this problem.Can anyone help me?
thanks a lot

Docker + Kong: [PostgreSQL error] failed to retrieve PostgreSQL server_version_num: connection refused

I'm currently running Docker 19.03.5 and trying to replicate the contents of this article, but i'm getting the following error in the third step:
First step:
docker network create kong-net
Second:
docker run -d --name kong-database \
--network=kong-net \
-p 5555:5432 \
-e “POSTGRES_USER=kong” \
-e “POSTGRES_DB=kong” \
postgres:9.6
Third:
docker run --rm \
--network=kong-net \
-e “KONG_DATABASE=postgres” \
-e “KONG_PG_HOST=kong-database” \
kong:latest kong migrations up
At this third step, if I use the verbose option, I can see the following error:
2019/12/02 15:51:25 [verbose] Kong: 1.4.0
Error:
/usr/local/share/lua/5.1/kong/cmd/migrations.lua:93: [PostgreSQL error] failed to retrieve
PostgreSQL server_version_num: connection refused
stack traceback:
[C]: in function 'assert'
/usr/local/share/lua/5.1/kong/cmd/migrations.lua:93: in function 'cmd_exec'
/usr/local/share/lua/5.1/kong/cmd/init.lua:87: in function </usr/local/share/lua/5.1/kong/cmd/init.lua:87>
[C]: in function 'xpcall'
/usr/local/share/lua/5.1/kong/cmd/init.lua:87: in function </usr/local/share/lua/5.1/kong/cmd/init.lua:44>
/usr/local/bin/kong:9: in function 'file_gen'
init_worker_by_lua:48: in function <init_worker_by_lua:46>
[C]: in function 'xpcall'
init_worker_by_lua:55: in function <init_worker_by_lua:53>
2019/12/02 15:51:25 [verbose] no config file found at /etc/kong/kong.conf
2019/12/02 15:51:25 [verbose] no config file found at /etc/kong.conf
2019/12/02 15:51:25 [verbose] no config file, skip loading
2019/12/02 15:51:25 [verbose] prefix in use: /usr/local/kong
My docker logs -f --tail 10 kong-database:
PostgreSQL init process complete; ready for start up.
LOG: database system was shut down at 2019-12-02 12:22:46 UTC
LOG: MultiXact member wraparound protections are now enabled
LOG: autovacuum launcher started
LOG: database system is ready to accept connections
I'm running Ubuntu 18.04 and there are no other networks or containers running.
The article you're referring is a bit outdated
Note for Kong < 0.15: with Kong versions below 0.15 (up to 0.14), use
the up sub-command instead of bootstrap. Also note that with Kong <
0.15, migrations should never be run concurrently; only one Kong node should be performing migrations at a time. This limitation is lifted
for Kong 0.15, 1.0, and above.
Reference https://hub.docker.com/_/kong
Kong docs https://docs.konghq.com/install/docker
The instructions below should work
Create a docker network
docker network create kong-net
Start a PostgreSQL container
docker run -d --name kong-database \
--network=kong-net \
-p 5555:5432 \
-e "POSTGRES_USER=kong" \
-e "POSTGRES_DB=kong" \
-e "POSTGRES_PASSWORD=kong" \
postgres:12.2
Prepare your database
docker run --rm \
--network=kong-net \
-e "KONG_DATABASE=postgres" \
-e "KONG_PG_HOST=kong-database" \
-e "KONG_PG_PASSWORD=kong" \
kong:2.0.3 kong migrations bootstrap
Start Kong
docker run -d --name kong \
--network=kong-net \
--link kong-database:kong-database \
-e "KONG_DATABASE=postgres" \
-e "KONG_PG_HOST=kong-database" \
-e "KONG_PG_PASSWORD=kong" \
-e "KONG_PROXY_ACCESS_LOG=/dev/stdout" \
-e "KONG_ADMIN_ACCESS_LOG=/dev/stdout" \
-e "KONG_PROXY_ERROR_LOG=/dev/stderr" \
-e "KONG_ADMIN_ERROR_LOG=/dev/stderr" \
-e "KONG_ADMIN_LISTEN=0.0.0.0:8001, 0.0.0.0:8444 ssl" \
-p 8000:8000 \
-p 8443:8443 \
-p 8001:8001 \
-p 8444:8444 \
kong

Hasura use SSL certificates for Postgres connection

I can run Hashura from the Docker image.
docker run -d -p 8080:8080 \
-e HASURA_GRAPHQL_DATABASE_URL=postgres://username:password#hostname:port/dbname \
-e HASURA_GRAPHQL_ENABLE_CONSOLE=true \
hasura/graphql-engine:latest
But I also have a Postgres instance that can only be accessed with three certificates:
psql "sslmode=verify-ca sslrootcert=server-ca.pem \
sslcert=client-cert.pem sslkey=client-key.pem \
hostaddr=$DB_HOST \
port=$DB_PORT\
user=$DB_USER dbname=$DB_NAME"
I don't see a configuration for Hasura that allows me to connect to a Postgres instance in such a way.
Is this something I'm suppose to pass into the database connection URL?
How should I do this?
You'll need to mount your certificates into the docker container and then configure libpq (which is what hasura uses underneath) to use the required certificates with these environment variables. It'll be something like this (I haven't tested this):
docker run -d -p 8080:8080 \
-v /absolute-path-of-certs-folder:/certs
-e HASURA_GRAPHQL_DATABASE_URL=postgres://hostname:port/dbname \
-e HASURA_GRAPHQL_ENABLE_CONSOLE=true \
-e PGSSLMODE=verify-ca \
-e PGSSLCERT=/certs/client-cert.pem \
-e PGSSLKEY=/certs/client-key.pem \
-e PGSSLROOTCERT=/certs/server-ca.pem \
hasura/graphql-engine:latest