Connecting Apache Superset with PostgreSQL

Connecting Apache Superset with PostgreSQL - postgresql

Suppose I run my Apache Superset on top of the Docker and I want this to connect with my local postgreSQL server. I used the following URI but I got an error:
postgresql+psycopg2://username:password#localhost:5432/mydb
The error is:
ERROR: {"error": "Connection failed!\n\nThe error message returned was:\n(psycopg2.OperationalError) could not connect to server: Connection refused\n\tIs the server running on host \"localhost\" (127.0.0.1) and accepting\n\tTCP/IP connections on port 5432?\ncould not connect to server: Cannot assign requested address\n\tIs the server running on host \"localhost\" (::1) and accepting\n\tTCP/IP connections on port 5432?\n\n(Background on this error at: http://sqlalche.me/e/e3q8)", "stacktrace": "Traceback (most recent call last):\n File \"/usr/local/lib/python3.6/site-packages/sqlalchemy/engine/base.py\", line 2265, in _wrap_pool_connect\n return fn()\n File \"/usr/local/lib/python3.6/site-packages/sqlalchemy/pool/base.py\", line 303, in unique_connection\n return _ConnectionFairy._checkout(self)\n File \"/usr/local/lib/python3.6/site-packages/sqlalchemy/pool/base.py\", line 760, in _checkout\n fairy = _ConnectionRecord.checkout(pool)\n File \"/usr/local/lib/python3.6/site-packages/sqlalchemy/pool/base.py\", line 492, in checkout\n rec = pool._do_get()\n File \"/usr/local/lib/python3.6/site-packages/sqlalchemy/pool/impl.py\", line 238, in _do_get\n return self._create_connection()\n File \"/usr/local/lib/python3.6/site-packages/sqlalchemy/pool/base.py\", line 308, in _create_connection\n return _ConnectionRecord(self)\n File \"/usr/local/lib/python3.6/site-packages/sqlalchemy/pool/base.py\", line 437, in __init__\n self.__connect(first_connect_check=True)\n File \"/usr/local/lib/python3.6/site-packages/sqlalchemy/pool/base.py\", line 639, in __connect\n connection = pool._invoke_creator(self)\n File \"/usr/local/lib/python3.6/site-packages/sqlalchemy/engine/strategies.py\", line 114, in ...
How can I solve it?

Instead of using localhost or 127.0.0.1, open up your pgAdmin.
The servers are on the left.
Click the dropdown.
Right click on the now opened cluster (level above "Databases") & open properties.
Navigate to the opened connection tab and the Hostname/Address is your replacement for "localhost"
Also make sure the final part of your connection string is pointed at your database which is one level below "Databases" in your pgAdmin.

I encountered the same problem with connecting superset to local database (postgresql), and after consulting many sites on the internet this trick solved it.Instead of local host, try to put this in SQLalchemy URI:
postgresql+psycopg2://user:password#host.docker.internal:5432/database

I understand that It is a bad practice action to connect the Docker with a host database so I changed my opinion and use the postgres image inside the docker and push my data to that postgres server.
It would be helpful if you notify me if I am wrong.

Related

Cannot connect to my PostgreSQL server remotely from chainlink node terminal

I have tried everyting to connect my Chainlink node up to my postgresql database with no luck. I have scoured the interwebs for answers to no avail...
Here is the error message I am receiving:
[ERROR] failed to initialize database, got error failed to connect to `host=/tmp user=root database=`: dial error (dial unix /tmp/.s.PGSQL.5432: connect: no such file or directory)
Here is my .env file:
ROOT=/chainlink
LOG_LEVEL=debug
ETH_CHAIN_ID=42
MIN_OUTGOING_CONFIRMATIONS=2
LINK_CONTRACT_ADDRESS=0xa36085F69e2889c224210F603D836748e7dC0088
CHAINLINK_TLS_PORT=0
SECURE_COOKIES=false
GAS_UPDATER_ENABLED=true
ALLOW_ORIGINS=*
ETH_URL=wss://kovan.infura.io/ws/v3/id...
DATABASE_URL=https://chainlink-db-url://postgres:Password#chainlink-kovan:5432
I have tried every configuration of the connection string. Also I am able to connect to the db via pgAdmin no problem and the dbs are publicaly accessible.
The postgresql database is on AWS.

Please change the syntax of your DATABASE_URL to:
DATABASE_URL=postgresql://"username":"password"#"public-ip-pg-server":5432/"database-name"
just change:
"username" : you need to configure a new user, because the default/admin user postgres will not work for it.
"password" : password of the user
"public-ip-pg-server" : the public ip address of your postgresql-server
"database-name" : the name of your database
PS: delete all " in your syntax (;
Here is the link to the official documentation: https://docs.chain.link/docs/connecting-to-a-remote-database/

I cant connect to MongoDB Atlas

I tried to connect with
*mongo "mongodb+srv://cluster0.mi3o1.mongodb.net/test" --username cristian* in the shell.
But instead, it looks like it's trying to connect with:
*mongodb://127.0.0.1:27017/?compressors=disabled&gssapiServiceName=mongodb*
I am getting the error
*Error: couldn't connect to server 127.0.0.1:27017, connection attempt failed: SocketException: Error connecting to 127.0.0.1:27017 :: caused by :: No connection could be made because the target machine actively refused it. :
connect#src/mongo/shell/mongo.js:374:17
#(connect):2:6
exception: connect failed
exiting with code 1
bash: mongodb+srv://cluster0.mi3o1.mongodb.net/test: No such file or directory*
I have created the cluster, database access with admin role user cristian, whitelisted both my IP and all IPs 0.0.0.0/0, created a new database, loaded sample databases, opened ports 27015,27016,27017 and tested on portquiz.net.
I added a PRTSCN.
Please help!
terminal prtscn

Thank you very much D. SM for your help, it looks like it was a problem with my .bash_profile. When i did the copy and paste for the 2 paths i dont know why the terminal writed the last " on another row and got some spaces between. I rewrited the 2 rows with no space like this:
alias mongod="/c/Program\ files/MongoDB/Server/4.4/bin/mongod.exe"
alias mongo="/c/Program\ Files/MongoDB/Server/4.4/bin/mongo.exe"
and it worked.

Unable to insert or retrieved data to MongoDB

I try to insert and pull some data from MongoDB.
The connection was setup correctly follow there instruction on mongodb.com
try:
client = MongoClient(
'mongodb+srv://user:pw!#cluster0-nghj0.gcp.mongodb.net/test?retryWrites=true',
ssl=True)
print("connected")
except:
print('failed')
I manually create a Database: messager.messager and put some json file in it
when I try to use collection.find() or collection.insert_one(...)
db = client.messager
collection = db.messager
for i in collection.find():
print(i)
It returns Timeout error:
File "/Users/anhnguyen/Documents/GitHub/GoogleCloud_Flask/comming soon/env/lib/python3.7/site-packages/pymongo/cursor.py", line 1225, in next
if len(self.__data) or self._refresh():
File "/Users/anhnguyen/Documents/GitHub/GoogleCloud_Flask/comming soon/env/lib/python3.7/site-packages/pymongo/cursor.py", line 1117, in _refresh
self.__session = self.__collection.database.client._ensure_session()
File "/Users/anhnguyen/Documents/GitHub/GoogleCloud_Flask/comming soon/env/lib/python3.7/site-packages/pymongo/mongo_client.py", line 1598, in _ensure_session
return self.__start_session(True, causal_consistency=False)
File "/Users/anhnguyen/Documents/GitHub/GoogleCloud_Flask/comming soon/env/lib/python3.7/site-packages/pymongo/mongo_client.py", line 1551, in __start_session
server_session = self._get_server_session()
File "/Users/anhnguyen/Documents/GitHub/GoogleCloud_Flask/comming soon/env/lib/python3.7/site-packages/pymongo/mongo_client.py", line 1584, in _get_server_session
return self._topology.get_server_session()
File "/Users/anhnguyen/Documents/GitHub/GoogleCloud_Flask/comming soon/env/lib/python3.7/site-packages/pymongo/topology.py", line 434, in get_server_session
None)
File "/Users/anhnguyen/Documents/GitHub/GoogleCloud_Flask/comming soon/env/lib/python3.7/site-packages/pymongo/topology.py", line 200, in _select_servers_loop
self._error_message(selector))
pymongo.errors.ServerSelectionTimeoutError: connection closed,connection closed,connection closed
Where did it goes wrong ?
Here is my Mongodb.com setup:

From the pymongo documentation for errors you have the following issue.
exception pymongo.errors.ServerSelectionTimeoutError(message='', errors=None)
Thrown when no MongoDB server is available for an operation
If there is no suitable server for an operation PyMongo tries for serverSelectionTimeoutMS (default 30 seconds) to find one, then throws this exception. For example, it is thrown after attempting an operation when PyMongo cannot connect to any server, or if you attempt an insert into a replica set that has no primary and does not elect one within the timeout window, or if you attempt to query with a Read Preference that the replica set cannot satisfy.
You have to check whether your network has good connectivity to the network where your MongoDB server resides.
There might be a case, where the Primary Node of the Replica Set becomes unresponsive. In such cases you need to restart your cluster(if you have the access permissions).
Also, create the connections as follows:
mongo_conn = MongoClient('mongodb+srv://cluster0-nghj0.gcp.mongodb.net/test?retryWrites=true', username=your_username, password=pwd, authSource='admin', authMechanism='SCRAM-SHA-1')
above is the best practice to follow. mongodb+srv urls do not need the ssl=True mention.

Airflow psycopg2.OperationalError: FATAL: sorry, too many clients already

I have a four node clustered Airflow environment that's been working fine for me for a few months now.
ec2-instances
Server 1: Webserver, Scheduler, Redis Queue, PostgreSQL Database
Server 2: Webserver
Server 3: Worker
Server 4: Worker
Recently I've been working on a more complex DAG that has a few dozen tasks in it compared to my relatively small ones I was working on beforehand. I'm not sure if that's why I'm just now seeing this error pop up or what but I'll sporadically get this error:
On the Airflow UI under the logs for the task:
psycopg2.OperationalError: FATAL: sorry, too many clients already
And on the Webserver (output from running airflow webserver) I get the same error too:
[2018-07-23 17:43:46 -0400] [8116] [ERROR] Exception in worker process
Traceback (most recent call last):
File "/usr/local/lib/python3.6/site-packages/sqlalchemy/engine/base.py", line 2158, in _wrap_pool_connect
return fn()
File "/usr/local/lib/python3.6/site-packages/sqlalchemy/pool.py", line 403, in connect
return _ConnectionFairy._checkout(self)
File "/usr/local/lib/python3.6/site-packages/sqlalchemy/pool.py", line 788, in _checkout
fairy = _ConnectionRecord.checkout(pool)
File "/usr/local/lib/python3.6/site-packages/sqlalchemy/pool.py", line 532, in checkout
rec = pool._do_get()
File "/usr/local/lib/python3.6/site-packages/sqlalchemy/pool.py", line 1193, in _do_get
self._dec_overflow()
File "/usr/local/lib/python3.6/site-packages/sqlalchemy/util/langhelpers.py", line 66, in __exit__
compat.reraise(exc_type, exc_value, exc_tb)
File "/usr/local/lib/python3.6/site-packages/sqlalchemy/util/compat.py", line 187, in reraise
raise value
File "/usr/local/lib/python3.6/site-packages/sqlalchemy/pool.py", line 1190, in _do_get
return self._create_connection()
File "/usr/local/lib/python3.6/site-packages/sqlalchemy/pool.py", line 350, in _create_connection
return _ConnectionRecord(self)
File "/usr/local/lib/python3.6/site-packages/sqlalchemy/pool.py", line 477, in __init__
self.__connect(first_connect_check=True)
File "/usr/local/lib/python3.6/site-packages/sqlalchemy/pool.py", line 671, in __connect
connection = pool._invoke_creator(self)
File "/usr/local/lib/python3.6/site-packages/sqlalchemy/engine/strategies.py", line 106, in connect
return dialect.connect(*cargs, **cparams)
File "/usr/local/lib/python3.6/site-packages/sqlalchemy/engine/default.py", line 410, in connect
return self.dbapi.connect(*cargs, **cparams)
File "/usr/local/lib64/python3.6/site-packages/psycopg2/__init__.py", line 130, in connect
conn = _connect(dsn, connection_factory=connection_factory, **kwasync)
psycopg2.OperationalError: FATAL: sorry, too many clients already
I can fix this by running sudo /etc/init.d/postgresql restart and restarting the DAG but then after about three runs I'll start seeing the error again.
I can't find any specifics on this issue in regards to Airflow but from other posts I've found such as this one they're saying it's because my client (I guess in this case that's Airflow) is trying to open up more connections to PostgreSQL than what PostgreSQL is configured to handle. I ran this command to find that my PostgreSQL can accept 100 connections:
[ec2-user#ip-1-2-3-4 ~]$ sudo su
root#ip-1-2-3-4
[/home/ec2-user]# psql -U postgres
psql (9.2.24)
Type "help" for help.
postgres=# show max_connections;
max_connections
-----------------
100
(1 row)
In this solution the post says I can increase my PostgreSQL max connections but I'm wondering if I should instead set a value in my Airflow.cfg file so that I can match the Airflow allowed connections size to my PoastgreSQL max connections size. Does anyone know where I can set this value in Airflow? Here are the fields I think are relevant:
# The SqlAlchemy pool size is the maximum number of database connections
# in the pool.
sql_alchemy_pool_size = 5
# The SqlAlchemy pool recycle is the number of seconds a connection
# can be idle in the pool before it is invalidated. This config does
# not apply to sqlite.
sql_alchemy_pool_recycle = 3600
# The amount of parallelism as a setting to the executor. This defines
# the max number of task instances that should run simultaneously
# on this airflow installation
parallelism = 32
# The number of task instances allowed to run concurrently by the scheduler
dag_concurrency = 32
# When not using pools, tasks are run in the "default pool",
# whose size is guided by this config element
non_pooled_task_slot_count = 128
# The maximum number of active DAG runs per DAG
max_active_runs_per_dag = 32
Open to any suggestions for fixing this issue. Is this something related to my Airflow configuration or is it an issue with my PostgreSQL configuration?
Also, because I'm testing a new DAG I'll sometimes terminate the running tasks and start them over. Perhaps doing this is causing some of the processes to not die correctly and they're keeping dead connections open to PostgreSQL?

Ran into similar issue. I changed max_connections in postgres to 10000 and sql_alchemy_pool_size in airflow config to 1000. Now I am able to run hundreds of tasks in parallel.
PS: My machine has 32 cores and 60GB memory. Hence, its taking the load.

Quoting the airflow documentation:
sql_alchemy_max_overflow: The maximum overflow size of the pool. When the number of checked-out connections reaches the size set in pool_size, additional connections will be returned up to this limit. When those additional connections are returned to the pool, they are disconnected and discarded. It follows then that the total number of simultaneous connections the pool will allow is pool_size + max_overflow, and the total number of “sleeping” connections the pool will allow is pool_size. max_overflow can be set to -1 to indicate no overflow limit; no limit will be placed on the total number of concurrent connections. Defaults to 10.
It seems that the variables you'll want to set on your airflow.cfg are both sql_alchemy_pool_size and sql_alchemy_max_overflow. Your PostgreSQL max_connections must be equal to or greater than the sum of those two Airflow configuration variables, since Airflow can have at most sql_alchemy_pool_size + sql_alchemy_max_overflow open connections with your database.

Heroku postgres connection: "Is the server running locally and accepting connections on Unix domain socket"

I am setting up a new dev environment, followed the django setup tutorial and am having issues. Here is what I get when I try to run syncdb
Running `python doccal/manage.py syncdb` attached to terminal... up, run.1
Traceback (most recent call last):
File "doccal/manage.py", line 14, in <module>
execute_manager(settings)
File "/app/.heroku/venv/lib/python2.7/site-packages/django/core/management/__i
nit__.py", line 459, in execute_manager
utility.execute()
File "/app/.heroku/venv/lib/python2.7/site-packages/django/core/management/__i
nit__.py", line 382, in execute
self.fetch_command(subcommand).run_from_argv(self.argv)
File "/app/.heroku/venv/lib/python2.7/site-packages/django/core/management/bas
e.py", line 196, in run_from_argv
self.execute(*args, **options.__dict__)
File "/app/.heroku/venv/lib/python2.7/site-packages/django/core/management/bas
e.py", line 232, in execute
output = self.handle(*args, **options)
File "/app/.heroku/venv/lib/python2.7/site-packages/django/core/management/bas
e.py", line 371, in handle
return self.handle_noargs(**options)
File "/app/.heroku/venv/lib/python2.7/site-packages/django/core/management/com
mands/syncdb.py", line 57, in handle_noargs
cursor = connection.cursor()
File "/app/.heroku/venv/lib/python2.7/site-packages/django/db/backends/__init_
_.py", line 306, in cursor
cursor = self.make_debug_cursor(self._cursor())
File "/app/.heroku/venv/lib/python2.7/site-packages/django/db/backends/postgre
sql_psycopg2/base.py", line 177, in _cursor
self.connection = Database.connect(**conn_params)
File "/app/.heroku/venv/lib/python2.7/site-packages/psycopg2/__init__.py", lin
e 179, in connect
connection_factory=connection_factory, async=async)
psycopg2.OperationalError: could not connect to server: No such file or director
y
Is the server running locally and accepting
connections on Unix domain socket "/var/run/postgresql/.s.PGSQL.5432"?
I have setup this same project before using the same steps and have never had a problem. I did, a few weeks ago, get an email that Heroku was migrating away from shared databases and assume this is somehow involved.
Also, I did notice two NEW steps in the tutorial, namely, installing dj-database-url and adding these lines to settings.py
import dj_database_url
DATABASES = {'default': dj_database_url.config(default='postgres://localhost')}
I have tried to run this both with and without these lines and get the same issue regardless.
Another post suggested the fix was to do this
heroku addons:add shared-database
Tried, get a message that shared-database is deprecated and to use heroku-postgresql, but that had no effect.
Thanks for any help

config HOST as localhost like :
'HOST': 'localhost',

The simplest(not comprehensive but one to get you up and running as quick as possible) solution to this error is, in your settings.py file , in the part where you have set up database settings revert the settings to
DATABASES = {
'default': {
'ENGINE': 'django.db.backends.sqlite3',
'NAME': BASE_DIR / 'db.sqlite3',
}
}
make migrations (i.e python manage.py makemigrations and then python manage.py migrate)
push your changes to GitHub, in your Heroku account, create a new app then move to the setting portion of your new app, connect to GitHub and deploy from GitHub, the database resources will then be provisioned for you (i.e a PostgreSQL database will be created with the same schema and data as your local SQLite database).

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Connecting Apache Superset with PostgreSQL - postgresql

I encountered the same problem with connecting superset to local database (postgresql), and after consulting many sites on the internet this trick solved it.Instead of local host, try to put this in SQLalchemy URI: postgresql+psycopg2://user:password#host.docker.internal:5432/database

I understand that It is a bad practice action to connect the Docker with a host database so I changed my opinion and use the postgres image inside the docker and push my data to that postgres server. It would be helpful if you notify me if I am wrong.

Related

Cannot connect to my PostgreSQL server remotely from chainlink node terminal

I cant connect to MongoDB Atlas

Unable to insert or retrieved data to MongoDB

Airflow psycopg2.OperationalError: FATAL: sorry, too many clients already

Heroku postgres connection: "Is the server running locally and accepting connections on Unix domain socket"

Categories

Resources