uWSGI, Flask, sqlalchemy, and postgres: SSL error: decryption failed or bad record mac - postgresql

I'm trying to setup an application webserver using uWSGI + Nginx, which runs a Flask application using SQLAlchemy to communicate to a Postgres database.
When I make requests to the webserver, every other response will be a 500 error.
The error is:
Traceback (most recent call last):
File "/var/env/argos/lib/python3.3/site-packages/sqlalchemy/engine/base.py", line 867, in _execute_context
context)
File "/var/env/argos/lib/python3.3/site-packages/sqlalchemy/engine/default.py", line 388, in do_execute
cursor.execute(statement, parameters)
psycopg2.OperationalError: SSL error: decryption failed or bad record mac
The above exception was the direct cause of the following exception:
sqlalchemy.exc.OperationalError: (OperationalError) SSL error: decryption failed or bad record mac
The error is triggered by a simple Flask-SQLAlchemy method:
result = models.Event.query.get(id)
uwsgi is being managed by supervisor, which has a config:
[program:my_app]
command=/usr/bin/uwsgi --ini /etc/uwsgi/apps-enabled/myapp.ini --catch-exceptions
directory=/path/to/my/app
stopsignal=QUIT
autostart=true
autorestart=true
and uwsgi's config looks like:
[uwsgi]
socket = /tmp/my_app.sock
logto = /var/log/my_app.log
plugins = python3
virtualenv = /path/to/my/venv
pythonpath = /path/to/my/app
wsgi-file = /path/to/my/app/application.py
callable = app
max-requests = 1000
chmod-socket = 666
chown-socket = www-data:www-data
master = true
processes = 2
no-orphans = true
log-date = true
uid = www-data
gid = www-data
The furthest that I can get is that it has something to do with uwsgi's forking. But beyond that I'm not clear on what needs to be done.

The issue ended up being uwsgi's forking.
When working with multiple processes with a master process, uwsgi initializes the application in the master process and then copies the application over to each worker process. The problem is if you open a database connection when initializing your application, you then have multiple processes sharing the same connection, which causes the error above.
The solution is to set the lazy configuration option for uwsgi, which forces a complete loading of the application in each process:
lazy
Set lazy mode (load apps in workers instead of master).
This option may have memory usage implications as Copy-on-Write semantics can not be used. When lazy is enabled, only workers will be reloaded by uWSGI’s reload signals; the master will remain alive. As such, uWSGI configuration changes are not picked up on reload by the master.
There's also a lazy-apps option:
lazy-apps
Load apps in each worker instead of the master.
This option may have memory usage implications as Copy-on-Write semantics can not be used. Unlike lazy, this only affects the way applications are loaded, not master’s behavior on reload.
This uwsgi configuration ended up working for me:
[uwsgi]
socket = /tmp/my_app.sock
logto = /var/log/my_app.log
plugins = python3
virtualenv = /path/to/my/venv
pythonpath = /path/to/my/app
wsgi-file = /path/to/my/app/application.py
callable = app
max-requests = 1000
chmod-socket = 666
chown-socket = www-data:www-data
master = true
processes = 2
no-orphans = true
log-date = true
uid = www-data
gid = www-data
# the fix
lazy = true
lazy-apps = true

As an alternative you might dispose the engine. This is how I solved the problem.
Such issues may happen if there is a query during the creation of the app, that is, in the module that creates the app itself. If that states, the engine allocates a pool of connections and then uwsgi forks.
By invoking 'engine.dispose()', the connection pool itself is closed and new connections will come up as soon as someone starts making queries again. So if you do that at the end of the module where you create your app, new connections will be created after the UWSGI fork.

I am running a flask app using gunicorn on Heroku. My application started exhibiting this problem when I added the --preload option to my Procfile. When I removed that option, my application resumed functioning as normal.

Not sure whether to add this as an answer to this question or ask a separate question and put this as an answer there. I was getting this exact same error for reasons that are slightly different from the people who have posted and answered. In my setup, I using gunicorn as a wsgi for a Flask application. In this application, I was offloading some intense database operations off to a celery worker. The error would come from the celery worker.
From reading a lot of the answers here and looking at the psycopg2 as well as sqlalchemy session documentation, it became apparent to me that it is a bad idea to share an SQLAlchemy session between separate processes (the gunicorn worker and the sqlalchemy worker in my case).
What ended up solving this for me was creating a new session in the celery worker function so it used a new session each time it was called and also destroying the session after every web request so flask used a session per request. The overall solution looked like this:
Flask_app.py
#app.teardown_appcontext
def shutdown_session(exception=None):
session.close()
celery_func.py
#celery_app.task(bind=True, throws=(IntegrityError))
def access_db(self,entity_dict, tablename):
with Session() as session:
try:
session.add(ORM_obj)
session.commit()
except IntegrityError as e:
session.rollback()
print('primary key violated')
raise e

Related

Program stalls on "DEBUG Starting new HTTPS connection (1): login.microsoftonline.com:443" only when executed via cron

I'm using the python-O365 library in order to access my O365 mailbox. The project requires me to execute the program in a docker container. If I start the program manually (as root), everything works fine, but if I try to start it via cron, it stalls on DEBUG Starting new HTTPS connection (1): login.microsoftonline.com:443, which I found out after activating logging.
The minimal code example that reproduces the error (with log):
import O365
from utils.credentials import get_credentials
import logging # We want to get additional information
logging.basicConfig(
filename='./easy_log_file.log',
filemode='a',
format='%(levelname)s %(message)s', # %(asctime)s %(pathname)s %(lineno)d
level=logging.DEBUG
)
filename = "o365_token.txt"
token_backend = O365.FileSystemTokenBackend(token_path = filename)
account = O365.Account(get_credentials(), token_backend=token_backend)
inbox = account.mailbox().inbox_folder()
messages = inbox.get_messages()
for message in messages:
logging.info(message)
logging.info("finished")
To start it via cron, I used the following command:
echo "15 21 * * * bash /workspace/daemon_start.sh >> /workspace/cronlogs/logs_daemon_mail.log" | crontab. If I start the program manually, the log continues like this:
DEBUG Starting new HTTPS connection (1): graph.microsoft.com:443
DEBUG https://graph.microsoft.com:443 "GET /v1.0/me/mailFolders/Inbox/messages?%24top=100&%24filter=isRead+eq+false HTTP/1.1" 200 None
DEBUG Received response (200) from URL https://graph.microsoft.com/v1.0/me/mailFolders/Inbox/messages?%24top=100&%24filter=isRead+eq+false
If the program is started via cron, sometimes the log continues like this:
DEBUG Incremented Retry for (url='/common/oauth2/v2.0/token'): Retry(total=2, connect=3, read=3, redirect=None, status=None)
WARNING Retrying (Retry(total=2, connect=3, read=3, redirect=None, status=None)) after connection broken by 'SSLError(SSLEOFError(8, 'EOF occurred in violation of protocol (_ssl.c:1131)'))': /common/oauth2/v2.0/token
In order to resolve the issue, I added my proxy by using account = Account(credentials, token_backend=token_backend, proxy_server="proxy.my_proxy.com"). It's strange, that I would have to add it, for the container is already configured to use this proxy. When I tried it with this setting, I encountered the same issue, only that the log when started with cron was continued always and much faster.
Since I think, that cron simply starts the program and does not meddle with the connections, it doesn't make sense to me, that I get different outcomes by starting it manually or with cron.

MongoEngine connects to incorrect database during testing

Context
I'm creating Flask app connected to mongodb using MongoEngine via flask-mongoengine extension. I create my app using application factory pattern as specified in configuration instructions.
Problem
While running test(s), I specified testing database named datazilla_test which is passed to mongo instance via mongo.init_app(app). Even though my app.config['MONGODB_DB'] and mongo.app.config['MONGODB_DB'] instance has correct value (datazilla_test), this value is not reflected in mongo instance. Thus, when I run assertion assert mongo.get_db().name == mongo.app.config['MONGODB_DB'] this error is triggered AssertionError: assert 'datazzilla' == 'datazzilla_test'
Question
What am I doing wrong? Why database connection persist with default database datazzilla rather, than datazilla_test? How to fix it?
Source Code
# __init__.py
from flask_mongoengine import MongoEngine
mongo = MongoEngine()
def create_app(config=None):
app = Flask(__name__)
app.config['MONGODB_HOST'] = 'localhost'
app.config['MONGODB_PORT'] = '27017'
app.config['MONGODB_DB'] = 'datazzilla'
# override default config
if config is not None:
app.config.from_mapping(config)
mongo.init_app(app)
return app
# conftest.py
import pytest
from app import mongo
from app import create_app
#pytest.fixture
def app():
app = create_app({
'MONGODB_DB': 'datazzilla_test',
})
assert mongo.get_db().name == mongo.app.config['MONGODB_DB']
# AssertionError: assert 'datazzilla' == 'datazzilla_test'
return app
Mongoengine is already connected when your fixture is called, when you call app = create_app from your fixture, it tries to re-establish the connection but fails silently (as it sees that there is an existing default connection established).
This got reworked in the development version of mongoengine (see https://github.com/MongoEngine/mongoengine/pull/2038) but wasn't released yet (As of 04-JUN-2019). When that version gets out, you'll be able to disconnect any existing mongoengine connection by calling disconnect_all
In the meantime, you can either:
- check where the existing connection is created and prevent it
- try to disconnect the existing connection by using the following:
from mongoengine.connection import disconnect, _connection_settings
#pytest.fixture
def app():
disconnect()
del _connection_settings['default']
app = create_app(...)
...
But it may have other side effects
Context
Coincidentally, I figure-out fix for this problem. #bagerard answer is correct! It works for MongoClient where client's connect is set to True -this is/should be default value.
MongoClient(host=['mongo:27017'], document_class=dict, tz_aware=False, connect=False, read_preference=Primary())
If that is the case, then you have to disconnect database and delete connection settings as #bagerard explains.
Solution
However, if you change MongoClient connection to False, then you don't have to disconnect database and delete connection settings. At the end solution that worked for me was this solution.
def create_app(config=None):
...
app.config['MONGODB_CONNECT'] = False
...
Notes
As I wrote earlier. I found this solution coincidentally, I was trying to solve this problem MongoClient opened before fork. Create MongoClient only after forking. It turned out that it fixes both problems :)
P.S If there are any side effects I'm not aware of them at this point! If you find some then please share them in comments section.

How to share connection pool with multiprocessing Python

I'm trying to share a psycopg2.pool.(Simple/Threaded)ConnectionPool among multiple python processes. What is the correct way to approach this?
I am using Python 2.7 and Postgres 9.
I would like to provide some context. I want to use a connection pool is because I am running an arbitrary, but large (>80) number of processes that initially use the database to query results, perform some actions, then update the database with the results of the actions.
So far, I've tried to use the multiprocessing.managers.BaseManager and pass it to child processes so that the connections that are being used/unused are synchronized across the processes.
from multiprocessing import Manager, Process
from multiprocessing.managers import BaseManager
PASSWORD = 'xxxx'
def f(connection, connection_pool):
with connection.cursor() as curs: # ** LINE REFERENCED BELOW
curs.execute('SELECT * FROM database')
connection_pool.putconn(connection)
BaseManager.register('SimpleConnectionPool', SimpleConnectionPool)
manager = BaseManager()
manager.start()
conn_pool = manager.SimpleConnectionPool(5, 20, dbname='database', user='admin', host='xxxx', password=PASSWORD, port='8080')
with conn_pool.getconn() as conn:
print conn # prints '<connection object at 0x7f48243edb48; dsn:'<unintialized>', closed:0>
proc = Process(target=f, args=(conn, conn_pool))
proc.start()
proc.join()
** raises an error, 'Operational Error: asynchronous connection attempt underway'
If anyone could recommend a method to share the connection pool with numerous processes it would be greatly appreciated.
You don't. You can't pass a socket to another process. The other process must open the connection itself.

celeryd insists on checking amqp when its configured to use redis

I am trying to run celerdy + redis in my setup.
CELERYD_NODES="worker1"
CELERYD_NODES="worker1 worker2 worker3"
CELERY_BIN="/home/snijsure/.virtualenvs/mtest/bin/celery"
CELERYD_CHDIR="/home/snijsure/work/mytest/"
CELERYD_OPTS="--time-limit=300 --concurrency=8"
CELERYD_LOG_FILE="/var/log/celery/%N.log"
CELERYD_PID_FILE="/var/run/celery/%N.pid"
CELERYD_USER="celery"
CELERYD_GROUP="celery"
CELERY_CREATE_DIRS=1
export DJANGO_SETTINGS_MODULE="analytics.settings.local"
I have following in my base.py
BROKER_URL = 'redis://localhost:6379/0'
CELERY_RESULT_BACKEND = 'redis://localhost:6379/0'
BROKER_HOST = "localhost"
BROKER_BACKEND="redis"
REDIS_PORT=6379
REDIS_HOST = "localhost"
BROKER_USER = ""
BROKER_PASSWORD =""
BROKER_VHOST = "0"
REDIS_DB = 0
REDIS_CONNECT_RETRY = True
CELERY_SEND_EVENTS=True
CELERY_RESULT_BACKEND='redis'
CELERY_TASK_RESULT_EXPIRES = 10
CELERYBEAT_SCHEDULER="djcelery.schedulers.DatabaseScheduler"
CELERY_ALWAYS_EAGER = False
import djcelery
djcelery.setup_loader()
However when I start the celeryd using /etc/init.d/celerdy start
I see following messages in my log files
[2014-08-14 23:16:41,430: ERROR/MainProcess] consumer: Cannot connect to amqp://guest:**#127.0.0.1:5672//: [Errno 111] Connection refused.
Trying again in 32.00 seconds...
It seems like its trying to connect to amqp. Any ideas on why that is I have followed procedure outlined here
http://celery.readthedocs.org/en/latest/getting-started/brokers/redis.html
I am running version 3.1.13 (Cipater)
What am I doing wrong?
-Subodh
How do you start you celery worker? I encounter this error once because I didn't start it right. You should add -A option when execute "celery worker" so that celery will connect to the broker you configured in your Celery Obj. Otherwise celery will try to connect the default broker.
Your /etc/default/celeryd file looks ok.
You are using djcelery, however. I'd recommend you drop that. If you look at the Django setup guide and example project you will notice that there are no longer any INSTALLED_APPS required for celery. It appears that djcelery is now only recommended if you want to use the Django SQL database as a backend.
https://github.com/celery/celery/tree/3.1/examples/django/
http://celery.readthedocs.org/en/latest/django/first-steps-with-django.html#using-celery-with-django
I've just rebuilt against that pattern and I can confirm that it works ok, at least in terms of connecting to Redis rather than trying to use RabbitMQ (amqp).

Buildbot slaves priority

Problem
I have set up a latent slave in buildbot to help avoid congestion.
I've set up my builds to run either in permanent slave or latent one. The idea is the latent slave is waken up only when needed but the result is that buildbot randomly selectes one slave or the other so sometimes I have to wait for the latent slave to wake even if the permanent one is idle.
Is there a way to prioritize buildbot slaves?
Attempted solutions
1. Custom nextSlave
Following #david-dean suggestion, I've created a nextSlave function as follows (updated to working version):
from twisted.python import log
import traceback
def slave_selector(builder, builders):
try:
host = None
support = None
for builder in builders:
if builder.slave.slavename == 'host-slave':
host = builder
elif builder.slave.slavename == 'support-slave':
support = builder
if host and support and len(support.slave.slave_status.runningBuilds) < len(host.slave.slave_status.runningBuilds):
log.msg('host-slave has many running builds, launching build in support-slave')
return support
if not support:
log.msg('no support slave found, launching build in host-slave')
elif not host:
log.msg('no host slave found, launching build in support-slave')
return support
else:
log.msg('launching build in host-slave')
return host
except Exception as e:
log.err(str(e))
log.err(traceback.format_exc())
log.msg('Selecting random slave')
return random.choice(buildslaves)
And then passed it to BuilderConfig.
The result is that I get this in twistd.log:
2014-04-28 11:01:45+0200 [-] added buildset 4329 to database
But the build never starts, in the web UI it always appear as Pending and none of the logs I've put appear in twistd.log
2. Trying to mimic default behavior
I've having a look to buildbot code, to see how it is done by default.
in file ./master/buildbot/process/buildrequestdistributor.py, class BasicBuildChooser you have:
self.nextSlave = self.bldr.config.nextSlave
if not self.nextSlave:
self.nextSlave = lambda _,slaves: random.choice(slaves) if slaves else None
So I've set exactly that lambda function in my BuilderConfig and I'm getting exactly the same build not starting result.
You can set up a nextSlave function to assign slaves to a builder in a custom manner see: http://docs.buildbot.net/current/manual/cfg-builders.html#builder-configuration