how to solve 'Mysql2::Error: This connection is still waiting for a result' error with mysql2 and activerecord - eventmachine

Not duplicate of this question with the same title
I am using activerecord with mysql2 and I am designing to handle 10 queries to the same activerecord model/class at a time. Please note I am using strict activerecord and not using mysql queries directly.
I get the calls in Sinatra and then use activerecord to get the data from the DB.
I don't wan't the calls to be blocking so therefore I used mysql2 and I do NOT want to use em-synchrony.
But now I get the following "Mysql2::Error: This connection is still waiting for a result, try again once you have the result:" on subsequent simultanous calls.
I am not establishing connection with pool=10
my class
class User < ActiveRecord::Base
and my code to call
user.find(:all, :conditions => ["id=?", userid])
The mysql2 doc says "To use the ActiveRecord driver (with or without rails), all you should need to do is have this gem installed and set the adapter in your database.yml to "mysql2". That was easy right? :)"
And that is exactly what I have done when I moved from mysql to mysql2.
Why am I getting this error.

Here is a fully working example:
require 'rubygems'
gem 'activerecord', '~> 3.1.0'
gem 'sinatra', '~> 1.3.1'
gem 'mysql2', '~> 0.3.11'
require 'active_record'
require 'sinatra/base'
require 'mysql2'
# thin use the eventmachine thread pool
# you should have at least one connection per thread
# or you can expect errors
EM::threadpool_size = 10
# connect to the database
ActiveRecord::Base.establish_connection(
:adapter => "mysql2",
:database => "test",
:username => "root",
:encoding => 'utf8',
# number of connections openened to the database
:pool => 10
)
class App < Sinatra::Base
get '/db' do
ActiveRecord::Base.connection.execute("SELECT SLEEP(1)")
end
end
run App
to run it save the file as "config.ru" and run it with thin in threaded mode:
thin start -e production --threaded
You can use ab to check that everything is working, I used a tool called siege:
siege -c 10 -r 1 http://localhost:3000/db

You should use a ConnectionPool... Somehow you have 2 connections in a race condition.
I dont use Sinatra, I use Rails, but I had the same problem and solved it like that:
# class ActiveRecord::Base
# mattr_accessor :shared_connection
# ##shared_connection = nil
#
# def self.connection
# ##shared_connection || retrieve_connection
# end
# end
#
# ActiveRecord::Base.shared_connection = ActiveRecord::Base.connection
class ActiveRecord::Base
mattr_accessor :shared_connection
##shared_connection = nil
def self.connection
##shared_connection || ConnectionPool::Wrapper.new(:size => 1) { retrieve_connection }
end
end
ActiveRecord::Base.shared_connection = ActiveRecord::Base.connection

You have to use a connection pool. mysql2 allows the queries to be async, but you can still only send one query at a time to MySQL through one connection. If you send multiple queries through one connection you get the waiting for results message.
Use a connection pool and you should be fine.

Related

How to connect searchkick (in a Rails app &/ Sidekiq job) to multiple elasticsearch clusters without stomping on global searckick config?

Upon startup my app sets my (?global?) searchkick client to point at my default elasticsearch cluster.
Searchkick.client = Elasticsearch::Client.new(
hosts: default_cluster, # this is the list of hosts in my default cluster
retry_on_failure: true,
)
However, I am upgrading my cluster (again), and while I'd like to be able to have my app read/search from that default cluster,
/search?q="some term"
# =>
Model.search("some term")
continue to work against the default_cluster
Where it starts to get a bit tricky is that:
I'd also like (via some specific ?sidekiq background jobs?) to fill an alternate (alt) cluster's index, something like:
Model.connect_to(alternate_cluster) {|client|
Searchkick.client = client
Model.reindex
}
Without causing all other background jobs to interact with the alternate cluster.
And, of course:
I'd like some way to verify that the alternate_cluster is working well (i.e. for search) before making it my default_cluster. And presumably via some admin route:
/admin/search?q="some search term"&cluster=alternate
# =>
Model.connect_to(alternate_cluster) {|client|
Searchkick.client = client
Model.search("some term")
}
And finally:
I'd like to avoid having to reconnect before every search/reindex action, i.e. I'd prefer not to have the overhead of changing (also because that probably implies that long-running tasks that continue to reconnect to searchkick will be swapping back and-forth from one cluster to the other):
Model.search("some term")
# =>
Model.connect_to(alternate_cluster) {|client|
Searchkick.client = client
Model.search("some term")
}
^ I don't want that
FWIW, the best I've been able to come-up with so far is something like:
def self.connect_to(current_cluster, &block)
previous_es_client = Searchkick.client
current_es_client = Elasticsearch::Client.new(
hosts: current_cluster,
retry_on_failure: true,
)
block.call(current_es_client)
rescue Exception => e
logger.warn(e)
ensure
Searchkick.client = previous_es_client
end
But, I suspect that will cause every other interaction within my system (via the same web-worker or other background jobs running in the same background-worker-instance) to (temporarily) point at the alternate cluster.
Thanks in advance for your assistance...

How to share connection pool with multiprocessing Python

I'm trying to share a psycopg2.pool.(Simple/Threaded)ConnectionPool among multiple python processes. What is the correct way to approach this?
I am using Python 2.7 and Postgres 9.
I would like to provide some context. I want to use a connection pool is because I am running an arbitrary, but large (>80) number of processes that initially use the database to query results, perform some actions, then update the database with the results of the actions.
So far, I've tried to use the multiprocessing.managers.BaseManager and pass it to child processes so that the connections that are being used/unused are synchronized across the processes.
from multiprocessing import Manager, Process
from multiprocessing.managers import BaseManager
PASSWORD = 'xxxx'
def f(connection, connection_pool):
with connection.cursor() as curs: # ** LINE REFERENCED BELOW
curs.execute('SELECT * FROM database')
connection_pool.putconn(connection)
BaseManager.register('SimpleConnectionPool', SimpleConnectionPool)
manager = BaseManager()
manager.start()
conn_pool = manager.SimpleConnectionPool(5, 20, dbname='database', user='admin', host='xxxx', password=PASSWORD, port='8080')
with conn_pool.getconn() as conn:
print conn # prints '<connection object at 0x7f48243edb48; dsn:'<unintialized>', closed:0>
proc = Process(target=f, args=(conn, conn_pool))
proc.start()
proc.join()
** raises an error, 'Operational Error: asynchronous connection attempt underway'
If anyone could recommend a method to share the connection pool with numerous processes it would be greatly appreciated.
You don't. You can't pass a socket to another process. The other process must open the connection itself.

Using a config block at the top of the file throws error

I'm trying to set a connection to the database (using Sequel) before the model appears. Well it must be that way but am getting an error
undefined method `configure' for main:Object (NoMethodError)
Here is the code, I don't see anything wrong with setting up the constants there so perhaps it is either something related to the configure block or the config.ru.
require 'sinatra/base'
require 'sequel'
require 'slim'
require 'sass'
require 'sinatra/flash'
require './sinatra/auth'
configure :development do
password = ENV["PGPASSWORD"]
DB = Sequel.postgres('development', user: 'postgres', password: password, host: 'localhost')
end
configure :production do
DB = Sequel.connect(ENV['DATABASE_URL'])
end
Here is the rack file. I tried to do the connect statement in there but failed (so far)
require 'sinatra/base'
require './main'
require './song'
require 'sequel'
map('/songs') { run SongController }
map('/') { run Website}
Not understanding why the configure block will not work.
Edit: I'm guessing because the call to the SongController is in config.ru, the connect statements need to be in there as well.
Edit: And further along , since this is a modular app, a config.yml is probably my best option.
You're using sinatra/base. That means you'll have to use a subclass:
require 'sinatra/base'
require 'sequel'
require 'slim'
require 'sass'
require 'sinatra/flash'
require './sinatra/auth'
class MyApp < Sinatra::Base
configure :development do
password = ENV["PGPASSWORD"]
DB = Sequel.postgres('development', user: 'postgres', password: password, host: 'localhost')
end
configure :production do
DB = Sequel.connect(ENV['DATABASE_URL'])
end
run! if app_file == $0
end
NB: You can just use require sinatra and all the magic without using subclasses will be abailable. Or, if you need a modular app, use Sinatra::Application and you will have all the magic included. See sinatra's readme for full coverage on the differences.

uWSGI, Flask, sqlalchemy, and postgres: SSL error: decryption failed or bad record mac

I'm trying to setup an application webserver using uWSGI + Nginx, which runs a Flask application using SQLAlchemy to communicate to a Postgres database.
When I make requests to the webserver, every other response will be a 500 error.
The error is:
Traceback (most recent call last):
File "/var/env/argos/lib/python3.3/site-packages/sqlalchemy/engine/base.py", line 867, in _execute_context
context)
File "/var/env/argos/lib/python3.3/site-packages/sqlalchemy/engine/default.py", line 388, in do_execute
cursor.execute(statement, parameters)
psycopg2.OperationalError: SSL error: decryption failed or bad record mac
The above exception was the direct cause of the following exception:
sqlalchemy.exc.OperationalError: (OperationalError) SSL error: decryption failed or bad record mac
The error is triggered by a simple Flask-SQLAlchemy method:
result = models.Event.query.get(id)
uwsgi is being managed by supervisor, which has a config:
[program:my_app]
command=/usr/bin/uwsgi --ini /etc/uwsgi/apps-enabled/myapp.ini --catch-exceptions
directory=/path/to/my/app
stopsignal=QUIT
autostart=true
autorestart=true
and uwsgi's config looks like:
[uwsgi]
socket = /tmp/my_app.sock
logto = /var/log/my_app.log
plugins = python3
virtualenv = /path/to/my/venv
pythonpath = /path/to/my/app
wsgi-file = /path/to/my/app/application.py
callable = app
max-requests = 1000
chmod-socket = 666
chown-socket = www-data:www-data
master = true
processes = 2
no-orphans = true
log-date = true
uid = www-data
gid = www-data
The furthest that I can get is that it has something to do with uwsgi's forking. But beyond that I'm not clear on what needs to be done.
The issue ended up being uwsgi's forking.
When working with multiple processes with a master process, uwsgi initializes the application in the master process and then copies the application over to each worker process. The problem is if you open a database connection when initializing your application, you then have multiple processes sharing the same connection, which causes the error above.
The solution is to set the lazy configuration option for uwsgi, which forces a complete loading of the application in each process:
lazy
Set lazy mode (load apps in workers instead of master).
This option may have memory usage implications as Copy-on-Write semantics can not be used. When lazy is enabled, only workers will be reloaded by uWSGI’s reload signals; the master will remain alive. As such, uWSGI configuration changes are not picked up on reload by the master.
There's also a lazy-apps option:
lazy-apps
Load apps in each worker instead of the master.
This option may have memory usage implications as Copy-on-Write semantics can not be used. Unlike lazy, this only affects the way applications are loaded, not master’s behavior on reload.
This uwsgi configuration ended up working for me:
[uwsgi]
socket = /tmp/my_app.sock
logto = /var/log/my_app.log
plugins = python3
virtualenv = /path/to/my/venv
pythonpath = /path/to/my/app
wsgi-file = /path/to/my/app/application.py
callable = app
max-requests = 1000
chmod-socket = 666
chown-socket = www-data:www-data
master = true
processes = 2
no-orphans = true
log-date = true
uid = www-data
gid = www-data
# the fix
lazy = true
lazy-apps = true
As an alternative you might dispose the engine. This is how I solved the problem.
Such issues may happen if there is a query during the creation of the app, that is, in the module that creates the app itself. If that states, the engine allocates a pool of connections and then uwsgi forks.
By invoking 'engine.dispose()', the connection pool itself is closed and new connections will come up as soon as someone starts making queries again. So if you do that at the end of the module where you create your app, new connections will be created after the UWSGI fork.
I am running a flask app using gunicorn on Heroku. My application started exhibiting this problem when I added the --preload option to my Procfile. When I removed that option, my application resumed functioning as normal.
Not sure whether to add this as an answer to this question or ask a separate question and put this as an answer there. I was getting this exact same error for reasons that are slightly different from the people who have posted and answered. In my setup, I using gunicorn as a wsgi for a Flask application. In this application, I was offloading some intense database operations off to a celery worker. The error would come from the celery worker.
From reading a lot of the answers here and looking at the psycopg2 as well as sqlalchemy session documentation, it became apparent to me that it is a bad idea to share an SQLAlchemy session between separate processes (the gunicorn worker and the sqlalchemy worker in my case).
What ended up solving this for me was creating a new session in the celery worker function so it used a new session each time it was called and also destroying the session after every web request so flask used a session per request. The overall solution looked like this:
Flask_app.py
#app.teardown_appcontext
def shutdown_session(exception=None):
session.close()
celery_func.py
#celery_app.task(bind=True, throws=(IntegrityError))
def access_db(self,entity_dict, tablename):
with Session() as session:
try:
session.add(ORM_obj)
session.commit()
except IntegrityError as e:
session.rollback()
print('primary key violated')
raise e

Config CarrierWave with Mongoid - GridFS

I am getting a trouble, trying to use CarrierWave for a file upload Rest API developed in Rails 3, with a MongoDB database.
What I would like to do is storing some files (not only images but every file format) with the MongoDB system GridFS.
I read many documentations that recommend my to use the CarrierWave gem.
But I have an error when I try to configure it.
My development environment :
The Gemfile :
source 'https://rubygems.org'
gem 'rails', '3.2.8'
# MongoDB
gem 'mongoid', :git => 'git://github.com/mongoid/mongoid.git'
gem 'carrierwave', :git => "git://github.com/jnicklas/carrierwave.git"
# gem 'carrierwave-mongoid', :require => 'carrierwave/mongoid'
gem 'mini_magick', :git => 'git://github.com/probablycorey/mini_magick.git'
gem 'bson_ext'
gem 'json'
The application.rb :
require File.expand_path('../boot', __FILE__)
# ActiveRecord will not be use with MongoDB
# require 'rails/all'
require "action_controller/railtie"
require "action_mailer/railtie"
require "active_resource/railtie"
require "rails/test_unit/railtie"
require "sprockets/railtie"
require "mongoid/railtie"
require "carrierwave"
# require "carrierwave/mongoid"
I define the database with a mongoid.yml (config/mongoid.yml) file :
development:
sessions:
default:
database: lf_rest_api_development
hosts:
- localhost:27017
options:
consistency: :strong
options:
test:
sessions:
default:
database: lf_rest_api_test
hosts:
- localhost:27017
options:
consistency: :strong
And load it with an initializer (config/initializers/mongoid.rb) :
Mongoid.load!("config/mongoid.yml")
-- I can execute the "rails server" command without problems after the last file, config/initializers/carrierwave.rb :
CarrierWave.configure do |config|
config.grid_fs_database = Mongoid.database.name
config.grid_fs_host = Mongoid.config.master.connection.host
config.storage = :grid_fs
config.grid_fs_access_url = "/files"
end
And then get the following error when I run the "rails server" command :
=> Booting WEBrick
=> Rails 3.2.8 application starting in development on http://0.0.0.0:3000
=> Call with -d to detach
=> Ctrl-C to shutdown server
Exiting
/{API_path}/config/initializers/zcarrierwave.rb:4:in `block in <top (required)>': undefined method `database' for Mongoid:Module (NoMethodError)
[...]
My file model is defined as following :
require 'carrierwave/orm/mongoid'
class File
include Mongoid::Document
store_in_collection: "files", database: "lf_rest_api_developement", session: "default"
key :filename, type: String
key :content_type, type: String
key :length, type: BigDecimal
key :chunk_size, type: Integer, :default => 256
key :upload_date, type: DateTime
key :md5, type: String
key :metadata, type: Array, :default => []
mount_uploader :file, FileUploader
index({ location: "2d" }, { min: -200, max: 200 })
end
The FileUploader is just an extension of CarrierWave uploader...
class FileUploader < CarrierWave::Uploader::Base
storage :grid_fs
end
Sorry about the slow response. Firstly, the reason for your error is that Mongoid 3 no longer supports Mongoid.database. You can now find these configurations in the Mongoid::Config.sessions[:default] object.
BUT THIS AIN'T GONNA FIX YOUR PROBLEM! Mongoid 3 has no GridFS support at all. From mongoid docs:
No GridFS Support
GridFS is marketed as a core database feature, when in fact it is not. It is simply a pattern for storing chunked file data as documents in a collection, just like any other document. The implementation of this behaviour is handled in the client drivers, not in the core database itself, which can lead to discrepencies in how this is handled across > platforms.
Even if having this behaviour in the client is acceptable, the effects of this on application performance where you are not just storing file data is quite large. Since files are stored as documents, they consume RAM just as any other document in the database would, and can easily cause memory consumption on your server to max out. There are also limitations in chunking the data, such as you do not have the ability to update a file - you must delete the file and replace it with a new one.
Given this, we did not prioritize any work with GridFS at the front, but there is a gem in the pipeline for those who can wait a bit to upgrade. In the meantime you have a few options...
So rather than seek other ways to store uploads in the GridFS at the expense of performance, I would suggest just throwing them in a SQL database. If your using Mongo as your only database, don't be put off by this option. It's not very difficult to get ActiveRecord and Mongoid working together side-by-side. But from my experience, uploading binary objects to any database may not perform well. I would personally use a filesystem for storage, with carrierwave or paperclip taking care of the management. Alternatively, I would suggest checking out some cheap cloud storage options. You can use something like aws-s3, a great service. It also has very well documented compatibility with Carrierwave.
If you are determined to use GridFS, I would check out the mongoid-grid_fs gem or check out some alternative ruby MongoDB drivers on the 10gen website.
This is my first time answering a question so I hope I'm doing this right.
I was struggling with the same issue uploading an image using carrier wave in my rails application with Mongoid 3. I believe I have a solution (at least got it working locally on my laptop.) Here is what I came up with:
Add carrierwave-mongoid gem to your gemfile with the branch mongoid-3.0. This gem uses mongoid-grid_fs:
# Image Uploading
gem "carrierwave-mongoid", :git => "git://github.com/jnicklas/carrierwave-mongoid.git", :branch => "mongoid-3.0"
Make an initializer for carrier wave:
#config/initializers/carrierwave.rb
CarrierWave.configure do |config|
config.storage = :grid_fs
# Storage access url
config.grid_fs_access_url = "/upload/grid"
end
I know I didn't set config.grid_fs_database or config.grid_fs_host. This seems to work locally (on my laptop) I haven't tried it with a remote gridfs database.
Mounting looks normal:
#app/models/user.rb
class User
include Mongoid::Document
mount_uploader :avatar, AvatarUploader
end
Uploader is also standard:
#app/uploaders/avatar_uploader
class AvatarUploader < CarrierWave::Uploader::Base
include CarrierWave::MiniMagick
def store_dir
"#{model.class.to_s.underscore}/#{mounted_as}/#{model.id}"
end
version :thumb do
process :resize_to_limit => [200, 200]
end
end
Create a controller for GridFS:
#app/controllers/gridfs_controller.rb
class GridfsController < ApplicationController
def serve
gridfs_path = env["PATH_INFO"].gsub("/upload/grid/", "")
begin
gridfs_file = Mongoid::GridFS[gridfs_path]
self.response_body = gridfs_file.data
self.content_type = gridfs_file.content_type
rescue
self.status = :file_not_found
self.content_type = 'text/plain'
self.response_body = ''
end
end
end
and add the route to the routes file:
#config/routes.rb
match "/upload/grid/*path" => "gridfs#serve"
Hope this helps.