I have an python application which creates number of threads for a job. each thread connects to mongodb and retrieve data. Number of allowed connection to mongodb is 200 which I'm taking care using semaphore. And once mongo querying job is done each thread closes mongodb connection. But while executing this application I'm getting same error for all threads. Error is:
Traceback (most recent call last):
File "C:\Python34\lib\threading.py", line 921, in _bootstrap_inner
self.run()
File "C:\Python34\lib\threading.py", line 869, in run
self._target(*self._args, **self._kwargs)
File "C:/path/pytest/under_construction/testAlgo.py", line 95, in sample_thread
status=monObj.process_status(list_value1,list_value2,5,120,120)
File "C:\path\pytest\under_construction\mongo_lib.py", line 153, in process_status
result=self.mongo_result('Submission','find',q={})
File "C:\path\pytest\under_construction\mongo_lib.py", line 53, in mongo_result
result=list(_query[query_type.lower()](query_string[keys]))
File "C:\Python34\lib\site-packages\pymongo\cursor.py", line 1076, in __next__
if len(self.__data) or self._refresh():
File "C:\Python34\lib\site-packages\pymongo\cursor.py", line 1037, in _refresh
limit, self.__id))
File "C:\Python34\lib\site-packages\pymongo\cursor.py", line 933, in __send_message
res = client._send_message_with_response(message, **kwargs)
File "C:\Python34\lib\site-packages\pymongo\mongo_client.py", line 1205, in _send_message_with_response
response = self.__send_and_receive(message, sock_info)
File "C:\Python34\lib\site-packages\pymongo\mongo_client.py", line 1182, in __send_and_receive
return self.__receive_message_on_socket(1, request_id, sock_info)
File "C:\Python34\lib\site-packages\pymongo\mongo_client.py", line 1174, in __receive_message_on_socket
return self.__receive_data_on_socket(length - 16, sock_info)
File "C:\Python34\lib\site-packages\pymongo\mongo_client.py", line 1153, in __receive_data_on_socket
chunk = sock_info.sock.recv(length)
MemoryError
Code for creating mongo connection
client=MongoClient(mc_name,port)
I was thinking, is this error due to results of all threads accumulating at one port of machine running my application?
MongoClient is a thread-safe connection pool, so you should be creating a single instance that's shared by all the worker threads rather than having each thread create its own.
The connection pool size defaults to 100, but if you want to make it even larger you can use the maxPoolSize parameter to do that (e.g. maxPoolSize=200).
Related
We get the following error when we try to push backup using wal-e:
2020-07-16T21:18:55Z <Greenlet at 0x7f2a59fadc48: <wal_e.worker.upload.PartitionUploader object at 0x7f2a59f96cc0>([ExtendedTarInfo(submitted_path='/var/lib/postgres)> failed with PermissionError
wal_e.operator.backup WARNING MSG: blocking on sending WAL segments
DETAIL: The backup was not completed successfully, but we have to wait anyway. See README: TODO about pg_cancel_backup
STRUCTURED: time=2020-07-16T21:18:55.651073-00 pid=19697
wal_e.main CRITICAL MSG: An unprocessed exception has avoided all error handling
DETAIL: Traceback (most recent call last):
File "/var/lib/postgresql/virtualenvs/wal-e/lib/python3.5/site-packages/wal_e/operator/backup.py", line 197, in database_backup
**kwargs)
File "/var/lib/postgresql/virtualenvs/wal-e/lib/python3.5/site-packages/wal_e/operator/backup.py", line 500, in _upload_pg_cluster_dir
pool.put(tpart)
File "/var/lib/postgresql/virtualenvs/wal-e/lib/python3.5/site-packages/wal_e/worker/upload_pool.py", line 108, in put
self._wait()
File "/var/lib/postgresql/virtualenvs/wal-e/lib/python3.5/site-packages/wal_e/worker/upload_pool.py", line 65, in _wait
raise val
File "src/gevent/greenlet.py", line 766, in gevent._greenlet.Greenlet.run
File "/var/lib/postgresql/virtualenvs/wal-e/lib/python3.5/site-packages/wal_e/worker/upload.py", line 96, in __call__
gpg_key=self.gpg_key) as pl:
File "/var/lib/postgresql/virtualenvs/wal-e/lib/python3.5/site-packages/wal_e/pipeline.py", line 92, in __enter__
self.stdin = pipebuf.NonBlockBufferedWriter(stdin)
File "/var/lib/postgresql/virtualenvs/wal-e/lib/python3.5/site-packages/wal_e/pipebuf.py", line 225, in __init__
_setup_fd(self._fd)
File "/var/lib/postgresql/virtualenvs/wal-e/lib/python3.5/site-packages/wal_e/pipebuf.py", line 62, in _setup_fd
set_buf_size(fd)
File "/var/lib/postgresql/virtualenvs/wal-e/lib/python3.5/site-packages/wal_e/pipebuf.py", line 53, in set_buf_size
fcntl.fcntl(fd, fcntl.F_SETPIPE_SZ, OS_PIPE_SZ)
PermissionError: [Errno 1] Operation not permitted
It's not clear why fcntl call may lead to PermissionError.
PostgreSQL version: 9.6, Python: 3.5, Wal-e : 1.1.1 (tried also 1.0.3 and 1.1.0).
It was working previously and stopped working at some point (without any noticeable changes).
Well, I'm late to the game. See https://github.com/wal-e/wal-e/issues/270
I've worked around it by patching wal-e to not set this.
Periodically all my Celery workers get stuck on something. I cannot figure out what is causing this, as inspect doesn't work as all the workers are busy.
celery inspect active
Error: No nodes replied within time constraint
Is it possible to get Celery status, like active tasks, even if nodes are doing something (that seems to be causing problems)? Can I somehow spin up a temporary worker just to get inspect output?
What kind of other strategies there would be to diagnose this issue?
Celery 4.x. Redis backend.
This turned out to be a deadlock issue with Celery + gevent (evil monkey patch) + Sentry's Raven logger.
https://github.com/getsentry/raven-python/issues/305
To diagnose issues
You can start Celery workers with different queues (-q, -n) parameters and see when workers hang. Even if some worker groups are hung the others still may respond to inspect queries.
Celery file logs may reveal the error
2017-02-27 08:36:34,371 CRITI [celery.worker][DummyThread-6] Unrecoverable error: AttributeError("'NoneType' object has no attribute 'readline'",)
Traceback (most recent call last):
File "/srv/pyramid/xxx/venv/lib/python3.5/site-packages/celery/worker/worker.py", line 203, in start
self.blueprint.start(self)
File "/srv/pyramid/xxx/venv/lib/python3.5/site-packages/celery/bootsteps.py", line 119, in start
step.start(parent)
File "/srv/pyramid/xxx/venv/lib/python3.5/site-packages/celery/bootsteps.py", line 370, in start
return self.obj.start()
File "/srv/pyramid/xxx/venv/lib/python3.5/site-packages/celery/worker/consumer/consumer.py", line 318, in start
blueprint.start(self)
File "/srv/pyramid/xxx/venv/lib/python3.5/site-packages/celery/bootsteps.py", line 119, in start
step.start(parent)
File "/srv/pyramid/xxx/venv/lib/python3.5/site-packages/celery/worker/consumer/consumer.py", line 594, in start
c.loop(*c.loop_args())
File "/srv/pyramid/xxx/venv/lib/python3.5/site-packages/celery/worker/loops.py", line 118, in synloop
connection.drain_events(timeout=2.0)
File "/srv/pyramid/xxx/venv/lib/python3.5/site-packages/kombu/connection.py", line 301, in drain_events
return self.transport.drain_events(self.connection, **kwargs)
File "/srv/pyramid/xxx/venv/lib/python3.5/site-packages/kombu/transport/virtual/base.py", line 961, in drain_events
get(self._deliver, timeout=timeout)
File "/srv/pyramid/xxx/venv/lib/python3.5/site-packages/kombu/transport/redis.py", line 359, in get
ret = self.handle_event(fileno, event)
File "/srv/pyramid/xxx/venv/lib/python3.5/site-packages/kombu/transport/redis.py", line 341, in handle_event
return self.on_readable(fileno), self
File "/srv/pyramid/xxx/venv/lib/python3.5/site-packages/kombu/transport/redis.py", line 337, in on_readable
chan.handlers[type]()
File "/srv/pyramid/xxx/venv/lib/python3.5/site-packages/kombu/transport/redis.py", line 714, in _brpop_read
**options)
File "/srv/pyramid/xxx/venv/lib/python3.5/site-packages/redis/client.py", line 585, in parse_response
response = connection.read_response()
File "/srv/pyramid/xxx/venv/lib/python3.5/site-packages/redis/connection.py", line 577, in read_response
response = self._parser.read_response()
File "/srv/pyramid/xxx/venv/lib/python3.5/site-packages/redis/connection.py", line 238, in read_response
response = self._buffer.readline()
AttributeError: 'NoneType' object has no attribute 'readline'
I'm using mongo-connector and neo4j_doc_manager for syncing the mongodb's data to neo4j, it used to work perfectly but today it started giving following error.
2016-07-29 17:18:59,558 [CRITICAL] mongo_connector.oplog_manager:549 - Exception during collection dump
Traceback (most recent call last):
File "/usr/local/lib/python2.7/site-packages/mongo_connector/oplog_manager.py", line 501, in do_dump
upsert_all(dm)
File "/usr/local/lib/python2.7/site-packages/mongo_connector/oplog_manager.py", line 485, in upsert_all
dm.bulk_upsert(docs_to_dump(namespace), mapped_ns, long_ts)
File "/usr/local/lib/python2.7/site-packages/mongo_connector/util.py", line 38, in wrapped
reraise(new_type, exc_value, exc_tb)
File "/usr/local/lib/python2.7/site-packages/mongo_connector/util.py", line 32, in wrapped
return f(*args, **kwargs)
File "/usr/local/lib/python2.7/site-packages/mongo_connector/doc_managers/neo4j_doc_manager.py", line 89, in bulk_upsert
tx.commit()
File "/usr/local/lib/python2.7/site-packages/py2neo/cypher/core.py", line 306, in commit
return self.post(self.__commit or self.__begin_commit)
File "/usr/local/lib/python2.7/site-packages/py2neo/cypher/core.py", line 261, in post
raise self.error_class.hydrate(error)
File "/usr/local/lib/python2.7/site-packages/py2neo/cypher/error/core.py", line 54, in hydrate
error_cls = getattr(error_module, title)
Neo4jOperationFailed: 'module' object has no attribute 'ConstraintValidationFailed'
2016-07-29 17:18:59,563 [ERROR] mongo_connector.oplog_manager:557 - OplogThread: Failed during dump collection cannot recover!
You're trying to insert data which doesn't match the constraints of your Neo4j schema (unicity or existence), and apparently the code doesn't know how to handle that type of error, though it does give its name:
ConstraintValidationFailed
You should maybe activate some log to see the data which it is trying to insert, or the Cypher query it's trying to execute.
We recently updated MongoDB from 2.6 to 3.0. Since then, we are having trouble using PyMongo in combination with Multiprocessing.
The issue is that sometimes an operation (e.g. find) within a process hangs for ~30 seconds and then throws an exception "ServerSelectionTimeoutError: No servers found yet".
The behavior seems independent from the input, as our script usually runs just fine for a few times and then hangs randomly.
The log files do not show any entries related to timeouts, nor did I find any useful information on the Internet about this issue.
The script is running in our test environment, meaning there are no replica sets involved and the Mongo instance in bound to localhost.
Here is the stack trace for completeness:
File "/usr/lib/python2.7/multiprocessing/process.py", line 258, in _bootstrap
self.run()
File "somescript.py", line 109, in run
self.find_incoming_cc()
File "somescript.py", line 370, in find_incoming_cc
{'_id': 1, 'cc': 1}
File "/usr/local/lib/python2.7/dist-packages/pymongo/cursor.py", line 983, in next
if len(self.__data) or self._refresh():
File "/usr/local/lib/python2.7/dist-packages/pymongo/cursor.py", line 908, in _refresh
self.__read_preference))
File "/usr/local/lib/python2.7/dist-packages/pymongo/cursor.py", line 813, in __send_message
**kwargs)
File "/usr/local/lib/python2.7/dist-packages/pymongo/mongo_client.py", line 728, in _send_message_with_response
server = topology.select_server(selector)
File "/usr/local/lib/python2.7/dist-packages/pymongo/topology.py", line 121, in select_server
address))
File "/usr/local/lib/python2.7/dist-packages/pymongo/topology.py", line 97, in select_servers
self._error_message(selector))
ServerSelectionTimeoutError: No servers found yet
Now for the question: Is there any known issue/bug when using PyMongo with Multiprocessing? Is there a way to debug the exception?
Thanks for any help!
It is bug in pymongo version 3.0.x. Bug report url https://jira.mongodb.org/browse/PYTHON-961
Workaround for this issue. (Tested in pymongo 3.0.3)
Pass “connect=False” in MongoClient object initialisation
MongoClient(uri, connect=False)
Or simply wait for few seconds before creating instance of MongoClient in the child process (like time.sleep(2)).
def start(uri):
time.sleep(2)
mclient = MongoClient(uri)
mclient.db.collection.find_one()
if __name__ == '__main__':
p = multiprocessing.Process(target=start, args=('mongodb://localhost:27017/',))
p.start()
In what circumstances would redis-py raise the following AttributeError exception?
Isn't redis-py built by design to raise only redis.exceptions.RedisError based exceptions?
What would be a reasonable handling logic?
Traceback (most recent call last):
File "c:\Python27\Lib\threading.py", line 551, in __bootstrap_inner
self.run()
File "c:\Python27\Lib\threading.py", line 504, in run
self.__target(*self.__args, **self.__kwargs)
File C:\Users\Administrator\Documents\my_proj\my_module.py", line 33, in inner
ret = protected_func(*args, **kwargs)
File C:\Users\Administrator\Documents\my_proj\my_module.py", line 104, in _listen
for message in _pubsub.listen():
File "C:\Users\Administrator\virtual_environments\my_env\lib\site-packages\redis\client.py", line 1555, in listen
r = self.parse_response()
File "C:\Users\Administrator\virtual_environments\my_env\lib\site-packages\redis\client.py", line 1499, in parse_response
response = self.connection.read_response()
File "C:\Users\Administrator\virtual_environments\my_env\lib\site-packages\redis\connection.py", line 306, in read_response
response = self._parser.read_response()
File "C:\Users\Administrator\virtual_environments\my_env\lib\site-packages\redis\connection.py", line 104, in read_response
response = self.read()
File "C:\Users\Administrator\virtual_environments\my_env\lib\site-packages\redis\connection.py", line 89, in read
return self._fp.readline()[:-2]
AttributeError: 'NoneType' object has no attribute 'readline'
seems like an old question, but I faced the same problem recently.
My setup was using celery with redis as a broker. A ThreadPoolExecutor uses the shared celery object to batch tasks to workers. The batcher function waits for the submitted tasks to finish using celery.result.ResultSet.
After quick investigations, I found that celery somewhere uses a pub/sub mechanism to wait for the tasks to finish. And that is it, pub/sub don't play well with thread-safety per the official readme https://github.com/andymccurdy/redis-py#thread-safety
Honestly, I didn't try to prove my theory and fixed my problem by switching to a ProcessPoolExecutor instead.