Cursor id not valid at server even with no_cursor_timeout=True - mongodb

Traceback (most recent call last):
File "from_mongo.py", line 27, in <module>
for sale in pm.events.find({"type":"sale", "date":{"$gt":now-(_60delta+_2delta)}}, no_cursor_timeout=True, batch_size=100):
File "/usr/local/lib/python2.7/dist-packages/pymongo/cursor.py", line 968, in __next__
if len(self.__data) or self._refresh():
File "/usr/local/lib/python2.7/dist-packages/pymongo/cursor.py", line 922, in _refresh
self.__id))
File "/usr/local/lib/python2.7/dist-packages/pymongo/cursor.py", line 838, in __send_message
codec_options=self.__codec_options)
File "/usr/local/lib/python2.7/dist-packages/pymongo/helpers.py", line 110, in _unpack_response
cursor_id)
pymongo.errors.CursorNotFound: cursor id '1025112076089406867' not valid at server
I have also experimented with bigger or lower batch size, and no no_cursor_timeout at all. I have even managed to get this error on a very small collection (200 documents with id and title). It seems to happen when the database is not responsive (heavy inserts). The setup is a cluster of 3 shards of 3 replica sets (9 mongodb instances), mongodb 3.0.

Based on the line numbers in your traceback, it looks like you're using PyMongo 3, which was released last week. Are you using multiple mongos servers in a sharded clusters? If so, the error is probably a symptom of a critical new bug in PyMongo 3:
https://jira.mongodb.org/browse/PYTHON-898
It will be fixed in PyMongo 3.0.1, which we'll release within a week.

It just hit me that I thought I was using pymongo 3.0 which has a flag called no_cursor_timeout=True, when in fact I was using 2.8.

Related

pymongo atlas loadbalancer error ARM device

I get a LoadBalancerSupportMismatch error accessing my online mongo database/cluster from an ARM device (Jetson Xavier) running ubuntu 18.04 jetson version that came with it. The code works on a normal x86 pc and is run using python 3.6 (I use 3.8 on the normal pc).
My code is straightforward. I anonymized parts of it.
self.online_client = MongoClient(
f"mongodb+srv://<user>:<passowrd>}#<dbname>.pkphq.mongodb.net/Xcontainers?retryWrites=true&w=majority")
self.cloud_coll = self.online_client[<dbname>][<collection>]
self.cloud_coll.insert_one(some_dict)
The error I get on the jetson is:
File "/usr/local/lib/python3.6/dist-packages/pymongo/collection.py", line 1319, in find_one
for result in cursor.limit(-1):
File "/usr/local/lib/python3.6/dist-packages/pymongo/cursor.py", line 1207, in next
if len(self.__data) or self._refresh():
File "/usr/local/lib/python3.6/dist-packages/pymongo/cursor.py", line 1100, in _refresh
self.__session = self.__collection.database.client._ensure_session()
File "/usr/local/lib/python3.6/dist-packages/pymongo/mongo_client.py", line 1816, in _ensure_session
return self.__start_session(True, causal_consistency=False)
File "/usr/local/lib/python3.6/dist-packages/pymongo/mongo_client.py", line 1766, in __start_session
server_session = self._get_server_session()
File "/usr/local/lib/python3.6/dist-packages/pymongo/mongo_client.py", line 1802, in _get_server_session
return self._topology.get_server_session()
File "/usr/local/lib/python3.6/dist-packages/pymongo/topology.py", line 499, in get_server_session
None)
File "/usr/local/lib/python3.6/dist-packages/pymongo/topology.py", line 217, in _select_servers_loop
(self._error_message(selector), timeout, self.description))
pymongo.errors.ServerSelectionTimeoutError: The server is being accessed through a load balancer, but this driver does not have load balancing enabled, full error: {'ok': 0, 'errmsg': 'The server is being accessed through a load balancer, but this driver does not have load balancing enabled', 'code': 354, 'codeName': 'LoadBalancerSupportMismatch'}, Timeout: 30s, Topology Description: <TopologyDescription id: 61ee9d768a646fd4a74f0849, topology_type: Single, servers: [<ServerDescription ('containers-lb.pkphq.mongodb.net', 27017) server_type: Unknown, rtt: None, error=OperationFailure("The server is being accessed through a load balancer, but this driver does not have load balancing enabled, full error: {'ok': 0, 'errmsg': 'The server is being accessed through a load balancer, but this driver does not have load balancing enabled', 'code': 354, 'codeName': 'LoadBalancerSupportMismatch'}",)>]>
[INFO] [1643027861.848502]: No id to push measurement
My little journey at resolving the issue brought me here, as pretty much the only thing that seemed relevant: https://www.mongodb.com/community/forums/t/scala-driver-2-9-0-connection-fails-with-loadbalancersupportmismatch/126525/2 . So it appeared that the scala driver is not up to date. It seems I need to update it using sbt or maven: http://mongodb.github.io/mongo-java-driver/4.3/driver-scala/getting-started/installation/
I set up the hardware quite recently, and it's up to date, so a bit puzzling why the driver isn't then up to date.
Looking into the documentation of sbt and maven; it seems totally unrelated at worst and very complicated at best to get pymongo working properly again with mongo atlas.
Is there a better solution to make the load balancer issue go away, or get my driver up to date?
Had the same issue, fixed it by upgrading pymongo to version 3.12, since this document describes that as the minimum version for serverless clusters (which is my case)
It seemed upgrading the cluster worked. I first used M0 - M2 cluster. Paying more for M10 somehow fixed the issue.

Mongo connector error: Unable to process oplog document

I am new to neo4j-doc-manager and I am trying to use neo4j-doc-manager to view the collection from my mongoDB to a created graph in neo4j as per:
https://neo4j.com/developer/mongodb/
I've have my mongoDB and neo4j instance running in local and I'm using the following command:
mongo-connector -m mongodb://localhost:27017/axa -t
http://<user_name>:
<password>#localhost:7474/C:/Users/user_name/.Ne
o4jDesktop/neo4jDatabases/database-c791fa15-9a0d-4051-bb1f-
316ec9f1c7df/installation-4.0.3/data/ -d neo4j_doc_manager
However I get an error:
2020-04-17 15:49:47,011 [ERROR] mongo_connector.oplog_manager:309 - **Unable to process oplog document** {'ts': Timestamp(1587118784, 2), 't': 9, 'h': 0, 'v': 2, 'op': 'i', 'ns': 'axa.talks', 'ui': UUID('3245621e-e204-49fc-8350-d9950246fa6c'), 'wall': datetime.datetime(2020, 4, 17, 10, 19, 44, 994000), 'o': {'session': {'title': '12 Years of Spring: An Open Source Journey', 'abstract': 'Spring emerged as a core open source project in early 2003 and evolved to a broad portfolio of open source projects up until 2015.'}, 'topics': ['keynote', 'spring'], 'room': 'Auditorium', 'timeslot': 'Wed 29th, 09:30-10:30', 'speaker': {'name': 'Juergen Hoeller', 'bio': 'Juergen Hoeller is co-founder of the Spring Framework open source project.', 'twitter': 'https://twitter.com/springjuergen', 'picture': 'http://www.springio.net/wp-content/uploads/2014/11/juergen_hoeller-220x220.jpeg'}}}
Traceback (most recent call last):
File "c:\users\user_name\pycharmprojects\axa_experience\venv\lib\site-packages\py2neo\core.py", line 258, in get
response = self.__base.get(headers=headers, redirect_limit=redirect_limit, **kwargs)
File "c:\users\user_name\pycharmprojects\axa_experience\venv\lib\site-packages\py2neo\packages\httpstream\http.py", line 966, in get
return self.__get_or_head("GET", if_modified_since, headers, redirect_limit, **kwargs)
File "c:\users\user_name\pycharmprojects\axa_experience\venv\lib\site-packages\py2neo\packages\httpstream\http.py", line 943, in __get_or_head
return rq.submit(redirect_limit=redirect_limit, **kwargs)
File "c:\users\user_name\pycharmprojects\axa_experience\venv\lib\site-packages\py2neo\packages\httpstream\http.py", line 452, in submit
return Response.wrap(http, uri, self, rs, **response_kwargs)
File "c:\users\user_name\pycharmprojects\axa_experience\venv\lib\site-packages\py2neo\packages\httpstream\http.py", line 489, in wrap
raise inst
**py2neo.packages.httpstream.http.ClientError: 404 Not Found**
Versions used:
Python - 3.8
mongoDB - 4.2.5
neo4j - 4.0.3
Any help in this regards, I would really appreciate.
I was having the same problem and I think the issue has to do with the version of py2neo. Mongo connector only seems to work with version 2.0.7 but when you install that version Neo4j 4.0 doesn't work with version 2.0.7. This is where I got stuck and found no solution to fix it. Maybe using Neo4J 3.0 could fix that but that wouldn't work for me as I need 4.0 for a fabric database. I've recently started looking into APOC procedures for mongodb instead. Hope this was helpful.
The doc-manager library you are using requires that the Mongo api-rest work, and in new versions it no longer works. If you want to use mongo version <3.2 (it has the active api rest).

Airflow scheduler is throwing out an error - 'DisabledBackend' object has no attribute '_get_task_meta_for'

I am trying to install airflow (distributed mode) in WSL, I got the setup of Airflow webserver, Airflow Scheduler, Airflow Worker, Celery (3.1) and RabbitMQ.
While running the Airflow Scheduler it is throwing out this error (below) even though the backend is set up.
ERROR
Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/airflow/executors/celery_executor.py", line 92, in sync
state = task.state
File "/usr/local/lib/python3.6/dist-packages/celery/result.py", line 398, in state
return self._get_task_meta()['status']
File "/usr/local/lib/python3.6/dist-packages/celery/result.py", line 341, in _get_task_meta
return self._maybe_set_cache(self.backend.get_task_meta(self.id))
File "/usr/local/lib/python3.6/dist-packages/celery/backends/base.py", line 288, in get_task_meta
meta = self._get_task_meta_for(task_id)
AttributeError: 'DisabledBackend' object has no attribute '_get_task_meta_for'
https://issues.apache.org/jira/browse/AIRFLOW-1840
This is the exact error I am getting but couldn't find a solution.
Result Backend-
result_backend = db+postgresql://postgres:****#localhost:5432/postgres
broker_url = amqp://rabbitmq_user_name:rabbitmq_password#localhost/rabbitmq_virtual_host_name
Help please, gone through almost all the documents but couldn't find a solution
I was facing the same issue on celery version - 3.1.26.post2 (with rabitmq,postgresql and airflow),the reason for this issue is the dictionary used in celery base.py file at(lib/python3.5/site-packages/celery/app/base.py)
does not capture celery backend at key CELERY_RESULT_BACKEND instead it captures at key result_backend.
So the solution here is go to _get_config function available in base.py file at(lib/python3.5/site-packages/celery/app/base.py),at the end of the function before returning dictionary s add the below code.
s['CELERY_RESULT_BACKEND'] = s['result_backend'] #code to be added
return s
This solved the problem.

PyMongo AutoReconnect: timed out

I work in an Azure environment. I have a VM that runs a Django application (Open edX) and a Mongo server on another VM instance (Ubuntu 16.04). Whenever I try to load anything in the application (where the data is fetched from the Mongo server), I would get an error like this one:
Feb 23 12:49:43 xxxxx [service_variant=lms][mongodb_proxy][env:sandbox] ERROR [xxxxx 13875] [mongodb_proxy.py:55] - Attempt 0
Traceback (most recent call last):
File "/edx/app/edxapp/venvs/edxapp/local/lib/python2.7/site-packages/mongodb_proxy.py", line 53, in wrapper
return func(*args, **kwargs)
File "/edx/app/edxapp/edx-platform/common/lib/xmodule/xmodule/contentstore/mongo.py", line 135, in find
with self.fs.get(content_id) as fp:
File "/edx/app/edxapp/venvs/edxapp/local/lib/python2.7/site-packages/gridfs/__init__.py", line 159, in get
return GridOut(self.__collection, file_id)
File "/edx/app/edxapp/venvs/edxapp/local/lib/python2.7/site-packages/gridfs/grid_file.py", line 406, in __init__
self._ensure_file()
File "/edx/app/edxapp/venvs/edxapp/local/lib/python2.7/site-packages/gridfs/grid_file.py", line 429, in _ensure_file
self._file = self.__files.find_one({"_id": self.__file_id})
File "/edx/app/edxapp/venvs/edxapp/local/lib/python2.7/site-packages/pymongo/collection.py", line 1084, in find_one
for result in cursor.limit(-1):
File "/edx/app/edxapp/venvs/edxapp/local/lib/python2.7/site-packages/pymongo/cursor.py", line 1149, in next
if len(self.__data) or self._refresh():
File "/edx/app/edxapp/venvs/edxapp/local/lib/python2.7/site-packages/pymongo/cursor.py", line 1081, in _refresh
self.__codec_options.uuid_representation))
File "/edx/app/edxapp/venvs/edxapp/local/lib/python2.7/site-packages/pymongo/cursor.py", line 996, in __send_message
res = client._send_message_with_response(message, **kwargs)
File "/edx/app/edxapp/venvs/edxapp/local/lib/python2.7/site-packages/pymongo/mongo_client.py", line 1366, in _send_message_with_response
raise AutoReconnect(str(e))
AutoReconnect: timed out
First I thought it was because my Mongo server lived in an instance outside of the Django application's virtual network. I created a new Mongo server on an instance inside the same virtual network and would still get these issues. Mind you, I receive the data eventually but I feel like I wouldn't get timed out errors if the connection is normal.
If it helps, here's the Ansible playbook that I used to create the Mongo server: https://github.com/edx/configuration/tree/master/playbooks/roles/mongo_3_2
Also I have tailed the Mongo log file and this is the only line that would appear at the same time I would get the timed out error on the application server:
2018-02-23T12:49:20.890+0000 [conn5] authenticate db: edxapp { authenticate: 1, user: "user", nonce: "xxx", key: "xxx" }
mongostat and mongotop don't show anything out of the ordinary. Also here's the htop output:
I don't know what else to look for or how to fix this issue.
I forgot to change the Mongo server IP's in the Django application settings to point to the new private IP address inside the virtual network instead of the public IP. After I've changed that it don't get that issue anymore.
If you are reading this, make sure you change the private IP to a static one in Azure, if you are using that IP address in the Djagno application settings.

Celery: Remote workers frequently losing connection

I have a Celery broker running on a cloud server (Django app), and two workers on local servers in my office connected behind a NAT. The local workers frequently lose connection, and have to be restarted to re-establish connection with the broker. Usually celeryd restart hangs the first time I try it, so I have to ctr+C and retry once or twice to get it back up and connected. The workers' logs two most common errors:
[2014-08-03 00:08:45,398: WARNING/MainProcess] consumer: Connection to broker lost. Trying to re-establish the connection...
Traceback (most recent call last):
File "/usr/local/lib/python2.7/dist-packages/celery/worker/consumer.py", line 278, in start
blueprint.start(self)
File "/usr/local/lib/python2.7/dist-packages/celery/bootsteps.py", line 123, in start
step.start(parent)
File "/usr/local/lib/python2.7/dist-packages/celery/worker/consumer.py", line 796, in start
c.loop(*c.loop_args())
File "/usr/local/lib/python2.7/dist-packages/celery/worker/loops.py", line 72, in asynloop
next(loop)
File "/usr/local/lib/python2.7/dist-packages/kombu/async/hub.py", line 320, in create_loop
cb(*cbargs)
File "/usr/local/lib/python2.7/dist-packages/kombu/transport/base.py", line 159, in on_readable
reader(loop)
File "/usr/local/lib/python2.7/dist-packages/kombu/transport/base.py", line 142, in _read
raise ConnectionError('Socket was disconnected')
ConnectionError: Socket was disconnected
[2014-03-07 20:15:41,963: CRITICAL/MainProcess] Couldn't ack 11, reason:RecoverableConnectionError(None, 'connection already closed', None, '')
Traceback (most recent call last):
File "/usr/local/lib/python2.7/dist-packages/kombu/message.py", line 93, in ack_log_error
self.ack()
File "/usr/local/lib/python2.7/dist-packages/kombu/message.py", line 88, in ack
self.channel.basic_ack(self.delivery_tag)
File "/usr/local/lib/python2.7/dist-packages/amqp/channel.py", line 1583, in basic_ack
self._send_method((60, 80), args)
File "/usr/local/lib/python2.7/dist-packages/amqp/abstract_channel.py", line 50, in _send_method
raise RecoverableConnectionError('connection already closed')
How do I go about debugging this? Is the fact that the workers are behind a NAT an issue? Is there a good tool to monitor whether the workers have lost connection? At least with that, I could get them back online by manually restarting the worker.
Unfortunately yes, there is a problem with late acks in Celery+Kombu - task handler tries to use closed connection.
I worked around it like this:
CELERY_CONFIG = {
'CELERYD_MAX_TASKS_PER_CHILD': 1,
'CELERYD_PREFETCH_MULTIPLIER': 1,
'CELERY_ACKS_LATE': True,
}
CELERYD_MAX_TASKS_PER_CHILD - guarantees that worker will be restarted after finishing the task.
As for the tasks that already lost connection, there is nothing you can do right now. Maybe it'll be fixed in version 4. I just make sure that the tasks are as idempotent as possible.