MongoDB thinks it's running a replica? - mongodb

I'm running MongoDB (version 2.4) behind a Django application served by Apache.
In the past few days, I'm seeing the following error come up many times an hour in the Apache logs, on all sorts of different requests:
AutoReconnect: not master and slaveOk=false
I did not explicitly create a replica, and to the best of my knowledge am not running one. rs.status() says that we are not running with --replset.
Mongo is run with:
'mongod --dbpath /srv/db/mongodb/ --fork --logpath /var/log/mongodb.log --logappend --auth'
There is one mongod process running on the server.
What's going on here?
Edit -
Here's the tail end of the stack trace, as requested.
File "/var/www/sefaria_dev/sefaria/texts.py", line 916, in parse_ref
shorthand = db.index.find_one({"maps": {"$elemMatch": {"from": pRef["book"]}}})
File "/usr/local/lib/python2.7/dist-packages/pymongo/collection.py", line 604, in find_one
for result in self.find(spec_or_id, *args, **kwargs).limit(-1):
File "/usr/local/lib/python2.7/dist-packages/pymongo/cursor.py", line 904, in next
if len(self.__data) or self._refresh():
File "/usr/local/lib/python2.7/dist-packages/pymongo/cursor.py", line 848, in _refresh
self.__uuid_subtype))
File "/usr/local/lib/python2.7/dist-packages/pymongo/cursor.py", line 800, in __send_message
self.__uuid_subtype)
File "/usr/local/lib/python2.7/dist-packages/pymongo/helpers.py", line 98, in _unpack_response
raise AutoReconnect(error_object["$err"])
AutoReconnect: not master and slaveOk=false
rs.status() returns:
{
"ok" : 0,
"errmsg" : "not running with --replSet"
}
rs.conf() returns null.
I haven't seen an indication of error in the mongodb.log that corresponds to one of the apache.log errors.

Related

Snakemake and cloud formation cluster error with local scratch space

I am having a problem using local scratch space on cfncluster and snakemake at the same time. My strategy is to write data to local scratch for each node in the cluster and then move the data to the NFS partition.
Unfortunately I am getting the following error:
snakemake 4.0.0, cfncluster
/shared/bin/bin/snakemake --rerun-incomplete -s /shared/scripts/sra_to_fa_cluster.py -j 1 -p --latency-wait 20 -k -c " qsub -cwd -V" -F
/shared/dbGAP/sra_toolkit/sratoolkit.2.8.2-1-ubuntu64/bin/fastq-dump --split-files --gzip --outdir /scratch/ /shared/dbGAP/sras2/test/SRR2135300.sra
Waiting at most 20 seconds for missing files.
Exception in thread Thread-1:
Traceback (most recent call last):
File "/shared/bin/lib/python3.6/site-packages/snakemake/dag.py", line 319, in check_and_touch_output
wait_for_files(expanded_output, latency_wait=wait)
File "/shared/bin/lib/python3.6/site-packages/snakemake/io.py", line 395, in wait_for_files
latency_wait, "\n".join(get_missing())))
OSError: Missing files after 20 seconds:
/scratch/SRR2135300_2.fastq.gz
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/shared/bin/lib/python3.6/threading.py", line 916, in _bootstrap_inner
self.run()
File "/shared/bin/lib/python3.6/threading.py", line 864, in run
self._target(*self._args, **self._kwargs)
File "/shared/bin/lib/python3.6/site-packages/snakemake/executors.py", line 647, in _wait_for_jobs
active_job.callback(active_job.job)
File "/shared/bin/lib/python3.6/site-packages/snakemake/scheduler.py", line 287, in _proceed
self.get_executor(job).handle_job_success(job)
File "/shared/bin/lib/python3.6/site-packages/snakemake/executors.py", line 549, in handle_job_success
super().handle_job_success(job, upload_remote=False)
File "/shared/bin/lib/python3.6/site-packages/snakemake/executors.py", line 178, in handle_job_success
ignore_missing_output=ignore_missing_output)
File "/shared/bin/lib/python3.6/site-packages/snakemake/dag.py", line 323, in check_and_touch_output
"wait time with --latency-wait.", rule=job.rule)
snakemake.exceptions.MissingOutputException: Missing files after 20 seconds:
/scratch/SRR2135300_2.fastq.gz
This might be due to filesystem latency. If that is the case, consider to increase the wait time with --latency-wait.
This is similar to the error reported here:
https://bitbucket.org/snakemake/snakemake/issues/462/unhandled-missingoutputexception-in
Snakemake script is as follows:
rule all:
input:expand("/shared/dbGAP/sras2/fastq.gz/{sample}_{end}.fastq.gz",
sample=SAMPLES, end=END)
rule move:
input: left="/scratch/{sample}_1.fastq.gz", right="/scratch/{sample}_2.fastq.gz"
output: left="/shared/dbGAP/sras2/fastq.gz/{sample}_1.fastq.gz", right="/shared/dbGAP/sras2/fastq.gz/{sample}_2.fastq.gz"
shell: "rsync --remove-source-files -av {input.left} {output.left}; rsync --remove-source-files -av {input.right} {output.right};"
rule get_fastq_files_from_sra_file:
input: sras="/shared/dbGAP/sras2/test/{sample}.sra"
output: left="/scratch/{sample}_1.fastq.gz", right="/scratch/{sample}_2.fastq.gz"
shell: "/shared/dbGAP/sra_toolkit/sratoolkit.2.8.2-1-ubuntu64/bin/fastq-dump --split-files --gzip --outdir /scratch/ {input}"
My feeling is that snakemake cannot "see" the scratch on the nodes, so it returns it as missing, but I am not sure how to solve this issue.

How to connect to remote MongoDB with mongo-connector?

How can I connect to a MongoDB cluster on Mongo Atlas using mongo-connector?
I have tried to connector to my cluster with the following commands:
First attempt
sudo mongo-connector -m "mongodb://g******:*********#rest-api-data-shard-00-00-xemv3.mongodb.net:27017,rest-api-data-shard-00-01-xemv3.mongodb.net:27017,rest-api-data-shard-00-02-xemv3.mongodb.net:27017/admin?ssl
=true&replicaSet=rest-api-data-shard-0&authSource=admin" -a g****** -p "***********" -t http://localhost:9200 -d elastic2_doc_manager
Response:
Logging to mongo-connector.log.
Exception in thread Thread-1:
Traceback (most recent call last):
File "/usr/local/Cellar/python/2.7.12_2/Frameworks/Python.framework/Versions/2.7/lib/python2.7/threading.py", line 801, in __bootstrap_inner
self.run()
File "/usr/local/lib/python2.7/site-packages/mongo_connector/util.py", line 90, in wrapped
func(*args, **kwargs)
File "/usr/local/lib/python2.7/site-packages/mongo_connector/connector.py", line 263, in run
main_conn['admin'].authenticate(self.auth_username, self.auth_key)
File "/usr/local/lib/python2.7/site-packages/pymongo/database.py", line 1018, in authenticate
connect=True)
File "/usr/local/lib/python2.7/site-packages/pymongo/mongo_client.py", line 434, in _cache_credentials
raise OperationFailure('Another user is already authenticated '
OperationFailure: Another user is already authenticated to this database. You must logout first.
Second attempt:
sudo mongo-connector -m "mongodb://rest-api-data-shard-00-00-xemv3.mongodb.net:27017,rest-api-data-shard-00-01-xemv3.mongodb.net:27017,rest-api-data-shard-00-02-xemv3.mongodb.net:27017/admin?replicaSet=rest-api-data-shard-0" -a g********* -p "********" -t http://localhost:9200 -d elastic2_doc_manager
Response:
Logging to mongo-connector.log.
Exception in thread Thread-1:
Traceback (most recent call last):
File "/usr/local/Cellar/python/2.7.12_2/Frameworks/Python.framework/Versions/2.7/lib/python2.7/threading.py", line 801, in __bootstrap_inner
self.run()
File "/usr/local/lib/python2.7/site-packages/mongo_connector/util.py", line 90, in wrapped
func(*args, **kwargs)
File "/usr/local/lib/python2.7/site-packages/mongo_connector/connector.py", line 263, in run
main_conn['admin'].authenticate(self.auth_username, self.auth_key)
File "/usr/local/lib/python2.7/site-packages/pymongo/database.py", line 1018, in authenticate
connect=True)
File "/usr/local/lib/python2.7/site-packages/pymongo/mongo_client.py", line 439, in _cache_credentials
writable_preferred_server_selector)
File "/usr/local/lib/python2.7/site-packages/pymongo/topology.py", line 210, in select_server
address))
File "/usr/local/lib/python2.7/site-packages/pymongo/topology.py", line 186, in select_servers
self._error_message(selector))
ServerSelectionTimeoutError: rest-api-data-shard-00-02-xemv3.mongodb.net:27017: [Errno 54] Connection reset by peer,rest-api-data-shard-00-00-xemv3.mongodb.net:27017: [Errno 54] Connection reset by peer,rest-api-data-shard-00-01-xemv3.mongodb.net:27017: [Errno 54] Connection reset by peer
Answered on the github issue. Solution:
In your first attempt, the problem is that you are specifying the username and password for MongoDB twice. Remove the -a g****** -p "***********" and it should work fine. If you need to authenticate to Elasticsearch you need to use a mongo-connector config file and set the correct authentication options for the Python Elasticsearch client, eg:
{
"mainAddress": "mongodb://user:pass#mongodb:27017,mongodb:27018,mongodb:27019/admin?ssl=true&replicaSet=name&authSource=admin",
"verbosity": 1,
"docManagers": [
{
"docManager": "elastic2_doc_manager",
"targetURL": "http://localhost:9200",
"args": {
"clientOptions": {
"http_auth": ["user", "secret"],
"use_ssl": true
}
}
}
]
}
In your second attempt, it looks like the problem is that you forgot to add ssl=true to the MongoDB connection string. That's why you're getting Connection reset by peer errors.

unable to run mongo-connector

I have installed mongo-connector in the mongodb server.
I am executing by giving the command
mongo-connector -m [remote mongo server IP]:[remote mongo server port] -t [elastic search server IP]:[elastic search server Port] -d elastic_doc_manager.py
I also tried with this since mongo is running in the same server with the default port.
mongo-connector -t [elastic search server IP]:[elastic search server Port] -d elastic_doc_manager.py
I am getting error
Traceback (most recent call last):
File "/usr/local/bin/mongo-connector", line 9, in <module>
load_entry_point('mongo-connector==2.3.dev0', 'console_scripts', 'mongo-connector')()
File "/usr/local/lib/python2.7/dist-packages/mongo_connector-2.3.dev0-py2.7.egg/mongo_connector/util.py", line 85, in wrapped
func(*args, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/mongo_connector-2.3.dev0-py2.7.egg/mongo_connector/connector.py", line 1037, in main
conf.parse_args()
File "/usr/local/lib/python2.7/dist-packages/mongo_connector-2.3.dev0-py2.7.egg/mongo_connector/config.py", line 118, in parse_args
option, dict((k, values.get(k)) for k in option.cli_names))
File "/usr/local/lib/python2.7/dist-packages/mongo_connector-2.3.dev0-py2.7.egg/mongo_connector/connector.py", line 820, in apply_doc_managers
module = import_dm_by_name(dm['docManager'])
File "/usr/local/lib/python2.7/dist-packages/mongo_connector-2.3.dev0-py2.7.egg/mongo_connector/connector.py", line 810, in import_dm_by_name
"Could not import %s." % full_name)
**mongo_connector.errors.InvalidConfiguration: Could not import mongo_connector.doc_managers.elastic_doc_manager.py.**
NOTE: I am using python2.7
and mongo-connector 2.3
Elastic search server is 2.2
Any suggestions ?
[edit]
After applying Val's suggestion:
2016-02-29 19:56:59,519 [CRITICAL] mongo_connector.oplog_manager:549 -
Exception during collection dump
Traceback (most recent call last):
File
"/usr/local/lib/python2.7/dist-packages/mongo_connector-2.3.dev0-py2.7.egg/mongo_connector/oplog_manager.py",
line 501, in do_dump
upsert_all(dm)
File
"/usr/local/lib/python2.7/dist-packages/mongo_connector-2.3.dev0-py2.7.egg/mongo_connector/oplog_manager.py",
line 485, in upsert_all dm.bulk_upsert(docs_to_dump(namespace),
mapped_ns, long_ts)
File
"/usr/local/lib/python2.7/dist-packages/mongo_connector-2.3.dev0-py2.7.egg/mongo_connector/util.py", line 32, in wrapped
return f(*args, **kwargs)
File
"/usr/local/lib/python2.7/dist-packages/mongo_connector-2.3.dev0-py2.7.egg/mongo_connector/doc_managers/elastic_doc_manager.py", line 190, in bulk_upsert
for ok, resp in responses:
File
"/usr/local/lib/python2.7/dist-packages/elasticsearch-1.9.0-py2.7.egg/elasticsearch/helpers/init.py",
line 160, in streaming_bulk
for result in _process_bulk_chunk(client, bulk_actions,
raise_on_exception, raise_on_error, **kwargs):
File
"/usr/local/lib/python2.7/dist-packages/elasticsearch-1.9.0-py2.7.egg/elasticsearch/helpers/init.py",
line 132, in _process_bulk_chunk
raise BulkIndexError('%i document(s) failed to index.' % len(errors),
errors)
BulkIndexError: (u'2 document(s) failed to
index.',..document_class=dict, tz_aware=False, connect=True,
replicaset=u'mss'), u'local'), u'oplog.rs')
2016-02-29 19:56:59,835 [ERROR] mongo_connector.connector:302 -
MongoConnector: OplogThread unexpectedly stopped! Shutting down
Hi Val,
I connected with another mongodb instance, which had only one database, having one collection with 30,000+ records and I was able to execute it succesfully. The previous mongodb collection has multiple databases (around 7), which internally had multiple collections (around 5 to 15 per databases) and all were having good amount of documents (ranging from 500 to 50,000) in the collections.
Was Mongo-connector failing because of huge data residing in the mongo database ?
I have further queries
a. Is is possible to get indexing done of only specific collections in the mongodb, residing in different databases? I wan to index only specific collections (not the entire database). How can I achieve this ?
b. In elasticsearch i can see duplicate indexes for one collection. First one is with the database name (as expected), other one with the name mongodb_meta, both of them having same data, if I am changing the collection, the update is happening in both the collections.
c. Is it possible to configure the output index name or any other parameters any how?
I think the only issue is that you have the .py extension on the doc manager (it was needed before mongo-connector 2.0), you simply need to remove it:
mongo-connector -m [remote mongo server IP]:[remote mongo server port] -t [elastic search server IP]:[elastic search server Port] -d elastic_doc_manager
I found this option to run specific collection only.
$ mongo-connector -m mongodbserver:27017 -t elasticserver:9200 -d elastic_doc_manager --oplog-ts oplogstatus.txt --namespace-set database.collection
It started working after giving below command with --oplog-ts option.
mongo-connector -m localhost:27017 -t localhost:37017 -d mongo_doc_manager --oplog-ts oplogstatus.txt
But its failing if i use a config file. Kindly advise how to resolve this issue.
C:\Dev\mongodb\mongo-connector>mongo-connector -c myconfig.json --oplog-ts oplogstatus.txt
Fatal Exception
Traceback (most recent call last):
File "C:\Program Files\Python\lib\site-packages\mongo_connector-2.5.0.dev0-py3.6.egg\mongo_connector\config.py", line 110, in parse_args
self.load_json(f.read())
File "C:\Program Files\Python\lib\site-packages\mongo_connector-2.5.0.dev0-py3.6.egg\mongo_connector\config.py", line 132, in load_json
parsed_config = json.loads(text)
File "C:\Program Files\Python\lib\json\__init__.py", line 319, in loads
return _default_decoder.decode(s)
File "C:\Program Files\Python\lib\json\decoder.py", line 339, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "C:\Program Files\Python\lib\json\decoder.py", line 355, in raw_decode
obj, end = self.scan_once(s, idx)
json.decoder.JSONDecodeError: Invalid \escape: line 6 column 21 (char 201)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\Program Files\Python\lib\site-packages\mongo_connector-2.5.0.dev0-py3.6.egg\mongo_connector\util.py", line 90, in wrapped
func(*args, **kwargs)
File "C:\Program Files\Python\lib\site-packages\mongo_connector-2.5.0.dev0-py3.6.egg\mongo_connector\connector.py", line 1059, in main
conf.parse_args()
File "C:\Program Files\Python\lib\site-packages\mongo_connector-2.5.0.dev0-py3.6.egg\mongo_connector\config.py", line 112, in parse_args
reraise(errors.InvalidConfiguration, *sys.exc_info()[1:])
File "C:\Program Files\Python\lib\site-packages\mongo_connector-2.5.0.dev0-py3.6.egg\mongo_connector\compat.py", line 9, in reraise
raise exctype(str(value)).with_traceback(trace)
File "C:\Program Files\Python\lib\site-packages\mongo_connector-2.5.0.dev0-py3.6.egg\mongo_connector\config.py", line 110, in parse_args
self.load_json(f.read())
File "C:\Program Files\Python\lib\site-packages\mongo_connector-2.5.0.dev0-py3.6.egg\mongo_connector\config.py", line 132, in load_json
parsed_config = json.loads(text)
File "C:\Program Files\Python\lib\json\__init__.py", line 319, in loads
return _default_decoder.decode(s)
File "C:\Program Files\Python\lib\json\decoder.py", line 339, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "C:\Program Files\Python\lib\json\decoder.py", line 355, in raw_decode
obj, end = self.scan_once(s, idx)
mongo_connector.errors.InvalidConfiguration: Invalid \escape: line 6 column 21 (char 201)
Try this.
pip install 'elastic2-doc-manager[elastic5]'
mongo-connector -m localhost:27017 -t localhost:9200 -d elastic2_doc_manager
answer on github
Your strategy seems sound to me. Here's how to do this:
Generate a mongo-connector timestamp file:
Run mongo-connector --no-dump.
Stop mongo-connector right after it starts up. Now you have an
oplog.timestamp file pointing to the latest entry on the oplog.
Run mongodump on the primary. The dump already reflects all the
changes that mongo-connector saw in the oplog.
Run mongorestore with the dump from (2) on the target MongoDB.
Restart mongo-connector. Pass in the file generated in (1) to the
--oplog-ts option.
I'll add this to the wiki.

PyMongo and Multiprocessing: ServerSelectionTimeoutError

We recently updated MongoDB from 2.6 to 3.0. Since then, we are having trouble using PyMongo in combination with Multiprocessing.
The issue is that sometimes an operation (e.g. find) within a process hangs for ~30 seconds and then throws an exception "ServerSelectionTimeoutError: No servers found yet".
The behavior seems independent from the input, as our script usually runs just fine for a few times and then hangs randomly.
The log files do not show any entries related to timeouts, nor did I find any useful information on the Internet about this issue.
The script is running in our test environment, meaning there are no replica sets involved and the Mongo instance in bound to localhost.
Here is the stack trace for completeness:
File "/usr/lib/python2.7/multiprocessing/process.py", line 258, in _bootstrap
self.run()
File "somescript.py", line 109, in run
self.find_incoming_cc()
File "somescript.py", line 370, in find_incoming_cc
{'_id': 1, 'cc': 1}
File "/usr/local/lib/python2.7/dist-packages/pymongo/cursor.py", line 983, in next
if len(self.__data) or self._refresh():
File "/usr/local/lib/python2.7/dist-packages/pymongo/cursor.py", line 908, in _refresh
self.__read_preference))
File "/usr/local/lib/python2.7/dist-packages/pymongo/cursor.py", line 813, in __send_message
**kwargs)
File "/usr/local/lib/python2.7/dist-packages/pymongo/mongo_client.py", line 728, in _send_message_with_response
server = topology.select_server(selector)
File "/usr/local/lib/python2.7/dist-packages/pymongo/topology.py", line 121, in select_server
address))
File "/usr/local/lib/python2.7/dist-packages/pymongo/topology.py", line 97, in select_servers
self._error_message(selector))
ServerSelectionTimeoutError: No servers found yet
Now for the question: Is there any known issue/bug when using PyMongo with Multiprocessing? Is there a way to debug the exception?
Thanks for any help!
It is bug in pymongo version 3.0.x. Bug report url https://jira.mongodb.org/browse/PYTHON-961
Workaround for this issue. (Tested in pymongo 3.0.3)
Pass “connect=False” in MongoClient object initialisation
MongoClient(uri, connect=False)
Or simply wait for few seconds before creating instance of MongoClient in the child process (like time.sleep(2)).
def start(uri):
time.sleep(2)
mclient = MongoClient(uri)
mclient.db.collection.find_one()
if __name__ == '__main__':
p = multiprocessing.Process(target=start, args=('mongodb://localhost:27017/',))
p.start()

"pysolr.SolrError: [Reason: /solr4/update/]" when running mongo_connector.py

As a follow on from this problem I was having before: (How long does mongo_connector.py usually take?)
I was wondering if anyone else has had this problem when running the following:
$ python /usr/local/lib/python2.7/dist-packages/mongo-connector/mongo_connector.py -m localhost:27017 --docManager /usr/local/lib/python2.7/dist-packages/mongo-connector/doc_managers/solr_doc_manager.py -t http://localhost:8080/solr4
This is the error output I get:
2012-08-20 10:24:11,893 - INFO - Beginning Mongo Connector
2012-08-20 10:24:12,971 - INFO - Starting new HTTP connection (1): localhost
2012-08-20 10:24:12,974 - INFO - Finished 'http://localhost:8080/solr4/update/?commit=true' (post) with body 'u'<commit ' in 0.017 seconds.
2012-08-20 10:24:12,983 - ERROR - [Reason: /solr4/update/]
Traceback (most recent call last):
File "/usr/local/lib/python2.7/dist-packages/mongo-connector/mongo_connector.py", line 441, in <module>
auth_username=options.admin_name)
File "/usr/local/lib/python2.7/dist-packages/mongo-connector/mongo_connector.py", line 100, in __init__
unique_key=u_key)
File "/usr/local/lib/python2.7/dist-packages/mongo-connector/doc_managers/solr_doc_manager.py", line 54, in __init__
self.run_auto_commit()
File "/usr/local/lib/python2.7/dist-packages/mongo-connector/doc_managers/solr_doc_manager.py", line 95, in run_auto_commit
self.solr.commit()
File "/usr/local/lib/python2.7/dist-packages/pysolr.py", line 802, in commit
return self._update(msg, waitFlush=waitFlush, waitSearcher=waitSearcher)
File "/usr/local/lib/python2.7/dist-packages/pysolr.py", line 359, in _update
return self._send_request('post', path, message, {'Content-type': 'text/xml; charset=utf-8'})
File "/usr/local/lib/python2.7/dist-packages/pysolr.py", line 293, in _send_request
raise SolrError(error_message)
pysolr.SolrError: [Reason: /solr4/update/]
Reason: [Reason: /solr4/update/] is not really an output that I can even start to debug. Solr is working perfectly fine, MongoDB is working perfectly fine. What could this problem be caused by?
I have been following the instructions on this page up to now: http://loutilities.wordpress.com/2012/11/26/complementing-mongodb-with-real-time-solr-search/#comment-183. I've also seen on various websites that adding the following to my Solr's solrconfig.xml should make 'update' accessible, but this is already configured on my system:
<requestHandler name="/update" class="solr.XmlUpdateRequestHandler">
That's about all the information I have. Any hints as to what I might be doing wrong?