mongo-conector not connecting with solr - Exception during collection dump - mongodb

I am connecting MongoDB with solr,
Following this document for integration:
https://blog.toadworld.com/2017/02/03/indexing-mongodb-data-in-apache-solr
DB.Collection: solr.wlslog
D:\path to solr\bin>
mongo-connector --unique-key=id -n solr.wlslog -m localhost:27017 -t http://localhost:8983/solr/wlslog -d solr_doc_manager
I am getting below response and error:
2020-06-15 12:15:45,744 [ALWAYS] mongo_connector.connector:50 - Starting mongo-connector version: 3.1.1
2020-06-15 12:15:45,744 [ALWAYS] mongo_connector.connector:50 - Python version: 3.8.3 (tags/v3.8.3:6f8c832, May 13 2020, 22:37:02) [MSC v.1924 64 bit (AMD64)]
2020-06-15 12:15:45,745 [ALWAYS] mongo_connector.connector:50 - Platform: Windows-10-10.0.18362-SP0
2020-06-15 12:15:45,745 [ALWAYS] mongo_connector.connector:50 - pymongo version: 3.10.1
2020-06-15 12:15:45,755 [ALWAYS] mongo_connector.connector:50 - Source MongoDB version: 4.2.2
2020-06-15 12:15:45,755 [ALWAYS] mongo_connector.connector:50 - Target DocManager: mongo_connector.doc_managers.solr_doc_manager version: 0.1.0
2020-06-15 12:15:45,787 [CRITICAL] mongo_connector.oplog_manager:713 - Exception during collection dump
Traceback (most recent call last):
File "C:\Users\ancubate\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\mongo_connector\doc_managers\solr_doc_manager.py", line 292, in
batch = list(next(cleaned) for i in range(self.chunk_size))
StopIteration
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "C:\Users\ancubate\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\mongo_connector\oplog_manager.py", line 668, in do_dump
upsert_all(dm)
File "C:\Users\ancubate\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\mongo_connector\oplog_manager.py", line 651, in upsert_all
dm.bulk_upsert(docs_to_dump(from_coll), mapped_ns, long_ts)
File "C:\Users\ancubate\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\mongo_connector\util.py", line 33, in wrapped
return f(*args, **kwargs)
File "C:\Users\ancubate\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.8_qbz5n2kfra8p0\LocalCache\local-packages\Python38\site-packages\mongo_connector\doc_managers\solr_doc_manager.py", line 292, in bulk_upsert
batch = list(next(cleaned) for i in range(self.chunk_size))
RuntimeError: generator raised StopIteration
2020-06-15 12:15:45,801 [ERROR] mongo_connector.oplog_manager:723 - OplogThread: Failed during dump collection cannot recover! Collection(Database(MongoClient(host=['localhost:27017'], document_class=dict, tz_aware=False, connect=True, replicaset='rs0'), 'local'), 'oplog.rs')
2020-06-15 12:15:46,782 [ERROR] mongo_connector.connector:408 - MongoConnector: OplogThread <OplogThread(Thread-2, started 4936)> unexpectedly stopped! Shutting down
I searched over in GitHub issues of mongo-connector but not getting any solutions:
Github-issue-870
Github-issue-898

Finally the issue is resolved :)
My system OS is windows and i have installed mongodb in C:\Program Files\MongoDB\ (system's drive),
Before this mongo-connector connection, i have initiated replica set for mongodb using below command as per this blog:
mongod --port 27017 --dbpath ../data/db --replSet rs0
Problem:
The problem inside the --dbpath ../data/db directory, this directory was located in C:\Program Files\MongoDB\Server\4.2\data\db this directory have all permissions but parent directory C:\Program Files have not all permission because its system's directory and protected directory.
Actual Problem Was: (exception during collection dump)
2020-06-15 12:15:45,787 [CRITICAL] mongo_connector.oplog_manager:713 - Exception during collection dump
Solution:
I have just changed my --dbpath to another path that directory is outside of system's protected directory as below:
mongod --port 27017 --dbpath C:/data/db --replSet rs0
After that i have executed below command for connection, as i posted in my question:
mongo-connector --unique-key=id -n solr.wlslog -m localhost:27017 -t http://localhost:8983/solr/wlslog -d solr_doc_manager
Success mongo connector log result:
2020-06-17 12:08:52,292 [ALWAYS] mongo_connector.connector:50 - Starting mongo-connector version: 3.1.1
2020-06-17 12:08:52,292 [ALWAYS] mongo_connector.connector:50 - Python version: 3.8.3 (tags/v3.8.3:6f8c832, May 13 2020, 22:37:02) [MSC v.1924 64 bit (AMD64)]
2020-06-17 12:08:52,293 [ALWAYS] mongo_connector.connector:50 - Platform: Windows-10-10.0.18362-SP0
2020-06-17 12:08:52,293 [ALWAYS] mongo_connector.connector:50 - pymongo version: 3.10.1
2020-06-17 12:08:52,310 [ALWAYS] mongo_connector.connector:50 - Source MongoDB version: 4.2.2
2020-06-17 12:08:52,311 [ALWAYS] mongo_connector.connector:50 - Target DocManager: mongo_connector.doc_managers.solr_doc_manager version: 0.1.0
Hope this answer helpful for everyone :)

in my case, this didn't solve the problem.
I'm using python 3.8, so for me it was actually due to https://docs.python.org/3/whatsnew/3.7.html#changes-in-python-behavior
PEP 479 is enabled for all code in Python 3.7, meaning that
StopIteration exceptions raised directly or indirectly in coroutines
and generators are transformed into RuntimeError exceptions.
(Contributed by Yury Selivanov in bpo-32670.)
reading How yield catches StopIteration exception? led me to think initially it was related to the yield doc statements but actually the problen was 2 lines calling next() in both line 292 and a few lines later in solr_doc_manager.py:
batch = list(next(cleaned) for i in range(self.chunk_size))
changed to:
batch = []
for i in range(self.chunk_size):
for x in cleaned:
batch.append(x)

Related

Mongoexport with cluster throws i/o timeout error

Just upgraded to mongo 3.0, but mongoexport gives us a the following error: "Failed: read tcp 127.0.0.1:27020: i/o timeout" after outputting some documents (not always the same amount). mongoexport is connecting to a sharded cluster of 4 standalone mongod servers with 3 mongod config servers
[root#SRV]$ mongoexport --host
localhost:27022,localhost:27021,localhost:27020 --db horus
--collection users --type json --fields _id | wc -l
2015-03-09T12:41:19.198-0600 connected to:
localhost:27022,localhost:27021,localhost:27020
2015-03-09T12:41:22.570-0600 Failed: read tcp 127.0.0.1:27020: i/o
timeout
15322
The versions we are using are:
[root#MONGODB01-SRV]# mongo --version MongoDB shell version: 3.0.0
[root#SRV]$ mongoexport --version mongoexport version: 3.0.0 git
version: e35a2e87876251835fcb60f5eb0c29baca04bc5e
[root#SRV]$ mongos --version MongoS version 3.0.0 starting: pid=47359
port=27017 64-bit host=SRV (--help for usage) git version:
a841fd6394365954886924a35076691b4d149168 OpenSSL version: OpenSSL
1.0.1e-fips 11 Feb 2013 build sys info: Linux ip-10-181-61-91 2.6.32-220.el6.x86_64 #1 SMP Wed Nov 9 08:03:13 EST 2011 x86_64 BOOST_LIB_VERSION=1_49
Tried with a 2.6 mongoexport in another server against our mongod3.0 and mongos3.0 and works fine
This is an old question but I wanted to answer. Maybe this answer will help one of us. It might be caused by someone else trying to write to the collection you are writing. I had a smilar problem. After a long research I realised that a user with higher role was trying to write in the sametime and because his/her role is more important than mine ones request were done and mine are given IO exception.
Try closing the ports first: eg. killall -9 node

Limit mongo-connector to a specific collection for Solr indexing

I am currently attempting to use mongo-connector to automatically feed db updates to Solr. It's working fine through the use of the following command -
mongo-connector -m localhost:27017 -t http://localhost:8983/solr -d mongo_connector/doc_managers/solr_doc_manager.py
However, it is indexing every collection in my mongodb. I have tried the use of the option -n through the following -
mongo-connector -m localhost:27017 -t http://localhost:8983/solr -n feed_scraper_development.articles -d mongo_connector/doc_managers/solr_doc_manager.py
This fails with the following error -
2014-07-24 22:23:23,053 - INFO - Beginning Mongo Connector
2014-07-24 22:23:23,104 - INFO - Starting new HTTP connection (1): localhost
2014-07-24 22:23:23,110 - INFO - Finished 'http://localhost:8983/solr/admin/luke?show=schema&wt=json' (get) with body '' in 0.018 seconds.
2014-07-24 22:23:23,115 - INFO - OplogThread: Initializing oplog thread
2014-07-24 22:23:23,116 - INFO - MongoConnector: Starting connection thread MongoClient('localhost', 27017)
2014-07-24 22:23:23,126 - INFO - Finished 'http://localhost:8983/solr/update/?commit=true' (post) with body 'u'<commit ' in 0.006 seconds.
2014-07-24 22:23:23,129 - INFO - Finished 'http://localhost:8983/solr/select/?q=%2A%3A%2A&sort=_ts+desc&rows=1&wt=json' (get) with body '' in 0.003 seconds.
2014-07-24 22:23:23,337 - INFO - Finished 'http://localhost:8983/solr/select/?q=_ts%3A+%5B6038164010275176560+TO+6038164010275176560%5D&rows=100000000&wt=json' (get) with body '' in 0.207 seconds.
Exception in thread Thread-2:
Traceback (most recent call last):
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/threading.py", line 808, in __bootstrap_inner
self.run()
File "build/bdist.macosx-10.9-intel/egg/mongo_connector/oplog_manager.py", line 141, in run
cursor = self.init_cursor()
File "build/bdist.macosx-10.9-intel/egg/mongo_connector/oplog_manager.py", line 582, in init_cursor
cursor = self.get_oplog_cursor(timestamp)
File "build/bdist.macosx-10.9-intel/egg/mongo_connector/oplog_manager.py", line 361, in get_oplog_cursor
timestamp = self.rollback()
File "build/bdist.macosx-10.9-intel/egg/mongo_connector/oplog_manager.py", line 664, in rollback
if doc['ns'] in rollback_set:
KeyError: 'ns'
Any help or clues would be greatly appreciated!
Extra information: Solr 4.9.0 | MongoDB 2.6.3 | mongo-connector 1.2.1
It works as advertised after deleting all the indexes in the data folder, restarting solr and re-running the command with the -n option.

mongo-connector not working with --unique-key

mongo-connector --unique-key=id --auto-commit-interval=0 -m localhost:27017 -t http://localhost:8983/solr -d /Library/Python/2.7/site-packages/mongo_connector/doc_managers/solr_doc_manager.py --admin-username admin --password bypass
I'm using the following to connect between MongoDB and Apache Solr but I'm getting the following error at the end:
2014-05-17 12:38:20,607 - INFO - Beginning Mongo Connector
2014-05-17 12:38:22,200 - INFO - Starting new HTTP connection (1): localhost
2014-05-17 12:38:22,439 - INFO - Finished 'http://localhost:8983/solr/admin/luke?show=schema&wt=json' (get) with body '' in 0.404 seconds.
2014-05-17 12:38:22,527 - INFO - OplogThread: Initializing oplog thread
2014-05-17 12:38:22,580 - INFO - MongoConnector: Starting connection thread MongoClient('localhost', 27017)
Exception in thread Thread-2:
Traceback (most recent call last):
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/threading.py", line 808, in __bootstrap_inner
self.run()
File "/Library/Python/2.7/site-packages/mongo_connector/oplog_manager.py", line 181, in run
dm.remove(entry)
File "/Library/Python/2.7/site-packages/mongo_connector/doc_managers/solr_doc_manager.py", line 192, in remove
self.solr.delete(id=str(doc[self.unique_key]),
KeyError: 'id'
Please help me.
Did you explicitly create a unique field called "id"? That is not the default field added by MongoDB, the default unique field is "_id".

Error during mongo startup

When I type mongo on command prompt than output like
~$ mongo
MongoDB shell version: 2.2.3
connecting to: test
But when I type mongo 127.0.0.1:28017/stu1
output :-
MongoDB shell version: 2.2.3
connecting to: 127.0.0.1:28017/stu1
Mon Mar 10 16:56:01 DBClientCursor::init call() failed
Mon Mar 10 16:56:01 Error: Error during mongo startup. :: caused by :: 10276
DBClientBase::findN: transport error: 127.0.0.1:28017 ns: admin.$cmd query: { whatsmyuri: 1 } src/mongo/shell/mongo.js:93
exception: connect failed
OS :- ubuntu 12.04
So please help me to solve this error.
28017 is the default port for the HTTP admin interface, so if your config is allowing HTTP, you should use a web browser to access it http://127.0.0.1:28017/
You're trying to use mongo shell to access HTTP service.
To have access to the database using HTTP protocol then use ~$ mongod --rest to start the db. Now call http://127.0.0.1:28017/stu1/coll_name in the web browser which list all the documents in your collection.

Error 109 when stopping Mongo DB running as a service (1.6.1)

I'm running Mongo DB as windows service and every second time I stop the service it reports "Error 109: The pipe has been ended". Here is the command line being used to run the service
"C:\Temp\mongodb\bin\mongod" --service --serviceUser --servicePassword --dbpath C:\temp\db --rest --logpath C:\temp\db\log\mongo.log --logappend --directoryperdb
This bug was fixed for version 2.1.0 ( https://jira.mongodb.org/browse/SERVER-2833 ) and then resurfaced ( https://jira.mongodb.org/browse/SERVER-6771 ). Hopefully it will be fixed again for version 2.2.0.