How to use Elasticsearch with MongoDB? - mongodb

I have gone through many blogs and sites about configuring Elasticsearch for MongoDB to index Collections in MongoDB but none of them were straightforward.
Please explain to me a step by step process for installing elasticsearch, which should include:
configuration
run in the browser
I am using Node.js with express.js, so please help accordingly.

This answer should be enough to get you set up to follow this tutorial on Building a functional search component with MongoDB, Elasticsearch, and AngularJS.
If you're looking to use faceted search with data from an API then Matthiasn's BirdWatch Repo is something you might want to look at.
So here's how you can setup a single node Elasticsearch "cluster" to index MongoDB for use in a NodeJS, Express app on a fresh EC2 Ubuntu 14.04 instance.
Make sure everything is up to date.
sudo apt-get update
Install NodeJS.
sudo apt-get install nodejs
sudo apt-get install npm
Install MongoDB - These steps are straight from MongoDB docs.
Choose whatever version you're comfortable with. I'm sticking with v2.4.9 because it seems to be the most recent version MongoDB-River supports without issues.
Import the MongoDB public GPG Key.
sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv 7F0CEB10
Update your sources list.
echo 'deb http://downloads-distro.mongodb.org/repo/ubuntu-upstart dist 10gen' | sudo tee /etc/apt/sources.list.d/mongodb.list
Get the 10gen package.
sudo apt-get install mongodb-10gen
Then pick your version if you don't want the most recent. If you are setting your environment up on a windows 7 or 8 machine stay away from v2.6 until they work some bugs out with running it as a service.
apt-get install mongodb-10gen=2.4.9
Prevent the version of your MongoDB installation being bumped up when you update.
echo "mongodb-10gen hold" | sudo dpkg --set-selections
Start the MongoDB service.
sudo service mongodb start
Your database files default to /var/lib/mongo and your log files to /var/log/mongo.
Create a database through the mongo shell and push some dummy data into it.
mongo YOUR_DATABASE_NAME
db.createCollection(YOUR_COLLECTION_NAME)
for (var i = 1; i <= 25; i++) db.YOUR_COLLECTION_NAME.insert( { x : i } )
Now to Convert the standalone MongoDB into a Replica Set.
First Shutdown the process.
mongo YOUR_DATABASE_NAME
use admin
db.shutdownServer()
Now we're running MongoDB as a service, so we don't pass in the "--replSet rs0" option in the command line argument when we restart the mongod process. Instead, we put it in the mongod.conf file.
vi /etc/mongod.conf
Add these lines, subbing for your db and log paths.
replSet=rs0
dbpath=YOUR_PATH_TO_DATA/DB
logpath=YOUR_PATH_TO_LOG/MONGO.LOG
Now open up the mongo shell again to initialize the replica set.
mongo DATABASE_NAME
config = { "_id" : "rs0", "members" : [ { "_id" : 0, "host" : "127.0.0.1:27017" } ] }
rs.initiate(config)
rs.slaveOk() // allows read operations to run on secondary members.
Now install Elasticsearch. I'm just following this helpful Gist.
Make sure Java is installed.
sudo apt-get install openjdk-7-jre-headless -y
Stick with v1.1.x for now until the Mongo-River plugin bug gets fixed in v1.2.1.
wget https://download.elasticsearch.org/elasticsearch/elasticsearch/elasticsearch-1.1.1.deb
sudo dpkg -i elasticsearch-1.1.1.deb
curl -L http://github.com/elasticsearch/elasticsearch-servicewrapper/tarball/master | tar -xz
sudo mv *servicewrapper*/service /usr/local/share/elasticsearch/bin/
sudo rm -Rf *servicewrapper*
sudo /usr/local/share/elasticsearch/bin/service/elasticsearch install
sudo ln -s `readlink -f /usr/local/share/elasticsearch/bin/service/elasticsearch` /usr/local/bin/rcelasticsearch
Make sure /etc/elasticsearch/elasticsearch.yml has the following config options enabled if you're only developing on a single node for now:
cluster.name: "MY_CLUSTER_NAME"
node.local: true
Start the Elasticsearch service.
sudo service elasticsearch start
Verify it's working.
curl http://localhost:9200
If you see something like this then you're good.
{
"status" : 200,
"name" : "Chi Demon",
"version" : {
"number" : "1.1.2",
"build_hash" : "e511f7b28b77c4d99175905fac65bffbf4c80cf7",
"build_timestamp" : "2014-05-22T12:27:39Z",
"build_snapshot" : false,
"lucene_version" : "4.7"
},
"tagline" : "You Know, for Search"
}
Now install the Elasticsearch plugins so it can play with MongoDB.
bin/plugin --install com.github.richardwilly98.elasticsearch/elasticsearch-river-mongodb/1.6.0
bin/plugin --install elasticsearch/elasticsearch-mapper-attachments/1.6.0
These two plugins aren't necessary but they're good for testing queries and visualizing changes to your indexes.
bin/plugin --install mobz/elasticsearch-head
bin/plugin --install lukas-vlcek/bigdesk
Restart Elasticsearch.
sudo service elasticsearch restart
Finally index a collection from MongoDB.
curl -XPUT localhost:9200/_river/DATABASE_NAME/_meta -d '{
"type": "mongodb",
"mongodb": {
"servers": [
{ "host": "127.0.0.1", "port": 27017 }
],
"db": "DATABASE_NAME",
"collection": "ACTUAL_COLLECTION_NAME",
"options": { "secondary_read_preference": true },
"gridfs": false
},
"index": {
"name": "ARBITRARY INDEX NAME",
"type": "ARBITRARY TYPE NAME"
}
}'
Check that your index is in Elasticsearch
curl -XGET http://localhost:9200/_aliases
Check your cluster health.
curl -XGET 'http://localhost:9200/_cluster/health?pretty=true'
It's probably yellow with some unassigned shards. We have to tell Elasticsearch what we want to work with.
curl -XPUT 'localhost:9200/_settings' -d '{ "index" : { "number_of_replicas" : 0 } }'
Check cluster health again. It should be green now.
curl -XGET 'http://localhost:9200/_cluster/health?pretty=true'
Go play.

Using river can present issues when your operation scales up. River will use a ton of memory when under heavy operation. I recommend implementing your own elasticsearch models, or if you're using mongoose you can build your elasticsearch models right into that or use mongoosastic which essentially does this for you.
Another disadvantage to Mongodb River is that you'll be stuck using mongodb 2.4.x branch, and ElasticSearch 0.90.x. You'll start to find that you're missing out on a lot of really nice features, and the mongodb river project just doesn't produce a usable product fast enough to keep stable. That said Mongodb River is definitely not something I'd go into production with. It's posed more problems than its worth. It will randomly drop write under heavy load, it will consume lots of memory, and there's no setting to cap that. Additionally, river doesn't update in realtime, it reads oplogs from mongodb, and this can delay updates for as long as 5 minutes in my experience.
We recently had to rewrite a large portion of our project, because its a weekly occurrence that something goes wrong with ElasticSearch. We had even gone as far as to hire a Dev Ops consultant, who also agrees that its best to move away from River.
UPDATE:
Elasticsearch-mongodb-river now supports ES v1.4.0 and mongodb v2.6.x. However, you'll still likely run into performance problems on heavy insert/update operations as this plugin will try to read mongodb's oplogs to sync. If there are a lot of operations since the lock(or latch rather) unlocks, you'll notice extremely high memory usage on your elasticsearch server. If you plan on having a large operation, river is not a good option. The developers of ElasticSearch still recommend you to manage your own indexes by communicating directly with their API using the client library for your language, rather than using river. This isn't really the purpose of river. Twitter-river is a great example of how river should be used. Its essentially a great way to source data from outside sources, but not very reliable for high traffic or internal use.
Also consider that mongodb-river falls behind in version, as its not maintained by ElasticSearch Organization, its maintained by a thirdparty. Development was stuck on v0.90 branch for a long time after the release of v1.0, and when a version for v1.0 was released it wasn't stable until elasticsearch released v1.3.0. Mongodb versions also fall behind. You may find yourself in a tight spot when you're looking to move to a later version of each, especially with ElasticSearch under such heavy development, with many very anticipated features on the way. Staying up on the latest ElasticSearch has been very important as we rely heavily on constantly improving our search functionality as its a core part of our product.
All in all you'll likely get a better product if you do it yourself. Its not that difficult. Its just another database to manage in your code, and it can easily be dropped in to your existing models without major refactoring.

River is a good solution once you want to have a almost real time synchronization and general solution.
If you have data in MongoDB already and want to ship it very easily to Elasticsearch like "one-shot" you can try my package in Node.js https://github.com/itemsapi/elasticbulk.
It's using Node.js streams so you can import data from everything what is supporting streams (i.e. MongoDB, PostgreSQL, MySQL, JSON files, etc)
Example for MongoDB to Elasticsearch:
Install packages:
npm install elasticbulk
npm install mongoose
npm install bluebird
Create script i.e. script.js:
const elasticbulk = require('elasticbulk');
const mongoose = require('mongoose');
const Promise = require('bluebird');
mongoose.connect('mongodb://localhost/your_database_name', {
useMongoClient: true
});
mongoose.Promise = Promise;
var Page = mongoose.model('Page', new mongoose.Schema({
title: String,
categories: Array
}), 'your_collection_name');
// stream query
var stream = Page.find({
}, {title: 1, _id: 0, categories: 1}).limit(1500000).skip(0).batchSize(500).stream();
elasticbulk.import(stream, {
index: 'my_index_name',
type: 'my_type_name',
host: 'localhost:9200',
})
.then(function(res) {
console.log('Importing finished');
})
Ship your data:
node script.js
It's not extremely fast but it's working for millions of records (thanks to streams).

Here I found another good option to migrate your MongoDB data to Elasticsearch.
A go daemon that syncs mongodb to elasticsearch in realtime.
Its the Monstache. Its available at : Monstache
Below the initial setp to configure and use it.
Step 1:
C:\Program Files\MongoDB\Server\4.0\bin>mongod --smallfiles --oplogSize 50 --replSet test
Step 2 :
C:\Program Files\MongoDB\Server\4.0\bin>mongo
C:\Program Files\MongoDB\Server\4.0\bin>mongo
MongoDB shell version v4.0.2
connecting to: mongodb://127.0.0.1:27017
MongoDB server version: 4.0.2
Server has startup warnings:
2019-01-18T16:56:44.931+0530 I CONTROL [initandlisten]
2019-01-18T16:56:44.931+0530 I CONTROL [initandlisten] ** WARNING: Access control is not enabled for the database.
2019-01-18T16:56:44.931+0530 I CONTROL [initandlisten] ** Read and write access to data and configuration is unrestricted.
2019-01-18T16:56:44.931+0530 I CONTROL [initandlisten]
2019-01-18T16:56:44.931+0530 I CONTROL [initandlisten] ** WARNING: This server is bound to localhost.
2019-01-18T16:56:44.931+0530 I CONTROL [initandlisten] ** Remote systems will be unable to connect to this server.
2019-01-18T16:56:44.931+0530 I CONTROL [initandlisten] ** Start the server with --bind_ip <address> to specify which IP
2019-01-18T16:56:44.931+0530 I CONTROL [initandlisten] ** addresses it should serve responses from, or with --bind_ip_all to
2019-01-18T16:56:44.931+0530 I CONTROL [initandlisten] ** bind to all interfaces. If this behavior is desired, start the
2019-01-18T16:56:44.931+0530 I CONTROL [initandlisten] ** server with --bind_ip 127.0.0.1 to disable this warning.
2019-01-18T16:56:44.931+0530 I CONTROL [initandlisten]
MongoDB Enterprise test:PRIMARY>
Step 3 : Verify the replication.
MongoDB Enterprise test:PRIMARY> rs.status();
{
"set" : "test",
"date" : ISODate("2019-01-18T11:39:00.380Z"),
"myState" : 1,
"term" : NumberLong(2),
"syncingTo" : "",
"syncSourceHost" : "",
"syncSourceId" : -1,
"heartbeatIntervalMillis" : NumberLong(2000),
"optimes" : {
"lastCommittedOpTime" : {
"ts" : Timestamp(1547811537, 1),
"t" : NumberLong(2)
},
"readConcernMajorityOpTime" : {
"ts" : Timestamp(1547811537, 1),
"t" : NumberLong(2)
},
"appliedOpTime" : {
"ts" : Timestamp(1547811537, 1),
"t" : NumberLong(2)
},
"durableOpTime" : {
"ts" : Timestamp(1547811537, 1),
"t" : NumberLong(2)
}
},
"lastStableCheckpointTimestamp" : Timestamp(1547811517, 1),
"members" : [
{
"_id" : 0,
"name" : "localhost:27017",
"health" : 1,
"state" : 1,
"stateStr" : "PRIMARY",
"uptime" : 736,
"optime" : {
"ts" : Timestamp(1547811537, 1),
"t" : NumberLong(2)
},
"optimeDate" : ISODate("2019-01-18T11:38:57Z"),
"syncingTo" : "",
"syncSourceHost" : "",
"syncSourceId" : -1,
"infoMessage" : "",
"electionTime" : Timestamp(1547810805, 1),
"electionDate" : ISODate("2019-01-18T11:26:45Z"),
"configVersion" : 1,
"self" : true,
"lastHeartbeatMessage" : ""
}
],
"ok" : 1,
"operationTime" : Timestamp(1547811537, 1),
"$clusterTime" : {
"clusterTime" : Timestamp(1547811537, 1),
"signature" : {
"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
"keyId" : NumberLong(0)
}
}
}
MongoDB Enterprise test:PRIMARY>
Step 4.
Download the "https://github.com/rwynn/monstache/releases".
Unzip the download and adjust your PATH variable to include the path to the folder for your platform.
GO to cmd and type "monstache -v"
# 4.13.1
Monstache uses the TOML format for its configuration. Configure the file for migration named config.toml
Step 5.
My config.toml -->
mongo-url = "mongodb://127.0.0.1:27017/?replicaSet=test"
elasticsearch-urls = ["http://localhost:9200"]
direct-read-namespaces = [ "admin.users" ]
gzip = true
stats = true
index-stats = true
elasticsearch-max-conns = 4
elasticsearch-max-seconds = 5
elasticsearch-max-bytes = 8000000
dropped-collections = false
dropped-databases = false
resume = true
resume-write-unsafe = true
resume-name = "default"
index-files = false
file-highlighting = false
verbose = true
exit-after-direct-reads = false
index-as-update=true
index-oplog-time=true
Step 6.
D:\15-1-19>monstache -f config.toml

I found mongo-connector useful. It is form Mongo Labs (MongoDB Inc.) and can be used now with Elasticsearch 2.x
Elastic 2.x doc manager: https://github.com/mongodb-labs/elastic2-doc-manager
mongo-connector creates a pipeline from a MongoDB cluster to one or more target systems, such as Solr, Elasticsearch, or another MongoDB cluster. It synchronizes data in MongoDB to the target then tails the MongoDB oplog, keeping up with operations in MongoDB in real-time. It has been tested with Python 2.6, 2.7, and 3.3+. Detailed documentation is available on the wiki.
https://github.com/mongodb-labs/mongo-connector
https://github.com/mongodb-labs/mongo-connector/wiki/Usage%20with%20ElasticSearch

Here how to do this on mongodb 3.0. I used this nice blog
Install mongodb.
Create data directories:
$ mkdir RANDOM_PATH/node1
$ mkdir RANDOM_PATH/node2>
$ mkdir RANDOM_PATH/node3
Start Mongod instances
$ mongod --replSet test --port 27021 --dbpath node1
$ mongod --replSet test --port 27022 --dbpath node2
$ mongod --replSet test --port 27023 --dbpath node3
Configure the Replica Set:
$ mongo
config = {_id: 'test', members: [ {_id: 0, host: 'localhost:27021'}, {_id: 1, host: 'localhost:27022'}]};
rs.initiate(config);
Installing Elasticsearch:
a. Download and unzip the [latest Elasticsearch][2] distribution
b. Run bin/elasticsearch to start the es server.
c. Run curl -XGET http://localhost:9200/ to confirm it is working.
Installing and configuring the MongoDB River:
$ bin/plugin --install
com.github.richardwilly98.elasticsearch/elasticsearch-river-mongodb
$ bin/plugin --install elasticsearch/elasticsearch-mapper-attachments
Create the “River” and the Index:
curl -XPUT 'http://localhost:8080/_river/mongodb/_meta' -d '{
"type": "mongodb",
"mongodb": {
"db": "mydb",
"collection": "foo"
},
"index": {
"name": "name",
"type": "random"
}
}'
Test on browser:
http://localhost:9200/_search?q=home

Since mongo-connector now appears dead, my company decided to build a tool for using Mongo change streams to output to Elasticsearch.
Our initial results look promising. You can check it out at https://github.com/electionsexperts/mongo-stream. We're still early in development, and would welcome suggestions or contributions.

Related

Getting error message while replication database in mongodb windows

I am trying to replicate database in mongodb 4.0 in two different windows server. I followed all the steps of mongodb replication. While I was configuring the secondary database from primary through the below command
rsconf={_id:"myset",
members:[{_id:0,host:"PrimaryServerIP:27017"},{_id:1,host:"SecondaryServerIP:27017"}]}
Then while I was reconfiguring the rsconf through
rs.reconfig(rsconf)
I was getting the below error message
"Support for replication protocol version 0 was removed in MongoDB 4.0. Downgrade to MongoDB version 3.6 and upgrade your protocol version to 1 before upgrading your MongoDB version"
I tried to update the replication protocol by downgrading the MongoDB to 3.6 through the below command
rsconf = rs.conf()
rsconf.protocolVersion=1
rs.reconfig(rsconf)
But still getting the same error message. Is there any way to update the protocol version to 1 in mongodb 4.0.
Also I tried the same replication steps in mongodb 3.6 and it is working fine. Any help will be highly appreciated.
I had a similar issue and when I added protocolVersion to my new config it worked.
rsconf={
_id:"myset",
protocolVersion: NumberLong(1),
members:[{_id:0,host:"PrimaryServerIP:27017"},{_id:1,host:"SecondaryServerIP:27017"}]
}
I was initially having this issue trying to reset my rs config to blank so I could start over; I actually did this in the mongo shell
cfg = {
"_id" : "rs0",
"version" : 2,
"protocolVersion" : NumberLong(1),
"members" : [
{
"_id" : 0,
"host" : "myhost:27017"
}
]
}
rs.reconfig(cfg, {force:true})

mongodb show dbs listDatabases failed

I am new to mongo db.
I just installed mongo DB on my MAC,
After watching this Youtube Video
https://www.youtube.com/watch?v=pWbMrx5rVBE&t=369s
In mongo shell, I entered show dbs and I get wired output.
Kindly help me to understand and solve this
> show dbs
2017-09-11T02:45:34.298+0530 E QUERY [thread1] Error: listDatabases failed:{
"ok" : 0,
"errmsg" : "unable to open cursor at URI statistics:table:collection-2-2362555297355466682. reason: No such file or directory",
"code" : 43,
"codeName" : "CursorNotFound"
} :
_getErrorWithCode#src/mongo/shell/utils.js:25:13
Mongo.prototype.getDBs#src/mongo/shell/mongo.js:62:1
shellHelper.show#src/mongo/shell/utils.js:769:19
shellHelper#src/mongo/shell/utils.js:659:15
#(shellhelp2):1:1
After a couple of research, I found the solution and sharing all I learned so no new learner struggles like me.
My mistake :
I was staring mongodb using :
mongod --config "c:\MongoDB\Mongod.cfg"
and start mongo shell by just running
mongo
Starting Mongo DB and Shell
Then understood, what each command does :
|*| Start Mongo DB with default config :
mongod
|*| Start Mongo DB with config file :
mongod -f "c:\MongoDB\Mongods.cfg"
|Or|
mongod --config "c:\MongoDB\Mongod.cfg"
|O| Start Mongo DB with config flags :
mongod --dbpath "c:\mongodb\data\nameMdb" --logpath "c:\mongodb\log\nameMdbLog.log" --directoryperdb --logappend
So correspondingly we should use mongo also to start shell :
|*| Start Mongo shell with default config :
mongo
|*| Start Mongo shell with localhost config flags :
mongo --host localhost --port 28888
|*| Start Mongo shell with public config flags and user details :
mongo --username <user> --password <pass> --host <Host.IP.Adrs> --port 28888
|*| Creating Config File is explained here with example :
https://github.com/mongodb/mongo/blob/master/rpm/mongod.conf
I had a similar issue when configuring MongoDB for replication.
When I run the command below:
mongo
show dbs
I get the error:
> show dbs
uncaught exception: Error: listDatabases failed:{
"topologyVersion" : {
"processId" : ObjectId("60ddea05beb1d89d4d139546"),
"counter" : NumberLong(0)
},
"ok" : 0,
"errmsg" : "not master and slaveOk=false",
"code" : 13435,
"codeName" : "NotPrimaryNoSecondaryOk"
} :
_getErrorWithCode#src/mongo/shell/utils.js:25:13
Mongo.prototype.getDBs/<#src/mongo/shell/mongo.js:147:19
Mongo.prototype.getDBs#src/mongo/shell/mongo.js:99:12
shellHelper.show#src/mongo/shell/utils.js:937:13
shellHelper#src/mongo/shell/utils.js:819:15
#(shellhelp2):1:1
Here's how I fixed it:
The issue was that I had enabled the replication feature in the /etc/mongod.conf file without initializing the replication, so MongoDB could not tell which replica was the primary replica or secondary replica.
All I had to do was to comment out the replication feature in the /etc/mongod.conf file since I was not yet ready to set up replication:
#replication
# replSetName: my-replica-set-name
After which I restarted the MongoDB server:
sudo systemctl restart mongod
This time the command ran fine.
I experienced this issue today, installing the latest version of MongoDB with Homebrew and then launching the mongo shell and entering the command "show dbs". I tested this multiple times and spent some time researching it. The symptoms match an issue that was reported here: https://jira.mongodb.org/browse/SERVER-20753 where the issue was described as building WiredTiger separately from MongoDB and using an out of date version of WiredTiger.
While this is not the case for what you and I have experienced (note that Homebrew is currently installing 3.4.9 and WiredTiger 2.9.2), I guessed that it could be a similar mismatch between WiredTiger and MongoDB, so I decided to try installing a different version.
I ended up installing the latest "dev" version using this Homebrew command:
brew install mongodb --devel
This installs MongoDB 3.5.13 and WiredTiger 3.0.0 which do not have the issue. Note that 3.4.9 was released the day you reported this issue and 3.5.13 was released the next day, although 3.4.9 is still the current community edition listed here: https://docs.mongodb.com/manual/tutorial/install-mongodb-on-os-x/
In my case, the problem was replSet name.
I was changed my hostname at the Sharing preference and it was different with the previous installed replSet config's host name.
You can see the rs.config() in mongo shell like this.
"members" : [
{
"_id" : 0,
"host" : "MBA.local:27777",
"arbiterOnly" : false,
"buildIndexes" : true,
"hidden" : false,
"priority" : 1,
"tags" : {
},
"slaveDelay" : NumberLong(0),
"votes" : 1
}
],
See the host: section and change it by rs.config({object}) command in mongo shell,
or just add the line below at the /etc/hosts file.
127.0.0.1 MBA.local

How to migrate from MMAPv1 to WiredTiger with minimal downtime without mongodump/mongorestore

Most guidelines recommend to use mongodump/mongorestore, but for large product databases downtime can be very long
You can use replication and an additional server for this or the same server if the load allows.
You need 3 running MongoDB instance:
Your server you want to update (remind that WiredTiger support since 3.0).
Second instance of MongoDB which can be run on an additional server. Database will be temporarily copied to it by the replication.
And the third instance of MongoDB is arbiter, which doesn’t store data and only participates in the election of primary server. The arbiter can be run on the additional server on a separate port.
Anyway you need to backup your database. You can run “mongodump” without parameters and directory “./dump” will be created with the database dump. You can use “--gzip“ parameter to compress result size.
mongodump --gzip
Just in case, the command to restore:
mongorestore --gzip
It should be run in the same directory where “./dump” dir and “--gzip“ parameter should be added if used in “mongodump”.
Begin configure from the additional server. My target system is Linux RedHat without Internet, so I download and install MongoDB via RPM manually. Add the section to /etc/mongod.conf:
replication:
oplogSizeMB: 10240
replSetName: REPLICA
Check that the net section look like this to allow access from other servers:
net:
bindIp: 0.0.0.0
port: 27017
and run:
service mongod start
Run the third MongoDB instance - arbiter. It can work on the additional server on a different port. Create a temporary directory for the arbiter database:
mkdir /tmp/mongo
chmod 777 -R /tmp/mongo
and run:
mongod --dbpath /tmp/mongo --port 27001 --replSet REPLICA \
--fork --logpath /tmp/mongo/db1.log
Now configure the main server. Edit /etc/mongod.conf
replication:
oplogSizeMB: 10240
replSetName: REPLICA
and restart MongoDB on the main server:
service mongod restart
It’s important! After restarting the main server read operations may be unavailable. I was getting the following error:
{ "ok" : 0, "errmsg" : "node is recovering", "code" : 13436 }
So as quickly as possible you need to connect to MongoDB on the main server via “mongo” console and run the following command to configure replication:
rs.initiate(
{
_id: "REPLICA",
members: [
{ _id: 0, host : "<IP address of main server>:27017",
priority: 1.0 },
{ _id: 1, host : "<IP address of additional server>:27017",
priority: 0.5 },
{ _id: 2, host : "<IP address of additional server(the arbiter)>:27001",
arbiterOnly : true, priority: 0.5 }
]
}
)
After this operation all actions with MongoDB will be available and data synchronization will be started.
I don’t recommend to use rs.initiate() on the main server without parameters as in most tutorials, because name of the main server will be configured by default as DNS-name from the /etc/hostname. It's not very convenient for me because I use IP-addresses for communications in my projects.
To check the synchronization progress you can call from “mongo” console:
rs.status()
Result example:
{
"set" : "REPLICA",
"date" : ISODate("2017-01-19T14:30:34.292Z"),
"myState" : 1,
"term" : NumberLong(1),
"heartbeatIntervalMillis" : NumberLong(2000),
"members" : [
{
"_id" : 0,
"name" : "<IP address of main server>:27017",
"health" : 1.0,
"state" : 1,
"stateStr" : "PRIMARY",
"uptime" : 165,
"optime" : {
"ts" : Timestamp(6377323060650835, 3),
"t" : NumberLong(1)
},
"optimeDate" : ISODate("2017-01-19T14:30:33.000Z"),
"infoMessage" : "could not find member to sync from",
"electionTime" : Timestamp(6377322974751490, 1),
"electionDate" : ISODate("2017-01-19T14:30:13.000Z"),
"configVersion" : 1,
"self" : true
},
{
"_id" : 1,
"name" : "<IP address of additional server>:27017",
"health" : 1.0,
"state" : 5,
"stateStr" : "STARTUP2",
"uptime" : 30,
"optime" : {
"ts" : Timestamp(0, 0),
"t" : NumberLong(-1)
},
"optimeDate" : ISODate("1970-01-01T00:00:00.000Z"),
"lastHeartbeat" : ISODate("2017-01-19T14:30:33.892Z"),
"lastHeartbeatRecv" : ISODate("2017-01-19T14:30:34.168Z"),
"pingMs" : NumberLong(3),
"syncingTo" : "<IP address of main server>:27017",
"configVersion" : 1
},
{
"_id" : 2,
"name" : "<IP address of additional server (the arbiter)>:27001",
"health" : 1.0,
"state" : 7,
"stateStr" : "ARBITER",
"uptime" : 30,
"lastHeartbeat" : ISODate("2017-01-19T14:30:33.841Z"),
"lastHeartbeatRecv" : ISODate("2017-01-19T14:30:30.158Z"),
"pingMs" : NumberLong(0),
"configVersion" : 1
}
],
"ok" : 1.0
}
After “stateStr” of the additional server will be replaced from ”STARTUP2” to ”SECONDARY”, our servers are synchronized.
While we wait for the end of the synchronization, it is necessary to modify client applications a little bit they can work with all servers in replica.
If you use the ConnectionString, you should replace it with something like:
mongodb://<IP address of main server>:27017,<IP address of additional server>:27017,<IP address of additional server (the arbiter)>:27001/?replicaSet=REPLICA
If you use C++ mongo-cxx-driver legacy, as I am, you should to use mongo::DBClientReplicaSet instead mongo::DBClientConnection and list all three servers in connection parameters, including the arbiter.
There is a third option - you can simply change IP of MongoDB server in clients after switching PRIMARY-SECONDARY, but it's not very fair.
After the synchronization has ended and an additional server status has established as SECONDARY, we need to switch the PRIMARY and SECONDARY by executing the command in “mongo” console on the main server. This is important because command will not work on the additional server.
cfg = rs.conf()
cfg.members[0].priority = 0.5
cfg.members[1].priority = 1
cfg.members[2].priority = 0.5
rs.reconfig(cfg)
Then check server status by executing:
rs.status()
Stop the MongoDB on the main server
service mongod stop
and simply delete the entire contents of a directory with database. It is safe, because we have a working copy on the additional server, and in the beginning we have made a backup. Be careful. MongoDB doesn’t create a database directory itself. If you've deleted it, you need not only to restore
mkdir /var/lib/mongo
and setup owner:
chown -R mongod:mongod /var/lib/mongo
Check storage engine wiredTiger is configured in /etc/mongod.conf. From 3.2 it is used by default:
storage:
...
engine: wiredTiger
...
And run MongoDB:
service mongod start
The main server will get the configuration from the secondary server automatically and data will be synced back to WiredTiger storage.
After the synchronization is finished switch the PRIMARY server back. This operation should be performed on an additional server because it is the PRIMARY now.
cfg = rs.conf()
cfg.members[0].priority = 1
cfg.members[1].priority = 0.5
cfg.members[2].priority = 0.5
rs.reconfig(cfg)
Return the old version of database clients or change ConnectionString back.
Now turn off replication if necessary. Remove 2 replication servers from the main server:
rs.remove("<IP address of additional server>:27017")
rs.remove("<IP address of additional server (the arbiter)>:27001")
Remove all “replication” section from /etc/mongod.conf and restart MongoDB:
service mongod restart
After these we get the warning when connected via the “mongo” console:
2017-01-19T12:26:51.948+0300 I STORAGE [initandlisten] ** WARNING: mongod started without --replSet yet 1 documents are present in local.system.replset
2017-01-19T12:26:51.948+0300 I STORAGE [initandlisten] ** Restart with --replSet unless you are doing maintenance and no other clients are connected.
2017-01-19T12:26:51.948+0300 I STORAGE [initandlisten] ** The TTL collection monitor will not start because of this.
To get rid of it, you need to remove the database “local”. There is only one collection “startup_log” in this database in default state, so you can do this without fear via “mongo” console
use local
db.dropDatabase()
and restart MongoDB:
service mongod restart
If you will remove the “local” database before “replication” section from /etc/mongod.conf, it is immediately restored. So I could not do only one MongoDB restart.
On the additional server perform the same action:
remove “replication“ section from /etc/mongod.conf
restart MongoDB
drop the “local“ database
again restart
The arbiter just stop and remove:
pkill -f /tmp/mongo
rm -r /tmp/mongo

MongoDB shell authentication error

I created a user and password for a database called student.:
db.createUser({user:'Catalin',pwd:'Catalin',roles:[{role:'userAdmin',db:'student'}]});
So I restarted MongoDB Server with this command:
mongod --auth --dbpath C:\data\db
in another terminal, I connected to the sever with:
mongo
then queried the server with :
> db.getUsers()
[
{
"_id" : "student.Catalin",
"user" : "Catalin",
"db" : "student",
"roles" : [
{
"role" : "userAdmin",
"db" : "student"
}
]
}
]
went to the student database:
> use student
switched to db student
Entered my username and password succesfully via this command:
> db.auth('Catalin','Catalin');
1
and when I want to view my collections I get an error, WHY?:
> show collections
2016-03-07T15:54:41.166+0300 E QUERY [thread1] Error: listCollections failed:
{
"ok" : 0,
**"errmsg" : "not authorized on student to execute command { listCollectio
ns: 1.0, filter: {} }",**
"code" : 13
} :
_getErrorWithCode#src/mongo/shell/utils.js:23:13
DB.prototype._getCollectionInfosCommand#src/mongo/shell/db.js:746:1
DB.prototype.getCollectionInfos#src/mongo/shell/db.js:758:15
DB.prototype.getCollectionNames#src/mongo/shell/db.js:769:12
shellHelper.show#src/mongo/shell/utils.js:695:9
shellHelper#src/mongo/shell/utils.js:594:15
#(shellhelp2):1:1
"errmsg" : "not authorized on student to execute command { listCollectio
ns: 1.0, filter: {} }",
P.S. : I'm using mongoDB 3.2
The userAdmin built-in role only provides the ability to create and modify roles and users on a database. If you need access to the database, you would need to either assign database roles, or other roles that has database access such as dbOwner.
Please see Built-in Roles for more detailed information.
You may also find these useful:
Manage users and roles.
Enable client access control.
If you are using version 5 or befor DataBase MongoDB, pay attention to this point. You should not write anything in the mongod.conf file in the /etc/mongod.conf path. Use the following two commands to update the dependencies.
install dependencies :
sudo apt install libgconf-2-4
sudo add-apt-repository universe

How to convert a MongoDB replica set to a stand alone server

Consider, I have 4 replicate sets and the config is as follows:
{
"_id": "rs_0",
"version": 5,
"members" : [
{"_id": 1, "host": "127.0.0.1:27001"},
{"_id": 2, "host": "127.0.0.1:27002"},
{"_id": 3, "host": "127.0.0.1:27003"},
{"_id": 4, "host": "127.0.0.1:27004"}
]
}
I am able to connect to all sets using
mongo --port <port>
There are documents for getting information on Convert a Standalone to a Replica Set, but can anyone tell me how to convert back to standalone from replica set?
Remove all secondary hosts from replica set (rs.remove('host:port')), restart the mongo deamon without replSet parameter (editing /etc/mongo.conf) and the secondary hosts starts in standalone mode again.
The Primary host is tricky one, because you can't remove it from the replica set with rs.remove.
Once you have only the primary node in the replica set, you should exit mongo shell and stop mongo. Then you edit the /etc/mongo.conf and remove the replSet parameter and start mongo again.
Once you start mongo you are already in standalone mode, but the mongo shell will prompt a message like:
2015-07-31T12:02:51.112+0100 [initandlisten] ** WARNING: mongod started without --replSet yet 1 documents are present in local.system.replset
to remove the warning you can do 2 procedures:
1) Droping the local db and restarting mongo:
use local
db.dropDatabase();
/etc/init.d/mongod restart
2)Or if you don't want to be so radical, you can do:
use local
db.system.replset.find()
and it will prompt a message like:
{ "_id" : "replicaSetName", "version" : 1, "members" : [ { "_id" : 0, "host" : "hostprimary:mongoport" } ] }
then you will erase it using:
db.system.replset.remove({ "_id" : "replicaSetName", "version" : 1, "members" : [ { "_id" : 0, "host" : "hostprimary:mongoport" } ] })
and it will probably prompt:
WriteResult({ "nRemoved" : 1 })
Now, you can restart the mongo and the warning should be gone, and you will have your mongo in standalone mode without warnings
Just remove a host from replica set (rs.remove('host:port')), relaunch it without replSet parameter and it's standalone again.
On an Ubuntu Machine
Stop your mongo server
open /etc/mongod.conf
Comment the replication and replSetName line
#replication:
#replSetName: rs0
Start your mongo server and go to mongo shell
drop local database
use local
db.dropDatabase()
Restart mongo
The MongoDB Documentation suggests the following to perform maintenance on a replica set member, which brings the the replica set member into standalone mode for further operations. With little modification it can be made standalone:
If node in concern is the only node in a shard, drain the chunks to other shards as per MongoDB documentation here, or else the sharded database will break, i.e.
Make sure balancer is enabled by connecting to mongos and run sh.startBalancer(timeout, interval)
For the shard in concern, go to admin database and db.adminCommand( { removeShard: "mongodb0" } )
Check draining status by repeating above removeShard command, wait for draining to complete
If node in concern is primary, do rs.stepDown(300)
Stop the node by running db.shutdownServer()
Change the yaml config by:
commenting out replication.replSetName (--replSetName in command line)
commenting out sharding.clusterRole for shard or config server (--shardsvc and --configsvr in command line)
commenting out net.port, then change it to a different port (--port in command line)
Start the mongod instance
If change is permanent, go to other mongod instance and run rs.remove("host:port")
After this, the node in concern should be up and running in standalone mode.
Follow below steps :
Go to mongo shell on Secondary servers
Stop the secondary servers by using below command :
use admin
db.shutdownServer()
Go to Linux shell- on secondary servers and type below command :
sudo service mongod stop
Starting the MongoDB replication -
Go to Linux shell - on secondary servers and type below command :
sudo service mongod start
Starting the MongoDB replication -
Go to primary and type below commands to start the replication :
a] rs.initiate()
b] rs.add("Secondar -1:port no")
c] rs.add("Secondary-2:port no")
d] rs.add({ "_id" : 3, "host" : "Hidden_member:port no", "priority" : 0,
"hidden" : true })
e] rs.status()