MongoDB Not Responding on db.collection.createIndex() - mongodb

currently i'm using mongodb for my big data project. i have installed mongodb into Centos 7 server with 32GB RAM and connected to 12TB NFS. until now, this is my database statistic:
web-analyzer 43.933 GB
web-crawler 109.900 GB
web-crawler2 339.788 GB
the problem is, whenever i run the craeteIndex() on may collection, my mongodb always end up to not responding (i cannot execute db.collection.count() or 'show dbs / collections' command) so i terminate the job using CTRL + C. after that i cannot shutdown my mongodb using 'kill pid' or 'mongod --shutdown' command so i have to reboot my server. Anyone know the cause of this problem and how to solve this issue?
this is my 'top' command output for mongod service:
before running the query:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
2896 mongod 20 0 0.965t 99220 70964 S 0.3 0.3 0:00.33 mongod
after running createIndex()
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
2896 mongod 20 0 0.965t 2.465g 2.411g S 2.7 7.9 0:26.03 mongod
Thank you :)

By default, building an index is a blocking operation which locks the database until completed. When you already have a lot of data in your collection, this can take a very long time. But you can build an index in background by using the background:true option.
db.collection.createIndex({ keyfield: 1, otherkeyfield: 1 }, { background: true });
It will take longer overall, but the database will keep responding while you do.
For more information about creating indexes, please refer to the documentation.

Related

mongo MongoCursorNotFoundException in long-running query loop

I have a simple query loop that gets a MongoCursorNotFoundException after processing about 44,000 of 96,945 documents in around 93 minutes.
MongoIterable<MasterDocument> query = masterCollection.find().noCursorTimeout(true);
for (MasterDocument masterDocument : query) { ... do some stuff ... }
The "do some stuff" part takes a while, which is why the entire loop takes so long.
My problem is that I get this exception after handling maybe half of the documents in the collection.
I am running both the client application and the mongod server locally on my Windows 10 laptop, accessing the server via localhost.
The server log shows lots of messages like this:
{"t":{"$date":"2021-01-04T20:21:35.510-08:00"},"s":"I", "c":"COMMAND", "id":51803, "ctx":"conn27","msg":"Slow query","attr":{"type":"command","ns":"master_database.MasterCollection","command":{"find":"MasterCollection","filter":{"hashCode":1753339282},"$db":"master_database","lsid":{"id":{"$uuid":"6a252f51-2c6e-4c01-ae03-1a80aab109e0"}}},"planSummary":"COLLSCAN","keysExamined":0,"docsExamined":96944,"cursorExhausted":true,"numYields":96,"nreturned":0,"queryHash":"DBC59907","planCacheKey":"DBC59907","reslen":121,"locks":{"ReplicationStateTransition":{"acquireCount":{"w":97}},"Global":{"acquireCount":{"r":97}},"Database":{"acquireCount":{"r":97}},"Collection":{"acquireCount":{"r":97}},"Mutex":{"acquireCount":{"r":1}}},"storage":{},"protocol":"op_msg","durationMillis":147}}
The last of these messages is followed by:
{"t":{"$date":"2021-01-04T20:21:35.521-08:00"},"s":"I", "c":"NETWORK", "id":22944, "ctx":"conn27","msg":"Connection ended","attr":{"remote":"127.0.0.1:58990","connectionId":27,"connectionCount":14}}
{"t":{"$date":"2021-01-04T20:21:35.522-08:00"},"s":"I", "c":"NETWORK", "id":22944, "ctx":"conn26","msg":"Connection ended","attr":{"remote":"127.0.0.1:58989","connectionId":26,"connectionCount":13}}
{"t":{"$date":"2021-01-04T20:21:35.922-08:00"},"s":"I", "c":"-", "id":20883, "ctx":"conn25","msg":"Interrupted operation as its client disconnected","attr":{"opId":310196}}
I have tried:
Using "noCursorTimeout(true)" on the query cursor (as shown above)
Starting the server with "mongod --setParameter localLogicalSessionTimeoutMinutes=240". This last seems to have caused additional log messages that say "error":"Location13111: wrong type for field (expireAfterSeconds) long != int"
I am using mongod 4.4 and the latest mongo java api.
You may need to increase the default cursor idle timeout to bigger value in all shards and mongos:
check the parameter(default is 10 min = 600000 ms ):
use admin
db.runCommand({getParameter:1, cursorTimeoutMillis: 1})
and update to bigger value:
use admin
db.runCommand({setParameter:1, cursorTimeoutMillis: 600000000 })
also the COLSCAN in your logs indicate that you dont use indexes in your query , maybe you need to create one on "hashCode" ...
Thanks for the response.
It turned out that my application ran to completion once I started mongod with "--setParameter localLogicalSessionTimeoutMinutes=240, despite the error message that I saw in the console log.
You are absolutely right that I should have an index on "hashCode". (I had one before but forgot to recreate it after recreating the collection.)

is it correct parameters for pgbouncer.ini and postgresql.conf?

I have pgbouncer.ini file with the below configuration
[databases]
test_db = host=localhost port=5432 dbname=test_db
[pgbouncer]
logfile = /var/log/postgresql/pgbouncer.log
pidfile = /var/run/postgresql/pgbouncer.pid
listen_addr = 0.0.0.0
listen_port = 5433
unix_socket_dir = /var/run/postgresql
auth_type = md5
auth_file = /etc/pgbouncer/userlist.txt
admin_users = postgres
#pool_mode = transaction
pool_mode = session
server_reset_query = RESET ALL;
ignore_startup_parameters = extra_float_digits
max_client_conn = 25000
autodb_idle_timeout = 3600
default_pool_size = 250
max_db_connections = 250
max_user_connections = 250
and I have in my postgresql.conf file
max_connections = 2000
does it effect badly on the performance ? because of max_connections in my postgresql.conf ? or it doesn't mean anything and already the connection handled by the pgbouncer ?
one more question. in pgpouncer configuration, does it right listen_addr = 0.0.0.0 ? or should to be listen_addr = * ?
Is it better to set default_pool_size on PgBouncer equal to the number of CPU cores available on this server?
Shall all of default_pool_size, max_db_connections and max_user_connections to be set with the same value ?
So the idea of using pgbouncer is to pool connections when you can't afford to have a higher number of max_connections in PG itself.
NOTE: Please DO NOT set max_connections to a number like 2000 just like that.
Let's start with an example, if you have a connection limit of 20 and then your app or organization wants to have a 1000 connections at a given time, that is where pooler comes into picture and in this specific case you want the 20 connections to pool that 1000 coming in from the application.
To understand how it actually works let's take a step back and understand what happens when you do not have a connection pooler and only rely on PG config setting for the max connections which in our case is 20.
So when a connection comes in from a client\application etc. the main process of postgresql(PID, i.e. parent ID) spawns a child for that. So each new connection spawns a child process under the main postgres process, like so:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
24379 postgres 20 0 346m 148m 122m R 61.7 7.4 0:46.36 postgres: sysbench sysbench ::1(40120)
24381 postgres 20 0 346m 143m 119m R 62.7 7.1 0:46.14 postgres: sysbench sysbench ::1(40124)
24380 postgres 20 0 338m 137m 121m R 57.7 6.8 0:46.04 postgres: sysbench sysbench ::1(40122)
24382 postgres 20 0 338m 129m 115m R 57.4 6.5 0:46.09 postgres: sysbench sysbench ::1(40126)
So now once a connection request is sent, it is received by the POSTMASTER process and creates a child process at OS level under the main parent process. This connection then has a life span of "unlimited" unless close by the application or you have a time out set for idle connections in postgresql.
Now here comes the situation where it can be a very costly affair to manage the connections with a given compute, if they exceed a certain limit. Meaning n number of connections when served have a given compute cost and after some time the OS won't be able to handle a situation with HUGE connections and will in turn cause contentions at different compute level(i.e. Memory, CPU, I/O).
What if you can use the presently spawned child processes(backends) if they are not doing any work. You will save time on getting the child process(backends) and the additional cost as well(this can be different at times). This is where the pool of connections that are always open help to serve different client requests comes in and is also called pooling.
So basically now you have only n connections available but the pooler can manage n+i number of connections to serve the client requests.
This where pg-bouncer helps to reuse the connections. It can be configured with 3 types of pooling i.e Session pooling, Statement pooling and Transaction pooling. Basically bouncer returns the connection back to the pool once it has done, statement level work or transaction level work etc. Only during session pooling it keeps the connections unless it disconnects.
So basically lower down the number of connections at PG conf file level and tune all settings in the bouncer.ini.
To answer the second part:
one more question. in pgpouncer configuration, does it right listen_addr = 0.0.0.0 ? or should to be listen_addr = * ?
It depends if you have a standalone deployment, server etc.
basically if its on the server itself and you want it to allow connections from everywhere(incoming) use "*" if you want only the local network to be allowed use "127.0.0.0".
For the rest of your questions check this link: pgbouncer docs
I have tried to share a little of what I know, feel free to ask away if anything was unclear or or correct if it was incorrectly mentioned.

ERROR: child process failed, exited with error number 51 MongoDB

Getting this error while restarting MongoDB , I am using Mongo 3.2.4 and doing this set up on a new machine
Starting mongod... about to fork child process, waiting until server is ready for connections.
forked process: 19438
ERROR: child process failed, exited with error number 51
mongod(_ZN5mongo19MmapV1ExtentManager4initEPNS_16OperationContextE+0x4A8) [0x1040278]
mongod(_ZN5mongo26MMAPV1DatabaseCatalogEntryC1EPNS_16OperationContextENS_10StringDataES3_bb+0x187) [0x1036dc7]
mongod(_ZN5mongo12MMAPV1Engine23getDatabaseCatalogEntryEPNS_16OperationContextENS_10StringDataE+0x14E) [0x103a1de]
mongod(_ZN5mongo14DatabaseHolder6openDbEPNS_16OperationContextENS_10StringDataEPb+0x133) [0xac92a3]
----- END BACKTRACE -----
For me this error occured due to incorrect ownership of some files in my data directory. I fixed this using the following command:
sudo chown -R mongodb: /path/to/db/directory
Where mongodb was the database user in my case.
This is resolved by inserting the following lines into /etc/security/limits.conf:
mongodb soft nofile 65535
mongodb hard nofile 90000
mongodb soft nproc 65535
mongodb hard nproc 90000
We need to add the user account used to run the Mongo service. Generally, it is the mongodb user.
In my case the problem was that I haven't created the appropriate folders specified in the config files.

mongoimport loading only 1000 rows on sharding

I have a mongo sharding setup configuration like
6 config server
3 shard server (with replica)
6 router
for example:
**s1->s2 (one shard with replicat (primary:s1,secondry:s2))
s3->s4 (2nd shard with replics (primary s3, secondry s4))
s5->s6 (third shard with replics (primary s5, secondry s6))
config, router is on all server i.e s1 to s6**
I am not able to import data to one of the empty sharded collection , data is in csv format.
I m running mongoimport in background and the nohup out shows like this
**2017-01-10T17:13:18.444+0530 [........................] dbname.collectionname 364.0 KB/46.1 MB (0.8%)**
mongoimport is stuck , how to fix this.
I first tried to run mongoimport on s2 but not succeeded then try to run mongoimport on s1 no success
follwing are the errors servers from routerlog , configuration log
**HostnameCanonicalizationWorker
[rsBackgroundSync] we are too stale to use **** as a
sync source
REPL [ReplicationExecutor] could not find member to sync from
REPL [ReplicationExecutor] The liveness timeout does not match callback handle, so not resetting it.
REPL [rsBackgroundSync] too stale to catch up -- entering maintenance mode**

MongoDB's mongosniff won't start?

mongosniff is for looking at what's sent to the MongoDB server, but on a Mac with OS X Snow Leopard, it says
error finding device: no suitable device found
this is when mongod is running fine. Is there something that can make it work?
Update: thanks. after running it as root, for some reason it is not reporting any activities when a mongo is running with different queries on this same machine. One time I had an error and it reported once and that was it... was it to report each activity usually?
Have you tried running it as root? I get that error when I don't have permission to monitor network traffic.
Edit in response to your update:
You have to specify the network interface to sniff. Run ifconfig to see what your local network name is and then use:
sudo mongosniff --source NET lo
Running as root works for me:
bobk-mbp:~ bobk$ sudo mongosniff --source NET lo0
sniffing... 27017
10.78.4.213:14303 -->> 10.78.4.213:27017 admin.$cmd 58 bytes id:2447 9287
query: { ismaster: 1 } ntoreturn: -1 ntoskip: 0
10.78.4.213:27017 <<-- 10.78.4.213:14303 87 bytes id:2789 10121 - 9287
reply n:1 cursorId: 0
{ ismaster: true, maxBsonObjectSize: 16777216, ok: 1.0 }
...also, ifconfig shows that the LOOPBACK is on lo0 on my system.