mongodb queries are very slow - mongodb

I have a multi-threaded app that executes hundreds of transactions per second but after a while the performance drops and the queries are taking too long to execute. I have worked on optimising the connection with the following code to avoid errors, but this resulted in a poor performance.
My queries varies from inserts, delete and multi update; collections do not exceed 100,000 rows.
I have a cluster of 4 VMs for mongoldb each has 4 cores and 28GB of ram on Azure. I built the cluster using bitnami production (https://azure.microsoft.com/en-us/marketplace/partners/bitnami/production-mongodbdefault/)
private static MongoClientOptions options = MongoClientOptions.builder()
.connectionsPerHost(1000)
.threadsAllowedToBlockForConnectionMultiplier(15)
.connectTimeout(60 * 1000)
.maxWaitTime(60 * 1000)
.socketTimeout(60 * 1000)
.connectTimeout(60 * 1000)
.build();
I am not using any indexes and here is my app flow:
am using the db tales for queuing and processing msgs. Each job is saved in a separate collection, with a document for each msg that looks like this (status is 'ready' for the msgs at this stage):
{
"_id": ObjectId("57cd303743ffe80f3728fcf5"),
"_class": "com.mongodb.BasicDBObject",
"job_id": "57cd3031d9991f8639487013",
"priority": 1,
"title": "1",
"sender_id": "sender 1",
"account_id": "57c2d556d9991fbc15897275",
"schedule_date": ISODate("2016-09-05T08:43:00Z"),
"utf8": false,
"content": "text to be sent",
"number": "962799000001",
"status": "ready",
"user_id": "57c2d602d9991fbc1589727b",
"adv": true,
"number_of_sms_msgs": 1,
"uuid": "57cd3031d9991f8639487013_57cd303743ffe80f3728fcf5",
"msg_id": "1955559517"
}
Then I am moving batches from each job according to their priorities from status 'ready' to 'queued' and add them to on-memory queue to be processed:
List<DBObject> batch = scaffoldingRepository.findPageNoSort(dataType, page, next_batch_size, query, null);
if (batch != null && batch.size() > 0) {
BasicDBList ids = new BasicDBList();
for (final DBObject msg : batch) {
msg.put("status", "queued");
msg.put("uuid", job_id + "_" + msg.get("_id"));
ids.add(new ObjectId(msg.get("_id").toString()));
}
BasicDBObject search = new BasicDBObject();
search.put("_id", new BasicDBObject("$in", ids));
BasicDBObject update = new BasicDBObject();
update.put("$set", new BasicDBObject("status", "queued"));
scaffoldingRepository.updateObjects(search, update, dataType);
}
Then another thread sends the actual msgs from the on-memory queue and updates each msg's status separately (sent/failed); I add an index for this msg in a separate table so I can find it once the sender send me back the final status.
Finally I get a final result from the sender about the msg (delivered/undelivered) and I update this msg accordingly, then remove the index from (job_index) collection in the previous step.
==========================
UPDATE:======================
I noticed that I got this error in Java logs:
com.mongodb.MongoSocketReadTimeoutException: Timeout while receiving message
at com.mongodb.connection.InternalStreamConnection.translateReadException(InternalStreamConnection.java:475)
at com.mongodb.connection.InternalStreamConnection.receiveMessage(InternalStreamConnection.java:226)
at com.mongodb.connection.UsageTrackingInternalConnection.receiveMessage(UsageTrackingInternalConnection.java:105)
at com.mongodb.connection.DefaultConnectionPool$PooledConnection.receiveMessage(DefaultConnectionPool.java:438)
at com.mongodb.connection.WriteCommandProtocol.receiveMessage(WriteCommandProtocol.java:262)
at com.mongodb.connection.WriteCommandProtocol.execute(WriteCommandProtocol.java:104)
at com.mongodb.connection.UpdateCommandProtocol.execute(UpdateCommandProtocol.java:64)
at com.mongodb.connection.UpdateCommandProtocol.execute(UpdateCommandProtocol.java:37)
at com.mongodb.connection.DefaultServer$DefaultServerProtocolExecutor.execute(DefaultServer.java:168)
at com.mongodb.connection.DefaultServerConnection.executeProtocol(DefaultServerConnection.java:289)
at com.mongodb.connection.DefaultServerConnection.updateCommand(DefaultServerConnection.java:143)
at com.mongodb.operation.UpdateOperation.executeCommandProtocol(UpdateOperation.java:76)
at com.mongodb.operation.BaseWriteOperation$1.call(BaseWriteOperation.java:142)
at com.mongodb.operation.BaseWriteOperation$1.call(BaseWriteOperation.java:134)
at com.mongodb.operation.OperationHelper.withConnectionSource(OperationHelper.java:232)
at com.mongodb.operation.OperationHelper.withConnection(OperationHelper.java:223)
at com.mongodb.operation.BaseWriteOperation.execute(BaseWriteOperation.java:134)
at com.mongodb.operation.BaseWriteOperation.execute(BaseWriteOperation.java:61)
at com.mongodb.Mongo.execute(Mongo.java:827)
at com.mongodb.Mongo$2.execute(Mongo.java:810)
at com.mongodb.DBCollection.executeWriteOperation(DBCollection.java:333)
at com.mongodb.DBCollection.updateImpl(DBCollection.java:495)
at com.mongodb.DBCollection.update(DBCollection.java:455)
at com.mongodb.DBCollection.update(DBCollection.java:432)
at com.mongodb.DBCollection.update(DBCollection.java:522)
at com.mongodb.DBCollection.updateMulti(DBCollection.java:552)
at java.util.TimerThread.mainLoop(Timer.java:555)
at java.util.TimerThread.run(Timer.java:505)
And here is my Mongodb config:
# mongod.conf
# for documentation of all options, see:
# http://docs.mongodb.org/manual/reference/configuration-options/
# Where and how to store data.
storage:
dbPath: /opt/bitnami/mongodb/data/db
journal:
enabled: true
#engine:
mmapv1:
smallFiles: true
# wiredTiger:
# where to write logging data.
systemLog:
destination: file
logAppend: true
path: /opt/bitnami/mongodb/logs/mongodb.log
# network interfaces
net:
port: 27017
bindIp: 0.0.0.0
unixDomainSocket:
enabled: true
pathPrefix: /opt/bitnami/mongodb/tmp
# replica set options
replication:
replSetName: replicaset
# process management options
processManagement:
fork: false
pidFilePath: /opt/bitnami/mongodb/tmp/mongodb.pid
# set parameter options
setParameter:
enableLocalhostAuthBypass: true
# security options
security:
authorization: disabled
#keyFile: replace_me
#profiling
#operationProfiling:
#slowOpThresholdMs: 100
#mode: slowOp

Few tips
-In MongoDB profiler did you check the slow running queries.
-Did you try indexing the documents (use inputs from above step)
-Which version of MongoDB are you using and which is the storage engine.
-Have you done replication of the server. If yes, please revisit the write concern part https://docs.mongodb.com/manual/core/replica-set-write-concern/
-Can you check whether mongodb in-memory implementation can help https://docs.mongodb.com/manual/core/inmemory/
-You can see few important tips here - https://docs.mongodb.com/manual/administration/analyzing-mongodb-performance/

Related

MongoDB Replication addition failing

I'm trying to add 2 slaves in mongodb replication after successful initialization. But unfortunately it is failing.
repset_init.js file details
rs.add( { host: "10.0.1.170:27017" } )
rs.add( { host: "10.0.2.157:27017" } )
rs.add( { host: "10.0.3.88:27017" } )
command which i have executed for replicaset addition
mongo -u xxxxx -p yyyy --authenticationDatabase admin --port 27017 repset_init.js
command hangs in terminal and below is the log output
{"t":{"$date":"2021-06-09T12:29:34.939+00:00"},"s":"I", "c":"REPL", "id":21393, "ctx":"conn2","msg":"Found self in config","attr":{"hostAndPort":"MongoD-1:27017"}}
{"t":{"$date":"2021-06-09T12:29:34.939+00:00"},"s":"I", "c":"COMMAND", "id":51803, "ctx":"conn2","msg":"Slow query","attr":{"type":"command","ns":"local.system.replset","appName":"MongoDB Shell","command":{"replSetReconfig":{"_id":"Shard_0","version":2,"protocolVersion":1,"writeConcernMajorityJournalDefault":true,"members":[{"_id":0,"host":"MongoD-1:27017","arbiterOnly":false,"buildIndexes":true,"hidden":false,"priority":1.0,"tags":{},"slaveDelay":0,"votes":1},{"host":"10.0.2.157:27017","_id":1.0}],"settings":{"chainingAllowed":true,"heartbeatIntervalMillis":2000,"heartbeatTimeoutSecs":10,"electionTimeoutMillis":10000,"catchUpTimeoutMillis":-1,"catchUpTakeoverDelayMillis":30000,"getLastErrorModes":{},"getLastErrorDefaults":{"w":1,"wtimeout":0},"replicaSetId":{"$oid":"60c0b3566991d93637465f55"}}},"lsid":{"id":{"$uuid":"263568b4-ec31-4ea6-8f72-69cec80c1a7c"}},"$db":"admin"},"numYields":0,"reslen":38,"locks":{"ParallelBatchWriterMode":{"acquireCount":{"r":3}},"ReplicationStateTransition":{"acquireCount":{"w":5}},"Global":{"acquireCount":{"r":1,"w":4}},"Database":{"acquireCount":{"w":2,"W":1}},"Collection":{"acquireCount":{"w":2}},"Mutex":{"acquireCount":{"r":2}}},"flowControl":{"acquireCount":2,"timeAcquiringMicros":3},"storage":{},"protocol":"op_msg","durationMillis":151}}
{"t":{"$date":"2021-06-09T12:29:34.940+00:00"},"s":"I", "c":"REPL", "id":21215, "ctx":"ReplCoord-1","msg":"Member is in new state","attr":{"hostAndPort":"10.0.2.157:27017","newState":"STARTUP"}}
{"t":{"$date":"2021-06-09T12:29:34.941+00:00"},"s":"I", "c":"REPL", "id":4508702, "ctx":"conn2","msg":"Waiting for the current config to propagate to a majority of nodes"}
{"t":{"$date":"2021-06-09T12:33:55.701+00:00"},"s":"I", "c":"CONTROL", "id":20712, "ctx":"LogicalSessionCacheReap","msg":"Sessions collection is not set up; waiting until next sessions reap interval","attr":{"error":"ShardingStateNotInitialized: sharding state is not yet initialized"}}
{"t":{"$date":"2021-06-09T12:33:55.701+00:00"},"s":"I", "c":"CONTROL", "id":20714, "ctx":"LogicalSessionCacheRefresh","msg":"Failed to refresh session cache, will try again at the next refresh interval","attr":{"error":"ShardingStateNotInitialized: sharding state is not yet initialized"}}
{"t":{"$date":"2021-06-09T12:34:35.029+00:00"},"s":"I", "c":"CONNPOOL", "id":22572, "ctx":"MirrorMaestro","msg":"Dropping all pooled connections","attr":{"hostAndPort":"10.0.2.157:27017","error":"ShutdownInProgress: Pool for 10.0.2.157:27017 has expired."}}
Additional details:
Shard_0:PRIMARY> rs.printSlaveReplicationInfo()
WARNING: printSlaveReplicationInfo is deprecated and may be removed in the next major release. Please use printSecondaryReplicationInfo instead.
source: 10.0.2.157:27017
syncedTo: Thu Jan 01 1970 00:00:00 GMT+0000 (UTC) 1623243005 secs (450900.83 hrs) behind the primary
Able to reach the node via port 27017
telnet 10.0.2.157 27017
Trying 10.0.2.157...
Connected to 10.0.2.157.
Escape character is '^]'.
My config file
net:
bindIp: 0.0.0.0
port: 27017
ssl: {}
processManagement:
fork: "true"
pidFilePath: /var/run/mongodb/mongod.pid
replication:
replSetName: Shard_0
security:
authorization: enabled
keyFile: /etc/zzzzzkey.key
setParameter:
authenticationMechanisms: SCRAM-SHA-256
sharding:
clusterRole: shardsvr
storage:
dbPath: /data/dbdata
engine: wiredTiger
systemLog:
destination: file
path: /data/log/mongodb.log
I'm initializing replicaset using below cmd
mongo --host 127.0.0.1 --port {{mongod_port}} --eval 'printjson(rs.initiate())'
Not sure what causing this issue. Could you please help me
The command looks a bit strange:
{
"replSetReconfig": {
"_id": "Shard_0",
"members": [
{ "_id": 0, "host": "MongoD-1:27017", "arbiterOnly": false, "hidden": false, "priority": 1.0, "slaveDelay": 0, "votes": 1 },
{ "_id": 1.0, "host": "10.0.2.157:27017" }
],
}
}
Why do you name your replica set Shard_0? Do you try to setup a Sharded Cluster?
You add _id: 0, host: "MongoD-1:27017" and _id: 1.0, host: "10.0.2.157:27017" which is not consistent, i.e. you mixed IP-Address and hostname. Also _id "0" and "1.0" is confusing.
How does your config files look like and how did you start the MongoDB services?

Error node not found when setup replica set in mongodb

I'm having extreme frustration trying to setup a MongoDB replica set from scratch.
I have 2 machine run debian os and installed mongodb. When i try use rs.add() to add member to replica set then i appear error although i still connect to mongodb by
mongo --host 13.212.31.212:27017
Here is the error messages
rs0:PRIMARY> rs.add("13.212.31.212:27017")
{
"operationTime" : Timestamp(1597144435, 1),
"ok" : 0,
"errmsg" : "Quorum check failed because not enough voting nodes responded; required 2 but only the following 1 voting nodes responded: 192.168.0.59:27017; the following nodes did not respond affirmatively: 13.212.31.212:27017 failed with Received heartbeat from member with the same member ID as ourself: 0",
"code" : 74,
"codeName" : "NodeNotFound",
"$clusterTime" : {
"clusterTime" : Timestamp(1597144440, 1),
"signature" : {
"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
"keyId" : NumberLong(0)
}
}
}
Here is the mongod conf
# Where and how to store data.
storage:
dbPath: /var/lib/mongodb
journal:
enabled: true
# engine:
# mmapv1:
# wiredTiger:
# where to write logging data.
systemLog:
destination: file
logAppend: true
path: /var/log/mongodb/mongod.log
# network interfaces
net:
port: 27017
bindIp: 127.0.0.1,172.26.2.229
# how the process runs
processManagement:
timeZoneInfo: /usr/share/zoneinfo
what i am doing wrong ?
This is the descriptive error message:
the following 1 voting nodes responded: 192.168.0.59:27017; the following nodes did not respond affirmatively: 13.212.31.212:27017 failed with Received heartbeat from member with the same member ID as ourself: 0"
That is telling you that 192.168.0.59:27017 and 13.212.31.212:27017 are the same node, and you can't add the same node twice.

how to avoid java.util.concurrent.ThreadPoolExecutor error with prisma

I am currently using Prisma on a large scale project. When performing a complex query, the Prisma cluster often errors out with the following message in the docker-logs (I'm redacting the error for readability):
{"key":"error/unhandled","requestId":"local:ck7ditux500570716cl5f8x3r","clientId":"default$default","payload":{"exception":"java.util.concurrent.RejectedExecutionException: Task slick.basic.BasicBackend$DatabaseDef$$anon$3#552b85a4 rejected from slick.util.AsyncExecutor$$anon$1$$anon$2#1d4391f7[Running, pool size = 9, active threads = 9, queued tasks = 1000, completed tasks = 43440]","query":"query ($where: TaskWhereUniqueInput!)
{\n task(where: $where) {\n workflow {\n tasks {\n id\n state\n parentReq\n frozen\n parentTask {\n id\n state\n }\n }\n }\n }\n}\n","variables":
"{\"where\":{\"id\":\"ck6twx873bs550874ne867066\"}}",
"code":"0","stack_trace":"java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2063)\\n
java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:830)
...
"message":"Task slick.basic.BasicBackend$DatabaseDef$$anon$3#552b85a4 rejected from slick.util.AsyncExecutor$$anon$1$$anon$2#1d4391f7[Running, pool size = 9, active threads = 9, queued tasks = 1000, completed tasks = 43440]"}}
This is a frequent error when handling large queries. Has anyone come up with a way to configure Prisma or do internal batch operations to avoid this concurrency error?
Option 1:
I ran into this issue when running a prisma mutation in a loop with a lot of data which seemed to have created too many concurrent database operations.
My solution was to throttle the requests:
thingToLoop.map(() => {
await new Promise(resolve => setTimeout(resolve, 1000));
// Prisma operation here. Essentially this just makes the operation wait for one second between each operation.
})
Option 2
I've read other posts that say that you can set a limit to the number of concurrent connections as well.
See: https://v1.prisma.io/docs/1.25/prisma-server/database-connector-POSTGRES-jgfr/#overview
Essentially, this restricts the number of connections when too many are being created. See the following prisma config as an example:
PRISMA_CONFIG: |
managementApiSecret: __YOUR_MANAGEMENT_API_SECRET__
port: 4466
databases:
default:
connector: postgres
migrations: __ENABLE_DB_MIGRATIONS__
host: __YOUR_DATABASE_HOST__
port: __YOUR_DATABASE_PORT__
user: __YOUR_DATABASE_USER__
password: __YOUR_DATABASE_PASSWORD__
connectionLimit: __YOUR_CONNECTION_LIMIT__

Fiware cygnus: no data have been persisted in mongo DB

I am trying to use cygnus with Mongo DB, but no data have been persisted in the data base.
Here is the notification got in cygnus:
15/07/21 14:48:01 INFO handlers.OrionRestHandler: Starting transaction (1437482681-118-0000000000)
15/07/21 14:48:01 INFO handlers.OrionRestHandler: Received data ({ "subscriptionId" : "55a73819d0c457bb20b1d467", "originator" : "localhost", "contextResponses" : [ { "contextElement" : { "type" : "enocean", "isPattern" : "false", "id" : "enocean:myButtonA", "attributes" : [ { "name" : "ButtonValue", "type" : "", "value" : "ON", "metadatas" : [ { "name" : "TimeInstant", "type" : "ISO8601", "value" : "2015-07-20T21:29:56.509293Z" } ] } ] }, "statusCode" : { "code" : "200", "reasonPhrase" : "OK" } } ]})
15/07/21 14:48:01 INFO handlers.OrionRestHandler: Event put in the channel (id=1454120446, ttl=10)
Here is my agent configuration:
cygnusagent.sources = http-source
cygnusagent.sinks = OrionMongoSink
cygnusagent.channels = mongo-channel
#=============================================
# source configuration
# channel name where to write the notification events
cygnusagent.sources.http-source.channels = mongo-channel
# source class, must not be changed
cygnusagent.sources.http-source.type = org.apache.flume.source.http.HTTPSource
# listening port the Flume source will use for receiving incoming notifications
cygnusagent.sources.http-source.port = 5050
# Flume handler that will parse the notifications, must not be changed
cygnusagent.sources.http-source.handler = com.telefonica.iot.cygnus.handlers.OrionRestHandler
# URL target
cygnusagent.sources.http-source.handler.notification_target = /notify
# Default service (service semantic depends on the persistence sink)
cygnusagent.sources.http-source.handler.default_service = def_serv
# Default service path (service path semantic depends on the persistence sink)
cygnusagent.sources.http-source.handler.default_service_path = def_servpath
# Number of channel re-injection retries before a Flume event is definitely discarded (-1 means infinite retries)
cygnusagent.sources.http-source.handler.events_ttl = 10
# Source interceptors, do not change
cygnusagent.sources.http-source.interceptors = ts gi
# TimestampInterceptor, do not change
cygnusagent.sources.http-source.interceptors.ts.type = timestamp
# GroupinInterceptor, do not change
cygnusagent.sources.http-source.interceptors.gi.type = com.telefonica.iot.cygnus.interceptors.GroupingInterceptor$Builder
# Grouping rules for the GroupingInterceptor, put the right absolute path to the file if necessary
# See the doc/design/interceptors document for more details
cygnusagent.sources.http-source.interceptors.gi.grouping_rules_conf_file = /home/egm_demo/usr/fiware-cygnus/conf/grouping_rules.conf
# ============================================
# OrionMongoSink configuration
# sink class, must not be changed
cygnusagent.sinks.mongo-sink.type = com.telefonica.iot.cygnus.sinks.OrionMongoSink
# channel name from where to read notification events
cygnusagent.sinks.mongo-sink.channel = mongo-channel
# FQDN/IP:port where the MongoDB server runs (standalone case) or comma-separated list of FQDN/IP:port pairs where the MongoDB replica set members run
cygnusagent.sinks.mongo-sink.mongo_hosts = 127.0.0.1:27017
# a valid user in the MongoDB server (or empty if authentication is not enabled in MongoDB)
cygnusagent.sinks.mongo-sink.mongo_username =
# password for the user above (or empty if authentication is not enabled in MongoDB)
cygnusagent.sinks.mongo-sink.mongo_password =
# prefix for the MongoDB databases
#cygnusagent.sinks.mongo-sink.db_prefix = kura
# prefix pro the MongoDB collections
#cygnusagent.sinks.mongo-sink.collection_prefix = button
# true is collection names are based on a hash, false for human redable collections
cygnusagent.sinks.mongo-sink.should_hash = false
# ============================================
# mongo-channel configuration
# channel type (must not be changed)
cygnusagent.channels.mongo-channel.type = memory
# capacity of the channel
cygnusagent.channels.mongo-channel.capacity = 1000
# amount of bytes that can be sent per transaction
cygnusagent.channels.mongo-channel.transactionCapacity = 100
Here is my rule :
{
"grouping_rules": [
{
"id": 1,
"fields": [
"button"
],
"regex": ".*",
"destination": "kura",
"fiware_service_path": "/kuraspath"
}
]
}
Any ideas of what I have missed? Thanks in advance for your help!
This configuration parameter is wrong:
cygnusagent.sinks = OrionMongoSink
According to your configuration, it must be mongo-sink (I mean, you are configuring a Mongo sink named mongo-sink when you configure lines such as cygnusagent.sinks.mongo-sink.type).
In addition, I would recommend you to not using the grouping rules feature; it is an advanced feature about sending the data to a collection different than the default one, and in a first stage I would play with the default behaviour. Thus, my recommendation is to leave the path to the file in cygnusagent.sources.http-source.interceptors.gi.grouping_rules_conf_file, but comment all the JSON within it :)

* Unrecognized field at: database Did you mean?: - metrics - server - logging - DROPWIZARD

I cannot start my dropwizard application after add database details in my application configuration file (server.yml).
server.yml (app config file)
server:
applicationConnectors:
- type: http
port: 8080
adminConnectors:
- type: http
port: 9001
database:
# the name of your JDBC driver
driverClass: org.postgresql.Driver
# the username
user: dbuser
# the password
password: pw123
# the JDBC URL
url: jdbc:postgresql://localhost/database
# any properties specific to your JDBC driver:
properties:
charSet: UTF-8
# the maximum amount of time to wait on an empty pool before throwing an exception
maxWaitForConnection: 1s
# the SQL query to run when validating a connection's liveness
validationQuery: "/* MyService Health Check */ SELECT 1"
# the timeout before a connection validation queries fail
validationQueryTimeout: 3s
# the minimum number of connections to keep open
minSize: 8
# the maximum number of connections to keep open
maxSize: 32
# whether or not idle connections should be validated
checkConnectionWhileIdle: false
# the amount of time to sleep between runs of the idle connection validation, abandoned cleaner and idle pool resizing
evictionInterval: 10s
# the minimum amount of time an connection must sit idle in the pool before it is eligible for eviction
minIdleTime: 1 minute
As result of run dropwizard application I can see:
has an error:
* Unrecognized field at: database
Did you mean?:
- metrics
- server
- logging
In addition to code given by dropwizard example you need to add a setter for database property.
#Valid
#NotNull
#JsonProperty("database")
private DataSourceFactory database = new DataSourceFactory();
public DataSourceFactory getDataSourceFactory() {
return database;
}
public void setDatabase(DataSourceFactory database) {
this.database = database;
}
In your application configuration java file, you have to add the matching property for "database". If the properties you're specifying are the standard ones (which they look to be, good!) then you can keep with the DataSourceFactory type:
public class ExampleConfiguration extends Configuration {
#Valid
#NotNull
#JsonProperty
private DataSourceFactory database = new DataSourceFactory();
public DataSourceFactory getDataSourceFactory() {
return database;
}
public void setDatabase(DataSourceFactory database) {
this.database = database;
}
}
Example here: http://www.dropwizard.io/0.9.0/docs/manual/jdbi.html