Performance drop in upsert after delete with replica set - mongodb

I will need your help with understanding an performance problem.
We have a system where we are storing set of documents (1k-4k docs) in batches. Documents have this structure: {_id: ObjectId(), RepositoryId: UUID(), data...}
where repository id is same for all instance in the set. We also set an unique indexes for: {_id: 1, RepositoryId: 1}, {RepositoryId: 1, ...}.
In the usecase is: delete all documents with same RepositoryId:
db.collection.deleteMany(
{ RepositoryId: UUID("SomeGUID") },
{ writeConcern: {w: "majority", j: true} }
)
And then re-upsert batches (300 items per batch) with same RepositoryId as we delete before:
db.collection.insertMany(
[ { RepositoryId: UUID(), data... }, ... ],
{
writeConcern: {w: 1, j: false},
ordered: false
}
)
The issue is that upsert of first few (3-5) batches take much more time then reset (first batch: 10s, 8th bach 0.1s). There is also entry in log file:
{
"t": {
"$date": "2023-01-19T15:49:02.258+01:00"
},
"s": "I",
"c": "COMMAND",
"id": 51803,
"ctx": "conn64",
"msg": "Slow query",
"attr": {
"type": "command",
"ns": "####.$cmd",
"command": {
"update": "########",
"ordered": false,
"writeConcern": {
"w": 1,
"fsync": false,
"j": false
},
"txnNumber": 16,
"$db": "#####",
"lsid": {
"id": {
"$uuid": "6ffb319a-6003-4221-9925-710e9e2aa315"
}
},
"$clusterTime": {
"clusterTime": {
"$timestamp": {
"t": 1674139729,
"i": 5
}
},
"numYields": 0,
"reslen": 11550,
"locks": {
"ParallelBatchWriterMode": {
"acquireCount": {
"r": 600
}
},
"ReplicationStateTransition": {
"acquireCount": {
"w": 601
}
},
"Global": {
"acquireCount": {
"w": 600
}
},
"Database": {
"acquireCount": {
"w": 600
}
},
"Collection": {
"acquireCount": {
"w": 600
}
},
"Mutex": {
"acquireCount": {
"r": 600
}
}
},
"flowControl": {
"acquireCount": 300,
"timeAcquiringMicros": 379
},
"readConcern": {
"level": "local",
"provenance": "implicitDefault"
},
"writeConcern": {
"w": 1,
"j": false,
"wtimeout": 0,
"provenance": "clientSupplied"
},
"storage": {
},
"remote": "127.0.0.1:52800",
"protocol": "op_msg",
"durationMillis": 13043
}
}
}
}
Is there some background process that is running after delete that affects upsert pefrormance of first batches? It was not a problem until we switched from standalone to single instance replica set, due to transaction support in another part of app. This case does not require transaction but we can not host two instances of mongo with different setup. The DB is exclusive for this operation, no other operation runs on DB (running in isolated test environment). How we can fix it?
The issue is reproducible, seems when there is time gap in test run (few minutes), the problem is not there for first run but then following runs are problematic.
Runing on machine with Ryzen 7 PRO 4750U, 32 GB Ram and Samsung 970 EVO M2 SSD. MongoDB version 5.0.5

In that log entry timeAcquiringMicros indicates that this operation waited while attempt to acquire a lock.
flowControl is a throttling mechanism that delays writes on the primary node when the secondary nodes are lagging, with the intent of letting them catch up before the get so far behind that consistency is lost.
Waiting on the flowControl lock would suggest that there was a backlog of operations that were still be replicated to the secondaries, and they were a bit behind, so the new writes were being slowed.
See Replication Lag and Flow Control for more detail

Related

efficienct improving on sorting

My code is here:
https://github.com/Dyfused/ExplodeX/blob/master/labyrinth-mongodb/src/main/kotlin/explode2/labyrinth/mongo/LabyrinthMongo.kt#L87,L259
document structures
Set example:
{
"musicName": "Ignotus Afterburn",
"musicComposer": "Arcaea Sound Team",
"introduction": "yay",
"coinPrice": 0,
"noterName": "official",
"noterUserId": "f6fe9c4d-98e6-450a-937c-d64848eacc40",
"chartIds": [
"l54uw6y79g1pspqcsvok31ga"
],
"publishTime": {
"$date": {
"$numberLong": "1640966400000"
}
},
"category": 0,
"hidden": false,
"reviewing": false
}
GameRecord example:
{
"playerId": "92623248-b291-430a-81c5-d1175308f902",
"playedChartId": "qsx8ky1c94f9ez1b8rssbsr2",
"score": 994486,
"detail": {
"perfect": 1308,
"good": 7,
"miss": 1
},
"uploadTime": {
"$date": {
"$numberLong": "1"
}
},
"r": 0
}
Program logic
When the sort type is 'DESCENDING_BY_PLAY_COUNT', it just stuck and return something seemed to be wrong.
Line 101 to 234 is just filtering, it have nothing to do with sorting.
I want to sort the sets by the related records count. So I firstly 'lookup' the related records then get the size of the related records. But it seemed has a heavy efficient problem, which I cannot figure out how to resolve.
Relevant pipeline additions:
SearchSort.DESCENDING_BY_PLAY_COUNT -> {
pipeline += lookup("GameRecords", "chartIds", "playedChartId", "playRecords")
pipeline += addFields(Field(
"playCount",
MongoOperator.size.from("\$playRecords")
))
pipeline += sort(descending(SongSetWithPlayCount::playCount))
}

MongoDB update_one vs update_many - Improve speed

I got a collection of 10000 ca. docs, where each doc has the following format:
{
"_id": {
"$oid": "631edc6e207c89b932a70a26"
},
"name": "Ethereum",
"auditInfoList": [
{
"coinId": "1027",
"auditor": "Fairyproof",
"auditStatus": 2,
"reportUrl": "https://www.fairyproof.com/report/Covalent"
}
],
"circulatingSupply": 122335921.0615,
"cmcRank": 2,
"dateAdded": "2015-08-07T00:00:00.000Z",
"id": 1027,
"isActive": 1,
"isAudited": true,
"lastUpdated": 1662969360,
"marketPairCount": 6085,
"quotes": [
{
"name": "USD",
"price": 1737.1982544180462,
"volume24h": 14326453277.535921,
"marketCap": 212521748520.66168,
"percentChange1h": 0.62330307,
"percentChange24h": -1.08847937,
"percentChange7d": 10.96517745,
"lastUpdated": 1662966780,
"percentChange30d": -13.49374496,
"percentChange60d": 58.25153862,
"percentChange90d": 42.27475921,
"fullyDilluttedMarketCap": 212521748520.66,
"marketCapByTotalSupply": 212521748520.66168,
"dominance": 20.0725,
"turnover": 0.0674117,
"ytdPriceChangePercentage": -53.9168
}
],
"selfReportedCirculatingSupply": 0,
"slug": "ethereum",
"symbol": "ETH",
"tags": [
"mineable",
"pow",
"smart-contracts",
"ethereum-ecosystem",
"coinbase-ventures-portfolio",
"three-arrows-capital-portfolio",
"polychain-capital-portfolio",
"binance-labs-portfolio",
"blockchain-capital-portfolio",
"boostvc-portfolio",
"cms-holdings-portfolio",
"dcg-portfolio",
"dragonfly-capital-portfolio",
"electric-capital-portfolio",
"fabric-ventures-portfolio",
"framework-ventures-portfolio",
"hashkey-capital-portfolio",
"kenetic-capital-portfolio",
"huobi-capital-portfolio",
"alameda-research-portfolio",
"a16z-portfolio",
"1confirmation-portfolio",
"winklevoss-capital-portfolio",
"usv-portfolio",
"placeholder-ventures-portfolio",
"pantera-capital-portfolio",
"multicoin-capital-portfolio",
"paradigm-portfolio",
"injective-ecosystem"
],
"totalSupply": 122335921.0615
}
Im pulling updated version of it and, to aviod duplicates, im doing the following by using 'update_one'
for doc in new_doc_list:
CRYPTO_TEMPORARY_LIST.update_one(
{ "name" : doc['name']},
{ "$set": {
"lastUpdated": doc['lastUpdated']
}
},
upsert=True)
The problem is it's too slow.
I'm trying to figure out how to improve speed by using update_many but can't figure out how to set it up.
I Basically want to update every document x name. Completely change the doc and not the "lastUpdated" field would b even better.
Thanks guys <3

Poor Write perfomance with MongoDB 5.0.8 in a PSA (Primary-Secondary-Arbiter) setup

I have some write performance struggle with MongoDB 5.0.8 in an PSA (Primary-Secondary-Arbiter) deployment when one data bearing member goes down.
I am aware of the "Mitigate Performance Issues with PSA Replica Set" page and the procedure to temporarily work around this issue.
However, in my opinion, the manual intervention described here should not be necessary during operation. So what can I do to ensure that the system continues to run efficiently even if a node fails? In other words, as in MongoDB 4.x with the option "enableMajorityReadConcern=false".
As I understand the problem has something to do with the defaultRWConcern. When configuring a PSA Replica Set in MongoDB you are forced to set the DefaultRWConcern. Otherwise the following message will appear when rs.addArb is called:
MongoServerError: Reconfig attempted to install a config that would
change the implicit default write concern. Use the setDefaultRWConcern
command to set a cluster-wide write concern and try the reconfig
again.
So I did
db.adminCommand({
"setDefaultRWConcern": 1,
"defaultWriteConcern": {
"w": 1
},
"defaultReadConcern": {
"level": "local"
}
})
I would expect that this configuration causes no lag when reading/writing to a PSA System with only one data bearing node available.
But I observe "slow query" messages in the mongod log like this one:
{
"t": {
"$date": "2022-05-13T10:21:41.297+02:00"
},
"s": "I",
"c": "COMMAND",
"id": 51803,
"ctx": "conn149",
"msg": "Slow query",
"attr": {
"type": "command",
"ns": "<db>.<col>",
"command": {
"insert": "<col>",
"ordered": true,
"txnNumber": 4889253,
"$db": "<db>",
"$clusterTime": {
"clusterTime": {
"$timestamp": {
"t": 1652430100,
"i": 86
}
},
"signature": {
"hash": {
"$binary": {
"base64": "bEs41U6TJk/EDoSQwfzzerjx2E0=",
"subType": "0"
}
},
"keyId": 7096095617276968965
}
},
"lsid": {
"id": {
"$uuid": "25659dc5-a50a-4f9d-a197-73b3c9e6e556"
}
}
},
"ninserted": 1,
"keysInserted": 3,
"numYields": 0,
"reslen": 230,
"locks": {
"ParallelBatchWriterMode": {
"acquireCount": {
"r": 2
}
},
"ReplicationStateTransition": {
"acquireCount": {
"w": 3
}
},
"Global": {
"acquireCount": {
"w": 2
}
},
"Database": {
"acquireCount": {
"w": 2
}
},
"Collection": {
"acquireCount": {
"w": 2
}
},
"Mutex": {
"acquireCount": {
"r": 2
}
}
},
"flowControl": {
"acquireCount": 1,
"acquireWaitCount": 1,
"timeAcquiringMicros": 982988
},
"readConcern": {
"level": "local",
"provenance": "implicitDefault"
},
"writeConcern": {
"w": 1,
"wtimeout": 0,
"provenance": "customDefault"
},
"storage": {},
"remote": "10.10.7.12:34258",
"protocol": "op_msg",
"durationMillis": 983
}
The collection involved here is under proper load with about 1000 reads and 1000 writes per second from different (concurrent) clients.
MongoDB 4.x with "enableMajorityReadConcern=false" performed "normal" here and I have not noticed any loss of performance in my application. MongoDB 5.x doesn't manage that and in my application data is piling up that I can't get written away in a performant way.
So my question is, if I can get the MongoDB 4.x behaviour back. A write guarantee from the single data bearing node which is available in the failure scenario would be OK for me. But in a failure scenario, having to manually reconfigure the faulty node should actually be avoided.
Thanks for any advice!
At the end we changed the setup to a PSS layout.
This was also recommended in the MongoDB Community Forum.

MongoDB Query doesn't return with a sort

I have the query:
db.changes.find(
{
$or: [
{ _id: ObjectId("60b1e8dc9d0359001bb80441") },
{ _oid: ObjectId("60b1e8dc9d0359001bb80441") },
],
},
{
_id: 1,
}
);
which returns almost instantly.
But the moment I add a sort, the query doesn't return. The query just runs. The longest I could tolerate the query running was over 30 Min, so I'm not entirely sure if it does eventually return.
db.changes
.find(
{
$or: [
{ _id: ObjectId("60b1e8dc9d0359001bb80441") },
{ _oid: ObjectId("60b1e8dc9d0359001bb80441") },
],
},
{
_id: 1,
}
)
.sort({ _id: -1 });
I have the following indexes:
[
{
"_oid" : 1
},
{
"_id" : 1
}
]
and this is what db.currentOp() returns:
{
"host": "xxxx:27017",
"desc": "conn387",
"connectionId": 387,
"client": "xxxx:55802",
"appName": "MongoDB Shell",
"clientMetadata": {
"application": {
"name": "MongoDB Shell"
},
"driver": {
"name": "MongoDB Internal Client",
"version": "4.0.5-18-g7e327a9017"
},
"os": {
"type": "Linux",
"name": "Ubuntu",
"architecture": "x86_64",
"version": "20.04"
}
},
"active": true,
"currentOpTime": "2021-09-24T15:26:54.286+0200",
"opid": 71111,
"secs_running": NumberLong(23),
"microsecs_running": NumberLong(23860504),
"op": "query",
"ns": "myDB.changes",
"command": {
"find": "changes",
"filter": {
"$or": [
{
"_id": ObjectId("60b1e8dc9d0359001bb80441")
},
{
"_oid": ObjectId("60b1e8dc9d0359001bb80441")
}
]
},
"sort": {
"_id": -1.0
},
"projection": {
"_id": 1.0
},
"lsid": {
"id": UUID("38c4c09b-d740-4e44-a5a5-b17e0e04f776")
},
"$readPreference": {
"mode": "secondaryPreferred"
},
"$db": "myDB"
},
"numYields": 1346,
"locks": {
"Global": "r",
"Database": "r",
"Collection": "r"
},
"waitingForLock": false,
"lockStats": {
"Global": {
"acquireCount": {
"r": NumberLong(2694)
}
},
"Database": {
"acquireCount": {
"r": NumberLong(1347)
}
},
"Collection": {
"acquireCount": {
"r": NumberLong(1347)
}
}
}
}
This wasn't always a problem, it's only recently started. I've also rebuilt the indexes, and nothing seems to work. I've tried using .explain(), and that also doesn't return.
Any suggestions would be welcome. For my situation, it's going to be much easier to make changes to the DB than it is to change the query.
This is happening due to the way Mongo chooses what's called a "winning plan", I recommend you read more on this in my other answer which explains this behavior. However it is interesting to see if the Mongo team will consider this specific behavior a feature or a bug.
Basically the $or operator has some special qualities, as specified:
When evaluating the clauses in the $or expression, MongoDB either performs a collection scan or, if all the clauses are supported by indexes, MongoDB performs index scans. That is, for MongoDB to use indexes to evaluate an $or expression, all the clauses in the $or expression must be supported by indexes. Otherwise, MongoDB will perform a collection scan.
It seems that the addition of the sort is disrupting the usage this quality, meaning you're running a collection scan all of a sudden.
What I recommend you do is use the aggregation pipeline instead of the query language, I personally find it has more stable behavior and it might work there. If not maybe just do the sorting in code ..
The server can use a separate index for each branch of the $or, but in order to avoid doing an in-memory sort the indexes used would have to find the documents in the sort order so a merge-sort can be used instead.
For this query, an index on {_id:1} would find documents matching the first branch, and return them in the proper order. For the second branch, and index on {oid:1, _id:1} would do the same.
If you have both of those indexes, the server should be able to find the matching documents quickly, and return them without needing to perform an explicit sort.

MongoDB system.profile collection: no data for "insert" operations?

I've configured my MongoDB 2.0.2 instance (update: also tried this with a v2.2.0 instance) to log all operations to the system.profile collection (i.e., db.setProfilingLevel(2)) and am trying to see exactly what data is being inserted by an application when it calls save() for a new doc.
I can see the 'insert' operations in the system.profile collection, but it doesn't include the data that's being inserted. Why is that?
In contrast, update operations recorded in system.profile have an 'updateobj' property which shows the data.
Here's an example from a 2.2.0 instance. As you can see, the profile log includes an entry for the update with 'updateObj' data. The insert, however, doesn't have any info about what was inserted.
> use test;
switched to db test
> db.getProfilingStatus();
{ "was" : 2, "slowms" : 100 }
> show collections;
cartoons
system.indexes
system.profile
> db.foobar.insert({ "blah": true });
> db.foobar.update({ "blah": true }, { $set: { blerg: 1 } });
> db.system.profile.find({ ns:"test.foobar" });
{
"ts": ISODate("2012-09-25T20:37:40.287Z"),
"op": "insert",
"ns": "test.foobar",
"keyUpdates": 0,
"numYield": 0,
"lockStats": {
"timeLockedMicros": {
"r": NumberLong(0),
"w": NumberLong(2028)
},
"timeAcquiringMicros": {
"r": NumberLong(0),
"w": NumberLong(10)
}
},
"millis": 2,
"client": "127.0.0.1",
"user": ""
}{
"ts": ISODate("2012-09-25T20:38:11.454Z"),
"op": "update",
"ns": "test.foobar",
"query": {
"blah": true
},
"updateobj": {
"$set": {
"blerg": 1
}
},
"nscanned": 1,
"moved": true,
"nmoved": 1,
"nupdated": 1,
"keyUpdates": 0,
"numYield": 0,
"lockStats": {
"timeLockedMicros": {
"r": NumberLong(0),
"w": NumberLong(1797)
},
"timeAcquiringMicros": {
"r": NumberLong(0),
"w": NumberLong(9)
}
},
"millis": 1,
"client": "127.0.0.1",
"user": ""
}
Apologies for misleading you originally, it turns out that this is intentional (my original response was related to this being a bug with logging slow ops). The idea behind not doing this is that you would just double the write load automatically by turning this on, since you are effectively just writing the same information (actually a little more) twice.
Since the idea with profiling is usually to troubleshoot a performance issue, this has not been implemented as the default. However, it has been requested as an option:
https://jira.mongodb.org/browse/SERVER-3848
As you can see, it is not yet scheduled for a version, but votes and comments outlining why this would be useful do help when deciding what gets implemented.