I'm using Mongo v4.2, and I'm trying to limit the documents scanned using maxScan. I've tried "limit" but I believe that pulls all matching documents then slices the array, I actually want to stop mongo from scanning past the first 5 docs.
Here is the error I get:
db.movies.find({title: 'Godfather'}).maxScan(5)
Error: error: {
"operationTime" : Timestamp(1598049657, 1),
"ok" : 0,
"errmsg" : "Failed to parse: { find: \"content_movies\", filter: { title: \"Godfather\" }, maxScan: 5.0, lsid: { id: UUID(\"de0fad49-6cd1-425f-896a-77aa7229e4f0\") }, $clusterTime: { clusterTime: Timestamp(1598049547, 1), signature: { hash: BinData(0, 98F9B39F0F6B9E8088947EF37506EA8B17F8AFAA), keyId: 6838904097895088131 } }, $db: \"PRODUCTION\" }. Unrecognized field 'maxScan'.",
"code" : 9,
"codeName" : "FailedToParse",
"$clusterTime" : {
"clusterTime" : Timestamp(1598049657, 1),
"signature" : {
"hash" : BinData(0,"/oI+65SAR7fEGyp9yilR+PFG3KQ="),
"keyId" : NumberLong("6838904097895088131")
}
}
}
Appreciate your help.
maxScan is removed in 4.2.
MongoDB removes the deprecated option maxScan for the find command and the mongo shell helper cursor.maxScan(). Use either the maxTimeMS option for the find command or the helper cursor.maxTimeMS() instead.
I actually want to stop mongo from scanning past the first 5 docs
Use limit for this.
Related
I have a collection of 9 million records where I found an index in which if I tried to get all the documents, it throws the below error.
Error: error: {
"ok" : 0,
"errmsg" : "invalid bson type in element with field name '_contract_end_date' in object with unknown _id",
"code" : 22,
"codeName" : "InvalidBSON",
"operationTime" : Timestamp(1585753324, 14),
"$clusterTime" : {
"clusterTime" : Timestamp(1585753324, 14),
"signature" : {
"hash" : BinData(0,"2fEF+tGQoHsjvCCWph9YhkVajCs="),
"keyId" : NumberLong("6756221167083716618")
}
}
}
So I tried to rename the field to contract_end_date by using $rename operator. When I tried updateMany, it throws the same error.
But updateOne works. But this is not helpful as I just see the success message but not actually updating 100 odd docs for that index. I wonder how to see that corrupted doc to identify the other fields which will help me to identify the application which corrupts.
Sample doc: -It's a pretty simple flatten structure - around 50 fields are there in each doc - no nested docs.
{
_id:
sys_contract_end_date:
customer_name:
location:
owner:
retailer:
seller:
}
i'm trying to update the value of a field with the value of another field of a document. mongodb docs say it's possible by using an aggregation pipeline as described here.
even the sample code from the docs results in an TypeMismatch code 14 error.
command:
db.members.update(
{ },
[
{ $set: { status: "Modified", comments: [ "$misc1", "$misc2" ] } },
{ $unset: [ "misc1", "misc2" ] }
],
{ multi: true }
)
result:
WriteCommandError({
"operationTime" : Timestamp(1561779602, 1),
"ok" : 0,
"errmsg" : "BSON field 'update.updates.u' is the wrong type 'array', expected type 'object'",
"code" : 14,
"codeName" : "TypeMismatch",
"$clusterTime" : {
"clusterTime" : Timestamp(1561779602, 1),
"signature" : {
"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
"keyId" : NumberLong(0)
}
}
})
is this an actual bug in mongodb or am i missing something?
I think what you are facing is a mongodb version issue.
According to the official documentation :
Update with Aggregation Pipeline
Starting in MongoDB 4.2, the db.collection.update() can use an
aggregation pipeline for the update. The pipeline can consist of the
following stages:
$addFields and its alias $set
$project and its alias $unset
$replaceRoot and its alias $replaceWith.
You can see that this support is available from mongodb version 4.2, And thats why it is throwing you that error.
We have two MongoDB clusters, they do not interact with each other. We used to run dataSize command (https://docs.mongodb.com/manual/reference/command/dataSize/) to record the storage used for each specified ID. Both two clusters were running smoothly. Recently, we had one cluster's secondary server failure, and we restarted this cluster. Since then, the dataSize command has stopped working for this cluster. It responds back "couldn't find valid index containing key pattern" error.
Example of the error returned:
rs0:PRIMARY> db.runCommand({ dataSize: "dudubots.channel_tdata", keyPattern: { "c_id_s": 1 }, min: { "c_id_s": 1 }, max: { "c_id_s": 4226 } });
{
"estimate" : false,
"ok" : 0,
"errmsg" : "couldn't find valid index containing key pattern",
"operationTime" : Timestamp(1553510158, 20),
"$clusterTime" : {
"clusterTime" : Timestamp(1553510158, 20),
"signature" : {
"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
"keyId" : NumberLong(0)
}
}
}
The other cluster is running smoothly and no error is given:
rs0:PRIMARY> db.runCommand({ dataSize: "dudubots.channel_tdata", keyPattern: { "c_id_s": 1 }, min: { "c_id_s": 3015 }, max: { "c_id_s": 3017 } })
{
"estimate" : false,
"size" : 6075684,
"numObjects" : 3778,
"millis" : 1315,
"ok" : 1
}
field c_id_s is indeed set as indexes on both clusters. We don't understand why that cluster has failed to run the command.
We have found the problem. The index actually was changed. dataSize command requires index to be ASC, but we had changed one cluster to DSC.
In SSH session 1, I have ran operation to create partial index in MongoDB as follows:
db.scores.createIndex(
... { event_time: 1, "writes.k": 1 },
... { background: true,
... partialFilterExpression: {
... "writes.l_at": null,
... "writes.d_at": null
... }});
The creation of the index is quite large and lasts about 30+ minutes. While it is still running I started SSH session 2.
In SSH session 2 to cluster, I described indexes on my collection scores, and it looks like it is already there...
db.scores.getIndexes()
[
...,
{
"v" : 1,
"key" : {
"event_time" : 1,
"writes.k" : 1
},
"name" : "event_time_1_writes.k_1",
"ns" : "leaderboard.scores",
"background" : true,
"partialFilterExpression" : {
"writes.l_at" : null,
"writes.d_at" : null
}
}
]
When trying to count with hint to this index, I get below error:
db.scores.find().hint('event_time_1_writes.k_1').count()
2019-02-06T22:35:38.857+0000 E QUERY [thread1] Error: count failed: {
"ok" : 0,
"errmsg" : "error processing query: ns=leaderboard.scoresTree: $and\nSort: {}\nProj: {}\n planner returned error: bad hint",
"code" : 2,
"codeName" : "BadValue"
} : _getErrorWithCode#src/mongo/shell/utils.js:25:13
DBQuery.prototype.count#src/mongo/shell/query.js:383:11
#(shell):1:1
Never seen this below, but need confirmation to check if its failing because indexing is still running ?
Thanks!
I am using mongoDb 2.6.4 and still getting an error:
uncaught exception: aggregate failed: {
"errmsg" : "exception: aggregation result exceeds maximum document size (16MB)",
"code" : 16389,
"ok" : 0,
"$gleStats" : {
"lastOpTime" : Timestamp(1422033698000, 105),
"electionId" : ObjectId("542c2900de1d817b13c8d339")
}
}
Reading different advices I came across of saving result in another collection using $out. My query looks like this now:
db.audit.aggregate([
{$match: { "date": { $gte : ISODate("2015-01-22T00:00:00.000Z"),
$lt : ISODate("2015-01-23T00:00:00.000Z")
}
}
},
{ $unwind : "$data.items" } ,
{
$out : "tmp"
}]
)
But I am getting different error:
uncaught exception: aggregate failed:
{"errmsg" : "exception: insert for $out failed: { lastOp: Timestamp 1422034172000|25, connectionId: 625789, err: \"insertDocument :: caused by :: 11000 E11000 duplicate key error index: duties_and_taxes.tmp.agg_out.5.$_id_ dup key: { : ObjectId('54c12d784c1b2a767b...\", code: 11000, n: 0, ok: 1.0, $gleStats: { lastOpTime: Timestamp 1422034172000|25, electionId: ObjectId('542c2900de1d817b13c8d339') } }",
"code" : 16996,
"ok" : 0,
"$gleStats" : {
"lastOpTime" : Timestamp(1422034172000, 26),
"electionId" : ObjectId("542c2900de1d817b13c8d339")
}
}
Can someone has a solution?
The error is due to the $unwind step in your pipeline.
When you unwind by a field having n elements, n copies of the same documents are produced with the same _id. Each copy having one of the elements from the array that was used to unwind. See the below demonstration of the records after an unwind operation.
Sample demo:
> db.t.insert({"a":[1,2,3,4]})
WriteResult({ "nInserted" : 1 })
> db.t.aggregate([{$unwind:"$a"}])
{ "_id" : ObjectId("54c28dbe8bc2dadf41e56011"), "a" : 1 }
{ "_id" : ObjectId("54c28dbe8bc2dadf41e56011"), "a" : 2 }
{ "_id" : ObjectId("54c28dbe8bc2dadf41e56011"), "a" : 3 }
{ "_id" : ObjectId("54c28dbe8bc2dadf41e56011"), "a" : 4 }
>
Since all these documents have the same _id, you get a duplicate key exception(due to the same value in the _id field for all the un-winded documents) on insert into a new collection named tmp.
The pipeline will fail to complete if the documents produced by the
pipeline would violate any unique indexes, including the index on the
_id field of the original output collection.
To solve your original problem, you could set the allowDiskUse option to true. It allows, using the disk space whenever it needs to.
Optional. Enables writing to temporary files. When set to true,
aggregation operations can write data to the _tmp subdirectory in the
dbPath directory. See Perform Large Sort Operation with External Sort
for an example.
as in:
db.audit.aggregate([
{$match: { "date": { $gte : ISODate("2015-01-22T00:00:00.000Z"),
$lt : ISODate("2015-01-23T00:00:00.000Z")
}
}
},
{ $unwind : "$data.items" }] , // note, the pipeline ends here
{
allowDiskUse : true
});