MongoDB PartialFilterExpression filter issue - mongodb

I have a collection of items where a document looks something like this:
{
"source" : "rest",
"serviceCode" : "fluff",
"fluff" : "puff",
"systemEntryTime" : ISODate("2018-05-16T09:04:00.585Z")
}
I have an index with a TTL option for two weeks :
{
"v" : 1,
"key" : {
"systemEntryTime" : 1
},
"name" : "systemEntryTime_1",
"ns" : "storage.item",
"expireAfterSeconds" : NumberLong(1209600)
}
Now I want certain documents where source = "ftp" to have a different TTL. For this purpose I created the following index with a partialFilterExpression:
{
"v" : 1,
"key" : {
"systemEntryTime" : 1,
"source" : 1
},
"name" : "systemEntryTime_1_source_1",
"ns" : "storage.item",
"expireAfterSeconds" : NumberLong(1),
"partialFilterExpression" : {
"source" : {
"$eq" : "ftp"
}
}
}
Unfortunately this is not working, what am I doing wrong here? I have experimented with dropping the old index and using only this, but no documents a dropped according to the TTL (or any documents at all for that matter).

Related

Slow query when sorting and filtering

I'm using mongodb version 3.6.5. I would like to do a query on a collection, and then sorting it based on date. I work on a (what I think) is a pretty large dataset, currently 195064301 data in this collection, and it's growing.
Doing the filter or the sort in separated query work perfectly
db.getCollection('logs').find({session: ObjectId("5af3baa173faa851f8b0090c")})
db.getCollection('logs').find({}).sort({date: 1})
The result is returned is less than 1 sec, but if I try to do it in a single query like so
db.getCollection('logs').find({session: ObjectId("5af3baa173faa851f8b0090c")}).sort({date: 1})
Now it take about 5minutes to return the data. I was thinking it was a index problem, but as far as I can tell, the index seems fine
> db.logs.getIndexes();
[
{
"v" : 2,
"key" : {
"_id" : 1
},
"name" : "_id_",
"ns" : "client1.logs"
},
{
"v" : 2,
"key" : {
"session" : 1
},
"name" : "session_1",
"ns" : "client1.logs",
"background" : true
},
{
"v" : 2,
"key" : {
"date" : 1
},
"name" : "date_1",
"ns" : "client1.logs",
"background" : true
},
{
"v" : 2,
"key" : {
"user" : 1
},
"name" : "user_1",
"ns" : "client1.logs",
"background" : true
}
]
I'm still new to mongo, I try thoses request directly in the console, I also tried to use the reIndex() method, but nothing really help.
So I'm hoping there is a solution on this.
Thanks.

MongoDB duplicate index does not throw error

I am new to MongoDB and trying to make MongoDB throw an error when I insert another document with the same index. According to this answer MongoDB should throw an error.
The steps I did are:
1.) Add an index to Name field. I verified that it is added:
> db.room.getIndexes()
[
{
"v" : 1,
"key" : {
"_id" : 1
},
"name" : "_id_",
"ns" : "test.room"
},
{
"v" : 1,
"key" : {
"Name" : 1
},
"name" : "Name_1",
"ns" : "test.room"
}
]
2.) I tried to add document with the same name and was able to add it:
> db.room.find().pretty()
{
"_id" : 1,
"ModifiedDate" : ISODate("2017-02-12T10:59:35.394Z"),
"CreatedDate" : ISODate("2017-02-12T10:59:35.394Z"),
"Name" : "Sample"
}
{
"_id" : 2,
"ModifiedDate" : ISODate("2017-02-12T10:59:39.474Z"),
"CreatedDate" : ISODate("2017-02-12T10:59:39.474Z"),
"Name" : "Sample"
}
I am using C# MongoDB Driver 2.4.
You have to specify that the index you are creating is unique, otherwise MongoDB will not enforce it. You can do that with the C# driver using the CreateIndexOptions class.
roomCollection.Indexes
.CreateOne(
Builders<Room>.IndexKeys.Ascending(r => r.Name),
new CreateIndexOptions() { Unique = true });
Note that index creation will fail if there are currently duplicate names in the collection.

mongodb insert really slow

i use mongodb to manage device log datas. Right now, it has over one million documents. the document contains more than 30 fields which combine with embed fields. Now, it's really slow when i insert new documents. The insert cost more than 1000ms. From the slow query ops, i get the logs like this:
{
"op" : "insert",
"ns" : "xxx.LogDeviceReport",
"query" : {
"_id" : ObjectId("xxxx"),
"deviceId" : ObjectId("xxxx"),
"en" : "xxxxxx",
"status" : 1,
'other fields, more than 30 fields...'
...
...
},
"ninserted" : 1,
"keyUpdates" : 0,
"writeConflicts" : 0,
"numYield" : 0,
"locks" : {
"Global" : {
"acquireCount" : {
"w" : NumberLong(2)
}
},
"MMAPV1Journal" : {
"acquireCount" : {
"w" : NumberLong(3)
}
},
"Database" : {
"acquireCount" : {
"w" : NumberLong(2)
}
},
"Collection" : {
"acquireCount" : {
"W" : NumberLong(1)
},
"acquireWaitCount" : {
"W" : NumberLong(1)
},
"timeAcquiringMicros" : {
"W" : NumberLong(1477481)
}
},
"oplog" : {
"acquireCount" : {
"w" : NumberLong(1)
}
}
},
"millis" : 977,
"execStats" : {
},
"ts" : ISODate("2016-08-02T22:01:01.270Z"),
"client" : "xxx.xxx.xxxx",
"allUsers" : [
{
"user" : "xxx",
"db" : "xxx"
}
],
"user" : "xx#xx"
}
I checked the index, like this:
[
{
"v" : 1,
"key" : {
"_id" : 1
},
"name" : "_id_",
"ns" : "xxx.LogDeviceReport"
},
{
"v" : 1,
"key" : {
"time" : 1
},
"name" : "time_1",
"ns" : "xxx.LogDeviceReport",
"expireAfterSeconds" : 604800,
"background" : true
}
]
Only an _id index and a ttl index by time, no any other indexes.
I guess the 'query' slow the operate. In mongodb doc, it tells that only the _id will be checked the unique, but in the logs, all fields in the 'query', does it matter?
if not this reason, what makes it so slow? Can any one help me ?
If you are using mongodb 3+ you can consider using WiredTiger as storage engine than MMAPV1 which is being used in your case.
I have personally saw a 4x improvement when I have inserted up to 156000 documents in a single go.
MMAPV1 took around 40 min and when I switched to WiredTiger same task was completed in 10 min.
Please check this link from MongoDB blog for more information
Note :: This is only from MongoDB 3.0 +

Getting the physical location of extents of Mongodb indexes

I've gone through the Mongodb docs trying to find a way to get the disk(physical) location of mongodb extents.
Well presentation1 and presentation2 helped me to clear my fundamental basics of storage internals I got an idea of virtual mapping of address, but there was not much about how virtual to physical mapping is done in these presentations.
The diskLoc() method provides us the details for index file & offset. But there is no data about the extent number or its physical location.
db.system.indexes.find().showDiskLoc()
{ "v" : 1, "key" : { "_id" : 1 }, "name" : "id", "ns" : "mydb1.abc", "$diskLoc" : { "file" : 0, "offset" : 28848 } }
{ "v" : 1, "key" : { "x" : 1 }, "name" : "x_1", "ns" : "mydb1.abc", "$diskLoc" : { "file" : 0, "offset" : 28976 } }
{ "v" : 1, "key" : { "x" : 14 }, "name" : "x_14", "ns" : "mydb1.abc", "$diskLoc" : { "file" : 0, "offset" : 29104 } }
Can anyone help me to get the physical address of each index extent of file?

can't shard a collection on mongodb

I have a db ("mydb") on mongo that contains 2 collections (c1 and c2). c1 is already hash sharded. I want to shard a second collection the same way. I get the following error :
use mydb
sh.shardCollection("mydb.c2", {"LOG_DATE": "hashed"})
{
"proposedKey" : {
"LOG_DATE" : "hashed"
},
"curIndexes" : [
{
"v" : 1,
"key" : {
"_id" : 1
},
"ns" : "mydb.c1",
"name" : "_id_"
}
],
"ok" : 0,
"errmsg" : "please create an index that starts with the shard key before sharding."
So I did
db.c2.ensureIndex({LOG_DATE: 1})
sh.shardCollection("mydb.c2", {"LOG_DATE": "hashed"})
Same error but it shows the new index.
"proposedKey" : {
"LOG_DATE" : "hashed"
},
"curIndexes" : [
{
"v" : 1,
"key" : {
"_id" : 1
},
"ns" : "mydb.c2",
"name" : "_id_"
},
{
"v" : 1,
"key" : {
"LOG_DATE" : 1
},
"ns" : "mydb.c2",
"name" : "LOG_DATE_1"
}
],
"ok" : 0,
"errmsg" : "please create an index that starts with the shard key before sharding."
Just to be sure, I run :
db.system.indexes.find()
{ "v" : 1, "key" : { "_id" : 1 }, "ns" : "mydb.c1", "name" : "_id_" }
{ "v" : 1, "key" : { "timestamp" : "hashed" }, "ns" : "mydb.c1", "name" : "timestamp_hashed" }
{ "v" : 1, "key" : { "_id" : 1 }, "ns": "mydb.c2", "name" : "_id_" }
{ "v" : 1, "key" : { "LOG_DATE" : 1 }, "ns" : "mydb.c2", "name" : "LOG_DATE_1" }
I try again the same commands on admin and it fails with the same error.
Then I tried on admin without "hashed" and it worked.
db.runCommand({shardCollection: "mydb.c2", key: {"LOG_DATE": 1}})
Problem : now my collection is sharded on something that is not hashed and I can't change it (error : "already sharded")
What was wrong with what I did ?
How can I fix this ?
Thanks in advance
Thomas
The problem initially was that you did not have a hashed index what you proposed to use for sharding this is the error message is about. After the first error message, when you created an index which is
{
"v" : 1,
"key" : {
"LOG_DATE" : 1
},
"ns" : "mydb.c2",
"name" : "LOG_DATE_1"
}
You still just have an ordinary index which is not a hashed one. If you would do this :
db.c2.ensureIndex({LOG_DATE: "hashed"})
Instead of this :
db.c2.ensureIndex({LOG_DATE: 1})
Than would be a hashed index. As you can see in the output of the db.system.indexes.find() on the other collection you have a hashed index for the timestamp i assume this is the shard key for that collection.
So if you have no data in the c2 collection:
db.c2.drop()
db.createCollection('c2')
db.c2.ensureIndex({LOG_DATE: "hashed"})
sh.shardCollection("mydb.c2", {"LOG_DATE": "hashed"})
This will work properly.