mongodb - indexes disappear after few minutes - mongodb

Have tested this on MongoDB 3.4 and 3.6:
Create one or more indexes in a collection
rs1:PRIMARY> db.coll.createIndex({checkinDate:1}, {background:1})
{
"createdCollectionAutomatically" : false,
"numIndexesBefore" : 1,
"numIndexesAfter" : 2,
"ok" : 1,
"operationTime" : Timestamp(1518162276, 2),
"$clusterTime" : {
"clusterTime" : Timestamp(1518162276, 2),
"signature" : {
"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
"keyId" : NumberLong(0)
}
}
}
Now list the indexes.
rs1:PRIMARY> db.coll.getIndexes()
[
{
"v" : 2,
"key" : {
"_id" : 1
},
"name" : "_id_",
"ns" : "cico.coll"
},
{
"v" : 2,
"key" : {
"checkinDate" : 1
},
"name" : "checkinDate_1",
"ns" : "cico.coll",
"background" : 1
}
]
Wait for some time (few mins)
List the indexes again:
rs1:PRIMARY> db.coll.getIndexes()
[
{
"v" : 2,
"key" : {
"_id" : 1
},
"name" : "_id_",
"ns" : "cico.coll"
}
]
I have no clue why these indexes created are getting deleted? Any help appreciated.

Index option background is of type boolean, try:
db.coll.createIndex (
{ checkinDate:1 }, { background: true }
)

this is actually a non-issue. there was a program, running in the background, which was dropping all the indexes periodically (for a completely different reason).

Related

Mongo Case Insensitive index (collation) not getting results

I'm trying to use the CaseInsensitive Index using Mongoid 5. I even went down to the Mongo console to run the experiments but it won't pull the results at all:
> db.charges.getIndexes()
[
{
"v" : 1,
"key" : {
"_id" : 1
},
"name" : "_id_",
"ns" : "paymentsapi_development.charges"
}
]
> db.charges.createIndex({'details.email':1},{name:'emails_index',collation:{locale:'en',strength:1}})
{
"createdCollectionAutomatically" : false,
"numIndexesBefore" : 1,
"numIndexesAfter" : 2,
"ok" : 1
}
> db.charges.getIndexes()
[
{
"v" : 1,
"key" : {
"_id" : 1
},
"name" : "_id_",
"ns" : "paymentsapi_development.charges"
},
{
"v" : 1,
"key" : {
"details.email" : 1
},
"name" : "emails_index",
"ns" : "paymentsapi_development.charges",
"collation" : {
"locale" : "en",
"strength" : 1
}
}
]
> db.charges.find({'details.email':'lolo#lolo.com'}).count()
1
> db.charges.find({'details.email':'LoLo#LoLo.cOm'}).collation({locale:'en',strength:1}).count()
0
>
I'm basing in the Collation (https://docs.mongodb.com/manual/reference/collation/) and the Case Insensitive Index (https://docs.mongodb.com/manual/core/index-case-insensitive/) .
I already checked the db version, the db.adminCommand( { setFeatureCompatibilityVersion: <version> } ) based on https://docs.mongodb.com/manual/reference/command/setFeatureCompatibilityVersion/ and all further things possible.
Any help is very appreciated.

mongodb insert really slow

i use mongodb to manage device log datas. Right now, it has over one million documents. the document contains more than 30 fields which combine with embed fields. Now, it's really slow when i insert new documents. The insert cost more than 1000ms. From the slow query ops, i get the logs like this:
{
"op" : "insert",
"ns" : "xxx.LogDeviceReport",
"query" : {
"_id" : ObjectId("xxxx"),
"deviceId" : ObjectId("xxxx"),
"en" : "xxxxxx",
"status" : 1,
'other fields, more than 30 fields...'
...
...
},
"ninserted" : 1,
"keyUpdates" : 0,
"writeConflicts" : 0,
"numYield" : 0,
"locks" : {
"Global" : {
"acquireCount" : {
"w" : NumberLong(2)
}
},
"MMAPV1Journal" : {
"acquireCount" : {
"w" : NumberLong(3)
}
},
"Database" : {
"acquireCount" : {
"w" : NumberLong(2)
}
},
"Collection" : {
"acquireCount" : {
"W" : NumberLong(1)
},
"acquireWaitCount" : {
"W" : NumberLong(1)
},
"timeAcquiringMicros" : {
"W" : NumberLong(1477481)
}
},
"oplog" : {
"acquireCount" : {
"w" : NumberLong(1)
}
}
},
"millis" : 977,
"execStats" : {
},
"ts" : ISODate("2016-08-02T22:01:01.270Z"),
"client" : "xxx.xxx.xxxx",
"allUsers" : [
{
"user" : "xxx",
"db" : "xxx"
}
],
"user" : "xx#xx"
}
I checked the index, like this:
[
{
"v" : 1,
"key" : {
"_id" : 1
},
"name" : "_id_",
"ns" : "xxx.LogDeviceReport"
},
{
"v" : 1,
"key" : {
"time" : 1
},
"name" : "time_1",
"ns" : "xxx.LogDeviceReport",
"expireAfterSeconds" : 604800,
"background" : true
}
]
Only an _id index and a ttl index by time, no any other indexes.
I guess the 'query' slow the operate. In mongodb doc, it tells that only the _id will be checked the unique, but in the logs, all fields in the 'query', does it matter?
if not this reason, what makes it so slow? Can any one help me ?
If you are using mongodb 3+ you can consider using WiredTiger as storage engine than MMAPV1 which is being used in your case.
I have personally saw a 4x improvement when I have inserted up to 156000 documents in a single go.
MMAPV1 took around 40 min and when I switched to WiredTiger same task was completed in 10 min.
Please check this link from MongoDB blog for more information
Note :: This is only from MongoDB 3.0 +

What index to be added in MongoDB to support $elemMatch query on embedded document

Suppose we have a following document
{
embedded:[
{
email:"abc#abc.com",
active:true
},
{
email:"def#abc.com",
active:false
}]
}
What indexing should be used to support $elemMatch query on email and active field of embedded doc.
Update on question :-
db.foo.aggregate([{"$match":{"embedded":{"$elemMatch":{"email":"abc#abc.com","active":true}}}},{"$group":{_id:null,"total":{"$sum":1}}}],{explain:true});
on querying this i am getting following output of explain on aggregate :-
{
"stages" : [
{
"$cursor" : {
"query" : {
"embedded" : {
"$elemMatch" : {
"email" : "abc#abc.com",
"active" : true
}
}
},
"fields" : {
"_id" : 0,
"$noFieldsNeeded" : 1
},
"planError" : "InternalError No plan available to provide stats"
}
},
{
"$group" : {
"_id" : {
"$const" : null
},
"total" : {
"$sum" : {
"$const" : 1
}
}
}
}
],
"ok" : 1
}
I think mongodb internally not using index for this query.
Thanx in advance :)
Update on output of db.foo.stats()
db.foo.stats()
{
"ns" : "test.foo",
"count" : 2,
"size" : 480,
"avgObjSize" : 240,
"storageSize" : 8192,
"numExtents" : 1,
"nindexes" : 3,
"lastExtentSize" : 8192,
"paddingFactor" : 1,
"systemFlags" : 0,
"userFlags" : 1,
"totalIndexSize" : 24528,
"indexSizes" : {
"_id_" : 8176,
"embedded.email_1_embedded.active_1" : 8176,
"name_1" : 8176
},
"ok" : 1
}
db.foo.getIndexes();
[
{
"v" : 1,
"key" : {
"_id" : 1
},
"name" : "_id_",
"ns" : "test.foo"
},
{
"v" : 1,
"key" : {
"embedded.email" : 1,
"embedded.active" : 1
},
"name" : "embedded.email_1_embedded.active_1",
"ns" : "test.foo"
},
{
"v" : 1,
"key" : {
"name" : 1
},
"name" : "name_1",
"ns" : "test.foo"
}
]
Should you decide to stick to that data model and your queries, here's how to create indexes that match the query:
You can simply index "embedded.email", or use a compound key of embedded indexes, i.e. something like
> db.foo.ensureIndex({"embedded.email" : 1 });
- or -
> db.foo.ensureIndex({"embedded.email" : 1, "embedded.active" : 1});
Indexing boolean fields is often not too useful, since their selectivity is low.

mongoDB does not combine 1d and 2d indexes, geo queries scans all documents irrespective of filters applied to limit the number of records

Below is the output from explain for one of the queries:
{
"cursor" : "GeoSearchCursor",
"isMultiKey" : false,
"n" : 0,
"nscannedObjects" : **199564**,
"nscanned" : 199564,
"nscannedObjectsAllPlans" : **199564**,
"nscannedAllPlans" : **199564**,
"scanAndOrder" : false,
"indexOnly" : false,
"nYields" : 0,
"nChunkSkips" : 0,
"millis" : 1234,
"indexBounds" : {
},
"server" : "MongoDB",
"filterSet" : false
}
This query scans all the 199564 records, where as constrains applied in the filter for the query, which should be around few hundred records only.
Pointers would be much appreciated
Adding the query and indexes applied:
Query
{
"isfeatured" : 1 ,
"status" : 1 ,
"isfesturedseq" : 1 ,
"loc_long_lat" : {
"$near" : [ 76.966438 , 11.114906]
} ,
"city_id" : "40" ,
"showTime.0" : { "$exists" : true}}
Indexes
{
"v" : 1,
"key" : {
"_id" : 1
},
"name" : "_id_",
"ns" : "test_live.movies_theater_map"
},
{
"v" : 1,
"key" : {
"loc_long_lat" : "2d"
},
"name" : "loc_long_lat_2d",
"ns" : "test_live.movies_theater_map"
},
{
"v" : 1,
"key" : {
"georand" : "2d"
},
"name" : "georand_2d",
"ns" : "test_live.movies_theater_map"
},
{
"v" : 1,
"key" : {
"city_id" : 1
},
"name" : "city_id_1",
"ns" : "test_live.movies_theater_map"
},
{
"v" : 1,
"key" : {
"endDatetime" : 1
},
"name" : "endDatetime_1",
"ns" : "test_live.movies_theater_map"
},
{
"v" : 1,
"key" : {
"movieid" : 1
},
"name" : "movieid_1",
"ns" : "test_live.movies_theater_map"
},
{
"v" : 1,
"key" : {
"theaterid" : 1
},
"name" : "theaterid_1",
"ns" : "test_live.movies_theater_map"
},
{
"v" : 1,
"key" : {
"status" : 1
},
"name" : "status_1",
"ns" : "test_live.movies_theater_map"
},
{
"v" : 1,
"key" : {
"isfeatured" : 1
},
"name" : "isfeatured_1",
"ns" : "test_live.movies_theater_map"
},
{
"v" : 1,
"key" : {
"isfesturedseq" : 1
},
"name" : "isfesturedseq_1",
"ns" : "test_live.movies_theater_map"
},
{
"v" : 1,
"key" : {
"is_popular" : 1
},
"name" : "is_popular_1",
"ns" : "test_live.movies_theater_map"
},
{
"v" : 1,
"key" : {
"loc_name" : 1
},
"name" : "loc_name_1",
"ns" : "test_live.movies_theater_map"
},
{
"v" : 1,
"key" : {
"est_city_id" : 1
},
"name" : "est_city_id_1",
"ns" : "test_live.movies_theater_map"
},
{
"v" : 1,
"key" : {
"isfeatured" : 1,
"status" : 1,
"city_id" : 1
},
"name" : "isfeatured_1_status_1_city_id_1",
"ns" : "test_live.movies_theater_map",
"background" : true
},
{
"v" : 1,
"key" : {
"movieid" : 1,
"endDatetime" : 1,
"city_id" : 1,
"status" : 1
},
"name" : "movieid_1_endDatetime_1_city_id_1_status_1",
"ns" : "test_live.movies_theater_map",
"background" : 2
},
{
"v" : 1,
"key" : {
"movieid" : 1,
"endDatetime" : 1,
"city_id" : 1,
"status" : 1,
"georand" : 1
},
"name" : "movieid_1_endDatetime_1_city_id_1_status_1_georand_1",
"ns" : "test_live.movies_theater_map",
"background" : 2
},
{
"v" : 1,
"key" : {
"rand" : 1
},
"name" : "rand_1",
"ns" : "test_live.movies_theater_map"
},
{
"v" : 1,
"key" : {
"isfeatured" : 1,
"city_id" : 1,
"status" : 1
},
"name" : "isfeatured_1_city_id_1_status_1",
"ns" : "test_live.movies_theater_map"
},
{
"v" : 1,
"key" : {
"movieid" : 1,
"city_id" : 1
},
"name" : "movieid_1_city_id_1",
"ns" : "test_live.movies_theater_map"
},
{
"v" : 1,
"key" : {
"loc_long_lat" : 1,
"is_popular" : 1,
"movieid" : 1,
"status" : 1
},
"name" : "loc_long_lat_1_is_popular_1_movieid_1_status_1",
"ns" : "test_live.movies_theater_map"
},
{
"v" : 1,
"key" : {
"status" : 1,
"city_id" : 1,
"theaterid" : 1,
"endDatetime" : 1
},
"name" : "status_1_city_id_1_theaterid_1_endDatetime_1",
"ns" : "test_live.movies_theater_map",
"background" : true
}
The $near operator uses a 2d or 2dsphere index to return documents in order from nearest to furthest. For a 2d index, a max of 100 documents are returned. Your query scanned every document because there were no matching documents and every document, from nearest to furthest, had to be scanned to check if it matched all the conditions.
I would suggest the following to improve the query:
Use the $maxDistance option, which is specified in radians for legacy coordinates, to limit the maximum number of documents scanned.
Use a 2dsphere index, ideally with GeoJSON points instead of legacy coordinates. You can have compound indexes with prefix keys to a geo index with a 2dsphere index, so you could index the query in part on all the other conditions to reduce the number of documents that need to be scanned. What version of MongoDB are you using? You may not have all of these features available with an old version.
Use limit to limit the maximum number of documents scanned. However, when the query has less results than the value of limit, you'll still scan every document.

can't shard a collection on mongodb

I have a db ("mydb") on mongo that contains 2 collections (c1 and c2). c1 is already hash sharded. I want to shard a second collection the same way. I get the following error :
use mydb
sh.shardCollection("mydb.c2", {"LOG_DATE": "hashed"})
{
"proposedKey" : {
"LOG_DATE" : "hashed"
},
"curIndexes" : [
{
"v" : 1,
"key" : {
"_id" : 1
},
"ns" : "mydb.c1",
"name" : "_id_"
}
],
"ok" : 0,
"errmsg" : "please create an index that starts with the shard key before sharding."
So I did
db.c2.ensureIndex({LOG_DATE: 1})
sh.shardCollection("mydb.c2", {"LOG_DATE": "hashed"})
Same error but it shows the new index.
"proposedKey" : {
"LOG_DATE" : "hashed"
},
"curIndexes" : [
{
"v" : 1,
"key" : {
"_id" : 1
},
"ns" : "mydb.c2",
"name" : "_id_"
},
{
"v" : 1,
"key" : {
"LOG_DATE" : 1
},
"ns" : "mydb.c2",
"name" : "LOG_DATE_1"
}
],
"ok" : 0,
"errmsg" : "please create an index that starts with the shard key before sharding."
Just to be sure, I run :
db.system.indexes.find()
{ "v" : 1, "key" : { "_id" : 1 }, "ns" : "mydb.c1", "name" : "_id_" }
{ "v" : 1, "key" : { "timestamp" : "hashed" }, "ns" : "mydb.c1", "name" : "timestamp_hashed" }
{ "v" : 1, "key" : { "_id" : 1 }, "ns": "mydb.c2", "name" : "_id_" }
{ "v" : 1, "key" : { "LOG_DATE" : 1 }, "ns" : "mydb.c2", "name" : "LOG_DATE_1" }
I try again the same commands on admin and it fails with the same error.
Then I tried on admin without "hashed" and it worked.
db.runCommand({shardCollection: "mydb.c2", key: {"LOG_DATE": 1}})
Problem : now my collection is sharded on something that is not hashed and I can't change it (error : "already sharded")
What was wrong with what I did ?
How can I fix this ?
Thanks in advance
Thomas
The problem initially was that you did not have a hashed index what you proposed to use for sharding this is the error message is about. After the first error message, when you created an index which is
{
"v" : 1,
"key" : {
"LOG_DATE" : 1
},
"ns" : "mydb.c2",
"name" : "LOG_DATE_1"
}
You still just have an ordinary index which is not a hashed one. If you would do this :
db.c2.ensureIndex({LOG_DATE: "hashed"})
Instead of this :
db.c2.ensureIndex({LOG_DATE: 1})
Than would be a hashed index. As you can see in the output of the db.system.indexes.find() on the other collection you have a hashed index for the timestamp i assume this is the shard key for that collection.
So if you have no data in the c2 collection:
db.c2.drop()
db.createCollection('c2')
db.c2.ensureIndex({LOG_DATE: "hashed"})
sh.shardCollection("mydb.c2", {"LOG_DATE": "hashed"})
This will work properly.