Mongodb - ensureindex for record

Mongodb - ensureindex for record - mongodb

Is it possible to use ensureindex within records and not for whole collection.
Eg: My database structure is
{ "_id" : "com.android.hello",
"rating" : [
[ { "user" : "BBFE7F461E10BEE10A92784EFDB", "value" : "4" } ],
[ { "user" : "BBFE7F461E10BEE10A92784EFDB", "value" : "4" } ]
]
}
It is a rating system and i don't want the user to rate multiple times on the same application (com.android.hello). If i use ensureindex on the user field then user is able to vote only on one application. When i try to vote on a different application altogether (com.android.hi) it says duplicate key.

No, you can not do this. Uniqueness is only enforced on a per document level. You will need to redesign your schema for the above to work. For example to:
{
"_id" : "com.android.hello",
"rating": {
"user" : "BBFE7F461E10BEE10A92784EFDB",
"value" : "4"
}
}
And then just store multiple...
(I realize you didn't provide the full document though)

ensureIndex
creates indexes , which is applied to whole collection. In case you want only for few records , you may have to keep two collections and apply ensureIndex on one of the collection.

As #Derick said, no however it is possible to make sure they can only vote once atomically:
var res=db.votes.update(
{_id: 'com.android.hello', 'rating.user': {$nin:['BBFE7F461E10BEE10A92784EFDB']}},
{$push:{rating:{user:'BBFE7F461E10BEE10A92784EFDB',value:4}}},
{upsert:true}
);
if(res['upserted']||res['n']>0){
print('voted');
}else
print('nope');
I was a bit concerned that $push would not work in upsert but I tested this as working.

Related

Mongo query keys order

Mongo query parse: sorting key based on alphabetical order is there any solution to consider based on user input?
Example :
db.user.explain().find({name: 'test user', active: true})
In the above query, mongo will parse the query to
"$and" : [
{
"active" : {
"$eq" : true
}
},
{
"name" : {
"$eq" : "test user"
}
}
]
while parsing mongo considering "active" key first and "name"
I want the query should look for "name" key first and "active" like
"$and" : [
{
"name" : {
"$eq" : "test user"
}
},
{
"active" : {
"$eq" : true
}
},
]
is there any setting/config?

As you have noticed parsedQuery in explain() will show you the fields in alphabetical order , but this order is not important in case there is no suitable index to be used since all documents will be loaded from the storage to memory and evaluated , so even you rename the fields to "aname" and "nactive" the execution times will be same if you dont have index , this is why it is important to create index and the order of your searched fields to coincide with the fields order in your index , for better performance in your case you may create index on:
{name:1}
or
{name:1, active:1}
But since the field "active" looks like a boolean value with very low selectivity it may not add too much difference during the search unless it is a "covered query" e.g. db.test.find({name:"Test",active:true},{name:1,_id:0})
and search happen only in memory.
Remember:
Reading from disk to memory is much more expensive then searching in memory so even your keys are intentionally renamed to satisfy the alphabetical order there will be no benefit if you dont create index and the mongod process perform COLLSCAN on the full collection.

Add object to object array if an object property is not given yet

Use Case
I've got a collection band_profiles and I've got a collection band_profiles_history. The history collection is supposed to store a band_profile snapshot every 24 hour and therefore I am using MongoDB's recommended format for historical tracking: Each month+year is it's own document and in an object array I will store the bandProfile snapshot along with the current day of the month.
My models:
A document in band_profiles_history looks like this:
{
"_id" : ObjectId("599e3bc406955db4cbffe0a8"),
"month" : 7,
"tag_lowercased" : "9yq88gg",
"year" : 2017,
"values" : [
{
"_id" : ObjectId("599e3bc41c073a7418fead91"),
"profile" : {
"_id" : ObjectId("5989a65d0f39d9fd70cde1fe"),
"tag" : "9YQ88GG",
"name_normalized" : "example name1",
},
"day" : 1
},
{
"_id" : ObjectId("599e3bc41c073a7418fead91"),
"profile" : {
"_id" : ObjectId("5989a65d0f39d9fd70cde1fe"),
"tag" : "9YQ88GG",
"name_normalized" : "new name",
},
"day" : 2
}
]
}
And a document in band_profiles:
{
"_id" : ObjectId("5989a6190f39d9fd70cddeb1"),
"tag" : "9V9LRGU",
"name_normalized" : "example name",
"tag_lowercased" : "9v9lrgu",
}
This is how I upsert my documents into band_profiles_history at the moment:
BandProfileHistory.update(
{ tag_lowercased: tag, year, month},
{ $push: {
values: { day, profile }
}
},
{ upsert: true }
)
My problem:
I only want to insert ONE snapshot for every day. Right now it would always push a new object into the object array values no matter if I already have an object for that day or not. How can I achieve that it would only push that object if there is no object for the current day yet?

Putting mongoose aside for a moment:
There is an operation addToSet that will add an element to an array if it doesn't already exists.
Caveat:
If the value is a document, MongoDB determines that the document is a duplicate if an existing document in the array matches the to-be-added document exactly; i.e. the existing document has the exact same fields and values and the fields are in the same order. As such, field order matters and you cannot specify that MongoDB compare only a subset of the fields in the document to determine whether the document is a duplicate of an existing array element.
Since you are trying to add an entire document you are subjected to this restriction.
So I see the following solutions for you:
Solution 1:
Read in the array, see if it contains the element you want and if not push it to the values array with push.
This has the disadvantage of NOT being an atomic operation meaning that you could end up would duplicates anyways. This could be acceptable if you ran a periodical clean up job to remove duplicates from this field on each document.
It's up to you to decide if this is acceptable.
Solution 2:
Assuming you are putting the field _id in the subdocuments of your values field, stop doing it. Assuming mongoose is doing this for you (because it does, from what I understand) stop it from doing it like it says here: Stop mongoose from creating _id for subdocument in arrays.
Next you need to ensure that the fields in the document always have the same order, because order matters when comparing documents in the addToSet operation as stated in the citation above.
Solution 3
Change the schema of your band_profiles_history to something like:
{
"_id" : ObjectId("599e3bc406955db4cbffe0a8"),
"month" : 7,
"tag_lowercased" : "9yq88gg",
"year" : 2017,
"values" : {
"1": { "_id" : ObjectId("599e3bc41c073a7418fead91"),
"profile" : {
"_id" : ObjectId("5989a65d0f39d9fd70cde1fe"),
"tag" : "9YQ88GG",
"name_normalized" : "example name1"
}
},
"2": {
"_id" : ObjectId("599e3bc41c073a7418fead91"),
"profile" : {
"_id" : ObjectId("5989a65d0f39d9fd70cde1fe"),
"tag" : "9YQ88GG",
"name_normalized" : "new name"
}
}
}
Notice that the day field became the key for the subdocuments on the values. Notice also that values is now an Object instead of an Array.
No you can run an update query that would update values.<day> only if values.<day> didn't exist.
Personally I don't like this as it is using the fact that JSON doesn't allow duplicate keys to support the schema.

First of all, sadly mongodb does not support uniqueness of a field in an array of a collection. You can see there is major bug opened for 7 years and not closed yet(that is a shame in my opinion).
What you can do from here is limited and all is on application level. I had same problem and solve it in application level. Do something like this:
First read your document with document _id and values.day.
If your reading in step 1 returns null, that means there is no record on values array for given day, so you can push the new value(I assume band_profile_history has record with _id value).
If your reading in step 1 returns a document, that means values array has a record for given day. In that case you can use setoperation with $operator.
Like others said, they will be not atomic but while you are dealing with your problem in application level, you can make whole bunch of code synchronized. There will be 2 queries to run on mongodb among of 3 queries. Like below:
db.getCollection('band_profiles_history').find({"_id": "1", "values.day": 3})
if returns null:
db.getCollection('band_profiles_history').update({"_id": "1"}, {$push: {"values": {<your new band profile history for given day>}}})
if returns not null:
db.getCollection('band_profiles_history').update({"_id": "1", "values.day": 3}, {$set: {"values.$": {<your new band profile history for given day>}}})

To check if object is empty
{ field: {$exists: false} }
or if it is an array
{ field: {$eq: []} }
Mongoose also supports field: {type: Date} so you can use it instead counting a days, and do updates only for current date.

I have big database on mongodb and can't find and use my info

This my code:
db.test.find() {
"_id" : ObjectId("4d3ed089fb60ab534684b7e9"),
"title" : "Sir",
"name" : {
"_id" : ObjectId("4d3ed089fb60ab534684b7ff"),
"first_name" : "Farid"
},
"addresses" : [
{
"city" : "Baku",
"country" : "Azerbaijan"
},{
"city" : "Susha",
"country" : "Azerbaijan"
},{
"city" : "Istanbul",
"country" : "Turkey"
}
]
}
I want get output only all city. Or I want get output only all country. How can i do it?

I'm not 100% about your code example, because if your 'find' by ID there's no need to search by anything else... but I wonder whether the following can help:
db.test.insert({name:'farid', addresses:[
{"city":"Baku", "country":"Azerbaijan"},
{"city":"Susha", "country":"Azerbaijan"},
{"city" : "Istanbul","country" : "Turkey"}
]});
db.test.insert({name:'elena', addresses:[
{"city" : "Ankara","country" : "Turkey"},
{"city":"Baku", "country":"Azerbaijan"}
]});
Then the following will show all countries:
db.test.aggregate(
{$unwind: "$addresses"},
{$group: {_id:"$country", countries:{$addToSet:"$addresses.country"}}}
);
result will be
{ "result" : [
{ "_id" : null,
"countries" : [ "Turkey", "Azerbaijan"]
}
],
"ok" : 1
}
Maybe there are other ways, but that's one I know.
With 'cities' you might want to take more care (because I know cities with the same name in different countries...).

Based on your question, there may be two underlying issues here:
First, it looks like you are trying to query a Collection called "test". Often times, "test" is the name of an actual database you are using. My concern, then, is that you are trying to query the database "test" to find any collections that have the key "city" or "country" on any of the internal documents. If this is the case, what you actually need to do is identify all of the collections in your database, and search them individually to see if any of these collections contain documents that include the keys you are looking for.
(For more information on how the db.collection.find() method works, check the MongoDB documentation here: http://docs.mongodb.org/manual/reference/method/db.collection.find/#db.collection.find)
Second, if this is actually what you are trying to do, all you need to for each collection is define a query that only returns the key of the document you are looking for. If you get more than 0 results from the query, you know documents have the "city" key. If they don't return results, you can ignore these collections. One caveat here is if data about "city" is in embedded documents within a collection. If this is the case, you may actually need to have some idea of which embedded documents may contain the key you are looking for.

Update collection based on data in another collection in MongoDB

I have two MongoDB collections. questions:
{
"_id" : "8735574",
"title" : "...",
"owner" : {
"user_id" : 950690
},
}
{
"_id" : "8736808",
"title" : "...",
"owner" : {
"user_id" : 657258
},
}
and users:
{
"_id" : 950690,
"updated" : SomeDate,
...
}
{
"_id" : 657258,
"updated" : SomeDate,
...
}
The entries in users have to be regularily created or updated based on questions. So I would like to get all the user_ids from questions that either do not have an entry in users at all or their entry in users was updated more than e.g. one day ago.
To achieve this, I could read all user_ids from questions and then manually drop all users from the result that do not have to be updated. But this seems to be a bad solution as it reads a lot of unneccessary data. Is there a way to solve this differently? Some kind of collection join would be great but I know that this does not (really) exist in MongoDB. Any suggestions?
PS: Nesting these collections into a single collection is no solution as users has to be referenced from elsewhere as well.

Unfortunately there is no good way of doing this and since you don't have access to the indexes to able to do this client side without reading out all the data and manually manipulating it, it is the only way.
The join from users to questions could be done by querying the users collection and then doing an $in on the questions collection but that's really the only optimisation that can be made.

what is the real purpose of $ref (DBRef) in MongoDb

I want to use mongo for my app, and while I was thinking about designing issues, I came up with question, so what are the advantages/purposes of DBRef?
for example:
> names = ['apple', 'banana', 'orange', 'peach', 'pineapple']
[ "apple", "banana", "orange", "peach", "pineapple" ]
> for (i=0; i<5; i++) {
... db.fruits.insert({_id:i, name:names[i]})
... }
> db.fruits.find()
{ "_id" : 0, "name" : "apple" }
{ "_id" : 1, "name" : "banana" }
{ "_id" : 2, "name" : "orange" }
{ "_id" : 3, "name" : "peach" }
{ "_id" : 4, "name" : "pineapple" }
and I want to store those fruits in a basket collection:
> db.basket.insert({_id:1, items:[ {$ref:'fruits', $id:1}, {$ref:'fruits', $id:3} ] })
> db.basket.insert({_id:2, items:[{fruit_id: 1}, {fruit_id: 3}]})
> db.basket.find()
{ "_id" : 1, "items" : [ DBRef("fruits", 1), DBRef("fruits", 3) ] }
{ "_id" : 2, "items" : [ { "fruit_id" : 1 }, { "fruit_id" : 3 } ] }
What are the real difference between those two techniques? For me it looks like using DBRef you just have to insert more data without any advantages.... Please correct me if I'm wrong.

Basically a DBRef is a self describing ObjectID which a client side helper, which exists in all drivers (I think all), provides the ability within your application to get related rows easily.
They are not:
JOINs
Cascadeable relations
Server-side relations
Resolved Server-side
They also are not used within Map Reduce, the functionality was taken out due to complications with sharding.
It is not always great to use these though, for one they take quite a bit of space if you know the collection that is related to that row in comparison to just storing the ObjectID. Not only that but due to how they are resolved each related record needs to be lazy loaded one by one instead if being able to form a range (easily) to query for related rows all in one go, so they can increase the amount of queries you make to the database as well, in turn increasing cursors.

From "MongoDB: The Definitive Guide" DBRefs aren't necessary and storing a MongoID is more lightweight, but DBRefs offer some interesting functionality like the following:
Loading each DBRef in a document:
var note = db.notes.findOne({"_id":20});
note.references.forEach(function(ref) {
printjson(db[ref.$ref].findOne({"_id": ref.$id}));
});
They're also helpful if the references are stored across different collections and databases as the DBRef contains that info. If you use a MongoID you'd have to remember which DB and collection the MongoID is in reference to.
In your example a basket document's items array might contain references in the fruits collection, but also the vegetables collect. A DBRef would actually be handy in this case.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse