MongoDB: how to persist selected document from collection - mongodb

I'm new to MongoDB and I'm not how to best solve my fairly basic problem.
I have a Collection of "emoji" Documents in my database. At any given time, there is one (and only one) "selected" emoji Document. This is determined and updated by the application. How can I persist the information of which one is selected to the database?
Approach 1:
Add a new Collection to hold this kind of metadata of the emoji collection? I'm thinking it would hold a single document with a reference to the currently selected emoji document. This seems to hurt the OO design. A whole collection, with a single document, to hold a single property. But it does have flexibility to add more metadata.
Approach 2:
Add a new boolean field to each emoji Document indicating whether or not it is the current selected emoji. This seems like a lot of extra info to track for each Document, when only one should have a true value. I would also be concerned with maintaining consistency.
I know I'm not the first person to have this issue, but I couldn't find a solution this is as a general case. Thanks!

MongoDB is schemaless so you can just add the boolean field to the currently selected emoji and remove it when the selection changes. You should add a parse unique index to make querying this field faster. You could set the field using this syntax:
db.emojis.update({name:"b"},{$set:{selected:true}})
And simply unset it like this:
db.emojis.update({name:"b"},{$unset:{selected:""}})
You could create the following parse unique index to ensure there is only ever one field with selected:true
db.emojis.createIndex( { selected: 1 } , { sparse: true, unique: true } )

Related

Does length of indexed field matter while searching?

The chat app schema that I have is something like below.
1. conversations {participants[user_1, user_2], convsersation_id}
2. messages {sender: user_1, sonversation_id, timestamps}
I want to map this relationship using existing _id:ObjectId which is already indexed.
But if I want to get all conversation of user_1 I have to first search in which conversation that user is involed and get that conversation's _id and again search for the messages in messages using that conversation _id.
So my questions are -
Does length of indexed field (here _id) matters while searching?
Should I create another shorter indexed fields?.
Also if there is any better alternative schema please suggest.
I would suggest you to maintain the data as sub documents instead of array. The advantage you have is you can build another index (only) on conversation_id field, which you want to query to know the user's involvement
When you maintain it as array, you cannot index the converstaion_id field separately, instead you will have to build a multi key index, which indexes all the elements of the array (sender and timestamps fields) which you are never going to use for querying and it also increases the index size
Answering you questions:
Does length of indexed field (here _id) matters while searching? - Not really
Should I create another shorter indexed fields? - Create sub-document and index converstaion_id
Also if there is any better alternative schema please suggest. - Maintain the array fields as sub-documents

_id field compared to index

I'm planning to add a Collection to a mongodb database that will have a text field that should be unique for each Document. Lookups from this Collection will almost always be based on this field. This field can contain as many as 100+ chars.
My question is, should this field be the _id field, or should I just add an index for it? What would the performance impact for either approach be?
I suggest you to use your unique text as _id.
It will reduce data size and eliminate an index. Here is the reference. 9th page will guide you.

Mongo filed name change not updating indexes

While I undertand a foreign key constraint would not make sense for a NoSql database, should it not ensure that it updates the indexes if it allows me to rename fields?
http://www.mongodb.org/display/DOCS/Updating#Updating-%24rename
{ $rename : { old_field_name : new_field_name } }
but if I had
db.mycollections.ensureIndex({old_field_name:1});
wouldn't it be great if the index was updated automatically?
Is it that since system.indexes is simply just another table and such a automatic update would imply a foreign key constraint of sorts, the index update is not done? Or am I missing certain flags?
It doesn't do it.
The answer to your question "wouldn't it be great if the index was updated automatically?" is, "no, not really".
If you think that renaming fields is a good idea you can add the new index yourself at the same time. You'll likely have lots of other changes to do in your code to reflect a rename on a field (queries, updates, map reduce operations, ...) so why do you think it should single out index recreation as something that should happen automatically on what is a very rare operation when it's just one thing out of many that you'd need to do, manually?
If you care about this feature, go request it, 10Gen are incredibly responsive to suggestions, but I wouldn't be surprised if the answer was "why is this important?"
Quoting Mike O' Brien:
The $rename operator is analogous to doing a $set/$unset in a single atomic operation. It's a shortcut for situations where you need to take a value and move it into another field, without the need to do it in 2 steps (one to fetch the value of the field, another to set the new one).
Doing a $rename does mean the data is changing. If I use $rename to rename a field named "x" to "y", but the field named "y" already existed in the document, the old value for "y" is overwritten, and the field "x" will no longer exist anymore. If either "x" or "y" is indexed, then the operation will update those indexes to reflect the final values resulting from the operation. The same applies when using rename to move a field from within an embedded document up to the top level (e.g. renaming "a.b" to "c") or vice versa.
The behavior suggested in the SO question (i.e., renaming a field maintains the relationship between the field it was moved to and its value in the index) then things can get really confusing and make it difficult to reason about what the "correct" expected behavior is for certain operations. For example, if I create an index on field "A" in a collection, rename "A" to "B" in one of the documents, and then do things like:
update({"A" : }, {"$set":{"B":}}) // find a document where A= and set its value of B to
update({"B" : }, {"$set":{"A":}}) // find a document where B= and set its value of A to
Should these be equivalent?
In general, having the database maintain indexes across a collection by field name is a design decision that keeps behavior predictable and simple.

The fastest way to show Documents with certain property first in MongoDB

I have collections with huge amount of Documents on which I need to do custom search with various different queries.
Each Document have boolean property. Let's call it "isInTop".
I need to show Documents which have this property first in all queries.
Yes. I can easy do sort in this field like:
.sort( { isInTop: -1 } );
And create proper index with field "isInTop" as last field in it. But this will be work slowly, as indexes in mongo works best with unique fields.
So is there is solution to show Documents with field "isInTop" on top of each query?
I see two solutions here.
First: set Documents wich need to be in top the _id from "future". As you know, ObjectId contains timestamp. So I can create ObjectId with timestamp from future and use natural order
Second: create separate collection for Ducuments wich need to be in top. And do queries in it first.
Is there is any other solutions for this problem? Which will work fater?
UPDATE
I have done this issue with sorting on custom field which represent rank.
Using the _id field trick you mention has the problem that at some point in time you will reach the special time, and you can't change the _id field (without inserting a new document and removing the old one).
Creating a special collection which just holds the ones you care about is probably the best option. It gives you the ability to logically (and to some extent, physically) separate the documents.
Newly introduced in mongodb there is also support for a "sparse" index which may fulfill your needs as well. You could only set the "isInTop" field when you want it to be special, and then create a sparse index on it which would not have the problems you would normally have with a single indexed boolean field (in btrees).

MongoDB: Speed of field ("inside record") search in comporation with speed of search in "global scope"

My question may be not very good formulated because I haven't worked with MongoDB yet, so I'd want to know one thing.
I have an object (record/document/anything else) in my database - in global scope.
And have a really huge array of other objects in this object.
So, what about speed of search in global scope vs search "inside" object? Is it possible to index all "inner" records?
Thanks beforehand.
So, like this
users: {
..
user_maria:
{
age: "18",
best_comments :
{
goodnight:"23rr",
sleeptired:"dsf3"
..
}
}
user_ben:
{
age: "18",
best_comments :
{
one:"23rr",
two:"dsf3"
..
}
}
So, how can I make it fast to find user_maria->best_comments->goodnight (index context of collections "best_comment") ?
First of all, your example schema is very questionable. If you want to embed comments (which is a big if), you'd want to store them in an array for appropriate indexing. Also, post your schema in JSON format so we don't have to parse the whole name/value thing :
db.users {
name:"maria",
age: 18,
best_comments: [
{
title: "goodnight",
comment: "23rr"
},
{
title: "sleeptired",
comment: "dsf3"
}
]
}
With that schema in mind you can put an index on name and best_comments.title for example like so :
db.users.ensureIndex({name:1, 'best_comments.title:1})
Then, when you want the query you mentioned, simply do
db.users.find({name:"maria", 'best_comments.title':"first"})
And the database will hit the index and will return this document very fast.
Now, all that said. Your schema is very questionable. You mention you want to query specific comments but that requires either comments being in a seperate collection or you filtering the comments array app-side. Additionally having huge, ever growing embedded arrays in documents can become a problem. Documents have a 16mb limit and if document increase in size all the time mongo will have to continuously move them on disk.
My advice :
Put comments in a seperate collection
Either do document per comment or make comment bucket documents (say,
100 comments per document)
Read up on Mongo/NoSQL schema design. You always query for root documents so if you end up needing a small part of a large embedded structure you need to reexamine your schema or you'll be pumping huge documents over the connection and require app-side filtering.
I'm not sure I understand your question but it sounds like you have one record with many attributes.
record = {'attr1':1, 'attr2':2, etc.}
You can create an index on any single attribute or any combination of attributes. Also, you can create any number of indices on a single collection (MongoDB collection == MySQL table), whether or not each record in the collection has the attributes being indexed on.
edit: I don't know what you mean by 'global scope' within MongoDB. To insert any data, you must define a database and collection to insert that data into.
Database 'Example':
Collection 'table1':
records: {a:1,b:1,c:1}
{a:1,b:2,d:1}
{a:1,c:1,d:1}
indices:
ensureIndex({a:ascending, d:ascending}) <- this will index on a, then by d; the fact that record 1 doesn't have an attribute 'd' doesn't matter, and this will increase query performance
edit 2:
Well first of all, in your table here, you are assigning multiple values to the attribute "name" and "value". MongoDB will ignore/overwrite the original instantiations of them, so only the final ones will be included in the collection.
I think you need to reconsider your schema here. You're trying to use it as a series of key value pairs, and it is not specifically suited for this (if you really want key value pairs, check out Redis).
Check out: http://www.jonathanhui.com/mongodb-query