Fast related tag search in MongoDB - mongodb

I've got two document sets:
Wikis and WikiTags. Since i want flexible editing of tag names I don't want to embed tag itself into wiki document. So, I store a list of wiki_tag_ids inside wiki document.
I wonder what is the best way to find related tags using this schema. By related tags I mean tags that exist in other wikis tagged with selected tags.
May be I should store related tags in WikiTag document?

I suggest you should store to store WikiTag in Wiki document. Mongodb allow easy update, delete single document from nested collection, thats mean 'flexible editing of tag names'.
Collection like this:
wikis
{
_id,
wikiTags {_id, name, ...},
...
}
So, for example if you want update nested WikiTag name with id = SomeTagId you can:
db.wikis.update( {'wikiTags.id':SomeTagId},
{$set:{'wikiTags.$.name':"New Tag Name"}},
false,
true )
If yoy want delete item from nested array you should use $unset,
add new item: $push, $addToSet
So, i guess now you see that any operation with nested array can be done easy. And if performance is an issue -- use embedding.
Hope this helps.

Related

MongoDB Bulk Find and Replace of ObjectId on a single Document

We have two documents that have merged and they now have one one ObjectId.
There exists a configuration document that may have references to the old ObjectId. The old ObjectID can exist all over this document which is full of nested arrays and lists.
We want to do a simple find and replace on this document, preferably without replacing the entire document itself.
Is there a generic way to set every field that has ObjectIdA as a value and replace it with ObjectIdB?
There's no way to do that, no. You need to perform updates on all possible paths explicitly.

MongoDB - Tag based search with autocomplete

I am looking to implement a tag search feature and was looking for some advice in terms of efficiency. I am new to MongoDB so I am unsure of best practices for performance.
Okay so I want to create a link sharing app which users tag the links based on their content. For instance a funny dog image would be tagged with "funny" and "dog". A link would have a:
title,
url,
user_id,
tags: array of tags
Now in order for me to allow users to search for links I need a list of all the tags used. For usability this needs to have auto-complete functionality. So I researched a bit and tested out using a collection of tags where I index the tag value e.g. "funny" and then use a regex.
db.tags.find({value:/^search/})
With a collection of 600,000 documents it searched for all documents beginning with "s" in 63 milliseconds. As the length of the search term increases the execution time decreases.
Now comes the part I'm unsure of. Say for instance I want to find all the links with have the tags "funny" and "dog" (need to use intersects). How should I store the tags? Should I store the object id of each tag? Can I index these object ids? Is there another way to structure the whole database?
Also id like to be able suggest tags based on tags they already entered. I was thinking of just having a related field in the tag document for instance:
tag
----
id
value
related: [{
tag_id
count
}]
(again unsure as it would suggest tags that could be related to one of the already entered tags and not to another. With an intersect this would return no results.)
Any advice would be much appreciated.
Edit: mistake
Create a text index on the tag array. This will enable you to search quickly for funny, dog, and funny or dog.
https://docs.mongodb.com/manual/core/index-text/
db.tags.createIndex( { tags: "text" }, {background:true} )
As to the related tags, I don't think that you want to reference the _id values. You can probably embed an array of related tags such as:
relatedTags: [{tag1}, {tag2}]

MongoDB: how to persist selected document from collection

I'm new to MongoDB and I'm not how to best solve my fairly basic problem.
I have a Collection of "emoji" Documents in my database. At any given time, there is one (and only one) "selected" emoji Document. This is determined and updated by the application. How can I persist the information of which one is selected to the database?
Approach 1:
Add a new Collection to hold this kind of metadata of the emoji collection? I'm thinking it would hold a single document with a reference to the currently selected emoji document. This seems to hurt the OO design. A whole collection, with a single document, to hold a single property. But it does have flexibility to add more metadata.
Approach 2:
Add a new boolean field to each emoji Document indicating whether or not it is the current selected emoji. This seems like a lot of extra info to track for each Document, when only one should have a true value. I would also be concerned with maintaining consistency.
I know I'm not the first person to have this issue, but I couldn't find a solution this is as a general case. Thanks!
MongoDB is schemaless so you can just add the boolean field to the currently selected emoji and remove it when the selection changes. You should add a parse unique index to make querying this field faster. You could set the field using this syntax:
db.emojis.update({name:"b"},{$set:{selected:true}})
And simply unset it like this:
db.emojis.update({name:"b"},{$unset:{selected:""}})
You could create the following parse unique index to ensure there is only ever one field with selected:true
db.emojis.createIndex( { selected: 1 } , { sparse: true, unique: true } )

Mongo DB best practice for a query WHERE IN with tags and childs

Well i'm planning a collection schema and i have dubt to how embed data inside tag field.
i have a colletion named products:
products
->_id
->product_name
->tags
i have then a tags collection:
tags
->_id
->tag_name
and i have a tags_child collection:
tags_childs
->_id
-> id_tag
->tag_child_name
now i need to save tags and tags_childs into products collection, so i tought it was good to save in products collection a field:
tags: [[_id_tag]:[_id_tag_child,_id_tag_child, ... etc],[_id_tag]:[_id_tag_child,_id_tag_child, ... etc]]
but i think is not the right way, cause i need to be able to query on products collection, filtering by tags and child_tags.
So for example i need to filter products by:
+product_name: 'roastbeef'
+ tag:'hot'
+ child tag : 'sauce'
or filter by:
+product_name: 'roastbeef'
+ tag: 'hot'
+ child tag:'sauce'
+ child tag:'knife'
+ tag:'dinner'
Parent tags are always required when filtering, while child tags are optional.
How do you implement a right collection to do this type of query at the end?
Reading over this; I described your schema a bit in detail here:
Schema Advice
There really isn't a reason to be holding this pivotal table, as in the end they are all just tags. You can still search for multiple tags within your database by going with the above suggested schema; along with showing all your available tags to and their relationship from the above schema; this way it cleans up your entire DB rather than muddying it with an extra table which MongoDB just doesn't need.
I'd highly recommend re-going through this e-book to just strengthen the understanding around the NoSQL format Mongo empowers you with.
Mongo Free eBook PDF

How do I store tags associated with a document in Solr and retrieve their frequency in aggregate?

I am using Solr to index documents from a wiki. Each document has a unique id, title, body content and some other fields.
Firstly, I wanted to know the declaration in the Solr schema to store the multivalued field "tags" to hold n of these strings, attached to the document. Each document can have a set of tags applied on it.
Second, the use case - how exactly would I retrieve all distinct tags across the entire Solr instance with number of occurences, so that I can build a tag cloud of most popular tags?
thanks
Amit