ElasticSearch indexing and references to other documents - mongodb

I have an ElasticSearch instance indexing a MongoDB database using the river by richardwilly98
There are two types of documents that are indexed:
documents referencing users
documents representing users
When these objects are added to mongodb richardwilly98's river generates something like the following:
document = {'user': {"$id" :
"5159a004c87126641f4f9530" } }
user_document = {'_id':"5159a004c87126641f4f9530",'username':'bob'}
If I perform a search for 'bob' i'd like any documents that reference the bob document to be returned. At the moment this doesn't happen because the username field is not related to the referencing documents in anyway.
Is it possible to do this? Does ElasticSearch have object references?
Thanks - let me know if I haven't been clear.

If each document belong to no more than one user, you can index documents as children of users. Then you can use has_parent filter to perform the search. However, if a single document can belong to more than one user, you will have to perform search in two steps. First you would have to find the user and then issue another search to find documents.

Elasticsearch supports parent field [1]. MongoDB river supports custom mapping [2] so _parent can now be used.
http://www.elasticsearch.org/guide/reference/mapping/parent-field/
https://github.com/richardwilly98/elasticsearch-river-mongodb/issues/64

Related

query in mongodb atlas to verify the existence of multiple specific documents in a collection

I have a mongodb collection called employeeInformation, in which I have two documents:
{"name1":"tutorial1"}, {"name2":"tutorial2"}
When I do db.employeeInformation.find(), I get both these documents displayed. My question is - is there a query that I can run to confirm that the collection contains only those two specified documents? I tried db.employeeInformation.find({"name1":"tutorial1"}, {"name2":"tutorial2"}) but I only got the id corresponding to the first object with key "name1". I know it's easy to do here with 2 documents just by seeing the results of .find(), but I want to ensure that in a situation where I insert multiple (100's) of documents into the collection, I have a way of verifying that the collection contains all and only those 100 documents (note I will always have the objects themselves as text). Ideally this query should work in mongoatlas console/interface as well.
db.collection.count()
will give you number of inserts once you have inserted the document.
Thanks,
Neha

Ways to refer to a database, collection, and document in MongoDB?

use <databasename> will set a variable db to be the database
specified by <databasename>, so the database can be referred to
the variable db.
I wonder if a collection or document can also be
referred to by a variable, and if yes, how?
Every object in a MongoDB server has an identifier _id. If I am correct, a database, a collection and a document are objects.
How are the identifier of an object used in practice?
Both a database and a collection have a name. So we can refer to a
collection via its name e.g. mydb.mycollection.
Does a document also have a name?
Thanks.
A collection consists of several documents so a document as such do not have any name. _id distinguishes the documents. To fetch any particular document, you can filter it on the basis of _id or the data stored in it.
Referencing the db and then the collection in selected db will give you the required document.

Conditional based (selective documents from mongo) indexing to elasticsearch from mongo-connector

Use case
Need to index selective documents from a mongo collection. Selection is based on the document field value. Eg., Profile collection has multiple documents, but need to index documents whose age > 25 and country is "US"
I am using mongo-connector for indexing collection (in elasticsearch)
Please suggest the approaches which can be taken?
Need to index selective documents
Using elasticsearch you decide which fields to index through mapping. So in your case, you can set all the fields except age to index: no
Elastic Search Mapping
Depending on the version of your mongo-connector, you could also use namespaces to select which documents to index. See the Configuration Options of mongo-connector. However, I don't think that conditional index is possible... at least I have never seen it.

Design MongoDb Schema For My Social

I'm new for MongoDB , I just want to create a simple project to test performance of MongoDB
The project just like a simple CMS
it has users, blogs and comments, users can have friends
so I design my database like that
user
{
_ID:
name:
birth_day:
sex:
friends:[id_1,Id_2]
}
blogs
{
title:
owner:
tags_fiends:
comments:
[
{"_id":"","content":"","date_created":""},
{"_id":"","content":"","date_created":""},
],
"like"={"_id","_id"}
}
And How many collection are needed for this database. Can I use 1 Collection for both user and blog.Thanks in advance.
Due to mongoDB is schema less or schema free DB You can make any kind of structure within a document, which is supported:
individual elements
nested arrays
nested documents
There is a couple of things you have to considare during schema design which for it is useful to have the users and the blogs in separated schema. For example if you storing something in a nested array you can specify index for fastening the search within this array, but you can have only one multykéy index (indexed array content) within one particular collection. so if you store, friends and blogs, and posts, and tags all in arrays you can have index only on one of them.
Also important to know in this case that there is a size limit for each document what is now 16MB.
In your scenario, I would make Users a collection and reference it by _id from the blog collection.
In practise, you could make the Blogs an attribute of User, the only constraint being the max doc size of 16MB - but that's a lot of blogs (text).
To get round that (assuming you need to), a separate Blog collection referencing the user _id would be fine. You may need to denormalise the user name too if that's not your _id. This would mean you can get all the blogs for a user in a single query.

Mongoid: retrieving documents whose _id exists in another collection

I am trying to fetch the documents from a collection based on the existence of a reference to these documents in another collection.
Let's say I have two collections Users and Courses and the models look like this:
User: {_id, name}
Course: {_id, name, user_id}
Note: this just a hypothetical example and not actual use case. So let's assume that duplicates are fine in the name field of Course. Let's thin Course as CourseRegistrations.
Here, I am maintaining a reference to User in the Course with the user_id holding the _Id of User. And note that its stored as a string.
Now I want to retrieve all users who are registered to a particular set of courses.
I know that it can be done with two queries. That is first run a query and get the users_id field from the Course collection for the set of courses. Then query the User collection by using $in and the user ids retrieved in the previous query. But this may not be good if the number of documents are in tens of thousands or more.
Is there a better way to do this in just one query?
What you are saying is a typical sql join. But thats not possible in mongodb. As you suggested already you can do that in 2 different queries.
There is one more way to handle it. Its not exactly a solution, but the valid workaround in NonSql databases. That is to store most frequently accessed fields inside the same collection.
You can store the some of the user collection fields, inside the course collection as embedded field.
Course : {
_id : 'xx',
name: 'yy'
user:{
fname : 'r',
lname :'v',
pic: 's'
}
}
This is a good approach if the subset of fields you intend to retrieve from user collection is less. You might be wondering the redundant user data stored in course collection, but that's exactly what makes mongodb powerful. Its a one time insert but your queries will be lot faster.