Elasticsearch and subsequent Mongodb queries - mongodb

I am implementing search functionality using Elasticsearch.
I receive "username" set returned by Elasticsearch after which I need to query a collection in MongoDB for latest comment of each user in the "username" set.
Question: Lets say I receive ~100 usernames everytime I query Elasticsearch what would be the fastest way to query MongoDB to get the latest comment of each user. Is querying MongoDB 100 times in a for loop using .findOne() the only option?
(Note - Because latest comment of a user changes very often, I dont want to store it in Elasticsearch as that will trigger retrieve-change-reindex process for the entire document far too frequently)

This answer assumes following schema for your mongo db stored in comments db.
{
"_id" : ObjectId("5788b71180036a1613ac0e34"),
"username": "abc",
"comment": "Best"
}
assuming usernames is the list of users you get from elasticsearch, you can perform following aggregate:
a =[
{$match: {"username":{'$in':usernames}}},
{$sort:{_id:-1}},
{
$group:
{
_id: "$username",
latestcomment: { $first: "$comment" }
}
}
]
db.comments.aggregate(a)

You can try this..
db.foo.find().sort({_id:1}).limit(100);
The 1 will sort ascending (old to new) and -1 will sort descending (new to old.)

Related

MongoDB compass its not exporting all data for collection

while trying to export collection from MongoDB compass it's not exporting all data, it's only export fields that are present in all documents. for eg: if document 1 has
{
"Name": "Alex",
"__v": 0
}
and if Document 2 has
{
"Name": "Joe",
"ID" : 07
"__v": 0
}
and when trying to export collection it's only exporting Name fields. I'm trying to export all fields through the MongoDB Compass. is there any other way to export all data through any code or script
EDIT: the solution is Update to new version of compass and while exporting data from mongo if the field name is not present in the list, there is an option to add a field through we can add a field that misses by compass
MongoDB Compass has known issues on exporting an importing data for long time and it seems they are not willing to improve it!
When you try to export data using compass, it uses some sample documents to select the fields and if you are unlucky enough, you will miss some fields.
SOLUTION:
Use the Mongo DB Compass Aggregation tab to find all the existing fields in all documents:
[{$project: {
arrayofkeyvalue: {
$objectToArray: '$$ROOT'}
}},
{$unwind: '$arrayofkeyvalue'},
{$group: {
_id: null,
allkeys: {
$addToSet: '$arrayofkeyvalue.k'
}
}}]
Add the fields from the 1st step to the Export Full Collection (Select Fields).
Export it!
the solution is while exporting data from mongo if the field name is not present in the list, there is an option to add a field through which we can add a field that missed by compass.

db.getCollection(...).find(...).aggregate is not a function

(* MongoDB shell version: 2.6.12 *)
I have a collection logs and a collection users. logs has a field userId which represents the user who sent the log. users has a field _id.
I use Robo 3T to undertake queries to data in remote server.
Now, I'm interested in the logs whose url is /subscribe, and I want to see those user information behind. So I write the following query:
db.getCollection('logs').find({ "url" : "/subscribe" }).aggregate({
$lookup:{
from:"users",
localField:"userId",
foreignField:"_id",
as:"logs_users"
}
})
But I get an error:
Does anyone know how to solve this?
Edit 1: I got a new error:
You should not but use aggregate after the find, instead, use aggregate directly and use $match operator to find the documents and then $lookup
db.getCollection('logs').aggregate([
{
$match:{"url" : "/subscribe"}
},
{
$lookup:{
from:"users",
localField:"userId",
foreignField:"_id",
as:"logs_users"
}
}
])
Update: To use $lookup your MongoDB version should be equal or greater than 3.2, as this operator(joins) don't work in older versions of MongoDB

How to perform queries with large $in data set in mongo db

I have simple query like this: {"field": {$nin: ["value1","value2","valueN"]}}.
The problem is large amount of unique values to exclude (using $nin operator). It's about 50000 unique values to filter and about 1Kb of query length.
Question: Is there elegant and performant way to do such operations?
Example.
Collection daily_stat with 56M of docs. Each day increases collection with 100K docs. Example of document
{
"day": "2020-04-15",
"username": "uniq_name",
"total": 12345
}
I run next query:
{
"date": "2020-04-15",
"username": {
$nin: [
"name1",
"name2",
"...",
"name50000"
]
}
}
MongoDB version: 3.6.12
I would say the big $nin array is the elegant solution. If there is an index on field then it will also be performant -- but only in terms of quickly excluding those docs not to be returned in the cursor. If you have, say, 10 million docs in a collection and you do a find() to exclude 50000, you are still dragging 9,950,000 records out of the DB and across the wire; that is non-trivial.
if you can find a pattern in the values you pass in you can try with regex. Example given below
db.persons.find({'field':{$nin:[/san/i]}},{_id:0,"field":1})
more details on regex in
https://docs.mongodb.com/manual/reference/operator/query/regex/

Limit Document Insertion for Embedded Mongo Document

I am curious find out if I can create/enforce some limitations on a mongoDB document. I want to limit MongoDB embedded documents to a certain amount of records (10). I am creating a password check system that will query Mongo and check to see if the user's password is either a) like their current password, or b) matches one of their 10 oldest passwords. If there is no match, then the DB will be updated with the newest password and the old passwords document would be updated with the last current password. However, I want to limit this to 10 records, and over-write the oldest record so there are only ever 10 passwords in the oldPassword document.
Does this make sense? And is it possible to enforce such a limit? The mock object would look like the following:
_id: "",
username: "User",
currentPassword: "pass"
oldPasswords:{
password1: "pass1",
password2: "pass2",
password3: "pass3",
password4: "pass4",
password5: "pass5",
password6: "pass6",
password7: "pass7",
password8: "pass8",
password9: "pass9",
}
As a sidebar: Is this is the best way to handle the passwords in Mongo? I have read their modeling documents and it appears that a 1 to many relationship like this would be best in an embedded document, unless the embedded document continues to grow. Then, at that point, it seems that referencing the old passwords would be best served in its own document.
Any help would be greatly appreciated!
If you can switch old passwords to an array instead of an object, you can use slice.
db.passwords.update(
{ _id: 1 },
{
$push: {
oldpasswords: {
$each: ["passabc"],
$slice: -10
}
}
}
)
That should keep the last 10 passwords on your array.

Mongodb Aggregation Framework Output Collection Indexes

I'm building an API with a particular endpoint that returns various statistics for the entire database.
For this, I have an aggregation pipeline that takes 1 second to complete.
Instead of running this aggregation for every request, I want to store the results in a collection c as the aggregated data changes rarely and is accessed frequently.
I will also define a few indexes on c as I need to return only documents that mactch some criteria passed to the endpoint.
When the source data is changed, I'd run the aggregation again and replace the contents of collection c.
In MongoDB 3.0, the docs about the out operator of the aggregation pipeline state that:
The $out operation does not change any indexes that existed on the previous collection
I'm confused, does this mean that MongoDB won't update the indexes on c when its contents are replaced?
P.S.: I know that MapReduce might be an alternative; I tried that first, but I did not manage to get the results I wanted; my current approach works and given the approaching deadline I'd like to simply "cache" the aggregated data instead of reimplementing this from scratch.
EDIT
What I'm asking is if the indexes will reflect the new documents after the replacement of the collection or if they will be "stale".
Index will be updated when you execute your Aggregate query.
$out will create a collection when your aggregate query is successful
Mongo will update your collection created by $out when you execute your aggregate again
When the collection is updated, then the indexes associated with the collection is also updated
You can test this by following the below steps
Create a smaller collection say 'books'
{ "_id" : 8751, "title" : "The Banquet", "author" : "Dante", "copies" : 2 }
Have your aggregate with output collection - Step2
db.books.aggregate( [{ $group : { _id : "$author", books: { $push: "$title" } } },{ $out : "authors" }] )
Create Index on books in the new collection authors
db.authors.createIndex({books:1})
Query your author collection
db.authors.find({books:'The Banquet'}).explain()
and look for the winning plan
- Add another record
db.books.insert({ "_id" : 7101, "title" : "Wings of Fire", "author" : "APJ Abdul Kalam", "copies" : 1 })
Execute the aggregate query given on step2
Now do a find for the new book which we added
db.authors.find({books:'Wings of Fire'}).explain()
You can find that the Winning plan is having IXSCAN says that the index is used for this search and so the index is updated by Mongo for the new record.
MongoDB will preserve the existing indexes
Replace Existing Collection
If the collection specified by the $out operation already exists, then upon completion of the aggregation, the $out stage atomically replaces the existing collection with the new results collection. The $out operation does not change any indexes that existed on the previous collection. If the aggregation fails, the $out operation makes no changes to the pre-existing collection.
Reference:
http://docs.mongodb.org/manual/reference/operator/aggregation/out/