Search and update in array of objects MongoDB - mongodb

I have a collection in MongoDB containing search history of a user where each document is stored like:
"_id": "user1"
searchHistory: {
"product1": [
{
"timestamp": 1623482432,
"query": {
"query": "chocolate",
"qty": 2
}
},
{
"timestamp": 1623481234,
"query": {
"query": "lindor",
"qty": 4
}
},
],
"product2": [
{
"timestamp": 1623473622,
"query": {
"query": "table",
"qty": 1
}
},
{
"timestamp": 1623438232,
"query": {
"query": "ike",
"qty": 1
}
},
]
}
Here _id of document acts like a foreign key to the user document in another collection.
I have backend running on nodejs and this function is used to store a new search history in the record.
exports.updateUserSearchCount = function (userId, productId, searchDetails) {
let addToSetData = {}
let key = `searchHistory.${productId}`
addToSetData[key] = { "timestamp": new Date().getTime(), "query": searchDetails }
return client.db("mydb").collection("userSearchHistory").updateOne({ "_id": userId }, { "$addToSet": addToSetData }, { upsert: true }, async (err, res) => {
})
}
Now, I want to get search history of a user based on query only using the db.find().
I want something like this:
db.find({"_id": "user1", "searchHistory.somewildcard.query": "some query"})
I need a wildcard which will replace ".somewildcard." to search in all products searched.
I saw a suggestion that we should store document like:
"_id": "user1"
searchHistory: [
{
"key": "product1",
"value": [
{
"timestamp": 1623482432,
"query": {
"query": "chocolate",
"qty": 2
}
}
]
}
]
However if I store document like this, then adding search history to existing document becomes a tideous and confusing task.
What should I do?

It's always a bad idea to save values are keys, for this exact reason you're facing. It heavily limits querying that field, obviously the trade off is that it makes updates much easier.
I personally recommend you do not save these searches in nested form at all, this will cause you scaling issues quite quickly, assuming these fields are indexed you will start seeing performance issues when the arrays get's too large ( few hundred searches ).
So my personal recommendation is for you to save it in a new collection like so:
{
"user_id": "1",
"key": "product1",
"timestamp": 1623482432,
"query": {
"query": "chocolate",
"qty": 2
}
}
Now querying a specific user or a specific product or even a query substring is all very easily supported by creating some basic indexes. an "update" in this case would just be to insert a new document which is also much faster.
If you still prefer to keep the nested structure, then I recommend you do switch to the recommended structure you posted, as you mentioned updates will become slightly more tedious, but you can still do it quite easily using arrayFilters for updating a specific element or just using $push for adding a new search

Related

mongo db how to store multi relation like graph

I have to store some users and their group relations like below
So I am planning to create a collection like below
UserGroupRelation Collections
{
"user":String,
"Group":String"
}
example of collections for Super admin users
{
"user":"Adminuser-1",
"Group":"Group1"
}
{
"user":"Adminuser-1",
"Group":"Group2"
}
{
"user":"Adminuser-1",
"Group":"Group3"
}
where user & Group column is indexed and I will run below kind of query
1.Whenever I want to check whether given user has access to the given group
db.UserGroupRelation.find( { user: "Adminuser-1", Group: "Group2" })
2.Also I want to delete all the association whenever we delete group
db.UserGroupRelation.deleteMany({ Group: "Group2" })
3.Also find all the users of a group
db.UserGroupRelation.find( { Group: "Group2" })
4.Find Hierarchy?, with my Approach I am not able to find
But with this approach I am duplicating lot of data also in real time I may have 10000 groups and 1 million user so there would be performance issue. And with this I am not able to maintain a hierarchy like SuperAdmin->SubAdmin->user of same group
I checked with mongo tree but it is not fitting to this requirement. is there a better way to handle this requirement in mongodb .?
This is the structure your graphic requirements show. It does still lead to repetition though so you will need to change it. Read up on one-many relationships.
{
"superAdmin_ID": "001",
"groups": [
{
"_id": "0",
"groupNumber": "1",
"users": [
{
"_userKey": "1023"
"userName": "Fred"
},
{
"_userKey": "1024"
"userName": "Steve"
}
],
"subAdmin": {
"_adminKey": "55230"
"adminName": "Maverick"
},
},
{
"_id": "1",
"groupNumber": "2",
"users": [
{
"_userKey": "1023"
"userName": "Fred"
},
{
"_userKey": "4026"
"userName": "Ella"
}
],
"subAdmin": {
"_adminKey": "55230"
"adminName": "Maverick"
},
},
{
"_id": "2",
"groupNumber": "3",
"users": [
{
"_userKey": "7026"
"userName": "James"
}
],
"subAdmin": {
"_adminKey": "77780"
"adminName": "Chloe"
},
},
]
}
You can also make subAdmin an array if you need more than one subAdmin within a group.

MongoDB document setup and aggregation

I'm pretty new to MongoDB and while preparing data to be consumed I got into Aggregation... what a powerful little thing this database has! I got really excited and started to test some things :)
I'm saving time entries for a companyId and employeeId ... that can have many entries... those are normally sorted by date, but one date can have several entries (multiple registrations in the same day)
I'm trying to come up with a good schema so I could easily get my data exactly how I need and as a newbie, I would rather ask for guidance and check if I'm in the right path
my output should be as
[{
"company": "474A5D39-C87F-440C-BE99-D441371BF88C",
"employee": "BA75621E-5D46-4487-8C9F-C0CE0B2A7DE2",
"name": "Bruno Alexandre":
"registrations": [{
"id": 1448364,
"spanned": false,
"spannedDay": 0,
"date": "2019-01-17",
"timeStart": "09:00:00",
"timeEnd": "12:00:00",
"amount": {
"days": 0.4,
"hours": 2,
"km": null,
"unit": "days and hours",
"normHours": 5
},
"dateDetails": {
"week": 3,
"weekDay": 4,
"weekDayEnglish": "Thursday",
"holiday": false
},
"jobCode": {
"id": null,
"isPayroll": true,
"isFlex": false
},
"payroll": {
"guid": null
},
"type": "Sick",
"subType": "Sick",
"status": "APP",
"reason": "IS",
"group": "LeaveAndAbsence",
"note": null,
"createdTimeStamp": "2019-01-17T15:53:55.423Z"
}, /* more date entries */ ]
}, /* other employees */ ]
what is the best way to add the data into a collection?
Is it more efficient if I create a document per company/employee and add all registration entries inside that document (it could get really big as time passes)... or is it better to have one document per company/employee/date and add all daily events in that document instead?
regarding aggregation, I'm still new to all this, but I'm imagining I could simply call
RegistrationsModel.aggregate([
{
$match: {
date: { $gte: new Date('2019-01-01'), $lte: new Date('2019-01-31') },
company: '474A5D39-C87F-440C-BE99-D441371BF88C'
}
},
{
$group: {
_id: '$employee',
name: { '$first': '$name' }
}
},
{
// ... get all registrations as an Array ...
},
{
$sort: {
'registrations.date': -1
}
}
]);
P.S. I'm taken the Aggregation course to start familiarized with all of it
Is it more efficient if I create a document per company/employee and
add all registration entries inside that document (it could get really
big as time passes)... or is it better to have one document per
company/employee/date and add all daily events in that document
instead?
From what I understand of document oriented databases, I would say the aim is to have all the data you need, in a specific context, grouped inside one document.
So what you need to do is identify what data you're going to need (getting close to the features you want to implement) and build your data structure according to that. Be sure to identify future features, cause the more you prepare your data structure to it, the less it will be tricky to scale your database to your needs.
Your aggregation query looks ok !

Querying the most recent posts in a MongoDB collection

Rather new to Mongodb/Mongoose/Node. Trying to make a query to retrieve the most recent posts (example being the 10 most recent posts) across all documents in a collection.
I tried querying this a few different ways.
MessageboardModel.find({"posts": {"time": {"$gte": ISODate("2014-07-02T00:00:00Z")}}} ...
I tried doing the above just to try getting to the proper nested time property, but everything I was trying throws an error. I'm definitely missing something here...
Here is an example document in the collection:
{
"_id": {
"$oid": "5c435d493dcf9281500cd177"
},
"movie": 433249,
"posts": [
{
"replies": [],
"_id": {
"$oid": "5c435d493dcf9281500cd142"
},
"username": "Username1",
"time": {
"$date": "2019-01-19T17:24:25.204Z"
},
"post": "This is a post title",
"content": "Content here."
},
{
"replies": [],
"_id": {
"$oid": "5c435d493dcf9281500cd123"
},
"username": "Username2",
"time": {
"$date": "2019-01-12T17:24:25.204Z"
},
"post": "This is another post made earlier",
"content": "Content here."
}
],
"__v": 0
}
There are many documents in the collection. I want to get, say the most recent 10 posts, across all of the documents in the entire collection.
Any help?
You can try using aggregation query:
Steps:
1> Match Specific doc
2> Stretch docs of its array using $unwind.
3> Sort using the time field from the posts.
4> Select fields , if specific fields needs to be shown.
5> Add limit, how many docs you want.
<YOUR_MODEL>.aggregate([
{$match:{
"movie": 433249 //you may add find conditions here, otherwise you can keep {} or remove $match from here
}},
{$unwind:"$posts"}, //this will make the each array element with different different docs.
{$sort:{"posts. time":1}}, // sort using the date field now, depends on your requirement use -1 /1
{$project:{posts:1}}, //select docs only from posts field. [u can remove if you want every element, or may modify]
{$limit:10} //you want only last 10 posts
]).exec();
let me know if you still having any issue or getting any error.
would love answer.

How to multi-sort MongoDB entry with dynamic keys, on two suboptions?

I'm trying to sort this in MongoDB with mongojs on a find():
{
"songs": {
"bNppHOYIgRE": {
"id": "bNppHOYIgRE",
"title": "Kygo - ID (Ultra Music Festival Anthem)",
"votes": 1,
"added": 1428514707,
"guids": [
"MzM3NTUx"
]
},
"izJzdDPH9yw": {
"id": "izJzdDPH9yw",
"title": "Benjamin Francis Leftwich - Atlas Hands (Samuraii Edit)",
"votes": 1,
"added": 1428514740,
"guids": [
"MzM3NTUx"
]
},
"Yifz3X_i-F8": {
"id": "Yifz3X_i-F8",
"title": "M83 - Wait (Kygo Remix)",
"votes": 0,
"added": 1428494338,
"guids": []
},
"nDopn_p2wk4": {
"id": "nDopn_p2wk4",
"title": "Syn Cole - Miami 82 (Kygo Remix)",
"votes": 0,
"added": 1428494993,
"guids": []
}
}
}
and I want to sort the keys in the songs on votes ascending and added descending.
I have tried
db.collection(coll).find().sort({votes:1}, function(err, docs) {});
but that doesn't work.
If this is an operation that you're going to be doing often, I would strongly consider changing your schema. If you make songs an array instead of a map, then you can perform this query using aggregation.
db.coll.aggregate([{ "$unwind": "$songs" }, { "$sort": { "songs.votes": 1, "songs.added": -1 }}]);
And if you put each of these songs in a separate songs collection, then you could perform the query with a simple find() and sort().
db.songs.find().sort({ "votes": 1, "added": -1 });
With your current schema, however, all of this logic would need to be in your application and it would get messy. A possible solution would be to get all of the documents and while iterating through the cursor, for each document, iterate through the keys, adding them to an array. Once you have all of the subdocuments in the array, sorting the array according to votes and added.
It is possible, but unnecessarily complex. And, of course, you wouldn't be able to take advantage of indexes, which would have an impact on your performance.
You already include the key inside the subdocument, so I would really recommend you reconsider your schema.

Mongo using indexes with sort

I'm trying to optimize a mongodb query. I have an index on from_account_id, to_account_id, and created_at. But the following query does a full collection scan.
{
"ts": {
"$date": "2012-03-18T20:29:27.038Z"
},
"op": "query",
"ns": "heroku_app2281692.transactions",
"query": {
"$query": {
"$or": [
{
"from_account_id": {
"$oid": "4f55968921fcaf0001000005"
}
},
{
"to_account_id": {
"$oid": "4f55968921fcaf0001000005"
}
}
]
},
"$orderby": {
"created_at": -1
}
},
"ntoreturn": 25,
"nscanned": 2643718,
"responseLength": 20,
"millis": 10499,
"client": "10.64.141.77",
"user": "heroku_app2281692"
}
If I don't do the or, and only query from_account_id or to_account_id with an order on it, it's fast.
What's the best way to get the desired effect? Should I be keeping account_ids (both from and to) in one field like an array? Or perhaps there is a better way. Thanks!
Unfortunately, as you have discovered, an $or clause can make life difficult for the optimizer.
So, to work around this you have a couple options. Among them:
Divide your query into two and manually merge the results.
Change your data model to allow efficient querying. For example, you might add a "referenced_accounts" field that is an array of all the accounts referenced in the transaction.