How to remove related documents after removing by TTL? - mongodb

As part of my MongoDB, I have three different collections - A, B and AtoB.
A and B are different types of entities, where AtoB connects between them as follows-
A:
{
_id: ObjectId,
timestamp: Date,
keyA: string
}
B:
{
_id: ObjectId,
timestamp: Date,
keyB: number
}
AtoB:
{
_id: ObjectId,
aId: ObjectId, // Points to a document from A
bId: ObjectId // Points to a document from B
}
I created a TTL index on A documents - that will be deleted when the timestamp key is older than an hour.
Is it possible somehow to remove all the related AtoB documents, based on the removed _id property of the removed As?
In other words, is it possible to not only remove the A documents using the TTL, but also remove the related documents of the ones the were removed?
Thanks

In a word - no, not in this set up.
The options you have:
set up a changestream worker to delete links
set up a cron job to clean up links collection every minute
embed AtoB links into A documents
I would recommend the later, but it really depends on how feasible the change is for the rest of your application. Having a dedicated lookup collection is really a RDBS practice. It has very niche usecases in Mongo universe.
Your A documents will looks like this:
{
_id: ObjectId,
timestamp: Date,
keyA: string,
bIds: [
bId: ObjectId,
bId: ObjectId,
....
]
}
When the document's ttl expires the document is removed with all links at once.

Related

Sorting nested objects in MongoDB

So I have documents that follow this schema:
{
_id: String,
globalXP: {
xp: {
type: Number,
default: 0
},
level: {
type: Number,
default: 0
}
},
guilds: [{ _id: String, xp: Number, level: Number }]
}
So basically users have their own global XP and xp based on each guild they are in.
Now I want to make a leaderboard for all the users that have a certain guildID in their document.
What's the most efficient way to fetch all the user documents that have the guild _id in their guilds array and how do I sort them afterwards?
I know it might be messy as hell but bare with me here.
If I've understand well, you only need this line of code:
var find = await model.find({"guilds._id":"your_guild_id"}).sort({"globalXP.level":-1})
This query will return all documentas where guilds array contains the specific _id and sort by player level.
In this way the best level will be displayed first.
Here is an example how the query works. Please check if it work as you expected.

Creating compound indexes that will match queries in MongoDB

For our app I'm using the free-tier (for now) on MongoDB-Atlas.
our documents have, among other fields, a start time which is a Datetime object, and a userId int.
export interface ITimer {
id?: string,
_id?: string, // id property assigned by mongo
userId?: number,
projectId?: number,
description?: string,
tags?: number[],
isBillable?: boolean,
isManual?: boolean,
start?: Date,
end?: Date,
createdAt?: Date,
updatedAt?: Date,
createdBy?: number,
updatedBy?: number
};
I'm looking for an index that will match the following query:
let query: FilterQuery<ITimer> = {
start: {$gte: start, $lte: end},
userId: userId,
};
Where start and end parameters are date objects or ISOStrings passed to define a range of days.
Here I invoke the query, sorting results:
let result = await collection.find(query).sort({start: 1}).toArray();
It seems simple enough that the following index would match the above query:
{
key: {
start: 1,
userId: 1,
},
name: 'find_between_dates_by_user',
background: true,
unique: false,
},
But using mongodb-compass to monitor the collection, I see this index is not used.
Moreover, mongodb documentation specifically states that if an index matches a query completely, than no documents will have to be examined, and the results will be based on the indexed information alone.
unfortunately, for every query I run, I see documents ARE examined, meaning my indexes do not match.
Any suggestions? I feel this should be very simple and straightforward, but maybe I'm missing something.
attached is an screenshot from mongodb-compass 'explaining' the query and execution.

MongoDB schema design: reference by ID vs. reference by name?

With this simple example
(use short ObjectId to make it read easier)
Tag documents:
{
_id: ObjectId('0001'),
name: 'JavaScript',
// other data
},
{
_id: ObjectId('0002'),
name: 'MongoDB',
// other data
},
...
Assume that we need a individual tag collection, e.g. we need to store some information on each tag.
If reference by ID:
// a book document
{
_id: ObjectId('9876'),
title: 'MEAN Web Development',
tags: [ObjectId('0001'), ObjectId('0002'), ...]
}
If reference by name:
{
_id: ObjectId('9876'),
title: 'MEAN Web Development',
tags: ['JavaScript', 'MongoDB', ...]
}
It's known that "reference by ID" is feasible.
I'm thinking if use "reference by name", a query for book's info only need to find within the book collection, we could know the tags' name without a join ($lookup) operation, which should be faster.
If the app performs a tag checking before book creating and modifying, this should also be feasible, and faster.
I'm still not very sure:
Is there any hider on "reference by name" ?
Will "reference by name" slower on "finding all books with a given tag" ? Maybe ObjectId is somehow special ?
Thanks.
I would say it depends on what your use case is for tags. As you say, it will be more expensive to do a $lookup to retrieve tag names if you reference by id. On the other hand, if you expect that tag names may change frequently, all documents in the book collection containing that tag will need to be updated every change.
The ObjectID is simply a 12 byte value, which is autogenerated by a driver if no _id is present in inserted documents. See the MongoDB docs for more info. The only "special behavior" would be the fact that _id has an index by default. An index will speedup lookups in general, but indexes can be created on any field, not just _id.
In fact, the _id does not need to be an ObjectID. It is perfectly legal to have documents with integer _id values for instance:
{
_id: 1,
name: 'Javascript'
},
{
_id: 2,
name: 'MongoDB'
},

Is it better to save id of a document in another document as ObjectId or String

Lets take a simple "bad" example : lets assume I have 2 collections 'person' and 'address'. And lets assume in 'address' I want to store '_id' of the person the address is associated with. Is there any benefit to store this "referential key" item as ObjectId vs string in 'address' collection?
I feel like storing them as string should not hurt but I have not worked in mongo for very long and do not know if it will hurt down the road if I follow this pattern.
I read the post here : Store _Id as object or string in MongoDB?
And its said that ObjectId is faster, and I assume its true if you are fetching/updating using the ObjectId in parent collection(for eg. fetching/updating 'person' collection using person._id as ObjectId), but I couldn't find anything that suggests that same could be true if searching by string id representation in other collection(in our example search in address collection by person._id as string)
Your feedback is much appreciated.
Regardless of performance, you should store the "referential key" in the same format as the _id field that you are referring too. That means that if your referred document is:
{ _id: ObjectID("68746287..."), value: 'foo' }
then you'd refer to it as:
{ _id: ObjectID(…parent document id…), subDoc: ObjectID("68746287...")
If the document that you're pointing to has a string as an ID, then it'd look like:
{ _id: "derick-address-1", value: 'foo' }
then you'd refer to it as:
{ _id: ObjectID(…parent document id…), subDoc: "derick-address-1" }
Besides that, because you're talking about persons and addresses, it might make more sense to not have them in two documents altogether, but instead embed the document:
{ _id: ObjectID(…parent document id…),
'name' : 'Derick',
'addresses' : [
{ 'type' : 'Home', 'street' : 'Victoria Road' },
{ 'type' : 'Work', 'street' : 'King William Street' },
]
}
As for use string as id of document, in meteor collection, you could generate the document id either Random.id() as string or Meteor.Collection.ObjectID() as ObjectId.
In this discussion loop, Mongodb string id vs ObjectId, here is one good summary,
ObjectId Pros
it has an embedded timestamp in it.
it's the default Mongo _id type; ubiquitous
interoperability with other apps and drivers
ObjectId Cons
it's an object, and a little more difficult to manipulate in practice.
there will be times when you forget to wrap your string in new ObjectId()
it requires server side object creation to maintain _id uniqueness
- which makes generating them client-side by minimongo problematic
String Pros
developers can create domain specific _id topologies
String Cons
developer has to ensure uniqueness of _ids
findAndModify() and getNextSequence() queries may be invalidated
All those information above is based on the meteor framework. For Mongodb, it is better to use ObjectId, reasons are in the question linked in your question.
Storing it as objectId is benificial. It is faster as ObjectId size is 12 bytes compared to string which takes 24 bytes.
Also, You should try to de-normalize your collections so that you don't need to make 2 collections (Opposite to RDBMS).
Something like this might be better in general:
{ _id : "1",
person : {
Name : "abc",
age: 20
},
address : {
street : "1st main",
city: "Bangalore",
country: "India"
}
}
But again, it depends on your use case. This might be not suitable sometimes.
Hope that helps! :)

How to store related records in mongodb?

I have a number of associated records such as below.
Parent records
{
_id:234,
title: "title1",
name: "name1",
association:"assoc1"
},
Child record
{
_id:21,
title: "child title",
name: "child name1",
},
I want to store such records into MongoDb. Can anyone help?
Regards.
Even MongoDB doesn't support joins, you can organize data in several different ways:
1) First of all, you can inline(or embed) related documents. This case is useful, if you have some hierarchy of document, e.g. post and comments. In this case you can like so:
{
_id: <post_id>,
title: 'asdf',
text: 'asdf asdf',
comments: [
{<comment #1>},
{<comment #2>},
...
]
}
In this case, all related data will be in the save document. You can fetch it by one query, but pushing new comments to post cause moving this document on disk, frequent updates will increase disk load and space usage.
2) referencing is other technique you can use: in each document, you can put special field that contains _id of parent/related object:
{
_id: 1,
type: 'post',
title: 'asdf',
text: 'asdf asdf'
},
{
_id:2
type: 'comment',
text: 'yep!',
parent_id: 1
}
In this case you store posts and comments in same collection, therefor you have to store additional field type. MongoDB doesn't support constraints or any other way to check data constancy. This means that if you delete post with _id=1, comments with _id=2 store broken link in parent_id.
You can separate posts from comments in different collections or even databases by using database references, see your driver documentation for more details.
Both solutions can store tree-structured date, but in different way.