What does it mean when the operation of an oplog just has the id? - mongodb

I am investigating an issue where some data seems to be disappearing and while looking at the oplog of a certain document that fell in the scenario I noticed a weird operation and I am not sure what it means.
{
lsid: {
id: new UUID("foo"),
uid: Binary(Buffer.from("foo", "hex"), 0)
},
txnNumber: Long("27"),
op: 'u',
ns: 'db.foo',
o: { _id: ObjectId("foo") },
o2: { _id: ObjectId("foo") },
...
}
What exavtly does o: { _id: ObjectId("foo") } do on the document?

The format of the oplog in undocumented and could be changed between minor versions, so relying on it containing specific data in a certain form is unreliable.
If you really need to know what that structure means, it will require asking the MongoDB developers or delving into the source code.
If you just need to know what operations occurred on the node, use Change Streams

Debugging my code I found out that this operation was indeed the cause of my bug, running an update operation only specifying the _id removes every data from the document except for the _id.
I found the issue on my code, which is a ruby app using the mongoid gem, that was generating the operation, trying to call model.atomically without passing a block deletes all fields except for the id.

Related

MongoDB track which user mande changes

We're running a microservice architecture with multiple systems having access to shared collections in a MongoDB (yes, questionable design, but we're in the transition to get out of this.)
We're trying to find a way to track which change within the oplog was done by which "user"/service (each microservice is using different credentials) actually made this change. If we find any invalid changes in our DB this would make it super easy to find out which system is bugging out.
Any ideas?
The only ones I came up with so far would always have to change client side code, which I would like to avoid.
If you can maintain a column editors such that every operation would have an { $addToSet: { editors: { by: userId, at: new Date() } } }, then the corresponding oplog entries would have this information that you can cross-check with your access control list.
Note: you might have to change inserts to upserts, or account for another case: .insert({ ..., editors: [{ by: userId, at: new Date() }] })

Incremental field to existing collection

I have a collection containing around 100k documents. I want to add an auto incrementing "custom_id" field to my documents, and keep adding my documents by incrementing that field from now on.
What's the best approach for this? I've seen some examples in the official document (http://docs.mongodb.org/manual/tutorial/create-an-auto-incrementing-field/) however they're only for adding new documents, not for updating an existing collection.
Example code I created based on the link above to increment my counter:
function incrementAndGetNext(counter, callback) {
counters.findAndModify({
name: counter
}, [["_id", 1]], {
$inc: {
"count": 1
}
}, {
"new": true
}, function (err, doc) {
if (err) return console.log(err);
callback(doc.value);
})
}
On the above code counters is db.counters collection and I have this document there:
{_id:"...",name:"post",count:"0"}
Would love to know.
Thank you.
P.S. I'm using native mongojs driver for js
Well, using the link you mentionned, I'd rather use the counters collection approach.
The counters collections approach has some drawbacks including :
It always generates multiples request (two): one to get the sequence number, another to do the insertion using the id you got via the sequence,
If you are using sharding features of mongodb, a document responsible for storing a counter state may be used a lot, and each time it will reach the same server.
However it should be appropriate for most uses.
The approach you mentionned ("the optimistic loop") should not break IMO, and I don't guess why you have a problem with it. However I'd not recommend it. What happens if you execute the code on multiple mongo clients, if one has a lot of latency and others keep taking IDs? I'd not like to encounter this kind of problem... Furthermore, there are at least two request per successful operation, but no maximum of retries before a success...

Where to store users followings/followers? User's document of Followings collection? Fat document VS. polymorphic documents

What I'm talking about is:
Meteor.users.findOne() =
{
_id: "..."
...
followers: {
users: Array[], // ["someUserId1", "someUserId2"]
pages: Array[] // ["somePageId1", "somePageId2"]
}
}
vs.
Followings.findOne() =
{
_id: "..."
followeeId: "..."
followeeType: "user"
followerId: "..."
}
I found second one totally inefficient because I need to use smartPublish to publish user's followers.
Meteor.smartPublish('userFollowers', function(userId) {
var coursors = [],
followings = Followings.find({followeeId: userId});
followings.forEach(function(following) {
coursors.push(Meteor.users.find({_id: following.followerId}));
});
return coursors;
});
And I can't filter users inside the iron-router. I cache subscriptions so there may be more users than I need.
I want to do something like this:
data: function() {
return {
users: Meteor.users.find({_id: {$in: Meteor.user().followers.users}})
};
},
A bad thing about using nested arrays inside the Document is that if I've added an item to followers.users[], the whole array will be sent back to the client.
So what do you think? Is it better to keep such data inside the user Document so it'll become fat? May be it's a 'Meteor way' of solving such problems.
I think it's a better idea to keep it nested inside the user document. Storing it in a separate collection leads to a lot of unnecessary duplication, and every time the publish function is run you have to scan the entire collection again. If you're worrying about the arrays growing too large, in most cases, don't (generally, a full-text novel only takes a few hundred kb). Plus, if you're publishing your user document already, you don't have to pull any new documents into memory; you already have everything you need.
This MongoDB blog post seems to advocate a similar approach (see one-to-many section). It might be worth checking out.
You seem to be aware of the pros and cons of each option. Unfortunately, your question is mostly opinion based.
Generally, if your follower arrays will be small in size and don't change often, keep them embedded.
Otherwise a dedicated collection is the way to go.
For that case, you might want to take a look at https://atmospherejs.com/cottz/publish which seems very efficient in what it does and very easy to implement syntactically.

Mongo for Meteor data design: opposite of normalizing?

I'm new to Meteor and Mongo. Really digging both, but want to get feedback on something. I am digging into porting an app I made with Django over to Meteor and want to handle certain kinds of relations in a way that makes sense in Meteor. Given, I am more used to thinking about things in a Postgres way. So here goes.
Let's say I have three related collections: Locations, Beverages and Inventories. For this question though, I will only focus on the Locations and the Inventories. Here are the models as I've currently defined them:
Location:
_id: "someID"
beverages:
_id: "someID"
fillTo: "87"
name: "Beer"
orderWhen: "87"
startUnits: "87"
name: "Second"
number: "102"
organization: "The Second One"
Inventories:
_id: "someID"
beverages:
0: Object
name: "Diet Coke"
units: "88"
location: "someID"
timestamp: 1397622495615
user_id: "someID"
But here is my dilemma, I often need to retrieve one or many Inventories documents and need to render the "fillTo", "orderWhen" and "startUnits" per beverage. Doing things the Mongodb way it looks like I should actually be embedding these properties as I store each Inventory. But that feels really non-DRY (and dirty).
On the other hand, it seems like a lot of effort & querying to render a table for each Inventory taken. I would need to go get each Inventory, then lookup "fillTo", "orderWhen" and "startUnits" per beverage per location then render these in a table (I'm not even sure how I'd do that well).
TIA for the feedback!
If you only need this for rendering purposes (i.e. no further queries), then you can use the transform hook like this:
var myAwesomeCursor = Inventories.find(/* selector */, {
transform: function (doc) {
_.each(doc.beverages, function (bev) {
// use whatever method you want to receive these data,
// possibly from some cache or even another collection
// bev.fillTo = ...
// bev.orderWhen = ...
// bev.startUnits = ...
}
}
});
Now the myAwesomeCursor can be passed to each helper, and you're done.
In your case you might find denormalizing the inventories so they are a property of locations could be the best option, especially since they are a one-to-many relationship. In MongoDB and several other document databases, denormalizing is often preferred because it requires fewer queries and updates. As you've noticed, joins are not supported and must be done manually. As apendua mentions, Meteor's transform callback is probably the best place for the joins to happen.
However, the inventories may contain many beverage records and could cause the location records to grow too large over time. I highly recommend reading this page in the MongoDB docs (and the rest of the docs, of course). Essentially, this is a complex decision that could eventually have important performance implications for your application. Both normalized and denormalized data models are valid options in MongoDB, and both have their pros and cons.

How to store an object in MongoDB that has a key that starts with $

I want to save changes made to my document. The easiest way to do this is to store the actual changes made to a document. What I mean:
var changes = {
$set: {
text: 'Some text.'
}
}
db.posts.update({
_id: _id
}, changes)
db.changes.insert({
postid: _id,
changes: changes
})
However I'm getting the error (with good reason):
Error: key $set must not start with '$'
What's the easiest way to store changes?
Or perhaps I'm approaching the problem wrong and you have a better solution. I want users to be able to see a log of changes people make to any post or, in fact, anything. I'm not going to make a function for every time of change. Editing the text is just one of many ways to make changes to a document.
Another option, with many reservations, is to store your change log as a json string. The content would not be as easily searched, of course, but you can retain the simplicity of storing your original data as a string and decoding the json on retrieval. If you are simply storing a change log, this approach might work.
db.changes.insert({
postid: _id,
changes: JSON.stringify(changes)
})
This is a limitation within MongoDB. There are certain reserved characters one of them being $ due to how querying must work. When using operators there would be ambiguity between the document in the collection and the document used for updating.
I would recommend stripping out the $ symbols. I would instead use these words in place of the operators you are trying to use:
CREATE
SET
DELETE