Meteor reactive publish data from different collections - mongodb

i try to build a homeautomation system with meteor. therefore i would like to do the following thing.
i have a collection with all my different liveValues i'm reading from any kind of source. each document is a value of a for example sensor with the actual value.
now i want to create a second collection called thing. in this collection i'd like to add all my "Things" for example "Roomtemperature living" with the data for this thing. one attribute should be the connection to one of liveValues.
Now i want to publish and subscribe with Meteor the Thing collection, because on the webinterface it doesn't matter what liveValue is behind the Thing.
Here, the in my optionen, complicated part starts.
How can i publish the data to the client and i will have a reactive update when the LiveValue has changend for the thing? because it's an differnt collection than "Thing" collection.
In my idea i would like to do this via one subscrition to one "thing" document and i will get back with this subscription the update of the liveValue of the liveValue collection.
Is this workable?
has somebody an idea how i can handle this?
i've heard something about meteor-reactive-publish but i'not sure if this is the solution. also i've heard that this needs a lots of power for the server.
thanks for your help.

So basically you want to merge the documents on server side to one reactive collection on client-side.
You should use observeChanges provided by Meteor Collections as described in the docs.
By this you can observe the changes on your server side collections and publish to your client-side aggregated collection, like this:
// Get the data from a kind of sensor
var cursor = SomeSensor.find({/* your query */});
var self = this;
// Observe the changes in the cursor and publish
// it to the 'things' collection in client
var observer = cursor.observeChanges({
added: function (document) {
self.added('things', document._id, document);
},
removed: function (document) {
self.removed('things', document._id, document);
},
changed: function (document) {
self.changed('things', document._id, document);
}
});
// Make the publication ready
self.ready();
// Stop the observer on subscription stop
self.onStop(function () {
observer.stop();
});
With this the things collection will have the data from all the sensors reactively.
Hope it helps you.

Related

How to persist aggregate/read model from "EventStore" in a database?

Trying to implement Event Sourcing and CQRS for the first time, but got stuck when it came to persisting the aggregates.
This is where I'm at now
I've setup "EventStore" an a stream, "foos"
Connected to it from node-eventstore-client
I subscribe to events with catchup
This is all working fine.
With the help of the eventAppeared event handler function I can build the aggregate, whenever events occur. This is great, but what do I do with it?
Let's say I build and aggregate that is a list of Foos
[
{
id: 'some aggregate uuidv5 made from barId and bazId',
barId: 'qwe',
bazId: 'rty',
isActive: true,
history: [
{
id: 'some event uuid',
data: {
isActive: true,
},
timestamp: 123456788,
eventType: 'IsActiveUpdated'
}
{
id: 'some event uuid',
data: {
barId: 'qwe',
bazId: 'rty',
},
timestamp: 123456789,
eventType: 'FooCreated'
}
]
}
]
To follow CQRS I will build the above aggregate within a Read Model, right? But how do I store this aggregate in a database?
I guess just a nosql database should be fine for this, but I definitely need a db since I will put a gRPC APi in front of this and other read models / aggreates.
But what do I actually go from when I have built the aggregate, to when to persist it in the db?
I once tried following this tutorial https://blog.insiderattack.net/implementing-event-sourcing-and-cqrs-pattern-with-mongodb-66991e7b72be which was super simple, since you'd use mongodb both as the event store and just create a view for the aggregate and update that one when new events are incoming. It had it's flaws and limitations (the aggregation pipeline) which is why I now turned to "EventStore" for the event store part.
But how to persist the aggregate, which is currently just built and stored in code/memory from events in "EventStore"...?
I feel this may be a silly question but do I have to loop over each item in the array and insert each item in the db table/collection or do you somehow have a way to dump the whole array/aggregate there at once?
What happens after? Do you create a materialized view per aggregate and query against that?
I'm open to picking the best db for this, whether that is postgres/other rdbms, mongodb, cassandra, redis, table storage etc.
Last question. For now I'm just using a single stream "foos", but at this level I expect new events to happen quite frequently (every couple of seconds or so) but as I understand it you'd still persist it and update it using materialized views right?
So given that barId and bazId in combination can be used for grouping events, instead of a single stream I'd think more specialized streams such as foos-barId-bazId would be the way to go, to try and reduce the frequency of incoming new events to a point where recreating materialized views will make sense.
Is there a general rule of thumb saying not to recreate/update/refresh materialized views if the update frequency gets below a certain limit? Then the only other a lternative would be querying from a normal table/collection?
Edit:
In the end I'm trying to make a gRPC api that has just 2 rpcs - one for getting a single foo by id and one for getting all foos (with optional field for filtering by status - but that is not so important). The simplified proto would look something like this:
rpc GetFoo(FooRequest) returns (Foo)
rpc GetFoos(FoosRequest) returns (FooResponse)
message FooRequest {
string id = 1; // uuid
}
// If the optional status field is not specified, return all foos
message FoosRequest {
// If this field is specified only return the Foos that has isActive true or false
FooStatus status = 1;
enum FooStatus {
UNKNOWN = 0;
ACTIVE = 1;
INACTIVE = 2;
}
}
message FoosResponse {
repeated Foo foos;
}
message Foo {
string id = 1; // uuid
string bar_id = 2 // uuid
string baz_id = 3 // uuid
boolean is_active = 4;
repeated Event history = 5;
google.protobuf.Timestamp last_updated = 6;
}
message Event {
string id = 1; // uuid
google.protobuf.Any data = 2;
google.protobuf.Timestamp timestamp = 3;
string eventType = 4;
}
The incoming events would look something like this:
{
id: 'some event uuid',
barId: 'qwe',
bazId: 'rty',
timestamp: 123456789,
eventType: 'FooCreated'
}
{
id: 'some event uuid',
isActive: true,
timestamp: 123456788,
eventType: 'IsActiveUpdated'
}
As you can see there is no uuid to make it possible to GetFoo(uuid) in the gRPC API, which is why I'll generate a uuidv5 with the barId and bazId, which will combined, be a valid uuid. I'm making that in the projection / aggregate you see above.
Also the GetFoos rpc will either return all foos (if status field is left undefined), or alternatively it'll return the foo's that has isActive that matches the status field (if specified).
Yet I can't figure out how to continue from the catchup subscription handler.
I have the events stored in "EventStore" (https://eventstore.com/), using a subscription with catchup, I have built an aggregate/projection with an array of Foo's in the form that I want them, but to be able to get a single Foo by id from a gRPC API of mine, I guess I'll need to store this entire aggregate/projection in a database of some sort, so I can connect and fetch the data from the gRPC API? And every time a new event comes in I'll need to add that event to the database also or how is this working?
I think I've read every resource I can possibly find on the internet, but still I'm missing some key pieces of information to figure this out.
The gRPC is not so important. It could be REST I guess, but my big question is how to make the aggregated/projected data available to the API service (possible more API's will need it as well)? I guess I will need to store the aggregated/projected data with the generated uuid and history fields in a database to be able to fetch it by uuid from the API service, but what database and how is this storing process done, from the catchup event handler where I build the aggregate?
I know exactly how you feel! This is basically what happened to me when I first tried to do CQRS and ES.
I think you have a couple of gaps in your knowledge which I'm sure you will rapidly plug. You hydrate an aggregate from the event stream as you are doing. That IS your aggregate persisted. The read model is something different. Let me explain...
Your read model is the thing you use to run queries against and to provide data for display to a UI for example. Your aggregates are not (directly) involved in that. In fact they should be encapsulated. Meaning that you can't 'see' their state from the outside. i.e. no getter and setters with the exception of the aggregate ID which would have a getter.
This article gives you a helpful overview of how it all fits together: CQRS + Event Sourcing – Step by Step
The idea is that when an aggregate changes state it can only do so via an event it generates. You store that event in the event store. That event is also published so that read models can be updated.
Also looking at your aggregate it looks more like a typical read model object or DTO. An aggregate is interested in functionality, not properties. So you would expect to see void public functions for issuing commands to the aggregate. But not public properties like isActive or history.
I hope that makes sense.
EDIT:
Here are some more practical suggestions.
"To follow CQRS I will build the above aggregate within a Read Model, right? "
You do not build aggregates in the read model. They are separate things on separate sides of the CQRS side of the equation. Aggregates are on the command side. Queries are done against read models which are different from aggregates.
Aggregates have public void functions and no getter or setters (with the exception of the aggregate id). They are encapsulated. They generate events when their state changes as a result of a command being issued. These events are stored in an event store and are used to recover the state of an aggregate. In other words, that is how an aggregate is stored.
The events go on to be published so the event handlers and other processes can react to them and update the read model and or trigger new cascading commands.
"Last question. For now I'm just using a single stream "foos", but at this level I expect new events to happen quite frequently (every couple of seconds or so) but as I understand it you'd still persist it and update it using materialized views right?"
Every couple of seconds is very likely to be fine. I'm more concerned at the persist and update using materialised views. I don't know what you mean by that but it doesn't sound like you have the right idea. Views should be very simple read models. No need to complex relations like you find in an RDMS. And is therefore highly optimised fast for reading.
There can be a lot of confusion on all the terminologies and jargon used in DDD and CQRS and ES. I think in this case, the confusion lies in what you think an aggregate is. You mention that you would like to persist your aggregate as a read model. As #Codescribler mentioned, at the sink end of your event stream, there isn't a concept of an aggregate. Concretely, in ES, commands are applied onto aggregates in your domain by loading previous events pertaining to that aggregate, rehydrating the aggregate by folding each previous event onto the aggregate and then applying the command, which generates more events to be persisted in the event store.
Down stream, a subscribing process receives all the events in order and builds a read model based on the events and data contained within. The confusion here is that this read model, at this end, is not an aggregate per se. It might very well look exactly like your aggregate at the domain end or it could be only creating a read model that doesn't use all the events and or the event data.
For example, you may choose to use every bit of information and build a read model that looks exactly like the aggregate hydrated up to the newest event(likely your source of confusion). You may instead have another process that builds a read model that only tallies a specific type of event. You might even subscribe to multiple streams and "join" them into a big read model.
As for how to store it, this is really up to you. It seems to me like you are taking the events and rebuilding your aggregate plus a history of events in a memory structure. This, of course, doesn't scale, which is why you want to store it at rest in a database. I wouldn't use the memory structure, since you would need to do a lot of state diffing when you flush to the database. You should be modify the database directly in response to each individual event. Ideally, you also transactionally store the stream count with said modification so you don't process the same event again in the case of a failure.
Hope this helps a bit.

Add an extra field to collection only on the client side and not affecting server side

Is there a way to add or inject an additional or new field to a collection and then being able to access that newly inserted field to the collection on the client side and displaying it on a template or accessing it without affecting the server side? I didn't want to compromise with the APIs or the database since it's simply a count of something, like when a customer has a total of 8 deliveries.
I was doing this code where I'm subscribing to a collection and then trying to update the collection on the client side but obviously I should have no rights on updating it:
Meteor.subscribe("all_customers", function () {
Customer.find({}).forEach(function(data) {
var delivery_count = Delivery.find({ customerID: data._id }).count();
Customer.update( { _id: data._id } , { $push: { deliveries: delivery_count } } );
});
}),
And then doing this one where I'd try to manipulate the collection by inserting new key-value pair but it won't display or nothing at all when it's sent:
Meteor.subscribe("all_customers", function () {
Customer.find({}).forEach(function(data) {
var delivery_count = Delivery.find({ customerID: data._id }).count();
data.deliveries = delivery_count;
});
}),
My main objective is to basically be able to sort the count of the deliveries in the EasySearch package without compromising with the APIs or database but the package only sorts an existing field. Or is there a way to do this with the package alone?
Forgive me if this question may sound dumb but I hope someone could help.
It's completely possible, for that you can use low level publications described here: https://docs.meteor.com/api/pubsub.html
You have 2 solutions, one standard, the other hacky:
You follow documentation, you transform your document before sending it to the client. Your server data is not affected and you can enhance data (even do joins) from your subscription. It's cool but you can't push it too far as it might collide and conflict with another publication for the same collection on the same page. Solution is well explained here: How to publish a view/transform of a collection in Meteor?
You can do a client-only collection which will be customs objects with only the data you need. You still have a server-side observer on the source collection but you'll fetch new objects that client will treat and display without any calculation. You need more code but this way, you don't mess with classic server-client collections. It's a bit overkill so ping me if you want a detailed example.

Meteor Pub / Sub behaviour

I'm currently implementing a realtime search function in my app and I've come across some behaviour which I'm confused about.
The background is: I have two subscriptions from the same MongoDB database on my server, named posts.
The first subscription subscribes to the latest 50 posts, and sends the data to the MiniMongo collection Posts.
The second subscriptions subscribes to the post matching whatever search is entered by the user, and sends this data to MiniMongo collection PostsSearch as per below.
// client
Posts = new Mongo.Collection('posts');
PostsSearch = new Mongo.Collection('postsSearch');
// server
Meteor.publish('postsPub', function(options, search) {
return Posts.find(search, options);
});
Meteor.publish('postsSearchPub', function(options, search) {
var self = this;
var subHandle = Posts.find(search, options).observeChanges({
added: function (id, fields) {
self.added("postsSearch", id, fields);
}
});
self.ready();
});
My question is, we know from the docs:
If you pass a name when you create the collection, then you are
declaring a persistent collection — one that is stored on the server
and seen by all users. Client code and server code can both access the
same collection using the same API.
However this isn't the case with PostsSearch. When a user starts searching on the client, the functionality works perfectly as expected - the correct cursors are sent to the client.
However I do not see a postsSearch in my MongoDB database and likewise, PostsSearch isn't populated on any other client other than my own.
How is this happening? What is self.added("postsSearch", id, fields); appearing to do that's it's able to send cursors down the wire to the client but not to the MongoDB database.
According to this doc, self.added("postsSearch", id, fields); informs the client-side that a document has been added to the postsSeach collection.
And according to Meteor.publish:
Alternatively, a publish function can directly control its published record set by calling the functions added (to add a new document to the published record set), ...
So I'm guessing that self.added does both of these operations: Adds a document to the published record set, and informs the client (that has subscribed to the current publication) of this addition.
Now if you see Meteor.subscribe:
When you subscribe to a record set, it tells the server to send records to the client. The client stores these records in local Minimongo collections, with the same name as the collection argument used in the publish handler's added, changed, and removed callbacks. Meteor will queue incoming records until you declare the Mongo.Collection on the client with the matching collection name.
This suggests 2 things:
You have to subscribe in order to receive the data from the server-side database.
Some kind of client-side code must exist in order to create a client-only postsSearch collection. (this is because you said, this collection doesn't exist on server-side database).
The 2nd point can be achieved quite easily, for example:
if(Meteor.isClient) {
postsSearch = new Mongo.Collection(null);
}
In the above example, the postsSearch collection will exist only on the client and not on the server.
And regarding the 1st, being subscribed to postsSearchPub will automatically send data for the postsSearch collection to the client (even if said collection doesn't exist in the server-side database. This is because of the explicit call to self.added).
Something to check out: According to this doc, self.ready(); calls the onReady callback of the subscription. It would be useful to see what is there in this callback, perhaps the client-only postsSearch collection is defined there?
From the doc:
this.added(collection, id, fields)
Call inside the publish function.
Informs the subscriber that a document has been added to the record set.
This means that the line self.added("postsSearch", id, fields); emulates the fact that an insert has been done to the PostsSearch collection although it's obviously not the case.
Concerning the absence of MongoDB collection, it could be related to Meteor laziness which creates the MongoDB collection at first insert, not sure though.

Meteor: difference between names for collections, variables, publications, and subscriptions?

In the Discover Meteor examples, what's the diff between "posts" and "Posts"? Why is it that when we do an insert from the server we use "posts" but when querying from the browser we use "Posts"? Wouldn't the system be confused by the case differences?
I see the variable assignment for client Posts to the server posts in posts.js. Is it a conventional notation to capitalize client and use small caps for server?
Posts = new Meteor.Collection('posts')
Why does server/fixtures.js use "Posts"? I was under the assumption that we query "Posts" in the browser (client), and use "posts" in the server, like we did in meteor mongo. So why are we now using Posts in the server?
Let's distinguish between the different names you might have to deal with when programming Meteor:
Variable names, such as Posts = new Meteor.Collection(...). These are used only so your code knows how to access this variable. Meteor doesn't know or care what it is, although the convention is to capitalize.
Collection names, such as new Meteor.Collection("posts"). This maps to the name of a MongoDB collection (on the server) or a minimongo collection (on the client).
Publication and subscription names, used in Meteor.publish("foo", ...) or Meteor.subscribe("foo"). These have to match up for the client to subscribe to some data on the server.
There are two things you need to match up in the Meteor data model:
Names of publications and their corresponding subscriptions
(usually) Names of collections on the client and server, if using the default collection model
A subscription name needs to always match up with the name of a publication. However, the collections that are sent for a given subscription needn't have anything to do with the subscription name. In fact, one can send over multiple cursors in one publication or one collection over different publications or even multiple subscriptions per publication, which appear merged as one in the client. You can also have different collection names in the server and client; read on...
Let's review the different cases:
Simple subscription model. This is the one you usually see in straightforward Meteor demos.
On client and server,
Posts = new Meteor.Collection("posts");
On server only:
Meteor.publish("postsPub", function() {
return Posts.find()
});
On client only:
Meteor.subscribe("postsPub")
This synchronizes the Posts collection (which is named posts in the database) using the publication called postsPub.
Multiple collections in one publication. You can send multiple cursors over for a single publication, using an array.
On client and server:
Posts = new Meteor.Collection("posts");
Comments = new Meteor.Collection("comments");
On server only:
Meteor.publish("postsAndComments", function() {
return [
Posts.find(),
Comments.find()
];
});
On client only:
Meteor.subscribe("postsAndComments");
This synchronizes the Posts collection as well as the Comments collection using a single publication called postsAndComments. This type of publication is well-suited for relational data; for example, where you might want to publish only certain posts and the comments associated only with those posts. See a package that can build these cursors automatically.
Multiple publications for a single collection. You can use multiple publications to send different slices of data for a single collection which are merged by Meteor automatically.
On server and client:
Posts = new Meteor.Collection("posts");
On server only:
Meteor.publish("top10Posts", function() {
return Posts.find({}, {
sort: {comments: -1},
limit: 10
});
});
Meteor.publish("newest10Posts", function() {
return Posts.find({}, {
sort: {timestamp: -1},
limit: 10
});
});
On client only:
Meteor.subscribe("top10Posts");
Meteor.subscribe("newest10Posts");
This pushes both the 10 posts with the most comments as well as the 10 newest posts on the site to the user, which sees both sets of data merged into a single Posts collection. If one of the newest posts is also a post with the most comments or vice versa, the Posts collection will contain less than 20 items. This is an example of how the data model in Meteor allows you to do powerful data merging operations without implementing the details yourself.
Multiple subscriptions per publication. You can get multiple sets of data from the same publication using different arguments.
On server and client:
Posts = new Meteor.Collection("posts");
On server only:
Meteor.publish("postsByUser", function(user) {
return Posts.find({
userId: user
});
});
On client only:
Meteor.subscribe("postsByUser", "fooUser");
Meteor.subscribe("postsByUser", "barUser");
This causes the posts by fooUser and barUser to both show up in the posts collection. This model is convenient when you have several different computations that are looking at different slices of your data and may be updated dynamically. Note that when you subscribe inside a Deps.autorun(...), Meteor calls stop() on any previous subscription handle with the same name automatically, but if you are using these subscriptions outside of an autorun you will need to stop them yourself. As of right now, you can't do two subscriptions with the same name inside an autorun computation, because Meteor can't tell them apart.
Pushing arbitrary data over a publication. You can completely customize publications to not require the same collection names on the server and client. In fact, the server can publish data that isn't backed by a collection at all. To do this, you can use the API for the publish functions.
On server only:
Posts = new Meteor.Collection("posts");
Meteor.publish("newPostsPub", function() {
var sub = this;
var subHandle = null;
subHandle = Posts.find({}, {
sort: {timestamp: -1},
limit: 10
})
.observeChanges({
added: function(id, fields) {
sub.added("newposts", id, fields);
},
changed: function(id, fields) {
sub.changed("newposts", id, fields);
},
removed: function(id) {
sub.removed("newposts", id);
}
});
sub.ready();
sub.onStop(function() {
subHandle.stop();
})
});
On client only:
NewPosts = new Meteor.Collection("newposts");
Meteor.subscribe("newPostsPub");
This synchronizes the newest 10 posts from the Posts collection on the server (called posts in the database) to the NewPosts collection on the client (called newposts in minimongo) using the publication/subscription called newPostsPub. Note that observeChanges differs from observe, which can do a bunch of other things.
The code seems complicated, but when you return a cursor inside a publish function, this is basically the code that Meteor is generating behind the scenes. Writing publications this way gives you a lot more control over what is and isn't sent to the client. Be careful though, as you must manually turn off observe handles and mark when the subscription is ready. For more information, see Matt Debergalis' description of this process (however, that post is out of date). Of course, you can combine this with the other pieces above to potentially get very nuanced and complicated publications.
Sorry for the essay :-) but many people get confused about this and I though it would be useful to describe all the cases.
You decide the naming conventions, and meteor doesn't care.
Posts becomes a collection of documents from the mongo server. You find posts by calling Posts.find({author: 'jim}). In the example you wrote, meteor is being told to internally call that collection 'posts'. Hopefully this is easy to remember if the names are similar...
There needs to be a way to express and track what info is available to clients. Sometimes there may be multiple sets of information, of varying detail. Example: a summary for a title listing, but detail for a particular document. These are often also named 'posts' so it can be initially confusing:
Meteor.publish "posts", -> # on server
Posts.find()
and then
dbs.subscriptions.posts = Meteor.subscribe 'posts' # on client
publication and subscription names must match, but it could all be named like this:
PostsDB = new Meteor.Collection('postdocumentsonserver')
so in mongo you'd need to type
db.postdocumentsonserver.find()
but otherwise you never need to care about 'postdocumentsonserver'. Then
Meteor.publish "post_titles", ->
PostsDB.find({},{fields:{name:1}})
matching
dbs.subscriptions.post_titles = Meteor.subscribe 'post_titles'

When does node-mongodb-native hits the database?

I have trouble understanding when exactly the database is hit when using node-mongodb-native. Couldn't find any reference on that. As everything is callback based, it gave me the feeling that every single call hits the database ... For example, are those two snippets any different in terms of how many times the database is hit :
// ---- 1
db.collection('bla', function(err, coll) {
coll.findOne({'blo': 'bli'}, function(err, doc) {
coll.count(function(err, count) {
console.log(doc, count)
})
})
})
// ---- 2
db.collection('bla', function(err, coll) {
coll.findOne({'blo': 'bli'}, function(err, doc) {
db.collection('bla', function(err, coll) {
coll.count(function(err, count) {
console.log(doc, count)
})
})
})
})
I was basically wondering whether I can cache instances of collections and cursors. For example, why not fetch the collections I need only once, at server start, and reuse the same instances indefinitely ?
I'd really like to understand how the whole thing work, so I'd really appreciate a good link explaining stuff in details.
Looking at the source code for the node.js driver for collection it seems it will not ping MongoDB upon creation of the collection unless you have strict mode on: https://github.com/mongodb/node-mongodb-native/blob/master/Readme.md#strict-mode
The source code I looked at ( https://github.com/mongodb/node-mongodb-native/blob/master/lib/mongodb/db.js#L446 ) reinforced the idea that if strict was not on then it would just try and create a new node.js collection object and run the callback.
However findOne and count will break the "lazy" querying of node.js and will force it to query the database in order to get results.
Note: The count being on the collection won't enforce a "true" count of all items in the collection. Instead it will garnish this information from the collection meta.
So for the first snippet you should see two queries run. One for the findOne and one for the count and two for the second snippet as well since creating the collection after the findOne should not enforce a query to MongoDB.
After some googling, I have find this link about best practices for node-mongodb-native. It is answered by Christian Kvalheim who seem to be the maintainer of the library. He says :
"You can safely store the collection objects if you wish and reuse them"
So even if the call to collection might hit the database in case it is made in strict mode, the actual client-side collection instance can be reused.