I have a core data model with blog groups, blogs, and posts. A blog group has a to-many relationship to blogs, and each blog has a to-many relationship to posts. A post has an attribute "hasBeenRead." Both the blog and the blog group have a attributes "numberUnreadPosts."
I'd like to know the best practice for propagating the number of unread posts up through each relationship. For instance, if I read a post, I'd like to decrease the number of unread posts by one in both the blog and the blog group. Thanks!
There are a couple of ways to do this.
KVO observer
Your BlogGroup could watch for changes on the Blog entity -numberUnreadPosts property and when it changes it can update itself.
Likewise, your Blog can watch for changes on the Post entity -hasBeenRead property and when it changes it can update itself which will propagate up to the BlogGroup.
The problem with this design is that it assumes that the BlogGroup and Blog entities are both in memory (because you would turn the observer on in the -awakeFromFetch method). This may not always be the case and I find it best not to rely on that situation.
Propagate the update
When a Post changes the -hasBeenRead property you can override the setter and have it call it's parent (Blog) and tell it about the change. Blog would then update it's own unread count and tell the BlogGroup that it has updated.
This design is far more consistent and is unlikely to fail. However it can have unforeseen consequences because of the ripple. When you change a post a number of objects get fetched into memory to be updated.
Or don't worry about it
A third option is that only the post actually has the value. You could then produce a convenience method on both the Blog and the BlogGroup that merely counts the unread from the objects below.
This is pretty simple to do but it is not an observable property so may not work in your design.
The design of your application will determine which design works better for you. If you know that the BlogGroup and Blog will always be realized when you are working with a post then option one is a better solution imho.
This sounds like a job for Fetched Properties
Without using fetched properties (which I haven't done - maybe it's the right way), I'd go upwards from Posts. Create a fetch request for entity Post with a predicate for hasBeenRead == NO. Iterating over the resulting array use the inverse relationship to identify the Blog owning each post and BlogGroup owning each Blog and create two dictionaries.
First dictionary has Post->Blog's unique name as key, if there is no entry for that then you create and store [NSValue valueWithInt:1] for the key else increment what's there. Second dictionary is the same but has Post->Blog->BlogGroup's id as key. Unread count for either Blog or BlogGroup is found by looking at the value stored for the relevant key.
You would need to do the complete count once when your program starts and then follow the relationships and update your counts when you change a post from unread to read..
Related
I'm creating a discussion system using Parse.com
In my [simplified] system, there are Posts, Categorys, and Comments.
As you probably imagined, Posts can belong to one or more Categorys and can have multiple Comments.
However, often users will want to see all the Posts in a Category. If I set up my database like this
Post (name, content, categories)
Category(name)
I am worried that querying for all the Posts in a Category will be very ineffeficient (since it will have to check the categories field of every Post.
However, if I design the database like
Post (name, content)
Category(name, posts)
it will be inefficient for me to query what Categorys a Post belongs to since it will have to search all the Posts arrays in the all the Categorys.
I'm sure this must be a common Database design dilemma but I am still new at this. What is the best way to approach and solve this problem?
What you're looking for is a bi-directional, many-to-many relationship between Post and Category. With Parse, there are at least three approaches you can take.
You can add a column as a PFRelation to the Post table. You can ask a Post for its categories relation, create a query from that and run it. Inversely, if you have a category you can create a Post query with a where clause on the categories key. PFRelations are good if you will have big collections.
If you think better as a relational model, just create a "join" table called CategoryPosts. It would have two pointer columns, one for the Post and another for the Category. This is also very efficient.
Lastly, you could add an array column to either class. Since all of the results are loaded at once, this works best for smaller collections.
These options are described in a little more detail in the Parse Relations Documentation.
so I'm working with a database that has multiple collections and some of the data overlaps in the collection . In particular I have a collection called app-launches which contains a field called userId and one called users where the _id of a particular object is actually the same as the userId in app-launches. Is it possible to group the two collections together so I can analyze the data? Or maybe match the the userId in app-launches with the _id in users?
There is no definit answer for your question Jeffrey and none of the experts here can tell you to choose which technique over other just by having this information.
After going through various web pages over internet and mongo documentation and understanding the design patterns used in Mongo over a period of time, How I would design it depends on few things which I can try explaining it here in short.
if you have a One-To-One relation then always prefer to choose Embedding over Linking. e.g. User and its address (assuming user has only one address) thus you can utilize the atomicity (without worrying about transactions) as well easily fetch the records without too and fro to bring other information as in the case of Linking (like in DBRef)
If you have One-To-Many relation then you need to consider whether you can do the stuff by using Embedding (prefer this as explained the benefits in point 1). However, embedding would help you if you always want the information altogether e.g. Post/Comments where your requirement is to get the post and all of its comments by postId let say. But think of a situation where you need to get all the comments (and it related posts) which contains some specific tags in comments. in this case you should prefer Linking Because if you go via Embedding route then you would end up getting all the collection of comments for a post and you have to filter the desired comments.
for a Many-To-Many relations I would prefer two separate entities as well another collection for linking them e.g. Product-Category.
-$
I have a quite common use case - a list of comments. Each comment has an author.
I'm storing the reference from a comment to the author using a reference, since an author can make multiple comments.
Now I'm working with ReactiveMongo and want to try to keep the database access asynchronous, but in this case, I don't know how. I do an asynchronous access to the database, to get the comments, but then for each comment I have to get the author, and until now the only way I know is to loop through the comments and get the user synchronously:
val userOption:Option[JsObject] = Await.result(usersCollection.find(Json.obj("id" -> userId).one[JsObject], timeout)
//...
Other than that, I could:
Get each user asynchronously but then I have to introduce some functionality to wait until all user were fetched, in order to return the response, and my code is likely to become a mess.
Store the complete user object - at least what I need for the comment (picture, name and such) in each comment. This redundancy could become troublesome to manage, since each time a user changes something (relevant to the data stored in the comments) I would have to go through all the comments in the database and modify it.
What is the correct pattern to apply here?
I tackled this exact problem a while ago.
There are no joins in mongo.
You have to manually take care of the join.
Your options are:
Loop through each comment entry and query mongo for the user. this is what you're doing.
Get all user id's from comments, query mongo for the users matching these ids, then take care to match user to comment.This is just what you did but a little more optimized.
Embed the user in comments or comments in users. Wouldn't recommend this, this is probably not the right place for comments/users.
Think of what set of data do you need from user when displaying a comment, and embed just this info in comment
I ended up going with the last option.
We embedded the user id, first and last name in each comment.
This info is unlikely to change (possibly not even allowed to change after creation?).
If it can change then it is not too hard to tailor the update-user method to update the related comments with the new info (we did that too).
So now no join is needed.
The repository in the CommonDomain only exposes the "GetById()". So what to do if my Handler needs a list of Customers for example?
On face value of your question, if you needed to perform operations on multiple aggregates, you would just provide the ID's of each aggregate in your command (which the client would obtain from the query side), then you get each aggregate from the repository.
However, looking at one of your comments in response to another answer I see what you are actually referring to is set based validation.
This very question has raised quite a lot debate about how to do this, and Greg Young has written an blog post on it.
The classic question is 'how do I check that the username hasn't already been used when processing my 'CreateUserCommand'. I believe the suggested approach is to assume that the client has already done this check by asking the query side before issuing the command. When the user aggregate is created the UserCreatedEvent will be raised and handled by the query side. Here, the insert query will fail (either because of a check or unique constraint in the DB), and a compensating command would be issued, which would delete the newly created aggregate and perhaps email the user telling them the username is already taken.
The main point is, you assume that the client has done the check. I know this is approach is difficult to grasp at first - but it's the nature of eventual consistency.
Also you might want to read this other question which is similar, and contains some wise words from Udi Dahan.
In the classic event sourcing model, queries like get all customers would be carried out by a separate query handler which listens to all events in the domain and builds a query model to satisfy the relevant questions.
If you need to query customers by last name, for instance, you could listen to all customer created and customer name change events and just update one table of last-name to customer-id pairs. You could hold other information relevant to the UI that is showing the data, or you could simply hold IDs and go to the repository for the relevant customers in order to work further with them.
You don't need list of customers in your handler. Each aggregate MUST be processed in its own transaction. If you want to show this list to user - just build appropriate view.
Your command needs to contain the id of the aggregate root it should operate on.
This id will be looked up by the client sending the command using a view in your readmodel. This view will be populated with data from the events that your AR emits.
I have a form where users create Person records. Each Person can have several attributes -- height, weight, etc. But they can also have lists of associated data such as interests, favorite movies, etc.
I have a single form where all this data is collected. To me it seems like I should POST all of this data in a single request. But is that RESTful? My reading suggests that the interests, favorite movies and other lists should be added in separate POST requests. But I don't think that makes sense because one of those could fail and then there would be a partial insert of the Person and it may be missing their interests or favorite movies.
I'd say that it depends entirely upon the addressability and uniqueness of the dependent data.
If your user-associated data is dependent upon the user (i.e., a "distinct" string, e.g. an attribute such as a string representing an (unvalidated) name of a movie), then it should be included in the POST creation of the user representation; however, if the data is independent of the user (where the data can be addressed independently of the user, e.g. a reference, such as a movie from a set of movies) then it should be added independently.
The reasoning behind this is that reference addition when bundled with the original POST implies transactionality; that is, if another user deletes the movie reference for the "favorite" movie between when it is chosen on the client and when the POST goes through, the user add will (should by that design) fail, whereas if the "favorite" movie is not associative but is just an attribute, there's nothing to fail on (attributes (presumably) cannot be invalidated by a third party).
And again, this goes very much to your specific needs, but I fall on the side of allowing the partial inserts and indicating the failures. The proper way to handle this sort of thing if you really want to not allow partial inserts is to just implement transactions on the back end; they're the only way to truly handle a situation where a critical associated resource is removed mid-process.
The real restriction in REST is that for a modifiable resource that you GET, you can also turn around and PUT the same representation back to change its state. Or POST. Since it's reasonable (and very common) to GET resources that are big bundles of other things, it's perfectly reasonable to PUT big bundles of things, too.
Think of resources in REST very broadly. They can map one-to-one with database rows, but they don't have to. An addressable resource can embed other addressable resources, or include links to them. As long as you're honoring your representation and the semantics of the underlying protocol's operations (i.e. HTTP GET POST PUT etc.), REST doesn't have anything to say about other design considerations that might make your life easier or harder.
I don't think there is a problem with adding all data in one request as long as its inherently associated with the main resource (i.e. the person in your case). If interest, fav. movies etc are resources of their own, they should also be handled as such.