I'm using Firestore to build a social app modeled after Facebook. I see lots of posts that resemble Twitter with the followed/followers approach, but that doesn't feel like a great fit for an app that uses friends.
My data model would have a top-level collection ("posts") and each post would have the poster's user ID ("UserID"). To build the timeline, I'm thinking of simply using the new-ish 'in' query:
var timelineQuery = postsRef
.where('UserID', whereIn: [user_123, user_456])
.orderBy('timestamp');
'in' queries are limited to an array size of 10, but my app is pretty specialized, so my average user will likely only have a couple of friends even if I get 1M+ users. For my rare user that has 40 or 50 friends, I could run this query 4 or 5 times, with the more active friends in the first query and updating the timeline as subsequent results come in.
Is this a reasonable approach? I don't see any examples using it, so am guessing I'm missing something?
Yup, this sounds like a reasonable approach.
The common alternative is to invert the data model, by building an explicit "wall" for each user in the database. So in that scenario you'd find all a user's followers when they post, and then write the new post to the "wall" of each follower. This makes writing more complex and slower, but makes reads a lot simpler and very scalable.
Related
I am currently watching a how-to create an instagram clone for Swift and want to understand the data model for the comments.
What is the purpose of using a model for the comments like:
post-comment (key = post-id) and comments
over something like this, where every comment has the post-id in it?
Without knowing what exactly they're building, and the types of queries they need to support for the app, one can only guess that this post-comments collection satisfies the need for a query to find out which comments are a part of which posts, while still allowing queries that search all posts or all comments. You should find the part of the tutorial that queries this collection to find out what it's trying to do.
This tutorial might be kind of old, because this sort of thing would be a little bit easier to express today using collection group queries.
I am trying to shift towards serverless architecture when it comes to building REST API. I came from Ruby on Rails background.
I have successfully understood and adapted services such as Api Gateway, Cognito, RDS and Lambda functions, however I am struggling with putting it all together in optimal way.
My case is the following. I have a simple user based platform when there are multiple resources related to application members say blog application.
I have used Cognito for the sake of authentication and Aurora as the database service for keeping thing like articles and likes..
Since the database and Cognito user pool are decoupled, it is hard for me to do things like:
Fetching users that liked particular article
Fetching users comments
It seems problematic for me because I need to pass some unique Cognito user identifier (retrieved during authorization phase in API gateway) to lambda function which will then save the database record with an external reference to this user. On the other hand, If I were to fetch particular users, firstly I must fetch their identifiers from my relation database and then request users details from Cognito user pool..I lack some standard ways of accessing current user in my lambda functions as well as mechanisms for easily associating databse record with that user..
I have not found some convincing recommended patterns for designing such applications even though it seems like a very common problem and I am having hard time struggling if my approach is correct..
I would appreciate some comments on what are some patterns to consider when designing simple user based platform and what are the pitfalls of my solution. Any articles and examples will also be very helpfull.
Thanks in advance.
These sound like standard problems associated with distributed, indpependent, databases. You can no longer delegate all relationships to the database and get a result aggregating them in some way. You have to do the work yourself by calling one database, then the other.
For a case like this:
Fetching users that liked particular article
You would look up the "likes" database to determine user IDs of those who liked it, then look up the "users" database to determine user details such as name and avatar.
Most patterns follow standard database advice, e.g. in the above example, you could follow the performance-oriented pattern of de-normalising - store user data such as name and avatar against each "like", as long as you feel the extra storage and burden of keeping it consistent is justified by the reduction in queries (probably too many Likes to justify this).
Another important practice is using bulk queries to avoid N+1 queries. This is what Rails does with the includes syntax, but you may have to do it yourself here. In my example, it should only take two queries because the second query should get all required user data in one go, by querying for users matching the list of user IDs.
Finally, I'd suggest you try to abstract things. This kind of code gets messy fast, so be sure to build a well-encapsulated data layer that isolates application code from dealing with the mess of multiple databases.
I am working on an application which will have users.. who create posts.. and other users can like/comment on any post.
I am trying to figure out a best way to design db tables for this. I have read the anypics tutorial on parse.com site. They keep all comments and likes in a table called "Activity". (which makes sense) being able to query any type of activity (like/comment) from a separate table without having to touch "posts" table.
My question is- in this scenario how do I fetch all posts that current user created along with likes and comments on each those posts?
Anypic app by parse makes a separate request to fetch number of likes on each post (which I think is not ideal.) I am new to nosql data stores.. so if someone could help me out with suggestion on how to structure data that would be great.
Also, how bad is it to store all likes/comments as an array in the post itself? I think this won't scale but I might be wrong.
Thanks
In terms of Parse, I would use an afterSave Cloud Function to update the Post anytime a like/comment is added.
Have a look at the documentation here, in the most simple case you just create an afterSave for the Activity class that increments the like/comment count.
In a more advanced scenario you might want to handle update/delete too. E.g. if someone can change their 'like' to 'not like' you would need to look at the before/after value and increase/decrease the counter as needed.
I am a fan of storing extra 'redundant' data, and no-sql/document-db systems work well for that. This is based on the idea that writes are done infrequently compared to the number of reads, so doing some extra work during/after the write has less impact and makes the app work more smoothly.
I have been investigating a graph database and I have found neo4j and although this seems ideal I have also come across Mongodb.
Mongodb is not an official graph database but I wondered if it could be used for my scenario.
I am writing an application where users can have friends and those friends can have friends etc, the typical social part of a social network.
I was wondering in my situation whether Mongodb might suffice. How easy would it be to implement or do I really need to focus on REAL graph databases?
I do notice foursquare are using Mongodb so I presume it supports their infrastructure.
But how easy would it be to find all friends of my friends that also have friends in common, for example?
Although it wouldn't be impossible, MongoDB would not be a good fit for this scenario.
The reason is that MongoDB does not do JOINs. When you need a query which spans multiple documents, you need a separate query for each document.
In your example, each user document would have an array with the _id's of their friends. To find "all friends of the friends of UserA who are also friends of UserB" would mean that you would:
find userA and get his friends-array
find all users in that array and get their friend-arrays
find all users in these arrays who have UserB in their friends-array
These are three queries you have to perform. Between each of these queries, the result set has to be sent to the application, the application has to formulate a new query and send it back to the database. The result-set returned from the 2nd query can be quite large, which means that the 3rd query could take a while.
tl;dr: Use the right tool for the job. When your data is graph-based and you want to do graph-based queries on it, use a graph database.
You likely want an actual graph database as opposed to MongoDB. Try using the TinkerPop graph technology stack to get started. Using Blueprints (which is like JDBC for graphs) you can see the performance of MongoDB as a graph (using the Blueprints MongoDB implementation) versus Neo4j, Titan, or any number of other graph implementations.
I'm trying to model a simple, experimental app as I learn Symfony and Doctrine.
My data model requires some flexibility, so I'm currenty looking into the possibility of using either an EAV model, or document store in MongoDB.
Here's my basic requirements:
Users will be able to store and share their favourite things (TV prog, website, song etc).
The list of possible 'things' a user can store is unknown. For example, a user may want to store their favourite animal.
Users can share their favourite things with other users. However, a user can decide what he / she shares with each other user. For example, a user may share their favourite movie with one user, but not another.
A typical user will log in and view all the favourite things from their list of friends, depending on what his friends have decided to share. The user will also update their own favourite things, which will be reflected when each other users views their own profile. Finally, the user may change which of his friends can see what of his favourite thing.
I've worked a lot with Magento, which uses the EAV model extensively. However, I'm adding another layer of complexity by restricting which users can see what information.
I'm instantly drawn to MongoDB as the schemaless format gives me the flexibility I require. However, I'm not sure how easy (or efficient) it will be to access the data once it's saved. I'm also concerned about how changes to the data will be managed, e.g. a user changes their favourite film.
I'm hoping someone can point me in the right direction. This is purely a demo app I'm building to further my knowledge, but I'm treating it like a real-world app where data access times are super-important.
Modelling this kind of app in a traditional relational DB makes me sweat when I think about the crazy number of joins I'd need to get the data for one user.
Thanks for reading this far, and please let me know if I can provide anymore information.
Regards,
Fish
You need to choose a model based on how you need to access the data.
If you just need to filter out some values when viewing the user profile, a single document for each user would work quite well, with each favorite within that having a list of authorized user/group IDs that is applied in the application code. Both read and write are single operations on a known document in this case, so will be fast.
If you need views across multiple profiles though, your main document should probably be the favorite. You'll need to set up the right indexes, but performance shouldn't be a problem.
Actually, the permissions you describe don't add that much complexity to an EAV schema - as long as attributes can have multiple values the permissions list is just one more attribute.