How to structure chat messages in supabase database? - postgresql

There are many questions available for the firebase chat structure.
But I cant figure out how to structured in supabase.
As I think,
case 1:
In one table store every message of all chat room. In this case billions of row will inserted of message.
case 2:
create table named (unique id of chat room of two users) of each chat room. In this case many tables will created in project.
case 3:
suggestion of other structure

Related

List all xmpp chat rooms with their subject names and last message

From the doc XEP-00300, I know I can list all rooms from for the user but is it possible to get filtered chat rooms by some metadata?
Scenario: As a user I want to open tab with 'apples' and I can see all chat rooms with some id=apple (some metadata)
As a user I want to switch the tab to banana and to see chat rooms with id=banana
Is it possible? Maybe other way? It has to be chat for many people.
Apparently that isn't explicitely possible. If you develop the client, there are somehow dirty alternatives:
A) Use the room name to set the tags.
For example, if the user wants to name the room "Last Marvel movies" with tags "film" and "comic", your client could create the room with name Last_Marvel_movies-film-comic.
Later, in the "search rooms by topic", your client gets the list of rooms, searches for -whatever in any room name, and later removes -* when showing the room names.
B) Use the room description to set the tags.
Each room can have a "description". You can set there whatever text you want.
The problem is: your client would need to get the list of rooms (as in case A), and then ask each room for its description, and then filter by topic.
All this is thinking in ejabberd. No idea if other servers with MUC service have another methods.

PostgreSQL Array Contains vs JOIN (performance)

I have a model in which a person can receive a gift for attending one event or receive multiple gifts for attending multiple events. The gift to person or multiple gifts to person is considered one transaction in both cases. I'm using PostgreSQL to implement this model.
For example,
if you attend to certain event, you will receive a gift (a single transaction of gift to person).
And another example, you attend to a set of events therefore you receive a set of gifts for these events (in a single transaction of gifts to person).
So, in the majority of cases, only one gift to one person will be transacted. But there will be a few cases of the second example.
In order to handle this cases, i have two options,
the first one is use a postgres array and query by array contains,
and the second one is create a new table of transaction_events and make a join to query by event.
I wanted to know which option is more performant and which option the community recommends. Tanking into account that the most transaction will contains only one event and also that i cannot change the transactions model.
The second option will perform better, and it has the added benefit that you can have foreign key constraints to enforce data integrity.
In most cases, it is a good idea to avoid composite types like arrays or JSON in the database.

How to get subcollection after groupCollection()?

I'm building a chat app on iOS using firestore. I can't figure out how to get a subcollection after doing groupCollection().
The database structure is
users (collection)
some_user_info
conversations (collection)
some_conversation_info
messages (collection)
If A and B have a conversation and A sends a message to B, what I did is I create a conversation with the same id for both A and B, but only store that message to A. (Who sends it, who owns it)
So when fetching all messages between A and B, I have to do
db.collectionGroup("conversations").whereField("id", isEqualTo: conversationId)
But it seems there is no way to fetch messages collection after group query on conversations. Is there any walkaround?
Thank you!
Since a collection group, by definition, includes an unlimited number of individual collections of the same name, it doesn't make sense to simply address another single subcollection under that. If you want to query a specific messages subcollection under a specific document that comes back from a collection group query, you can certainly build that in your code, but you will need its DocumentReference to build that.
Or, maybe you want to do a collection group query on messages, you can do that as long as you're able to filter that with the fields of the documents you're looking for.

Database modeling with mean.js and mongoose

I am classical developer who is normally developing relational DBs form my web applications.
I want to learn the new way and build an application with mean.js and mongoDB. I used yo generator from the meanjs.org to get started.
When I model my data I always fall back into the classic relational modeling. And I think this is not what the “new way” of app building is all about.
So my question is: What is he best practice to model my data model sample?
My learning sample is an app in which you have a specific given list of music albums (like best 50 Jazz albums of all times) and the user checks in and rate the music.
I have a CRUD module for adding and editing albums the user should listen to. This ends in an ordered list of albums.
I have a CRUD module for users, generated by the yo generator.
A user now can see the list and mark the albums which he already heard. He should be able to give a rating and a comment.
So the question is: where to store the user listenTo info? In the relational world I would introduce a new foreign key table which has a relation from user to album and model the properties like rating and comment in the foreign key table. I don’t think this is how things should work in mongo DB world, does it?
I could add the user listenTo information to each album. I would have a list of users and comments on each album. Then, I need to ensure that if the list is requested, only the information of the current user is present. So I would have to filter on property on a sub-sub-document. Feels strange.
Or, I could copy the album list for each newly created user but then I need to write code that changes the user’s object when I edit the original list.
What would you recommend?
When I think of Data Modeling, I break things down into the following relationships:
1 <--> 1
1 <--> Few/Many (A finite number, say a list of user's phone numbers)
1 <--> Very Many
The general rule of thumb with MongoDB is you should embed wherever possible. So for 1 <--> 1 and 1 <--> Few/Many if the document size is something small, you should embed the collection inside the user document.
It's important to think about the use case here. If we want to track all songs that the user likes or listens to, this could potentially be hundreds or thousands, so we probably want to store this information in a separate collection and contain an indexed reference to the user there.
In the case of tracking if a user listens to the song, I would probably structure it like this in your use case:
{
_id: ObjectID, // The identifier of the document
user_id: ObjectID, // The user who listened to the song
song_id: ObjectID, // The id of the song
count: number, // The number of times the user listened
rating: number, // The number of stars the user rated the song
favorite: boolean, // If the user marked the song as a favorite
last_listened: Date // The last time the user listened
}
With an index on { user_id: 1, song_id: 1 }.
Here is a really good reference on how to approach your problem:
https://docs.mongodb.com/manual/applications/data-models-relationships/

Structuring cassandra database

I don't understand one thing about Cassandra. Say, I have similar website to Facebook, where people can share, like, comment, upload images and so on.
Now, let's say, I want to get all of the things my friends did:
Username1 liked you comment
username 2 updated his profile picture
And so on.
So after a lot of reading, I guess I would need to do is create new Column Family for each single thing, for example: user_likes user_comments, user_shares. Basically, anything you can think off, and even after I do that, I would still need to create secondary indexes for most of the columns just so I could search for data? And even so how would I know which users are my friends? Would I need to first get all of my friends id's and then search through all of those Column Families for each user id?
EDIT
Ok so i did some more reading and now i understand things a little bit better, but i still can't really figure out how to structure my tables, so i will set a bounty and i want to get a clear example of how my tables should look like if i want to store and retrieve data in this kind of order:
All
Likes
Comments
Favourites
Downloads
Shares
Messages
So let's say i want to retrieve ten last uploaded files of all my friends or the people i follow, this is how it would look like:
John uploaded song AC/DC - Back in Black 10 mins ago
And every thing like comments and shares would be similar to that...
Now probably the biggest challenge would be to retrieve 10 last things of all categories together, so the list would be a mix of all the things...
Now i don't need an answer with a fully detailed tables, i just need some really clear example of how would i structure and retrieve data like i would do in mysql with joins
With sql, you structure your tables to normalize your data, and use indexes and joins to query. With cassandra, you can't do that, so you structure your tables to serve your queries, which requires denormalization.
You want to query items which your friends uploaded, one way to do this is t have a single table per user, and write to this table whenever a friend of that user uploads something.
friendUploads { #columm family
userid { #column
timestamp-upload-id : null #key : no value
}
}
as an example,
friendUploads {
userA {
12313-upload5 : null
12512-upload6 : null
13512-upload8 : null
}
}
friendUploads {
userB {
11313-upload3 : null
12512-upload6 : null
}
}
Note that upload 6 is duplicated to two different columns, as whoever did upload6 is a friend of both User A and user B.
Now to query the friends upload display of a friend, do a getSlice with a limit of 10 on the userid column. This will return you the first 10 items, sorted by key.
To put newest items first, use a reverse comparator that sorts larger timestamps before smaller timestamps.
The drawback to this code is that when User A uploads a song, you have to do N writes to update the friendUploads columns, where N is the number of people who are friends of user A.
For the value associated with each timestamp-upload-id key, you can store enough information to display the results (probably in a json blob), or you can store nothing, and fetch the upload information using the uploadid.
To avoid duplicating writes, you can use a structure like,
userUploads { #columm family
userid { #column
timestamp-upload-id : null #key : no value
}
}
This stores the uploads for a particular user. Now when want to display the uploads of User B's friends, you have to do N queries, one for each friend of User B, and merge the result in your application. This is slower to query, but faster to write.
Most likely, if users can have thousands of friends, you would use the first scheme, and do more writes rather than more queries, as you can do the writes in the background after the user uploads, but the queries have to happen while the user is waiting.
As an example of denormalization, look at how many writes twitter rainbird does when a single click occurs. Each write is used to support a single query.
In some regards, you "can" treat noSQL as a relational store. In others, you can denormalize to make things faster. For instance, PlayOrm's #OneToMany stored the many like so
user1 -> friend.user23, friend.user25, friend.user56, friend.user87
This is the wide row approach so when you find your user, you have all the foreign keys to his friends. Each row can be different lengths. You may also have a reverse reference stored as well so the user might have references to the people that marked him as a friend but he did not mark them back(let's call it buddy) so you might have
user1 -> friend.user23, friend.user25, buddy.user29, buddy.user37
Notice that if designed right, you may NOT need to "search" for the data. That said, with PlayOrm, you can still do Scalable SQL and do joins(you just have to figure out how to partition your tables so it can scale to trillions of rows).
A row can have millions of columns in it or it could have just 10. We are actually in the process of updating alot of the documentation in PlayOrm and the noSQL patterns this month so if you keep an eye on that, you can also learn more about general noSQL there as well.
Dean
Think of each DB query as of request to the service running on another machine. Your goal is to minimize number of these requests (because each request requires network roundtrip).
Here comes the main difference from RDBMS paradigm: In SQL you would typically use joins and secondary indexes. In cassandra joins aren't possible, since related data would reside on different servers. Things like materialized views are used in cassandra for the same purpose (to fetch all related data with single query).
I'd recommend to read this article:
http://maxgrinev.com/2010/07/12/do-you-really-need-sql-to-do-it-all-in-cassandra/
And to look into twissandra sample project https://github.com/twissandra/twissandra
This is nice collection of optimization technics for the kind of projects you described.