PostgreSQL effective way to store a list of IDs - postgresql

In my PostgreSQL I have two tables board and cards tables with OneToMany relationship between them(one board can have a multiple cards).
User can hold a few cards on the board. In order to implement this functionality typically I would created another table called for example cards_on_hold with OneToMany relationship and placed cards on hold IDs into this table. In order to fetch this data for board I'd use JOIN between board and cards_on_hold.
Is there any more effective way in PostgreSQL to store cards on hold IDs ? Maybe for example some feature to store this list inline in board table ? I'll need to use this IDs list later in IN SQL clause in order to filter card set.

Postgres does support arrays of integers (assuming your ids are integers):
http://www.postgresql.org/docs/9.1/static/arrays.html
However manipulating that data is a bit hard compared to a separate table. For example with a separate table you can put a uniqueness guarantee so that you won't have duplicates of ids (assuming you'd want that). To achieve the same thing with an array you would have to create a stored procedure to detect duplicates (on insert for example). That would be hard (if possible at all) to be as efficient as simple unique constraint. Not to mention that you lose consistency guarantee because you can't put foreign key constraint on such array.
So in general conisistency would be an issue with inline list. At the same time I doubt you would get any noticable performance gain. After all arrays should not be used as an "aggregated foreign key" IMHO.
All in all: I suggest you stick to a separate table.

Related

How to create/update a many-to-many relation where order/index matters?

Let's say that you're creating a music app. You have a table of playlists and a table of songs. How would you model that relationship of song order in a playlist in a SQL environment?
Requirements:
Each playlist can have multiple songs
Song order in the playlist matters
Each song has its own rich information (artist, album, etc.)
On the client side, this is easy, just have an array of product ids on the playlist, and get the song information from those. If the order changes, just update the array and push a new one. Computationally intensive but very easy to reason about and no chance of a doubled index entry.
In relational database world, normally for a many-to-many relationship, you'd use a junction table. Where each playlist_id corresponds to a song_id. You could add a column for index, but then when you update the order of a playlist, you have to rewrite the order of all the indexes.
id
playlist_id
song_id
index
1
1
50
1
2
1
24
2
3
1
21
3
4
2
12
1
I'm struggling to find an answer to this question.
For my specific situation, I'm currently using Supabase and their Javascript SDK which references a hosted PostgreSQL database and everything is done from a client side app with queries. I don't know how to write an SQL function that would deal with this. It all seems like it'd be very complex compared to just pushing a new array each time, even though it's the "correct" way. It doesn't look like PostgreSQL supports an array of foreign keys yet, so is there a better way?
In relational database world, normally for a many-to-many relationship, you'd use a junction table.
Yes, and that junction table would be 'all key' -- that is in this case its schema would be its key would be {playlist_id, song_id}.
Presumably (you don't say this) a song can appear on many playlists, and at a different sequence in each. Furthermore you can't infer index from any other table. index is only held on this table. Also (I guess) a user might resequence their playlist by shuffling the same set of songs.
Adding index means you no longer have a junction table. But you do still have a table (one of) whose key(s) is {playlist_id, song_id}. There might be an alternative key {playlist_id, index}, or if a user inadvertently shuffles two songs to the same position on a playlist, that might be allowed, and you'll use song_id to resolve ties. (That's what I'd do, to keep it simple. In a manufacturing step, that's like saying I need 4 bolts and 2 clips here, but it doesn't matter which order you attach them.)
What you don't want or need is an additional id in this table. It does nothing but get in the way of the true key(s) and, as you say, trying to shuffle the order would give a maintenance headache. Perhaps you're not aware any table can have a composite key.

Dynamodb update attribute value among related items

As it says in Dynamodb documentation, it's recommended that we use only one table to model all our entities.
You should maintain as few tables as possible in a DynamoDB application. Most well-designed applications require only one table.
Now suppose that we have a product and a user entity, using only one table we have a schema like this:
In dynamodb, its recommended that we keep related data together, that's why the user data is "duplicated" on the product entry.
My question is, if one day I update the user name, dynamodb will be able to update automatically the copy of that user on my product entry, or this kind of update has to be made manual?
In DynamoDB, it is recommended to keep the items in de-normalized form for achieving the benefits of DynamoDb. Having said that, while designing the table we keep the application layer design in mind based on which we try to fetch the results from the single table to get the values that can be used to create the single entity with all the mappings satisfied. Hence we create the table with columns that can hold the values from other related table. The only difference is we are just putting the relationship values for keeping the connection to other related tables.
In the above scenario, we can have user details in one table and while creating the table for product, keep the primary key of user table in the product table. So that, if the username or user detail is changed in future, there wouldn't be any problem.
In DynamoDB, using sort key for the table, will keep the related items together. There is also a provision of composite sort keys to deal with one-many relation.
Sharing the Best practices of using sort keys:
https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/bp-sort-keys.html

Sorting Cassandra using individual components of Composite Keys

I want to store a list of users in a Cassandra Column Family(Wide rows).
The columns in the CF will have Composite Keys of pattern id:updated_time:name:score
After inserting all the users, i need to query users in a different sorted order each time.
For example, if i specify updated_time, i could be able to fetch the recent 10 users.
And, if i specify score, then i could be able to fetch the top 10 users based on score.
Does Cassandra supports this?
Kindly help me in this regard...
i need to query users in a different sorted order each time...
Does Cassandra supports this
It does not. Unlike a RDBMS, you can not make arbitrary queries and expect reasonable performance. Instead you must design you data model so the queries you anticipate will be made will be efficient:
The best way to approach data modeling for Cassandra is to start with your queries and work backwards from there. Think about the actions your application needs to perform, how you want to access the data, and then design column families to support those access patterns.
So rather than having one column family (table) for your data, you might want several with cross references between them. That is, you might have to denormalise your data.

Composite _ID and using MongoDB as a composite bucket store, via C#

I am building an eCommerce system that uses composite bucket hashing to efficiently group similar items. Without going into why I chose this system, suffice it to say it solves several key problems facing distributed eCommerce system.
There are 11 buckets, all of them ints, which represent various values. Let's call these fields A to K. The 12th field, L, is an array of product IDs. Think of this all as a hierarchy with the leaf level (L) being data.
I ran some initial tests in MongoDB where I stored this data as individual documents. However, this is not efficient because a given set of A to K could have many L values, so these can be stored as an array.
This gives me two options:
Insert a meaningless _id document id, and put an index on A - K to ensure uniqueness. I already ran some tests on indexes, and indexing more that the first 2 columns impacts speed substantially.
Make A - K a composite _id, and have one document data field: L.
I know #2 is a highly unconventional use of MongoDB. Are there any technical reasons why I shouldn't do this? If not, using the official C# driver, how would I perform this insert?
If you went for option #2, you could perhaps create your own optimised composite id (using A-K) using a Custom Id Generator.
Did you run your tests on compound keys?

Non Relational Database , Key Value or flat table

My application needs configurable columns , and titles of these columns get configured in the begining, If relation database I would have created generic columns in table like CodeA, CodeB etc for this need because it helps queering on these columns (Code A = 11 ) it also helps in displaying the values (if that columns stores code and value) but now I am using Non Relational database Datastore (and I am new to it), should I follow the same old approach or I should use collection (Key Value pair) type of structure .
There will be lot of filters on these columns. Please suggest
What you've just described is one of the classic scenarios for a Key-Value database. The limitation here is that you will not have many of the set-based tools you're used to.
Most of the K-V databases are really good at loading one "record" or small set thereof. However, they don't tend to be any good at loading anything that may require a join. Given that you're using AppEngine, you probably appreciate this limitation. But it's worth stating.
As an important note, not all K-V database will allow you to "select by any column". Many K-V stores actually only allow for selection by a primary key. If you take a look at MongoDB, you'll find that you can query any column which sounds like a necessary feature.
I would suggest using key/value pairs where keys will act as your column names and value will be their data.