I'm building an application where I would like to use the multi-tenant strategy of creating a schema for each client. Would it be appropriate to store all users in a users table within a single schema that includes a reference column for their respective schemas?
Db_app_01
/schema_public
/schema_public/table_users
/schema_client_1
Where in table_users I have:
|user_id|username|password|schema_id|
--------------------------------------
|1 |user1 |* |1 |
I was thinking with this that I could easily query the correct schema as the schema_id would be available in main users table which is used for authentication.
Your approach looks fine to me, as long as there are not too many different users. When the number of tables and schemas goes into the 10000s, metadata queries will become sluggish, and it won't be much fun any more.
I wouldn't construct dynamic queries out of the schema_id, explicitly referencing the appropriate schema.
Rather, I would set search_path appropriately.
Related
I am unsure how to design security policies for a following system including counters in postgres/supabase. My database includes two tables:
Users:
uuid|name|follower_counter
------------------------------
xyz |tobi| 1
Following-Relationship
follower| following
---------------------------
uuid_1 | uuid_2
Once a user follows a different user, I would like to use a postgres function/transaction to
Insert a new following-follower relationship
Update the followed users' counter
BEGIN
create follower_relationship(follower_id, following_id);
update increment_counter_of_followed_person(following_id);
END;
The constraint should be that the users table (e.g. the name column) can only be altered by the user owning the row. However, the follower_counter should open to changes from users who start following that user.
What is the best security policy design here? Should I add column security or should exclude the counters to a different table?
Do I have to pass parameters to the "block transaction" to ensure that the update and insert functions are called with the needed rights? With which rights should I call the block function?
It might be better to take a different approach to solve this problem. Instead of having a column dedicated to counting the followers, I would recommend actually counting the number of followers when you query the users. Since you already have Following-Relationship table, we just need to count the rows within the table where following or follower is the querying user.
When you have a counter, it might be hard to keep the counter accurate. You have to make sure the number gets decremented when someone unfollows. What if someone blocks a user? What if a user was deleted? There could be a lot of situations that could throw off the counter.
If you count the number of followings/followers on the fly, you don't need to worry about those situations at all.
Now obvious concern with this approach that you might have is performance, but you should not worry too much about it. Postgres is a powerful database that has been battle tested for decades, and with a proper index in place, it can easily perform these query on the fly.
The easiest way of doing this in Supabase would be to create a view like this the following. Once you create a view, you can query it from your Supabase client just like a typical table!
create or replace view profiles as
select
id,
name,
(select count(*) from following_relationship where followed_user_id = id) as follower_count,
(select count(*) from following_relationship where following_user_id = id) as following_count
from users;
As it can be seen in the following screenshot, the current project database (postgresql)
named default has these 4 schema - public, appcompany1, appcompany2 and appcompany3.
They share some common tables. Right now, when I want to fetch data for customers, I write a query like this:
query getCustomerList {
customer {
customer_id
...
...
}
}
And it fetches the required data from public schema.
But according to the requirements, depending on user interactions in front-end, that query will be executed for appcompanyN (N=1,2,3,..., any positive integer). How do I achieve this goal?
NOTE: Whenever the user creates a new company, a new schema is created for that company. So the total number of schema is not limited to 4.
I suspect that you see a problem where it does not exists actually.
Everything is much simpler than maybe it seems.
A. Where all those tables?
There are a lot of schemas with identical (or almost identical) objects inside them.
All tables are registered in hasura.
Hasura can't register different tables with the same name, so by default names will be [schema_name]_[table_name] (except for public)
So table customer will be registered as:
customer (from public)
appcompany1_customer
appcompany2_customer
appcompany3_customer
It's possible to customize entity name in GraphQL-schema with "Custom GraphQL Root Fields".
B. The problem
But according to the requirements, depending on user interactions in front-end, that query will be executed for appcompanyN (N=1,2,3,..., any positive integer). How do I achieve this goal?
There are identical objects that differs only with prefixes with schema name.
So solutions are trivial
1. Dynamic GraphQL query
Application stores templates of GraphQL-queries and replaces prefix with real schema name before request.
E.g.
query getCustomerList{
[schema]_customer{
}
}
substitute [schema] with appcompany1, appcompany2, appcompanyZ and execute.
2. SQL view for all data
If tables are 100% identical then it's possible to create an sql view as:
CREATE VIEW ALL_CUSTOMERS
AS
SELECT 'public' as schema,* FROM public.customer
UNION ALL
SELECT 'appcompany1' as schema,* FROM appcompany1.customer
UNION ALL
SELECT 'appcompany2' as schema,* FROM appcompany2.customer
UNION ALL
....
SELECT `appcompanyZ',* FROM appcompanyZ.customer
This way: no need for dynamic query, no need to register all objects in all schemas.
You need only to register view with combined data and use one query
query{
query getCustomerList($schema: string) {
all_customer(where: {schema: {_eq: $schema}}){
customer_id
}
}
About both solutions: it's hard to call them elegant.
I myself dislike them both ;)
So decide yourself which is more suitable in your case.
Lets say I have an app where users can make posts. I store these in a single DynamoDB table using the following design:
+--------+--------+---------------------------+
| PK | SK | (Attributes) |
+-----------------+---------------------------+
| UserId | UserId | username, profile, etc... | <-- user item
| UserId | PostId | body, timestamp, etc... | <-- post item
+--------+--------+---------------------------+
When a user makes a post, my Lambda function receives the following data:
{
"userId": <UserId>",
"body": <Body>,
etc...
}
My question is, should I first verify that the user exists before adding the post to the table by using dynamodb.get({PK: userId, SK: userId)? This would make sure there won't be any orphaned posts, but also the function will require both a read and write unit.
One idea I have is to just write the post, potentially allowing orphaned posts. Then, I could have another Lambda function that runs periodically to find and remove any orphans.
This is obviously a simple case, but imagine a more complex system where objects have multiple relationships. It seems it could easily get very costly to check for relationship existence in these cases.
"Then, I could have another Lambda function that runs periodically to find and remove any orphans." <-- This could get very expensive over time, especially if you plan to do this by scanning the table.
I develop a system built on DynamoDB that has similar relationships, and I validate relationships before saving data because I do not want to have garbage data in my tables.
One option to consider is implicitly testing for the existence of a valid user via authentication & authorization. If a user has passed your auth tests, then you know that they exists, so you can add their posts with confidence.
Forgive my ignorance, but I'm wondering if there is a way to specify metadata for a table in PostgreSQL that I don't want it to be as a field in that table. For instance, if I want to add a Description field for that table, creation Time, etc...
I know I can do this using extra tables, but I'd prefer having not to do this, to be honest. I've digged in the official PostgreSQL docs, but there's nothing there besides looking in information_schema.tables, where I guess I'm not allowed to modify anything.
Any clues? Otherwise, I guess I'll have to create a few more tables to handle this.
Thanks!
There's the comment field:
COMMENT ON TABLE my_table IS 'Yup, it's a table';
In current versions the comment field is limited to a single text string. There's been discussion of allowing composite types or records, but AFAIK no agreement on any workable design.
You can shove JSON into the comments if you want. It's a bit dirty, since it'll show up as the Description column in \d+ output in psql, etc, but it'll work.
craig=> COMMENT ON TABLE test IS 'Some table';
COMMENT
craig=> \d+
List of relations
Schema | Name | Type | Owner | Size | Description
--------+----------------------+----------+-------+------------+-------------
public | test | table | craig | 8192 bytes | Some table
You can get the comment from SQL with:
SELECT pg_catalog.obj_description('test'::regclass, 'pg_class');
Comments can also be added on other objects, like columns, data types, functions, etc.
If that doesn't fit your needs you're pretty much stuck with a side table for metadata.
People regularly request table metadata like creation time, etc, but nobody tends to step up with a workable plan for an implementation and the time and enthusiasm to carry it through to the finish. In any case the most common request is "last modified time", which is pretty horrible from a performance point of view and difficult to get right in the face of multi-version concurrency control, transaction isolation rules, etc.
A bit of background. I have a base application and most clients use it as standard. However some clients have small code and database customisations.
Each of these clients has their own branch and maintenance can be tricky.
I want to consolidate all these into a single database structure (not a single database - we aren't doing multi-tenancy) to enable upgrades to be applied in a much more uniform fashion.
I'm still at the proof of concept stage, but the route I was going down would be to have the standard objects stay in the schema they currently exist in (mostly dbo) and have the custom objects reside in a schema for each client.
For example, I could have dbo.users and client1.users which has some additional columns. If I set the default schema for the client to be "client1" then the following query
SELECT * FROM users
will return data from the client1 schema or the dbo schema depending on which login is connected.
This is absolutely perfect for what I'm trying to achieve.
The problem I'm running into is with Views.
I have many views which are in the dbo schema and refer to the Users table. No matter which user I connect to the database as, these views always select from dbo.users.
So I'm guessing the question I have is:
Can I prefix the tables in the view with some variable like "DEFAULT"? e.g.
SELECT u.username, u.email, a.level
FROM DEFAULT.users u INNER JOIN accessLevels a ON u.accessID = a.accessID
If this isn't possible and I'm totally barking up the wrong tree, do you have any suggestions as to how I can achieve what I'm setting out to do?
Many thanks.
Just reference the name of the schema in which the views reside...
Select a., b.
from schema1.TABLEA A
join schema2.TABLEB B on A.ID = B.ID