multi-tenancy with sequelize and nest.js

multi-tenancy with sequelize and nest.js - postgresql

I want to implement a multi-tenant solution where I have one webserver and one database shared across all tenants. Regarding to this blog post from AWS it is "pooled multi tenancy model".
I'm using nest.js and sequelize. If sequelize is not a good fit for this I also could switch to another library like typeORM if necessary.
How can this be implemented? I'm absolutely clueless how I can use a different connection (different database user) for each HTTP request and also I don't know how to set a runtime context variable for the connection in a good way.
What I get currently is that every HTTP requests contains a header tenant-id. This should be used for all queries.
There is also the concept of scopes in sequelize. But this is something that is implemented on the client side and not on the database directly. Also, this is something that is specific to sequelize. I would prefer a solution that is independent from sequelize and maybe more specific to PostgreSQL.
Is there any way to implement this with sequelize? A hint or a basic approach would be sufficient.

That seems that this approach is similar. https://learn.microsoft.com/en-us/microsoft-365/education/deploy/design-multi-tenant-architecture.
I'm studding for create a similar architecture, but i will use the "silo" model or "physical database". I think that at first you need to create a internal database called "catalog" that will contains the information of the user (this user already have a login? if true select this information) where have to contains a previous credentials how tenant-id. About the Sequelize, i guess that is necessary to use RAW queries for create ROLE|GRANT|DATA BASE etc and the MIGRATIONS to create the same DB for each new clients.

Related

Alternative datastore per request type on sails.js

In a sails.js project, I would like to use a different database endpoint for READ operations (using AWS RDS read replica) than the default datastore that I will keep using for WRITE operations.
As explained here, it is possible in sails.js to set datastore on a per-model basis, but what about setting an alternative datastore on a per-request basis or directly for all read operations?

One way to accomplish this in a Sails-y fashion, would be to have "read only" models. This does a couple things: first, will make it very clear which datastore you are working with in your controllers, if you have say a User model, and a UserRead model, and that means it doubles your models (small price to pay in memory costs); but, this also means Sails can easily manage your read only and write databases, you just have to be conscience of using the proper model in the proper context.
To keep things really light in your duplicate "read only" model (and so you don't have to change 2 models every time something changes), you could just extend your original model, and simply change the datastore, something like this should work:
UserRead.js
module.exports = _.merge({}, require('./User'), {datastore: 'defaultRead'});

TypeORM: Dynamically set database schema for EntityManager (or repositories) at runtime?

Situation:
For our SaaS API we use schema-based multitenancy, which means every customer (~tenant) has its own separate schema within the same (postgres) database, without interfering with other customers. Each schema consists of the same underlying entity-model.
Everytime a new customer is registered to the system, a new isolated schema is automatically created within the db. This means, the schema is created at runtime and not known in advance. The customer's schema is named according to the customer's domain.
For every request that arrives at our API, we extract the user's tenancy-affiliation from the JWT and determine which db-schema to use to perform the requested db-operations for this tenant.
Problem
After having established a connection to a (postgres) database via TypeORM (e.g. using createConnection), our only chance to set the schema for a db-operation is to resort to the createQueryBuilder:
const orders = await this.entityManager
.createQueryBuilder()
.select()
.from(`${tenantId}.orders`, 'order') // <--- setting schema-prefix here
.where("order.priority = 4")
.getMany();
This means, we are forced to use the QueryBuilder as it does not seem to be possible to set the schema when working with the EntityManager API (or the Repository API).
However, we want/need to use these APIs, because they are much simpler to write, require less code and are also less error-prone, since they do not rely on writing queries "manually" employing a string-based syntax.
Question
In case of TypeORM, is it possible to somehow set the db-schema when working with the EntityManager or repositories?
Something like this?
// set schema when instantiating manager
const manager = connection.createEntityManager({ schema: tenantDomain });
// should find all matching "order" entities within schema
const orders = manager.find(Order, { priority: 4 })
// should find a matching "item" entity within schema using same manager
const item = manager.findOne(Item, { id: 321 })
Notes:
The db-schema needs to be set in a request-scoped way to avoid setting the schema for other requests, which may belong to other customers. Setting the schema for the whole connection is not an option.
We are aware that one could create a whole new connection and set the schema for this connection, but we want to reuse the existing connection. So simply creating a new connection to set the schema is not an option.

To answer my own question:
At the moment there is no way to instantiate TypeORM repositories with different schemas at runtime without creating new connections.
So the only two options that a developer is left with for schema-based multi tenancy are:
Setting up new connections to connect with different schemas within the same db at runtime. E.g. see NestJS Request Scoped Multitenancy for Multiple Databases. However, one should definitely strive for reusing connections and and be aware of connection limits.
Abandoning the idea of working with the RepositoryApi and reverting to using createQueryBuilder (or executing SQL queries via query()).
For further research, here are some TypeORM GitHub issues that track the idea of changing the schema for a existing connections or repositories at runtime (similar to what is requested in the OP):
Multi-tenant architecture using schema. #4786 proposes something like this.photoRepository.useSchema('customer1').find()
Handling of database schemas #3067 proposes something like getConnection().changeDefaultSchema('myschema')
Run-time change of schema #4473
Add an ability to set postgresql schema per call #2439
P.S. If TypeORM decides to support the idea discussed in the OP, I will try to update this answer.

Here is a global overview of the issues with schema-based multitenancy along with a complete walkthrough a Github repo for it.
Most of the time, you may want to use Postgres Row Security Policy instead. It gives most of the benefits of schema-based multitenancy (especially on developer experience), without the issues related to the multiplication of connections.

Since commenting does not work for me, here a hint from the documentation of NestJS:
https://docs.nestjs.com/techniques/database#async-configuration
I am not using NestJS but reading the docs at the moment to decide, if it's a fitting framework for us. We have an app where only some modules have multi tenancy with schema per tenant, so using TypeOrmModule.forRootAsync(dynamicCreatedDbConfig) might be an option for me too.
This may help you if you have an interceptor or middleware, which prepares the dynamicCreatedDbConfig data before...

How to expose read model from shared module

I am working on developing a set of assemblies that encapsulate parts of our domain that will be shared by many applications. Using the example of an order management system, one such assembly will contain all of the core operations an application can perform to/with an order. We are applying a simple version of CQS/CQRS so that all operations that change the state of the "system" are represented as public commands, such as CancelOrderCommand, ShipOrderCommand and CreateORderCommand. The command handlers are internal to the assembly.
The question I am struggling to answer is how to best expose the read model to consuming code?
The read model will be used by consuming code to perform queries. I don't know how all of the ways the read model will be used so the interface needs to be flexible to allow any query.
What complicates it for me is that I not only need to expose my aggregate root but there are also several "lookup" lists of related data that client applications may use. For example, each order has an associated OrderType which is data-driven (i.e., not an enum) and contains several properties that will drive some of our business rules that control what operations can/cannot be performed, etc. It is easy inside my module to manage this relationship; however, a client application that allows order creation will most likely need to display the list of possible OrderTypes to the user. As a result, I need to not only expose the list of Order aggregates but the supporting list of OrderTypes (and other lookup lists) from my read model.
How is this typically done?
I'm not sure what else to explain that will help trigger a solution, so please ask away...

I have never seen a CQRS based implementation expose a full dataset for ad-hoc querying so this is an interesting situation! In a typical CQRS scenario you would expose very specific queries because you may want to raise events when they are called (for caching for example - see this post for more details on that).
However since this is your design, let's not worry about "typical" or "correct" CQRS, I guess you just need a solution! One of the best new mechanisms for exposing data for flexible querying I have seen is the Open Data Protocol (OData). It will allow consumers to implement their own filtering, sorting and paging over a data source you expose.
Most implementations of this seem to deal with relational data. If you are dealing with a relational data source then OData might be a nice way to go. I suspect by your comment of "expose my aggregate root" that you might be using a document database? If so, there is one example I have seen of OData services on top of MongoDB: http://bloggingabout.net/blogs/vagif/archive/2012/10/11/mongodb-odata-provider-now-supports-arrays-and-nested-collections.aspx.
I hope that helps, OData is definitely worth looking into. It seems to be growing really quickly and is getting good support on both server and client technology platforms.

Change Schema of Entity Framework

I'm using Entity Framework 5 on ASP MVC 4 web site I'm developing.
Because I am using shared hosting which charge for the number of databases I use I would like to run a test site near my production site.
I have two problems:
1) I use Code First and Database Migration. The migration classes seem to embed the schema dbo inside the name of the tables.
How can I change the schema according to the test/production flag
2) How can I change the schema from which EF select data?
Thank you,
Ido.

Both migration and EF take schema from mapping so if you want to change the schema you must update your mapping to use:
modelBuilder.Entity<MyEntity>().ToTable("MyTable", "MySchema");
and control the value of MySchema from configuration but this is really bad idea. One day you forget to change the value and break your production. Use local database for development and test.

As already said: use identical databases (structurally) for development, test and production.
The goal of schemas is to group database objects, like we do with namespaces in e.g. C#, or to simplify permissions for groups of database objects. Not to identify database stages. By using them for the latter you also make it much harder, if not impossible, to use schema appropriately. See for instance this MSDN white paper.
It is much easier to use some database name conventions to indicate their purpose.

How to change the database on the fly in python using TurboGear framework?

I have come across a requirement that needs to access a set of databases in a Mongodb server, using TurboGear framework. There I need to list down the Databases, and allow the user to select one and move on. As far as I looked, TurboGear does facilitate multiple databases to use, but those needs to be specify beforehand in the development.ini.
Is there a way to just connect to the db server(or to a particular database first) and then get the list of databases and select one on the fly?

For SQLAlchemy you can achieve something like that using a smarter Session.
Just subclass the sqlalchemy.orm.Session class and override the get_bind(self, mapper=None, clause=None) method.
That method is called each time the session has to decide which engine to use and is expected to return the engine itself. You can then store a list of engines wherever you prefer and return the correct one.
When using Ming/MongoDB the same can probably be achieved by subclassing the ming.Session in model/session.py and overridding the ming.Session.db property to return the right database.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse