JPA: how to map some entities to a different schema of another database instance? - jpa

JPA: is there a way to map some entities to a schema of another database instance? e.g.,
#Entity
public class Foo {
}
#Entity
#Table(schema="schema1")
public class Bar {
}
The Bar entity is mapped to the schema1 of the same database instance. Is there a way in JPA to map it to a schema in a remote database instance? It is useful for sharing entities among multiple applications.
Can the "catalog" be used for this purpose?

What do you mean by 'remote database'?
If you use #Table(schema = "myschema", name = "bar"), Hibernate will qualify all queries with the schema name (e.g. SELECT e FROM Bar will ultimately translate to SELECT * FROM myschema.bar). If the database user you're using to connect to the DB has access to myschema.bar (whatever such a DB object is), then the query will work; if not, then the query will fail.
If you mean 'a remote DB that is a separate server', then, of course, you can only connect to the DB using one JDBC connection per persistence context. If that's your scenario, perhaps you should consult the docs of the RDBMS for ways to connect two DB instances (in Oracle, for example, you could use database links and synonyms).
Make sure that you understand the implications, though, as such a solution introduces its own class of problems (including the fact that you suddenly have implicit distributed transactions in your system).
As a side note, I'm not sure how such an approach is 'useful for sharing entities among multiple applications' or why one would even think 'sharing entities among multiple applications' is somehow useful, but I'd seriously think through the idea of integrating multiple application via shared/linked DBs. It usually introduces more problems than it solves.

If I understand well what you mean, you should use two (or more) different persistence context

Related

TypeORM: Dynamically set database schema for EntityManager (or repositories) at runtime?

Situation:
For our SaaS API we use schema-based multitenancy, which means every customer (~tenant) has its own separate schema within the same (postgres) database, without interfering with other customers. Each schema consists of the same underlying entity-model.
Everytime a new customer is registered to the system, a new isolated schema is automatically created within the db. This means, the schema is created at runtime and not known in advance. The customer's schema is named according to the customer's domain.
For every request that arrives at our API, we extract the user's tenancy-affiliation from the JWT and determine which db-schema to use to perform the requested db-operations for this tenant.
Problem
After having established a connection to a (postgres) database via TypeORM (e.g. using createConnection), our only chance to set the schema for a db-operation is to resort to the createQueryBuilder:
const orders = await this.entityManager
.createQueryBuilder()
.select()
.from(`${tenantId}.orders`, 'order') // <--- setting schema-prefix here
.where("order.priority = 4")
.getMany();
This means, we are forced to use the QueryBuilder as it does not seem to be possible to set the schema when working with the EntityManager API (or the Repository API).
However, we want/need to use these APIs, because they are much simpler to write, require less code and are also less error-prone, since they do not rely on writing queries "manually" employing a string-based syntax.
Question
In case of TypeORM, is it possible to somehow set the db-schema when working with the EntityManager or repositories?
Something like this?
// set schema when instantiating manager
const manager = connection.createEntityManager({ schema: tenantDomain });
// should find all matching "order" entities within schema
const orders = manager.find(Order, { priority: 4 })
// should find a matching "item" entity within schema using same manager
const item = manager.findOne(Item, { id: 321 })
Notes:
The db-schema needs to be set in a request-scoped way to avoid setting the schema for other requests, which may belong to other customers. Setting the schema for the whole connection is not an option.
We are aware that one could create a whole new connection and set the schema for this connection, but we want to reuse the existing connection. So simply creating a new connection to set the schema is not an option.
To answer my own question:
At the moment there is no way to instantiate TypeORM repositories with different schemas at runtime without creating new connections.
So the only two options that a developer is left with for schema-based multi tenancy are:
Setting up new connections to connect with different schemas within the same db at runtime. E.g. see NestJS Request Scoped Multitenancy for Multiple Databases. However, one should definitely strive for reusing connections and and be aware of connection limits.
Abandoning the idea of working with the RepositoryApi and reverting to using createQueryBuilder (or executing SQL queries via query()).
For further research, here are some TypeORM GitHub issues that track the idea of changing the schema for a existing connections or repositories at runtime (similar to what is requested in the OP):
Multi-tenant architecture using schema. #4786 proposes something like this.photoRepository.useSchema('customer1').find()
Handling of database schemas #3067 proposes something like getConnection().changeDefaultSchema('myschema')
Run-time change of schema #4473
Add an ability to set postgresql schema per call #2439
P.S. If TypeORM decides to support the idea discussed in the OP, I will try to update this answer.
Here is a global overview of the issues with schema-based multitenancy along with a complete walkthrough a Github repo for it.
Most of the time, you may want to use Postgres Row Security Policy instead. It gives most of the benefits of schema-based multitenancy (especially on developer experience), without the issues related to the multiplication of connections.
Since commenting does not work for me, here a hint from the documentation of NestJS:
https://docs.nestjs.com/techniques/database#async-configuration
I am not using NestJS but reading the docs at the moment to decide, if it's a fitting framework for us. We have an app where only some modules have multi tenancy with schema per tenant, so using TypeOrmModule.forRootAsync(dynamicCreatedDbConfig) might be an option for me too.
This may help you if you have an interceptor or middleware, which prepares the dynamicCreatedDbConfig data before...

What's the point of running an EF migration when you can SQL directly in database?

How to create View (SQL) from Entity Framework in ABP Framework
Not allowed to post comments because of reputation. Just trying to get more information on connecting a database to an Entity Framework, without having to switch to a code-first development style. View selected answer's response (he told the OP to basically do the same thing he was going to do in the DB but with EF, and then added an extra step where EF "...ignores..." the previous instructions...
I want to create tables and design database directly in SQL, and have the csharp library just read/write the table values (kind of like how dapper function where it isnt replacing your database, just working along side of it).
The tutorials don't talk about how to integrate your databases with your project. It either brushes over the subject, ignores it completely, or discusses how to replace it.
I don't want to do any EF migrations (i dont want/need to destroy/create database everytime i decide to run, duplicate, or transfer project). Any and all database back-track (back-up/restore) should be done with and thru SQL (within my work environment).
Just to be clear on exactly what i'm trying to learn:
How does somebody who specializes in database administration (building database schema, managing and monitoring data, and has existing database with data established) connect to project to fetch data (again, specifically referencing Dapper's Query functionality).
I want to integrate and design micro-services, some may share the same database connection or rely on another. But i just simply want to read data in a clean strongly-typed class entity, and maybe deal with insert/update somewhere else if i have to.
I would prefer to use Dapper instead of EF, but ABP is so heavily integrated with EF's design, it's more of a headache to avoid it, than it is to just go along with.
You should be able to map EF under ABP the same way as any other project using DB-first configuration.
The consistent approach I use for EF: (DB-First)
Define entities to match the table/view structure.
Define configuration classes extending EntityTypeConfiguration<TEntity> with the associated ToTable(), HasKey(), and any HasMany/HasRequired/HasOptional for relationships as needed.
In DbContext.OnModelCreating: modelBuilder.Configurations.AddFromAssembly(GetType().Assembly); to load all entity configurations. (assuming DbContext is in the same assembly as the models/configurations Substitute GetType().Assembly to point at the entity assembly.
Turn off Migrations. In DbContext constructor: Database.SetInitializer<MyDbContext>(null);
EF offers a lot more than simply mapping tables to classes. By mapping relationships between entities, EF can help generate optimized queries for retrieving data across those related entities. This can allow you to flatten data structures without returning unnecessary data, replace the need for views, and generally reduce the amount of data coming across the wire from the database to the application server.

Is it good practice to use AccessBean or SQL to fetch data from OOTB table in IBM WCS

I want to get data from multiple OOTB WCS table for which there is no OOTB rest available. I am using multiple access bean in databean to get data from tables. Is this a good practice or we should use ServerJDBCHelperAccessBean make a single query with join to hit database. I understand that AccessBean are cached but there are techniques we can cache sql also.
Is there any other reason we should use AccessBean instead of ServerJDBCHelperAccessBean in case fetching data from multiple tables. or we should use ServerJDBCHelperAccessBean and get data in single sql query with joins.
And which will be more expensive in above approaches.
Thanks
Ankit
There is no hard and fast rule to choose between the above two methods for database interactions. Developer has to make a logical choice
AccessBeans
Caching is one of the advantage of access beans. That is a good performance improvement and is achieved by caching the home objects as the look up for home objects are costly. Another point in favour of access bean is handling optimistic updates. Your case is to get the data (not to update/insert) and hence you are safe here.
Session Bean
Like access bean , session beans are another way of reading data from DB when you want to get data from multiple tables. A session bean must implement BASEJDBCHelper class.
public class TestSessionBean extends
com.ibm.commerce.base.helpers.BaseJDBCHelper
implements SessionBean{
public Object fetchResults() throws
javax.naming.NamingException, SQLException
{
try {
// get a connection from the WebSphere Commerce data source
makeConnection();
PreparedStatement prepStatement = getPreparedStatement( "sql to execute");
ResultSet rs = executeQuery(prepStatement, false);
}
finally {
closeConnection();
}
}
}
Using ServerJDBCHelperAccessBean
This is used when you have to make a db transaction outside of EJBs. Keep in mind that it is highly recommended to use EJBs for update/delete for keeping the overall integrity.
In your case, as far as I understand it is a select involving multiple tables and you are not keen on the data to be really in sync (like you are OK to lose a data which was updated nano seconds back or so). Hence you can go ahead with second or third approach
A good reference :
http://deepakpadmakumar.blogspot.com.au/2012/05/session-beans-and-entity-beans-in-wcs.html

Is it possibile to use a single transaction (on EF) with two different contexts pointing different schemas?

I'm currenly designing an application where I need to use two different database schemas (on the same instance): one as the application base, the other one to customize the application and the fields for every customer.
Since I read something about Repository pattern and as I've understood is possible to use two different contexts without efficiency loose, I'm now asking if I can use a single database transaction between two schemas with Entity Framework, as I'm actually doing directly on the database (SQL Server 2008-2012).
Sorry for my English an Thanks in advance!
If your connection strings are the same (which in your case will be as you have different schemas only for different contexts) then you are ok with this approach.
Basically you will have two different contexts that will be connected via the same connection string to the database and which will represent two different schemas.
using (var scope = new TransactionScope()) {
using (var contextSO = new ContextSchemaOne()) {
// Add, remove, change entities from context schema one
ContextSchemaOne.SaveChanges;
}
using (var contextST = new ContextSchemaTwo()) {
// Add, remove, change entities from context schema two
ContextSchemaTwo.SaveChanges;
}
scope.Complete();
}
I wasn't very successful in the past with this approach, and we switched to one context per database.
Further reading: Entity Framework: One Database, Multiple DbContexts. Is this a bad idea?
Maybe it's better to read something about unit of work before taking a decision about this.
You will have to do something like this: Preparing for multiple EF contexts on a unit of work - TransactionScope

two persistence units for two schemas?

I'm using Oracle DB as RDBMS, and I want to access, via my JSF2 application, to two database schema.
So, I think I must use two <persistence-unit> in my persistence.xml ?
If accessing two database schemas means just that some of the entities should be in different schema, that can be easily done with Table annotation:
#Entity
#Table(schema="someotherschemathandefault")
public class EntityInOtherSchema {
...
}
If those schemas need different credentials to access (or different datasources to be used), then defining two persistence units is way to go.