Setting default schema for Vertica Database - jpa

I am building a web application using Play! with Vertica database as back-end. The JDBC connection string for Vertica contains the server and database name, but my tables are under a specific schema (say "dev_myschema"). Thus, I should refer to my table as "dev_myschema.mytable". There is an exact copy of all these tables in a production schema as well (say "prod_myschema") with real data.
I would like to set this schema name in the configuration file so that it is easy to switch between these two schema. For now, I have a getConnection method in a helper class, that does DB.getConnection() and sets the configured schema as the default schema for that connection object. However, the same does not help in other model classes where it is mentioned along with its Entity annotation (#Entity #Table(name=dev_myschema.mytable))
Is there a way by which I can specify the schema name in the configuration file and have it read by the connection method as well as the model annotations?
Thanks.

Eugene got it almost correct, but was missing an underscore. The correct Vertica SQL syntax to set the default schema is:
set search_path to dev_myschema
As Eugene suggested, if you are using low-level JDBC, as soon as you create your Connection object you can do:
conn.createStatement().executeUpdate("set search_path to " + schemaName);

As far as I'm aware (and I just scanned the 4.1.7 documentation), there is no way as of yet to set a schema as the default.

according to the sql guide the default schema is the first one found in your search tree. maybe you could exploit that and make sure your copy is found first.

They way I handle this issue is by executing a "set search path" command if I am using my development schema. So, as soon as your Vertica connection object is created, execute the following command:
"set search path to dev_myschema"
In my application code, I just have my Vertica object check an environment/config variable, and if the "dev schema" setting is present, it executes that statement upon establishing the connection. My production config doesn't have that setting, so it will just use the default schema in that case and not incur the additional overhead of executing that statement every time.

In 7.0, admin can set it at user level by issuing below command:
alter user user_name search_path schema1,schema2;

Related

TypeORM: Dynamically set database schema for EntityManager (or repositories) at runtime?

Situation:
For our SaaS API we use schema-based multitenancy, which means every customer (~tenant) has its own separate schema within the same (postgres) database, without interfering with other customers. Each schema consists of the same underlying entity-model.
Everytime a new customer is registered to the system, a new isolated schema is automatically created within the db. This means, the schema is created at runtime and not known in advance. The customer's schema is named according to the customer's domain.
For every request that arrives at our API, we extract the user's tenancy-affiliation from the JWT and determine which db-schema to use to perform the requested db-operations for this tenant.
Problem
After having established a connection to a (postgres) database via TypeORM (e.g. using createConnection), our only chance to set the schema for a db-operation is to resort to the createQueryBuilder:
const orders = await this.entityManager
.createQueryBuilder()
.select()
.from(`${tenantId}.orders`, 'order') // <--- setting schema-prefix here
.where("order.priority = 4")
.getMany();
This means, we are forced to use the QueryBuilder as it does not seem to be possible to set the schema when working with the EntityManager API (or the Repository API).
However, we want/need to use these APIs, because they are much simpler to write, require less code and are also less error-prone, since they do not rely on writing queries "manually" employing a string-based syntax.
Question
In case of TypeORM, is it possible to somehow set the db-schema when working with the EntityManager or repositories?
Something like this?
// set schema when instantiating manager
const manager = connection.createEntityManager({ schema: tenantDomain });
// should find all matching "order" entities within schema
const orders = manager.find(Order, { priority: 4 })
// should find a matching "item" entity within schema using same manager
const item = manager.findOne(Item, { id: 321 })
Notes:
The db-schema needs to be set in a request-scoped way to avoid setting the schema for other requests, which may belong to other customers. Setting the schema for the whole connection is not an option.
We are aware that one could create a whole new connection and set the schema for this connection, but we want to reuse the existing connection. So simply creating a new connection to set the schema is not an option.
To answer my own question:
At the moment there is no way to instantiate TypeORM repositories with different schemas at runtime without creating new connections.
So the only two options that a developer is left with for schema-based multi tenancy are:
Setting up new connections to connect with different schemas within the same db at runtime. E.g. see NestJS Request Scoped Multitenancy for Multiple Databases. However, one should definitely strive for reusing connections and and be aware of connection limits.
Abandoning the idea of working with the RepositoryApi and reverting to using createQueryBuilder (or executing SQL queries via query()).
For further research, here are some TypeORM GitHub issues that track the idea of changing the schema for a existing connections or repositories at runtime (similar to what is requested in the OP):
Multi-tenant architecture using schema. #4786 proposes something like this.photoRepository.useSchema('customer1').find()
Handling of database schemas #3067 proposes something like getConnection().changeDefaultSchema('myschema')
Run-time change of schema #4473
Add an ability to set postgresql schema per call #2439
P.S. If TypeORM decides to support the idea discussed in the OP, I will try to update this answer.
Here is a global overview of the issues with schema-based multitenancy along with a complete walkthrough a Github repo for it.
Most of the time, you may want to use Postgres Row Security Policy instead. It gives most of the benefits of schema-based multitenancy (especially on developer experience), without the issues related to the multiplication of connections.
Since commenting does not work for me, here a hint from the documentation of NestJS:
https://docs.nestjs.com/techniques/database#async-configuration
I am not using NestJS but reading the docs at the moment to decide, if it's a fitting framework for us. We have an app where only some modules have multi tenancy with schema per tenant, so using TypeOrmModule.forRootAsync(dynamicCreatedDbConfig) might be an option for me too.
This may help you if you have an interceptor or middleware, which prepares the dynamicCreatedDbConfig data before...

Trouble with Multi-Tenant Schema Generator Example

We are attempting to use CFE to generate one schema for each tenant as outlined in the CodeFluent blog post (http://blog.codefluententities.com/2014/12/04/multi-tenant-using-multiple-schema/). In this scenario, we are expecting that each schema generated should be identical and we are using the ICodeFluentPersistence Hook system to identify the company for a user and then properly set the schema to be used. All of that works fine, but when we run the code to generate the multiple schemas (https://github.com/SoftFluent/CodeFluent-Entities/tree/master/Extensions/SoftFluent.MultiTenantGenerator), it is removing the constraints. I then tried to see if there was an issue with my configuration, but running the sample program from GitHub produces the same results. After running the sample program, the Primary key was not present in the contoso schema, even though is was properly defined in the dbo schema (and in the model).
Has anyone used the CFE Multi-Schema generator or have any insight into what the issue may be?
Thanks for your response, but I am not sure that I agree. The whole reason (at least of me) to use the Multi-Tenant generator is to create as many database schemas as needed (one per client) from a single CFE model. The idea that you would lose the constraints in all but one of them didn't feel right so I did a bit more investigation and found the following in "Microsoft SQL Server 2012 Internals" by Kalen Delaney and Craig Freeman (through Google Books):
And in fact was able to do a quick test to prove this out by creating two identical tables with identical PK names:
So it would appear to me that CFE should be able to create the two identical databases from the same model and seems to point to a deficiency in the SQLServer diff engine.
The multi-schema generator loads the model and change it dynamically to modify the schema of the entities. Then it call the standard code production process with only the database producers (SQL Server, Oracle, etc.).
So if you want to generate 2 differents schema (dbo and contoso) against an empty database, the process is the following:
Generate the database for the dbo schema from a blank database
Generate the database for the contoso schema from the previously generated database
Before creating a constraint, the SQL Server diff engine drops the constraint with the same name. In fact SQL Server does not allow 2 constraints to have the same name (I can't find a page on MSDN with more details about that). So in your case the existing PK is dropped when you generate the contoso schema because the name of the PK is the same as the one that exists in the dbo schema. Maybe this can be improved, but the diffs engine tries to generate a code that works for SQL Server 2000 to SQL Server 2016.
Workarounds
You can generate each schema in a different database, so the diffs engine will generate the code you expect. Then you can run the generated scripts on the production database. Not the easiest way but it should work.
You can use the patch producer to replace the name of the schema in the file. For SQL files you should use the SqlServerPatchProducer as explain in the KnowledgeBase:
namespace Sample
{
public class SqlServerPatchProducer : SqlServerProducer
{
public SqlServerPatchProducer()
{
}
protected override void RunProceduresScript()
{
string path = GetPath(Project.DefaultNamespace + "_procedures.sql");
ProduceFrom(path, "before");
SearchAndReplaceProducer.ProducePatches(Project, null, this, null, ProductionFlags, Element);
Utilities.RunFileScript(path, Database, OutputEncoding);
ProduceFrom(path, "after");
}
}
}

JPA table capitalization inconsistent

We're using a very basic JPA implementation that should create tables consistently from our models.
I believe we're using EclipseLink or TopLink (whichever one is default with the latest Netbeans/Glassfish). The problem is, the tables are created with inconsistent capitaliztion and with the columns out of order. For me, It creates the "User" table as "user", and for other members of my team it creates "USER".
I've tried using the #Table annotation (#Table(name="USer")), but it doesn't work.
How do we get EclipseLink to generate consistent table names? Frankly this seems like a rather amateurish mistake for a framework like this.
Sub-question : the reason this is a problem is because EclipseLink by default has no default way of managing schema/data migrations, as far as I know of. The way we're handling it is by writing a bunch of INSERT INTO's to bootstrap the objects we need in our database, and drop-and-recreating the tables every time the schema changes. I know this is not the best practice for propagating schema changes -- does anyone know how this is typically handled in a standard JPA implementation?
Thanks.
By default EclipseLink uses all upper case for the table name, the class User would be USER.
If you specify an #Table annotation with name="USer", then the table will be created as "USer".
Perhaps you are using your own scripts to create the tables, or you database is changing the case based on the OS or its own settings. What database are you using?
If you enable logging in EclipseLink, it will show the exact DDL that it is executing (if it is executing DDL).
In EclipseLink 2.4 there is also a "create-or-extend-tables" DDL generation option to alter existing tables.
We never found any good answer for this. Luckily, we found a workaround for the ways we were using to update the table, which didn't care about capitalization.

Change Schema of Entity Framework

I'm using Entity Framework 5 on ASP MVC 4 web site I'm developing.
Because I am using shared hosting which charge for the number of databases I use I would like to run a test site near my production site.
I have two problems:
1) I use Code First and Database Migration. The migration classes seem to embed the schema dbo inside the name of the tables.
How can I change the schema according to the test/production flag
2) How can I change the schema from which EF select data?
Thank you,
Ido.
Both migration and EF take schema from mapping so if you want to change the schema you must update your mapping to use:
modelBuilder.Entity<MyEntity>().ToTable("MyTable", "MySchema");
and control the value of MySchema from configuration but this is really bad idea. One day you forget to change the value and break your production. Use local database for development and test.
As already said: use identical databases (structurally) for development, test and production.
The goal of schemas is to group database objects, like we do with namespaces in e.g. C#, or to simplify permissions for groups of database objects. Not to identify database stages. By using them for the latter you also make it much harder, if not impossible, to use schema appropriately. See for instance this MSDN white paper.
It is much easier to use some database name conventions to indicate their purpose.

Does the Entity Framework use a default intital catalog and what assumptions does it make?

I did (pretty much) everything correct in a new EF project, but I forgot to use the named connection string in the EF context class, so, it used the default.
It created a new database inside the SQL Express default data directory, and it worked perfectly.
When I realised my mistake (After wondering for ages why no files were showing up in the app_data folder), I renamed the class to use the named connection string and then I kept getting the following error:
Unable to complete operation. The
supplied SqlConnection does not
specify an initial catalog.
I know how to fix this, but, EF is like magic to me! I can't believe it works as well as it does and I am just curious as to what it uses by default / is there a list anywhere of "assumptions" that EF uses on your behalf if you specify nothing?
By default it uses database with the same name as your context but once you specify custom named connection string you must provide the name of used database either by Initial Catalog or Database parameter.
I had the same problem but I figured it out by changing the web.conf. You need to specify an initial catalog and Integrated security and make a user instance.
Here is working web.conf:
add name="MovieDBContext"
connectionString="Data Source=.\SQLEXPRESS;AttachDbFilename=|DataDirectory|Movies.mdf;Initial Catalog=Movies;Integrated Security=SSPI;User Instance=true"
providerName="System.Data.SqlClient"