how to implement multi-tenant using spring-data-mongodb - mongodb

I am new to multi-tenancy with mongodb using spring-data-mongodb, we need to use spring-data-mongodb for rest APIs and scheduling tasks( we have more than one schedulers in our application) in the same code with thread-safe. Is autowiring mongoTemplate will make application thread safe as the same mongoTemplate will be accessed from Schedulers and APIs both. Please get me the good practice in such a situation.
Regards
Kris

MongoTemplate itself is thread-safe, i.e. you can call it from multiple threads at the same time, and it will work correctly, i.e. send the different requests correctly to MongoDB.
But that doesn't guarantee consistency: if the scheduler is running and executes multiple updates in the same task, an API call can possibly get some updated records and some other records that aren't updated yet.
By the way: multi-tenancy is having data from multiple organisational entities in the same database. I'm not sure how that links to your question, did you mean multi-threading?

If you use different databases, then you can't use an autowired MongoTemplate.
For autowiring, there must be a single instance, but since the database connection string is a dependency of a MongoTemplate, there must be a single database as well.
You could go for an approach where you do not auto-wire the MongoTemplate directly, but use some sort of factory pattern to create the correct MongoTemplate for the current tenant. See Making spring-data-mongodb multi-tenant for some examples. (It's an old question, but its answers get updated every now and then).
Or you could go with an infrastructural solution, and deploy separate instances of your application, one for each tenant, e.g. on Kubernetes.

Related

How to implement batch insert using spring-data-jdbc

is it possible to implement batch insert using spring-data-jdbc somehow? Or can i get access to JDBCTemplate using this spring-data realization?
There is currently no support for batch operations.
There are two issues requesting that one might want to follow if one is interested in that feature: https://jira.spring.io/browse/DATAJDBC-328 and https://jira.spring.io/browse/DATAJDBC-314
If one is working with Spring Data JDBC there will always be a NamedParameterJdbcTemplate in the application context so one can get that injected in order to perform batch operations without any additional configuration.

Optimistic Locking in Spring Data JDBC

I noticed that Spring Data JDBC doesn't seem to implemented Optimistic Locking (something like a JPA's #Version annotation).
I was thinking on creating a #Modifying query which considers the version field and returns boolean to check manually if the update was successful or not. But I'm afraid this approach is limited to simple entities, not aggregates implying multiple tables.
What's the best way to implement optimistic locking for aggregates?
It depends on your situation. If you just have 7 aggregates of which 5 are single entity aggregates go for the #Modifying solution for the single aggregates and write custom methods for the other 2.
If you have more aggregates consisting of more then one class consider properly implementing it and submitting a PR. The issue is already there: https://jira.spring.io/projects/DATAJDBC/issues/DATAJDBC-219
The main code changes will be in SqlGenerator which would need to add a where clause for aggregate roots if they have a version attribute.
If you are interested in doing a PR and need more assistance, please leave comment on the issue.

Can couchbase be used as the underlying JobRepository for spring-batch?

We have a requirement where we have to read a batch of a entitytype from the database, submit info about each entity to a service which will callback later with some data to update in the caller entity, save all the caller entities with the updated data. We thought of using spring-batch however we use Couchbase as our database which is eventually consistent and has no support for transactions.
I was going through the spring-batch documentation and I came across the Spring Batch Meta-Data ERD diagram here :
https://docs.spring.io/spring-batch/4.1.x/reference/html/index-single.html#metaDataSchema
With the above information in mind, my question is:
Can Couchbase be used as the underlying job-repository for spring-batch? What are the things I should keep in mind if its possible to use it? Any links to example implementations would be welcome.
The JobRepository needs to be transactional in order for Spring Batch to work properly. Here is an excerpt from the Transaction Configuration for the JobRepository section of the reference documentation:
The behavior of the framework is not well defined if the repository methods are not transactional.
Since Couchbase has no support for transactions as you mentioned, it is not possible to use it as an underlying datasource for the JobRepository.

Issue Insert/Update EF Core DbContext in Azure QueueTrigger Function (Multi-threading)

I´m getting PK Violation Exception when using EF Core 2.1 DbContext in an Azure QueueTrigger function. Guess is due to the nature of DbContext not being thread-safe, and the Azure Function running different instances in parallel. I have read quite a few, but I can´t find a good approach to solve this.
Here is my scenario (producer-consumer pattern):
I have a Scheduled Azure Function that is calling an API to get Projects from different external systems. To get all the required info for a project, I need to run different Queries to other external services, so I´m decoupling this to another Azure function, so the Scheduled function just queues a message per Project, as “Sync Project ID 101”.
Another QueueTrigger Function fires every time a message is queued, so, it means different instances running in parallel. This function must gather all the data of a specific Project, and that means more calls to other external services / APIs, to (some kind of) aggregate all the info about a Project. IMHO it´s good to do it that way, as I can process multiple Projects in parallel, and I can scale the Function if I need it.
Once I have all this Project info, I want to persist it in a SQL DB using EF Core (and here comes the issue)
Project data includes Users in the Project, and each user have a specific GUID as PK (coming from the external system). That means I can have repeated Users IDs in different Function instances, and here is the problem, as when I try to persist User info in a SQL Table, I can get PK Duplication exception, as multiple Function instances can try to Insert the same User at the same time (when the instance A check if user exists, it gets False, but another instance B is actually adding this User, so when instance A tries the Insert, it fails).
Guess I can lock DbContext somehow, but not sure if is good, as I also have a website doing Queries to the SQL DB (read-only queries for now, but could be updates in future too).
Another idea could be to send the entire Project info to another Queue / Blob file, and have another function in Singleton mode that Insert the data into SQL.
I´ve created this project simplifying my scenario, but enough to reproduce the issue and understand the problem.
https://github.com/luismanez/queuetrigger-efcore-multithreading
Any other ideas or recommended approaches? (open to change the architecture if find something better)
Many thanks!
A "more easy" way could be to do some kind of upsert in the database. There is a sample of how to do that with EF Core: https://www.flexlabs.org/2018/02/adding-upsert-support-for-entity-framework-core

.NETCore PostgreSQL "A command is already in progress"

I am developing a WebAPI on .NETCore accessing data to a POSTGRESQL DB.
I have troubles with the non-MARS support of PostgreSQL. NPGSQL is unable to support multiple connections from the same instance (as described in EntityFramework DbContext lifecycle + Postgres: "An operation is already in progress."). For Asynchronous management, this is blocking.
Unfortunately, I cannot find any solution to this.
At the moment, I inject my DB context with:
services.AddEntityFrameworkNpgsql().AddDbContextPool<DBApiContext>(opt => opt.UseNpgsql('connectionString');
I use EntityFramework.
Just for people who got stuck in this - In my case, it was a simple code change in the end of the method to fix this:
reader.close();
reader is an object of NpgsqlDataReader.
For who may be interested in: my problem was all about service scope and dependency injection.
The requestor was not a transient service, so that for every requests, even parallel, it was trying to access the DB.
Postgresql doesn't support MARS, then the second requests were rejected.
You need to have transient service requesting the access, for every invokation to use a different DB handler.