Spring Batch very slow when using 2 datasources - one for Spring Batch and another for the App - spring-data

I modified this sample batch job provided by spring to use two custom datasources instead of the one autoconfigured by boot. Both datasources point to the same MySql DB server, but to different schemas. One schema for Batch/Task tables and another for the app tables. MySql is running locally. The performance was much much slower compared to the same job running with default boot configured datasource or with ONE custom datasource.
Here is the timing I got and can't figure out why #3 is taking a long time:
Default Boot configured datasource - 1 second
One custom datasource (for both Batch/Task and App) - 1 second
Two custom datasources (one each for Batch/Task and App) - 90 seconds !!!
Do I need to set any CP settings for the custom datasources when using two of them? I tried a few, but didn't help.
Here is the properties file:
spring.application.name=fileIngest
spring.datasource.url=jdbc:mysql://localhost:3306/test-scdf?useSSL=false
spring.datasource.username=<user>
spring.datasource.password=<pwd>
spring.datasource.driverClassName=org.mariadb.jdbc.Driver
app.datasource.url=jdbc:mysql://localhost:3306/test?useSSL=false
app.datasource.username=<user>
app.datasource.password=<pwd>
app.datasource.driverClassName=org.mariadb.jdbc.Driver
Here are relevant portions of my datasource config as recommended here.
#Bean(name = "springDataSource") // for Batch/Task tables
public DataSource dataSource(#Qualifier("springDataSourceProperties")DataSourceProperties springDataSourceProperties) {
return DataSourceBuilder.create().driverClassName(springDataSourceProperties.getDriverClassName()).
url(springDataSourceProperties.getUrl()).
password(springDataSourceProperties.getPassword()).
username(springDataSourceProperties.getUsername()).
build();
}
#Bean(name = "appDataSource") // for App tables
#Primary
public DataSource appDataSource(#Qualifier("appDataSourceProperties") DataSourceProperties appDataSourceProperties) {
DataSource ds = DataSourceBuilder.create().driverClassName(appDataSourceProperties.getDriverClassName()).
url(appDataSourceProperties.getUrl()).
password(appDataSourceProperties.getPassword()).
username(appDataSourceProperties.getUsername()).
build();
I just inject the appropriate datasource into the BatchConfiguration as needed.
#Configuration
#EnableBatchProcessing
public class BatchConfiguration extends DefaultBatchConfigurer {
...
#Override
#Autowired
public void setDataSource(#Qualifier("springDataSource") DataSource batchDataSource) {
super.setDataSource(batchDataSource);
}
#Bean
public BatchDataSourceInitializer batchDataSourceInitializer(#Qualifier("springDataSource") DataSource batchDataSource,
ResourceLoader resourceLoader) {
BatchProperties batchProperties = new BatchProperties();
batchProperties.setInitializeSchema(DataSourceInitializationMode.ALWAYS);
return new BatchDataSourceInitializer(batchDataSource, resourceLoader, batchProperties);
}

Related

PostgreSQL connections still idle after close in JDBC

I am New to SpringBoot.
I have a Spring Boot Application and the Database I am using is PostgreSQL, I am using JdbcTemplate and I have 2 datasource connections.
The code works fine,but I observed that in PostgreSQL pgAdmin in the Server Status Dashboard,it shows the connection pool(Say 10 connection) in idle mode.
While earlier I was working with single datasource, I observed the same thing,but I solved it by setting some properties within application.properties file.
For.e.g:
spring.datasource.hikari.minimum-idle=somevalue
spring.datasource.hikari.idle-timeout=somevalue
How do I achieve the same with multiple Datasources.
properties file
spring.datasource.jdbcUrl=jdbc:postgresql://localhost:5432/stsdemo
spring.datasource.username=postgres
spring.datasource.password=********
spring.datasource.driver-class-name=org.postgresql.Driver
spring.seconddatasource.jdbcUrl=jdbc:postgresql://localhost:5432/postgres
spring.seconddatasource.username=postgres
spring.seconddatasource.password=********
spring.seconddatasource.driver-class-name=org.postgresql.Driver
DbConfig
#Configuration
public class DbConfig {
#Bean(name="db1")
#Primary
#ConfigurationProperties(prefix="spring.datasource")
public DataSource firstDatasource()
{
return DataSourceBuilder.create().build();
}
#Bean(name = "jdbcTemplate1")
public JdbcTemplate jdbcTemplate1(#Qualifier("db1") DataSource ds) {
return new JdbcTemplate(ds);
}
#Bean(name="db2")
#ConfigurationProperties(prefix="spring.seconddatasource")
public DataSource secondDatasource()
{
return DataSourceBuilder.create().build();
}
#Bean(name="jdbcTemplate2")
public JdbcTemplate jdbcTemplate2(#Qualifier("db2") DataSource ds)
{
return new JdbcTemplate(ds);
}
}
That's to be expected.
"closing" a connection (i.e. calling close()) that is obtained from am pool will only return it to the pool. The pool will not immediately close the physical connection to the database to avoid costly reconnects (which is the whole point of using a connection pool)
An "idle" connection is also no real problem in Postgres.
If you have connections that stay "idle in transaction" for a long time - that would be a problem.

Axon Framework - Configuring Multiple EventStores in Axon Configuration

We are having an usecase wherein each aggregate root should have different eventstores. We have used the following configuration where currently , we have only one event-store configured as below
#Configuration
#EnableDiscoveryClient
public class AxonConfig {
private static final String DOMAIN_EVENTS_COLLECTION_NAME = "coll-capture.domainEvents";
//private static final String DOMAIN_EVENTS_COLLECTION_NAME_TEST =
//"coll-capture.domainEvents-test";
#Value("${mongodb.database}")
private String databaseName;
#Value("${spring.application.name}")
private String appName;
#Bean
public RestTemplate restTemplate() {
CloseableHttpClient httpClient = HttpClientBuilder.create().build();
HttpComponentsClientHttpRequestFactory clientHttpRequestFactory = new
HttpComponentsClientHttpRequestFactory(httpClient);
return new RestTemplate(clientHttpRequestFactory);
}
#Bean
#Profile({"uat", "prod"})
public CommandRouter springCloudHttpBackupCommandRouter(DiscoveryClient discoveryClient,
Registration localInstance,
RestTemplate restTemplate,
#Value("${axon.distributed.spring-
cloud.fallback-url}") String messageRoutingInformationEndpoint) {
return new SpringCloudHttpBackupCommandRouter(discoveryClient,
localInstance,
new AnnotationRoutingStrategy(),
serviceInstance -> appName.equalsIgnoreCase(serviceInstance.getServiceId()),
restTemplate,
messageRoutingInformationEndpoint);
}
#Bean
public Repository<TestEnquiry> testEnquiryRepository(EventStore eventStore) {
return new EventSourcingRepository<>(TestEnquiry.class, eventStore);
}
#Bean
public Repository<Test2Enquiry> test2enquiryRepository(EventStore eventStore) {
return new EventSourcingRepository<>(Test2Enquiry.class, eventStore);
}
#Bean
public EventStorageEngine eventStorageEngine(MongoClient client) {
MongoTemplate mongoTemplate = new DefaultMongoTemplate(client, databaseName)
.withDomainEventsCollection(DOMAIN_EVENTS_COLLECTION_NAME);
return new MongoEventStorageEngine(mongoTemplate);
}
}
Now , We want to configure "DOMAIN_EVENTS_COLLECTION_NAME_TEST"(just for example) as well in EventStorageEngine. How we can achieve the same support for multiple event-stores and select the tracking process as which collection they should be part of
If you are going the route of segregating the event streams, then combining them from an event handling perspective could become a necessity indeed. Especially when having several bounded contexts, segregating the event streams into distinct storage solutions is reasonable.
If you want to define which [message source / event store] is used by a TrackingEventProcessor, you will have to deal with the EventProcessingConfigurer. More specifically, you should invoke the EventProcessingConfigurer#registerTrackingEventProcessor(String, Function<Configuration, StreamableMessageSource<TrackedEventMessage<?>>>) method. The first String parameter is the name of the processor you want to configure as being "tracking". The second parameter defines a Function which gives you the message source to be used by this TrackingEventProcessor (TEP). It is here where you should provide the event store you want this TEP to ingest events from.
Pairing them up at a later stage could also occur of course, which is also supported by Axon Framework. This boils down to a specific form of StreamableMessageSource implementation.
More specifically, you can use the MultiStreamableMessageSource, where you can connect any number of StreamableMessageSources together.
Note that Axon's EmbeddedEventStore is in essence an implementation of a StreamableMessageSource. Once the MultiStreamableMessageSource, you will have to specify it as the messageSource for your TrackingEventProcessors of course.
Last note, know that this solution can only be used when you are using TrackingEventProcessors, as those are the only Event Processors provided by Axon ingesting a StreamableMessageSource as the source for it's events.

Problems while connecting to two MongoDBs via Spring

I'm trying to achieve to connect to two different MongoDBs with Spring (1.5.2. --> we included Spring in an internal Framework therefore it is not the latest version yet) and this already works partially but not fully. More precisely I found a strange behavior which I will describe below after showing my setup.
So this is what I done so far:
Project structure
backend
config
domain
customer
internal
repository
customer
internal
service
In configI have my Mongoconfigurations.
I created one base class which extends AbstractMongoConfiguration. This class holds fields for database, host etc. which are filled with the properties from a application.yml. It also holds a couple of methods for creating MongoClient and SimpleMongoDbFactory.
Furthermore there are two custom configuration classes. For each MongoDB one config. Both extend the base class.
Here is how they are coded:
Primary Connection
#Primary
#EntityScan(basePackages = "backend.domain.customer")
#Configuration
#EnableMongoRepositories(
basePackages = {"backend.repository.customer"},
mongoTemplateRef = "customerDataMongoTemplate")
#ConfigurationProperties(prefix = "customer.mongodb")
public class CustomerDataMongoConnection extends BaseMongoConfig{
public static final String TEMPLATE_NAME = "customerDataMongoTemplate";
#Override
#Bean(name = CustomerDataMongoConnection.TEMPLATE_NAME)
public MongoTemplate mongoTemplate() {
MongoClient client = getMongoClient(getAddress(),
getCredentials());
SimpleMongoDbFactory factory = getSimpleMongoDbFactory(client,
getDatabaseName());
return new MongoTemplate(factory);
}
}
The second configuration class looks pretty similar. Here it is:
#EntityScan(basePackages = "backend.domain.internal")
#Configuration
#EnableMongoRepositories(
basePackages = {"backend.repository.internal"}
mongoTemplateRef = InternalDataMongoConnection.TEMPLATE_NAME
)
#ConfigurationProperties(prefix = "internal.mongodb")
public class InternalDataMongoConnection extends BaseMongoConfig{
public static final String TEMPLATE_NAME = "internalDataMongoTemplate";
#Override
#Bean(name = InternalDataMongoConnection.TEMPLATE_NAME)
public MongoTemplate mongoTemplate() {
MongoClient client = getMongoClient(getAddress(), getCredentials());
SimpleMongoDbFactory factory = getSimpleMongoDbFactory(client,
getDatabaseName());
return new MongoTemplate(factory);
}
}
As you can see, I use EnableMongoRepositoriesto define which repository should use which connection.
My repositories are defined just like it is described in the Spring documentation.
However, here is one example which is located in package backend.repository.customer:
public interface ContactHistoryRepository extends MongoRepository<ContactHistoryEntity, String> {
public ContactHistoryEntity findById(String id);
}
The problem is that my backend always only uses the primary connection with this setup. Interestingly, when I remove the beanname for the MongoTemplate (just #Bean) the backend then uses the secondary connection (InternalMongoDataConnection). This is true for all defined repositories.
My question is, how can I achieve that my backend really take care of both connections? Probably I missed to set another parameter/configuration?
Since this is a pretty extensive post I apologise if I forgot something to mention. Please ask for missing information in the comments.
I found the answer.
In my package structure there was a empty configuration class (of my colleague) with the annotation #Configurationand #EnableMongoRepositories. This triggered the automatic wiring process of Stpring Data and therefore led to the problems I reported above.
I simply deleted the class and now it works as it should!

Overridden RabbitSourceConfiguration (app starters) does not work with Spring Cloud Edgware

I'm testing an upgrade of my Spring Cloud DataFlow services from Spring Cloud Dalston.SR4/Spring Boot 1.5.9 to Spring Cloud Edgware/Spring Boot 1.5.9. Some of my services extend source (or sink) components from the app starters. I've found this does not work with Spring Cloud Edgware.
For example, I have overridden org.springframework.cloud.stream.app.rabbit.source.RabbitSourceConfiguration and bound my app to my overridden version. This has previously worked with Spring Cloud versions going back almost a year.
With Edgware, I get the following (whether the app is run standalone or within dataflow):
***************************
APPLICATION FAILED TO START
***************************
Description:
Field channels in org.springframework.cloud.stream.app.rabbit.source.RabbitSourceConfiguration required a bean of type 'org.springframework.cloud.stream.messaging.Source' that could not be found.
Action:
Consider defining a bean of type 'org.springframework.cloud.stream.messaging.Source' in your configuration.
I get the same behaviour with the 1.3.0.RELEASE and 1.2.0.RELEASE of spring-cloud-starter-stream-rabbit.
I override RabbitSourceConfiguration so I can set a header mapper on the AmqpInboundChannelAdapter, and also to perform a connectivity test prior to starting up the container.
My subclass is bound to the Spring Boot application with #EnableBinding(HeaderMapperRabbitSourceConfiguration.class). A cutdown version of my subclass is:
public class HeaderMapperRabbitSourceConfiguration extends RabbitSourceConfiguration {
public HeaderMapperRabbitSourceConfiguration(final MyHealthCheck healthCheck,
final MyAppConfig config) {
// ...
}
#Bean
#Override
public AmqpInboundChannelAdapter adapter() {
final AmqpInboundChannelAdapter adapter = super.adapter();
adapter.setHeaderMapper(new NotificationHeaderMapper(config));
return adapter;
}
#Bean
#Override
public SimpleMessageListenerContainer container() {
if (config.performConnectivityCheckOnStartup()) {
if (LOGGER.isInfoEnabled()) {
LOGGER.info("Attempting connectivity with ...");
}
final Health health = healthCheck.health();
if (health.getStatus() == Status.DOWN) {
LOGGER.error("Unable to connect .....");
throw new UnableToLoginException("Unable to connect ...");
} else if (LOGGER.isInfoEnabled()) {
LOGGER.info("Connectivity established with ...");
}
}
return super.container();
}
}
You really should never do stuff like healthCheck.health(); within a #Bean definition. The application context is not yet fully baked or started; it may, or may not, work depending on the order that beans are created.
If you want to prevent the app from starting, add a bean that implements SmartLifecycle, put the bean in a late phase (high value) so it's started after everything else. Then put your code in start(). autStartup must be true.
In this case, it's being run before the stream infrastructure has created the channel.
Some ordering might have changed from the earlier release but, in any case, performing activity like this in a #Bean definition is dangerous.
You just happened to be lucky before.
EDIT
I just noticed your #EnableBinding is wrong; it should be Source.class. I can't see how that would ever have worked - that's what creates the bean for the channels field of type Source.
This works fine for me after updating stream and the binder to 1.3.0.RELEASE...
#Configuration
public class MySource extends RabbitSourceConfiguration {
#Bean
#Override
public AmqpInboundChannelAdapter adapter() {
AmqpInboundChannelAdapter adapter = super.adapter();
adapter.setHeaderMapper(new MyMapper());
return adapter;
}
}
and
#SpringBootApplication
#EnableBinding(Source.class)
public class DemoApplication {
public static void main(String[] args) {
SpringApplication.run(DemoApplication.class, args);
}
}
If that doesn't work, please edit the question to show your POM.

How to load spring application context even if Cassandra down

When using
#Configuration
#EnableCassandraRepositories(basePackages={"com.foo"})
public class CassandraConfig{
#Bean
public CassandraClusterFactoryBean cluster()
{
final CassandraClusterFactoryBean cluster = new CassandraClusterFactoryBean();
cluster.setContactPoints(nodesRead);
cluster.setPort(port);
return cluster;
}
Where in the com.foo package there is a interface that extends CrudRepository.
Is there a way to make it so that at startup time an exception is not thrown if the database is down?
Ideally what occurs is that we startup and anytime you call a method on the repository, it will first attempt to connect to the database and then if the database is still down return an error saying can't connect.
The behavior I currently observe is that NoHostAvailableException is thrown and the web container does not start up.
I was able to come up with a solution. I removed the #EnableCassandraRepositories(basePackages={"com.foo"}) annotation from the repository and defined a Bean in my Config that would return my repository. Removing the EnableCassandraRepositories allowed lazy loading of the repository. This new bean in my Config allowed me to instantiate my repository using the RepositoryFactorySupport getRepository() method. I annotated this bean as lazy and made sure references to the bean were also lazy.
Assume my repository looks like the following
public interface IBarRepository extends CrudRepository<Bar, BarKey>{}
My Config file now looks like
#Configuration
public class CassandraConfig{
#Bean
#Lazy(value=true)
public IBarRepository barRepository() throws Exception
{
final RepositoryFactorySupport support = CassandraRepositoryFactory(cassandraTemplate());
return support.getRepository(IBarRepository.class);
}
#Bean
#Lazy(value=true)
public CassandraClusterFactoryBean cluster()
{
final CassandraClusterFactoryBean cluster = new CassandraClusterFactoryBean();
cluster.setContactPoints(nodesRead);
cluster.setPort(port);
return cluster;
}
//More beans down here defining things like cluster, mappingContext, session, etc.