Axon Framework: how to configure mutiple databases? - spring-data-jpa

I'm using mysql as the event store, so axon-server-connector is excluded from the classpath. My use case is described as following.
Spring Boot 2.1.7.RELEASE and axon 4.3.2
I planned to have three databases, for the purpose of axon event store, projection writing and projection reading respectively.
#Configuration
public class DataSourceConfiguration {
#Primary
#Bean("axonMaster")
#ConfigurationProperties("spring.datasource.hikari.axon-master")
public DataSource axon() {
return DataSourceBuilder.create().type(HikariDataSource.class).build();
}
#Bean("projectionRead")
#ConfigurationProperties("spring.datasource.hikari.projection-write")
public DataSource master() {
return DataSourceBuilder.create().type(HikariDataSource.class).build();
}
#Bean("projectionWrite")
#ConfigurationProperties("spring.datasource.hikari.projection-read")
public DataSource slave() {
return DataSourceBuilder.create().type(HikariDataSource.class).build();
}
}
I tried to configure mutiple datasources with spring data jpa. The primary one is shown below.
#Configuration
#EnableTransactionManagement
#EnableJpaRepositories(entityManagerFactoryRef = "axonEntityManagerFactory",
basePackages = "org.axonframework.eventsourcing.eventstore.jpa") // (1)
public class AxonEventStoreConfig {
#Primary
#Bean(name="axonEntityManagerFactory")
public LocalContainerEntityManagerFactoryBean entityManagerFactory(
EntityManagerFactoryBuilder builder, #Qualifier("axonMaster") DataSource axonMaster) {
return builder
.dataSource(axonMaster)
.packages("org.axonframework.eventsourcing.eventstore.jpa")
.persistenceUnit("axonMaster") //(2)
.build();
}
#Primary
#Bean(name = "axonPlatformTransactionManager") //(3)
public PlatformTransactionManager transactionManager(
#Qualifier("axonEntityManagerFactory") EntityManagerFactory axonEntityManagerFactory) {
return new JpaTransactionManager(axonEntityManagerFactory);
}
}
Questions about this part are:
(1) Is it enough to set the basePackages to be org.axonframework.eventsourcing.eventstore.jpa? Maybe I need to add token package org.axonframework.eventhandling.tokenstore.jpa and soga package as well? I will use soga store.
(2) Is the packages here same with the previous one? What shall the name of persistenceUnit be?
The example project is uplodad to github: https://github.com/sincosmos/axon-multiple-databases
I cannot get the application run.
Am I doing everything right? I referred to the example at https://github.com/AxonIQ/giftcard-demo, but the multiple databases version is based on axon 2.0, axon command bus need to be configured as well.
The goal seems simple, configure mutiple databases (one for event store) in an axon framework application, but even I spend serval days, I still get nothing done.
Could anyone please give me some suggestion or help? I will be very grateful.
/**************************** update 20200521 ****************************/
I made a progress after reading Allard's answer, now I can configure multiple databases for the application. The source code is uploaded to github https://github.com/sincosmos/axon-multiple-databases.git
Especially, for axon event store, the db configration is shown as belown.
#Configuration
#EnableTransactionManagement
public class AxonEventStoreConfig {
#Bean("axonMaster")
#ConfigurationProperties("spring.datasource.hikari.axon-master")
public DataSource axon() {
return DataSourceBuilder.create().type(HikariDataSource.class).build();
}
#Bean(name="axonEntityManagerFactory")
public LocalContainerEntityManagerFactoryBean entityManagerFactory(
EntityManagerFactoryBuilder builder, #Qualifier("axonMaster") DataSource axonMaster) {
return builder
.dataSource(axonMaster)
.persistenceUnit("axonMaster")
.properties(jpaProperties())
.packages("org.axonframework.eventhandling.tokenstore",
"org.axonframework.modelling.saga.repository.jpa",
"org.axonframework.eventsourcing.eventstore.jpa")
.build();
}
/**
* Is it right to provide EntityManagerProvider like this ???
* For axon event store
* #param entityManagerFactory
* #return
*/
#Bean
public EntityManagerProvider entityManagerProvider(#Qualifier("axonEntityManagerFactory") LocalContainerEntityManagerFactoryBean entityManagerFactory) {
return () -> entityManagerFactory.getObject().createEntityManager();
}
private Map<String, Object> jpaProperties() {
Map<String, Object> props = new HashMap<>();
props.put("hibernate.physical_naming_strategy", SpringPhysicalNamingStrategy.class.getName());
props.put("hibernate.implicit_naming_strategy", SpringImplicitNamingStrategy.class.getName());
props.put("hibernate.hbm2ddl.auto", "update");
props.put("hibernate.show_sql", "true");
return props;
}
}
After I start the application (event store related tables will be created automatically if it's the first time), the database conntection pool for axon will be exhausted very soon. The logs are pasted for your reference.
Hibernate: select tokenentry0_.segment as col_0_0_ from token_entry tokenentry0_ where tokenentry0_.processor_name=? order by tokenentry0_.segment ASC
Hibernate: select min(domaineven0_.global_index)-1 as col_0_0_ from domain_event_entry domaineven0_
Hibernate: select tokenentry0_.segment as col_0_0_ from token_entry tokenentry0_ where tokenentry0_.processor_name=? order by tokenentry0_.segment ASC
19:59:57.223 [EventProcessor[com.baeldung.axon.querymodel]-0] WARN o.a.e.TrackingEventProcessor - Fetch Segments for Processor 'com.baeldung.axon.querymodel' failed: no transaction is in progress. Preparing for retry in 1s
Hibernate: select tokenentry0_.segment as col_0_0_ from token_entry tokenentry0_ where tokenentry0_.processor_name=? order by tokenentry0_.segment ASC
Hibernate: select min(domaineven0_.global_index)-1 as col_0_0_ from domain_event_entry domaineven0_
Hibernate: select tokenentry0_.segment as col_0_0_ from token_entry tokenentry0_ where tokenentry0_.processor_name=? order by tokenentry0_.segment ASC
19:59:58.293 [EventProcessor[com.baeldung.axon.querymodel]-0] WARN o.a.e.TrackingEventProcessor - Fetch Segments for Processor 'com.baeldung.axon.querymodel' failed: no transaction is in progress. Preparing for retry in 2s
Hibernate: select tokenentry0_.segment as col_0_0_ from token_entry tokenentry0_ where tokenentry0_.processor_name=? order by tokenentry0_.segment ASC
Hibernate: select min(domaineven0_.global_index)-1 as col_0_0_ from domain_event_entry domaineven0_
Hibernate: select tokenentry0_.segment as col_0_0_ from token_entry tokenentry0_ where tokenentry0_.processor_name=? order by tokenentry0_.segment ASC
20:00:00.361 [EventProcessor[com.baeldung.axon.querymodel]-0] WARN o.a.e.TrackingEventProcessor - Fetch Segments for Processor 'com.baeldung.axon.querymodel' failed: no transaction is in progress. Preparing for retry in 4s
Hibernate: select tokenentry0_.segment as col_0_0_ from token_entry tokenentry0_ where tokenentry0_.processor_name=? order by tokenentry0_.segment ASC
Hibernate: select min(domaineven0_.global_index)-1 as col_0_0_ from domain_event_entry domaineven0_
Hibernate: select tokenentry0_.segment as col_0_0_ from token_entry tokenentry0_ where tokenentry0_.processor_name=? order by tokenentry0_.segment ASC
20:00:04.465 [EventProcessor[com.baeldung.axon.querymodel]-0] WARN o.a.e.TrackingEventProcessor - Fetch Segments for Processor 'com.baeldung.axon.querymodel' failed: no transaction is in progress. Preparing for retry in 8s
Hibernate: select tokenentry0_.segment as col_0_0_ from token_entry tokenentry0_ where tokenentry0_.processor_name=? order by tokenentry0_.segment ASC
Hibernate: select min(domaineven0_.global_index)-1 as col_0_0_ from domain_event_entry domaineven0_
Hibernate: select tokenentry0_.segment as col_0_0_ from token_entry tokenentry0_ where tokenentry0_.processor_name=? order by tokenentry0_.segment ASC
20:00:12.531 [EventProcessor[com.baeldung.axon.querymodel]-0] WARN o.a.e.TrackingEventProcessor - Fetch Segments for Processor 'com.baeldung.axon.querymodel' failed: no transaction is in progress. Preparing for retry in 16s
20:00:17.327 [HikariPool-1 housekeeper] DEBUG com.zaxxer.hikari.pool.HikariPool - HikariPool-1 - Pool stats (total=15, active=15, idle=0, waiting=0)
20:00:22.178 [HikariPool-2 housekeeper] DEBUG com.zaxxer.hikari.pool.HikariPool - HikariPool-2 - Pool stats (total=10, active=0, idle=10, waiting=0)
20:00:23.092 [HikariPool-3 housekeeper] DEBUG com.zaxxer.hikari.pool.HikariPool - HikariPool-3 - Pool stats (total=10, active=0, idle=10, waiting=0)
Status of these connections are "Sleep" when I check the connection status in mysql workbench. Change the connection pool size does no help. I also checked the stack of the jvm and no deadlock were found. I set datasource leakDetectionThreshold to 10000 but as you can see no datasource leak information was print.
Can you help with this?
/**************************** update 20200522 ****************************/
It turns out "javax.persistence.TransactionRequiredException: no transaction is in progress" happened when the event processor trying to access the mysql event store. I configured transaction managers for each datasource but the error continues. Have no idea what is going on...

When using multiple databases, you probably won't be able to rely on autoconfiguration anymore, as Spring and Axon wouldn't know which one of the two databases you'd want to use.
Axon doesn't use an EntityManager directly. Instead, all components require an EntityManagerProvider. You may be able to use that to your advantage.
If you want all Axon components to use a certain database, simply define an EntityManagerProvider bean that returns the EntityManager that connects to that database. Spring manages the EntityManager completely, so you only need a single instance for all your EntityManager Sessions.
If you want different components to use different EntityManagers (e.g. Event Store in one database, Tokens and Sagas in another), then you will need to configure these components yourself. Sometimes, it's just easiest to copy the bean definitions from the AutoConfiguration classes and adapt them to suit your needs. See https://github.com/AxonFramework/AxonFramework/tree/master/spring-boot-autoconfigure/src/main/java/org/axonframework/springboot/autoconfig
Lastly, the entities that you need to scan depend on the components that you expect to use. Spring Boot autoconfiguration will scan the following Axon packages by default (if you don't specify any #EntityScan yourself):
org.axonframework.eventhandling.tokenstore (for tokens)
org.axonframework.modelling.saga.repository.jpa (for sagas)
org.axonframework.eventsourcing.eventstore.jpa (for the event store)
Note that the #EnableJpaRepositories annotation is used to scan for #Repository classes. Axon doesn't use those, so there is no point scanning Axon packages for them. Axon does define entities, so #EntityScan will make sense.

Related

How to correctly GROUP BY on jdbc sources

I have a Kafka stream with user_id and want to produce another stream with user_id and number of records in a JDBC table.
Following is how I tried to achieve this (I'm new to flink, so please correct me if that's not how things are supposed to be done). The issue is that flink ignores all updates to JDBC table after the job has started.
As far as I understand the answer to this is to use lookup joins but flink complains that lookup joins are not supported on temporal views. Also tried to use versioned views without much success.
What would be the correct approach to achieve what I want?
CREATE TABLE kafka_stream (
user_id STRING,
event_time TIMESTAMP(3) METADATA FROM 'timestamp',
WATERMARK FOR event_time AS event_time - INTERVAL '5' SECOND
) WITH (
'connector' = 'kafka',
-- ...
)
-- NEXT SQL --
CREATE TABLE jdbc_table (
user_id STRING,
checked_at TIMESTAMP,
PRIMARY KEY(user_id) NOT ENFORCED
) WITH (
'connector' = 'jdbc',
-- ...
)
-- NEXT SQL --
CREATE TEMPORARY VIEW checks_counts AS
SELECT user_id, count(*) as num_checks
FROM jdbc_table
GROUP BY user_id
-- NEXT SQL --
INSERT INTO output_kafka_stream
SELECT
kafka_stream.user_id,
checks_counts.num_checks
FROM kafka_stream
LEFT JOIN checks_counts ON kafka_stream.user_id = checks_counts.user_id

PostgreSQL: prevent lock on self table update with left join

I'm on PostgreSQL 9.3. I'm the only one working on the database, and my code run queries sequentially for unit tests.
Most of the times the following UPDATE query run without problem, but sometimes it makes locks on the PostgreSQL server. And then the query seems to never ends, while it takes only 3 sec normally.
I must precise that the query run in a unit test context, i.e. data is exactly the same whereas the lock happens or not. The code is the only process that updates the data.
I know there may be lock problems with PostgreSQL when using update query for a self updating table. And most over when a LEFT JOIN is used.
I also know that a LEFT JOIN query can be replaced with a NOT EXISTS query for an UPDATE but in my case the LEFT JOIN is much faster because there is few data to update, while a NOT EXISTS should visit quite all row candidates.
So my question is: what PostgreSQL commands (like Explicit Locking LOCK on table) or options (like SELECT FOR UPDATE) I should use in order to ensure to run my query without never-ending lock.
Query:
-- for each places of scenario #1 update all owners that
-- are different from scenario #0
UPDATE t_territories AS upt
SET id_owner = diff.id_owner
FROM (
-- list of owners in the source that are different from target
SELECT trg.id_place, src.id_owner
FROM t_territories AS trg
LEFT JOIN t_territories AS src
ON (src.id_scenario = 0)
AND (src.id_place = trg.id_place)
WHERE (trg.id_scenario = 1)
AND (trg.id_owner IS DISTINCT FROM src.id_owner)
-- FOR UPDATE -- bug SQL : FOR UPDATE cannot be applied to the nullable side of an outer join
) AS diff
WHERE (upt.id_scenario = 1)
AND (upt.id_place = diff.id_place)
Table structure:
CREATE TABLE t_territories
(
id_scenario integer NOT NULL,
id_place integer NOT NULL,
id_owner integer,
CONSTRAINT t_territories_pk PRIMARY KEY (id_scenario, id_place),
CONSTRAINT t_territories_fkey_owner FOREIGN KEY (id_owner)
REFERENCES t_owner (id) MATCH SIMPLE
ON UPDATE NO ACTION ON DELETE RESTRICT
)
I think that your query was locked by another query. You can find this query by
SELECT
COALESCE(blockingl.relation::regclass::text,blockingl.locktype) as locked_item,
now() - blockeda.query_start AS waiting_duration, blockeda.pid AS blocked_pid,
blockeda.query as blocked_query, blockedl.mode as blocked_mode,
blockinga.pid AS blocking_pid, blockinga.query as blocking_query,
blockingl.mode as blocking_mode
FROM pg_catalog.pg_locks blockedl
JOIN pg_stat_activity blockeda ON blockedl.pid = blockeda.pid
JOIN pg_catalog.pg_locks blockingl ON(
( (blockingl.transactionid=blockedl.transactionid) OR
(blockingl.relation=blockedl.relation AND blockingl.locktype=blockedl.locktype)
) AND blockedl.pid != blockingl.pid)
JOIN pg_stat_activity blockinga ON blockingl.pid = blockinga.pid
AND blockinga.datid = blockeda.datid
WHERE NOT blockedl.granted
AND blockinga.datname = current_database()
This query I've found here http://big-elephants.com/2013-09/exploring-query-locks-in-postgres/
Also can use ACCESS EXCLUSIVE LOCK to prevent any query to read and write table t_territories
LOCK t_territories IN ACCESS EXCLUSIVE MODE;
More info about locks here https://www.postgresql.org/docs/9.1/static/explicit-locking.html

spring batch partitioning performance issue

We have a spring batch job where we trying to process around 10 million records. Now doing this in single thread will be very slow since we have to match SLA.
To improve performance, we have developed a POC where master step is creating partitions where each partition represent one unique prod id. This can range from anywhere between 500 to 4500. In POC we have 500 such unique prod id. Now each partition is being given a prod id and step work on it. All this end to end works fine.
What we noticed is that master steps takes more than 5min to send partition info to step execution request. What i mean by that is that, there is more than 5 min diff between master step generates partitions and step being executed for 1st partition.
What might be causing this slowness? What spring batch framework does during this 5 min?
Here are the 3 selects which is executed during that 5 min so many time
SELECT JOB_EXECUTION_ID, START_TIME, END_TIME, STATUS, EXIT_CODE, EXIT_MESSAGE, CREATE_TIME, LAST_UPDATED, VERSION, JOB_CONFIGURATION_LOCATION from BATCH_JOB_EXECUTION where JOB_INSTANCE_ID = ? order by JOB_EXECUTION_ID desc;
SELECT JOB_EXECUTION_ID, KEY_NAME, TYPE_CD, STRING_VAL, DATE_VAL, LONG_VAL, DOUBLE_VAL, IDENTIFYING from BATCH_JOB_EXECUTION_PARAMS where JOB_EXECUTION_ID = ?;
SELECT STEP_EXECUTION_ID, STEP_NAME, START_TIME, END_TIME, STATUS, COMMIT_COUNT, READ_COUNT, FILTER_COUNT, WRITE_COUNT, EXIT_CODE, EXIT_MESSAGE, READ_SKIP_COUNT, WRITE_SKIP_COUNT, PROCESS_SKIP_COUNT, ROLLBACK_COUNT, LAST_UPDATED, VERSION from BATCH_STEP_EXECUTION where JOB_EXECUTION_ID = ? order by STEP_EXECUTION_ID;
Take a look at your job repository's configuration. Once the Partitioner has created the ExecutionContexts for each slave step, the master creates a StepExecution for each before sending it to the slave to be processed. So that lag is probably due to the insertion of all of those StepExecutions into your job repository. As a follow up, make sure you're using the latest versions. There was an optimization done to that not too long ago (batch inserting the executions instead of doing it one by one).

Entity framework with OData(Web API) is sending Order By clause By default to Sql Query

I am using Web Api with OData. and I have an entity defined in an EF 5.0.
I am sending very simple request to Controller::
$.ajax({url: "/odata/Details?$top=10",
type: "GET",
dataType: 'json',
success: function (data) {
viewModel.list(data.value);
}
Now code on My controller::
[Queryable]
public override IQueryable<Area> Get()
{
return db.Area.AsQueryable();
}
Query i see using SQL Profiler::
SELECT TOP (#p__linq__1)
[Project1].[id] AS [id1],
[Project1].[name] AS [name1],
[Project1].[pucrhase] AS [pucrhase1],
[Project1].[sale] AS [sale1]
FROM Area
ORDER BY [Project1].[id] DESC, [Project1].[name] ASC, [Project1].[pucrhase] ASC,
[Project1].[sale] ASC,N',#p__linq__1 int,#p__linq__1=10
I have not requested for any Ordering , Order By Clause . EF adds ORDER BY clause by Itself to Query. Order By clause added contains all columns of table.This table has 3 millions records and Query is timing Out as it is Ordering by All columns .
I tested by removing Order By it took Less then a second to finish
So Question is
how to Stop Entity framework(support of Web Api Odata ) from sending Order By clause to Sql Query.
How to remove Order By clause from SQL Query Entity framework(Web Api Odata) runs on Server?
Any Help is appreciated.
Im not sure if this is the right answer, but im assuming that the odata service is attempting to maintain a stable sort ordering by ordering on all properties within your model.
Therefore try
[Queryable(EnsureStableOrdering=false)]

How to do a safe "SELECT FOR UPDATE" with a WHERE condition over multiple tables on a DB2?

Problem
On a DB2 (version 9.5) the SQL statement
SELECT o.Id FROM Table1 o, Table2 x WHERE [...] FOR UPDATE WITH RR
gives me the error message SQLSTATE=42829 (The FOR UPDATE clause is not allowed because the table specified by the cursor cannot be modified).
Additional info
I need to specify WITH RR, because I'm running on isolation level READ_COMMITTED, but I need my query to block while there is another process running the same query.
Solution so far...
If I instead query like this:
SELECT t.Id FROM Table t WHERE t.Id IN (
SELECT o.Id FROM Table1 o, Table2 x WHERE [...]
) FOR UPDATE WITH RR
everything works fine.
New problem
But now I occasionally get deadlock exceptions when multiple processes perform this query simultaneously.
Question
Is there a way to formulate the FOR UPDATE query without introducing a place where a deadlock can occur?
First, for having isolation level READ_COMMITTED you do not need to specify WITH RR, because this results in the isolation level SERIALIZABLE. To specify WITH RS (Read Stability) is enough.
To propagate the FOR UPDATE WITH RS to the inner select you have to specify additionally USE AND KEEP UPDATE LOCKS.
So the complete statement looks like this:
SELECT t.Id FROM Table t WHERE t.Id IN (
SELECT o.Id FROM Table1 o, Table2 x WHERE [...]
) FOR UPDATE WITH RS USE AND KEEP UPDATE LOCKS
I made some tests on a DB2 via JDBC and it worked without deadlocks.