Are Spring Batch 3 database structure changes compatible with Spring Batch 2? - spring-batch

We migrated from Spring Batch 2.1.7 to Spring Batch 3.0.6, but got this jboss startup error:
org.springframework.jdbc.BadSqlGrammarException: PreparedStatementCallback; bad SQL grammar [SELECT E.JOB_EXECUTION_ID, E.START_TIME, E.END_TIME, E.STATUS, E.EXIT_CODE, E.EXIT_MESSAGE, E.CREATE_TIME, E.LAST_UPDATED, E.VERSION, E.JOB_INSTANCE_ID, E.JOB_CONFIGURATION_LOCATION from BATCH_JOB_EXECUTION E, BATCH_JOB_INSTANCE I where E.JOB_INSTANCE_ID=I.JOB_INSTANCE_ID and I.JOB_NAME=? and E.END_TIME is NULL order by E.JOB_EXECUTION_ID desc]; nested exception is java.sql.SQLSyntaxErrorException: ORA-00904: "E"."JOB_CONFIGURATION_LOCATION": invalid identifier
...which was apparently caused by Spring Batch 3 auto-migration, in which Spring Batch 3 has some table structure changes from Spring Batch 2.
To get things moving forward, using the create table script our developer team found in one of the Spring Batch jars, our DBA team wrote a script to update (instead of create) the tables, since we need the job history. This is all working so far, but here is our issue:
We can't migrate all our systems forward to Spring Batch 3. We have to leave the older ones in Spring Batch 2 for a while.
Are these Spring Batch 3 table structure changes backward compatible with Spring Batch 2?
They appear to be from analysis by our DBA team and from our batch run results so far, but I'm just asking if this was intentional by Spring, i.e. when Spring altered the table structure for the purposes of Spring Batch 3, did you INTENTIONALLY make it backward compatible?
So far they appear to be compatible, but I just want to make sure there isn't some subtle difference which will break our system badly down some not-often-used logic path, i.e. at statement execution time (vs jboss startup time).
Ben Ethridge

They are not backwards compatible. The way job parameters are stored is different. The migration scripts did not remove the old columns (just added the net new). That doesn't mean that you couldn't come up with a schema that works for both versions (it looks like that's what you have), but as for our intent, it was identified as a breaking change when we added non-identifying parameters.

Related

Axon Framework : Table tokenentry doesn't exist

I'm new in development with Axon Framework. My problem is when I run my microservice (a client connecting to Axon Server), this error message
Caused by: com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: Table 'banking2.**tokenentry**' doesn't exist
is displayed in console and my microservie fails to start!
It pretty much depends how you have configured your project and how you want to setup the tables.
A common approach for enterprises is to have scripts (in form of migrations) that will run and create the tables. If that is the case, you have to provide your own scripts to create it.
If you are using Hibernate for instance, you can set the hibernate.hbm2ddl.auto property to ask it to create it (among other options, better check their docs).

running camunda with Spring boot & mongodb

Has anyone been able to get Camunda to run with Spring Boot and mongodb?
I tried several approaches and always got into a brick wall.
What I tried:
1. jpa / hibernate-ogm
I was able to initiate a connection to mongo after creating my own CamundaDatasourceConfiguration and ProcessEngineConfigurationImpl.
It failed when Camunda tried to get table metadata. I couldn't plug out this behavior.
2. jdbc driver for mongo by progress
I set up the jdbc url and driver class by progress.
Camunda then gets stuck during the startup process and does not get to the point where Jetty is fully started, i.e. the "Jetty started on port XYZ" message in the log.
3. camunda with postgres with mongo FDW
FDW is a mechanism for postress to interface an external datasource. This way an application can work with postgres over jdbc while the FDW will take care of reading and writing the date to the external source, be it a file, mongodb, etc.
After realizing 1 and 2 don't work, I started working on 3.
Has anyone succeeded in doing this and can share how?
so I ran into the same problem and decided to share my answers with you.
Currently it is not possible to run the Camunda-Engine with a NoSQL Database.
In this Camunda-Forum-Post one of the guys at Camunda also says it is not possible to run the engine completely without a database.
And in the offical Camunda-Docs there is also a list with all supported environments. Currently there are only SQL-Databases listed:
https://docs.camunda.org/manual/7.10/introduction/supported-environments/
But in some earlier Blog-Posts they metioned, that they want to make some proof-of-concept examples with the use of NoSQL-Databases. So we can expect, that these databases will be supported in the future, but not at the moment.
(note: the flowable engine is doing the same proof of concepts, they mentioned, that they want to be able to use NoSQL-databases by the end of the next year).

How to deploy changes to a Cassandra CQL schema

We have an application which is using Cassandra for its database. How should we deploy schema changes in a live production environment.
In development we are just blowing the database away and recreating it with a 'database.cql' script kept in version control. This clearly isn't a solution in production.
In the relational world I would either use a sequence of upgrade scripts and apply them in order, or use a tool to interactively compare the staging and production databases and make the appropriate schema changes.
How do I solve the same problem in the Cassandra?
Here's one I've started and have been using for a while.
https://github.com/heartysoft/aedes
It supports multiple environments, and versioning. Since we're Windows based, it's mainly powershell, but there's no reason a bash script couldn't be written to do the equivalent. The powershell script itself is extremely simple. It requires Powershell v3+. Usage is pretty easy:
aedes.ps1 192.168.40.4 [-u username -p password -env dev]
will look for schema files in the ..\schema folder. Schema files are expected to have a n_ prefix. Environment specific files have a .env.cql postfix. So, if the files are:
1_people.dev.cql
1_people.prod.cql
2_people_some_indexes.cql
3_jobs.dev.cql
3_jobs.prod.cql
4_jobs_something_changed.cql
And run it for prod, then the ones with .prod.cql and no "env" .cql will be applied in order. You can also specify a $start version that can be used to specify where to start applying from (e.g. if start is specified as 3, then anything with 1_ and 2_ will be skipped).
It's pretty basic but seems to work quite well. We just have Cassandra downloaded (not installed) on the "applier machine" (which could be your machine, i.e. not part of a cluster) and have cqlsh on the PATH for easier application. Did (and do) have plans for more features, but working nicely as is for the time being.
Since there wasn't an existing tool, I ended up writing one.
It is called cql-migrate, and provides incremental updates to a deployed Cassandra schema.
[update] Since writing this, I have found a couple more options: one for for rails and another for go

Customizing DB2 Command Line Processor

I have written a program in java that reads .csv files and stores them into a database table. But the performance of the storing operation is very slow. When I use DB2 Command Line Processor there is a drastic change in performance and it's very fast. So, I am trying to customize DB2 Command Line Processor according to my requirement. I searched on Google but I only found topics for how to use it. I would like to get clear on following subjects before I start.
Is "DB2 Command Line Processor" open source?
Which programming language is used?
Is there alternative like DB2 Command Line Processor with open source-code in java?
Is there a way to call DB2 Command Line Processor out of a java program?
It may be worth investigating the Java program, the slow run times may be related to how often you are commiting the data (i.e. you may running in auto-commit mode (commiting after every insert)).
Committing after every 500 insert may be a lot faster than commiting after every record
see DB2 autocommit for details on auto-commit
1) DB2 CLP (command line processor) is part of DB2. It is not open source, and it is included in all editions (Express-C, express, workgroup, extended), and in the Data Server client. This last is free to download, and install in all clients.
2) The best way to use the capabilities os DB2CLP is via scripts, such as bash scripts or windows scripts.
You can also call the db2clp from another program, such as a java application (runtime).
3) There are shells for databases with open source licence, however, you are mixing two things: a shell, that is normally a black screen where you type commands. And a driver to query a database from a program developed by yourself.
4) Again, via Runtime, http://docs.oracle.com/javase/6/docs/api/java/lang/Runtime.html
Finally, the best is to use a JDBC driver, in order to do things directly, and not with a lot of tiers. You have to check your Java code, probably the reading is not efficient. And also, check the properties of the DB2 Java driver.
One more thing, if you want the fatest, try to use LOAD to insert data in the database. It does not perform any log. You can call LOAD from a java application (remember to load the db2 environment before executing any command)

Database migrations: manage with build script or automatic on app startup?

I'm in the process of developing a deployment system for a new web app and I'm wondering where the best point in the process to manage database migrations is (the question of how to do the migrations is another problem entirely).
It seems there are two ways to go:
Use a migration script that can
either be run manually from command
line or as part of the automatic
deployment/build process
Run the migrations when the app
starts up (I'm using ASP.NET so this
can be done easily enough without
causing a long-running user request)
Does anyone have any suggestions/insight/experience with these approaches? Any other suggestions?
I can see why #1 might be more attractive - it gives me complete control over when the DB is updated. However, I quite like #2 as it allows me to quickly iterate between deployments and reduces the manual process. #2 could also be used on my development machine to allow even quicker iterations. Hmm, starting to think having both might be a good thing...
We have a sales-force system with ~100 client and we are updating database at application startup (True, our is a desktop application.) I like this approach, it's safe and iterative if we have indeterministic startpoint (is the client database new or only updated to verison x.y.z?).
But at serverside I'm preferr your #1 option: we create a SQL query file on our virtual machine (based on the copy of the original database) and runs this query against the real server.
So IMHO:
Disconnected clients: startup, iterative scripts
Server: query created on VM based on the actual and real database
So I'm interrested in this problem too, and find some (half)frameworks as RikMigrations. After some googling there is a good startplace about DB versioning/migration frameworks: .NET Database Migration Tool Roundup. Not neccessarely the documentation but the team blogs can be interresting.
I like option #1 better as it seems much more flexible. In lieu of actually performing migrations on each app start, I think I would verify that the database schema (version number?) matches the code, and if not, throw a warning or error about a mismatched database schema.
I'd prefer option #1 for a number of reasons. First, integration tests usually require your DB schema to be up-to-date, and launching a web-site to upgrade the schema will be a huge timewaster. Second, you cannot change database schema while your site is running (say, add a couple of indexes to speed things up).
As for production side of things, upgrading your database in transaction MSI-style installation is much better than attempting to upgrade at each app startup since you can potentially end up with desynchronized database-application versions.
And if you're looking for the migration framework, take a look at Wizardby.
If the application ever has to run on a customer's machine than migrating at startup can prevent a lot of support calls - assuming you can do seamless migration without user intervention (I hope you aren't normally running your web app with permission to modify the database).
If the application always runs under your control automatic migration is less of an issue - but still can be a good feature, especially if you want to minimize downtime and manual deployment steps.