Is it possible to Implement Job Repository of Spring Batch using any of the latest versions of MongoDB with transactional support? - mongodb

I have gone through several implementation of job repository using mongo dB but couldn't find any stable one, and that support transactions in job repository. I have also read a note, that mongo DB is not recommended for job repository as it does not support transactions. So need to know the possibilities of implement job repository using any latest versions of mongo DB with transactional support.

I have also read a note, that mongo DB is not recommended for job repository as it does not support transactions.
MongoDB added support for transactions in v4. There is a feature request against Spring Batch to use MongoDB as a job repository: https://github.com/spring-projects/spring-batch/issues/877, but this feature has not been implemented yet.

Related

AWS Glue - version control and setting up for continuous integration

We are in the process of setting up the CI / CD process for AWS Glue ETL Process. The existing ETL process contains the following AWS Glue Components - Crawlers, Registered tables in catalog, Jobs, Triggers and workflows.
Obviously the first step is to set up a code repository and link the existing artifacts from different components mentioned above to the repository, which will ideally need to facilitate the developers in performing the check-ins and pull request from the tool (Something similar to ADF and Databricks). However as far as we have explored, AWS glue does not have integration to any of the source code repository which can directly provide this feature unless we are missing something.
Hence what is the method to setup the environment for CI (I'm still not talking about CD), the below link gives a reference for CI/CD:
https://aws.amazon.com/blogs/big-data/implement-continuous-integration-and-delivery-of-serverless-aws-glue-etl-applications-using-aws-developer-tools/
However it mentions at the beginning that, AWS CloudFormation template file for deploying the ETL jobs are both committed to version control - so not clear on how this is done for the on-going regular commits from the developers.
However as far as we have explored, AWS glue does not have integration
to any of the source code repository which can directly provide this
feature unless we are missing something.
Correct, Glue does not have VC integration.
I develop (python and cloudformation) locally on vscode and use it's git integration plugin. And I use a container if I want to test something locally, but Glue also has a Dev Endpoint for similar tasks.

Apache Kafka patch release process

How Kafka will release the patch updates?
How users will get to know Kafka patch updates?
Kafka is typically available as a zip/tar that contains the binary files which we will use to start/stop/manage Kafka. You may want to:
Subscribe to https://kafka.apache.org/downloads by generating a feed for it.
Subscribe to any feeds that give you updates
Write a script that checks for new kafka releases https://downloads.apache.org/kafka/ periodically to notify or download.
The Kafka versioning format typically is major.minor.patch release.
Every time, there is a new Kafka release, we need to download the latest zip, use the old configuration files (make changes if required) and start Kafka using new binaries. The upgrade process is fully documented in the Upgrading section at https://kafka.apache.org/documentation
For production environments, we have several options:
1. Using Managed Kafka Service (like in AWS, Azure, Confluent etc)
In this case, we need not worry about patching and security updates to Kafka because it is taken care by the service provider itself. For AWS, you will typically get notifications in the Console regarding when your Kafka update is scheduled.
It is easy to get started to use Managed Kafka service for production environments.
2. Using self-hosted kafka in Kubernetes (eg, using Strimzi)
If you are running Kafka in Kubernetes environment, you can use Strimzi operator and helm upgrade to update to the version you require. You need to update helm chart info from repository using helm repo update.
Managed services and Kubernetes operators make managing easy, however, manually managing Kafka clusters is relatively difficult.

CI/CD of Database in multiple instance - Azure Devops

I have written a step to deploy database build in respective database. But at a time, i am able to deploy one database ie one step for one db. is it possible to deploy same db build in multiple db at a step in relase?
is it possible to deploy same db build in multiple db at a step in relase?
AFAIK, there is no build-in task to do it. But you could create a powershell/batch script that loops runs the sqlpackage.exe to deploy DAC to multiple db at a step in relase:
You could check the similar thread Deployment to several databases using SQL Server Data Tools and Team foundation Server for some details.
Besides, there are many extensions in Marketplace can do it, so choose some of them that meet your requirement:
https://marketplace.visualstudio.com/search?term=sql&target=AzureDevOps&category=All%20categories&visibilityQuery=all&sortBy=Relevance
Hope this helps.

spring cloud config server concurrency control

I've multiple consuming app instances connecting to spring cloud config server. Config server gets config from SVN Repo.
Just wanted to understand how config server (instance) is managing possibly concurrent requests.
Thanx,
That's a bug in the SVN support (the git version has a synchronized method). https://github.com/spring-cloud/spring-cloud-config/issues/128

How to manage database context changes in production / CI

I've spent the past few months developing a webApi solution that I'm ready to push up to Azure and hook into an Azure SQL Database. It was built with EF Code First.
I'm wondering what standard approaches there are to making changes to the database while in production. I've been using database initializers up to this point but they all blow away data and re-seed.
I have a feeling this question is too broad for a concise answer, so I'd like to ask: what terminology / processes / resources should a developer look into when designing a continuous integration workflow for a solution built with EF Code First and ASP.NET WebAPI, hosted as an Azure Service and hooked up to Azure SQL?
On the subject of database migration, there was an interesting article on ASP.NET about this subject: Strategies for Database Development and Deployment.
Also since you are using EF Code First you will be able to use Code First Migrations here for database changes. This will allow you to better manage the changes you make to the database.
I'm not sure how far you want to go with continuous integration but since you are using Azure it might be worth it to have a look at Continuous delivery to Windows Azure by using Team Foundation Service. Although it relies on TFS in the cloud it's of course also possible to configure it with for example Jenkins. However this does require a bit more work.
I use this technic:
1- Create a clone database for your development environment if it doesn't exist.
2- Make the necessary changes in your dev environment and dev
database.
3- Deploy to your staging environment.
4- If you added some static datas
that should also exist in your prod database, use a tool like
SQLDataExaminer to find the data differences and execute the
insert, update, deletes for according rows. Use Schema Compare in VS2012 to find differences between your dev
and prod environment by selecting source as dev and target as prod.
And execute the script in your prod.
5- Swap the environments