I am planning to use Spring Batch on Azure as a serverless and looking to explore the Cosmos DB to store and manipulate the data.
Can I still use the Spring Batch Metadata tables with the COSMOS DB? If not, where to store the Spring Batch Metadata tables details?
How can we scheduled Batch Job on Azure? Is there any complete working example?
CosmosDB is not a supported database in Spring Batch, but you may be able to use one of the supported types, if the SQL variant is close enough.
Please refer to the Non-standard Database Types in a Repository section of the documentation for more details and a sample.
Related
Is it possible to use CosmosDB as a job repository for Spring Batch?
If that is not possible, can we go with an in-memory DB to handle our Spring batch jobs?
The job itself is triggered on message arrival in a remote queue. We use a variation of the process indicator in our current Spring batch job, to keep track of "chunks" which are being processed. Our attributes for saveStep are also disabled . The reader always uses a DB query to avoid picking up the same chunks and prevent duplicate processing.
We don't commit the message on the queue , till all records for that job are processed. So if the node dies and comes back up in the middle of processing , the same message would be redelivered , which takes of job restarts. Given all this, we have a choice of either coming up with a way to implement a cosmos job repository or simply use in-memory and plug in an "afterJob" listener to clean up the in-memory job data to ensure that java mem is not used in Prod. Any recommendations?
Wanted to provide information that Azure Cosmos DB just release v3 of the Spring Data connector for the SQL API:
The Spring on Azure team, in partnership with the Azure Cosmos DB team, are proud to have just made the Spring Data Azure Cosmos DB v3 generally available. This is the latest version of Azure Cosmos DB’s SQL API Spring Data connector.
Also, Spring.io has an example microservices solution (Spring Cloud Data Flow) based on batch that could be used as an example for your solution.
Additional Information:
Spring Data Azure Cosmos DB v3 for Core (SQL) API: Release notes and resources (link)
A well written 3rd party blog that is super helpful:
Introduction to Spring Data Azure Cosmos DB (link)
I have created spring cloud task tables i.e. TASK_EXECUTION, TASK_TASK_BATCH with prefix as MYTASK_ and spring batch Tables with prefix MYBATCH_ in oracle database.
There are default tables also there in the same schema which got created automatically or by other team mate.
I have bound my Oracle database service to SCDF server deployed on PCF.
How can i tell my Spring Cloud Dataflow server to use tables created with my prefix to render data on dataflow server dashboard?
Currently, SCDF dashboard uses tables with default prefix to render data. It works fine. I want to use my tables to render SCDF dashboard screens.
I am using Dataflowserver version - 1.7.3 and Deployed it on PCF using manifest.yml
There's an open story to add this enhancement via spring-cloud/spring-cloud-dataflow#2048.
Feel free to consider contributing or share use-case details in the issue.
Currently in a spring-cloud-dataflow and spring-cloud-skipper we use flyway to manage database schemas and it's not possible to prefix table names. Trying to support for this would add too much complexity and I'm not even sure if it'd be possible.
I'm looking for the best way that i can synchronise a on premise MongoDB with an Azure DocumentDB . the idea is this can synchronise on a predetermined time, for example every 2 hours.
I'm using .NET and C#.
I was thinking that I can create a Windows Service that retrieves the documents from de Azure DocumentDB collections and inserts the documents on my on premise MongoDB.
But I'm wondering if there is any better way.
Per my understanding, you could use Azure Cosmos DB Data Migration Tool to Export docs from collections to JSON file, then pick up the exported file(s) and insert / update into your on-premise MongoDB. Moreover, here is a tutorial about using the Windows Task Scheduler to backup DocumentDB, you could follow here.
When executing the export operation, you could export to a local file or Azure Blob Storage. For exporting to the local file, you could leverage the FileTrigger from Azure WebJobs SDK Extensions to monitor the file additions / changes under a particular directory, then pick up the new inserted local file and insert into your MongoDB. For exporting to Blob storage, you could also work with WebJobs SDK and use the BlobTrigger to trigger the new blob file and do the insertion. For the blob approach, you could follow How to use Azure blob storage with the WebJobs SDK.
In an asp.net core application in Azure, I am using some stored procedures in a CosmosDB database.
I know how to create these stored procedures with C# code. But, I do not know when to deploy them (in my web app, by powershell, in a separate command line application...)
When bootstrapping your DocumentClient, upsert your stored procedures for unpartitioned collections to simplify deployment. For partitioned collections, you have to delete/create the stored procedures - which can be problematic if you are upgrading your system in flight.
I tried to have spring batch meta data tables in Mongo database but its not working correctly. I referred and used below mentioned github project to configure JobRepository to store job data in Mongodb. This GitHub project is updated last 3 years ago and looks discontinued.
https://github.com/vfouzdar/springbatch-mongoDao
https://jbaruch.wordpress.com/2010/04/27/integrating-mongodb-with-spring-batch/
Currently my application uses in-memory tables for spring batch and functional part is done. But I want job data to be stored in Mongodb.
I have already used Mysql for spring batch job data but in current application don't want mysql.
If anybody has any other solution/link which can help me, please share.