A file prepared by one spring batch job is not accessible to other for deletion - spring-batch

I have a requirement where I have to prepare a file using one job and another job which runs once a day will send the file to external system and delete/or move from the location. When this job tries to delete/or move the file it can't access it.
I tried setting writable to true when file is created. Running jobs on separate times (Running one job at a time). Tried adding "delete" as a step to the same job as well. Nothing worked.
I am using file.delete(). Also tried Files.deleteIfExists().
I suspect the first job is not assigning proper permissions but don't know a way around it set permissions in spring batch

Are these jobs run by the same user? i.e. Same user and permissions?
Also what is the actual error message? Does it say permissions denied? If so they it is likely an OS restriction not Spring Batch/Java limitation.
An easier solution would be to just add a step to the first job to send the files are part of the job and drop the job that just transfers the files.

Answering my own question 😀. Hope it helps someone.
Issue was the last ItemWriter was holding the resources because I was using the composite writer. While using CompositeWriter beforeStep, afterStep methods are “hidden”. You have to call them explicitely. I selected the approach to write a custom writer which will explicitely call writer.close().
Adding afterStep method and calling super.close() should also work. Though I have nit tries that out.

Related

Using Azure Data Factory output in Logic App

I have a logic app that runs on occurrence initially that runs an ADF
pipeline which outputs a folder of files.
Then, I use a List Blobs action to pull one specific file
from the newly made folder and place its path on a queue.
And once a message is placed on that queue, it triggers the run of
another ADF pipeline.
The issue is I have not seen a way to get the output of the first ADF pipeline to put on the queue. I have tried to cheat within the List Blobs action that is sequential to the 1st ADF pipeline by explicitly searching the name of the output folder because it will be the same every time.
However, even after the 1st ADF is ran and produces the folder, within the first instance of this Logic App being ran the List Blobs can't find the folder and says the file path is not found.
Only after I run the Logic App a second time the folder is finally found which is not at all optimal. How can I fix this ? I prefer to keep everything in one logic app. Are there other Azure tools that can help in addition?
I am not having the details of the implementation but i am wondering if the message is written by the first pipeline is only used as a signal the second pipeline ? if thats the case why you cannot you call the second pipeline on completion of the first one ? may be these pipelines are on different ADF's ?
I suggest you to read and see if you can use the Event triggers

deploying Batch with CloudFormation

I've been able to create a Compute Environment, a Job Queue and about a dozen Job Definitions using CloudFormation. Great!
Unless I'm missing something, there doesn't seem to be an element to actually submit my Job Definitions using CloudFormation. :(
At first, I thought I had it figured out because you can create CloudWatch Events that trigger a Job Submission. However, I notice that the Event Rule in CloudFormation does not have support for Batch like the CLI/SDK does. Lame!
Anyone else deploying Batch with CloudFormation? How are you submitting jobs? I guess I can create a Custom Resource, but that seems harder than it should be.
Does https://docs.aws.amazon.com/batch/latest/userguide/batch-cwe-target.html solve your problem?
AWS Batch jobs are available as CloudWatch Events targets. Using simple rules that you can quickly set up, you can match events and submit AWS Batch jobs in response to them.
When you create a new rule, add the batch job as a target.
The easiest way would be to create a Lambda function. You can create it via CF and capture your requirement in the function code.
Or like you mentioned, you can create a custom resource.

In spring batch, what is a scope of an ItemReader without scope="..."?

If I have a web application, with an application-context that loads everything for my webapp and all my jobs configuration files, and if I have in a job a simple ItemReader without scope="step", the reader is a singleton, right ? So if I launch twice my job from a controller via a SimpleJobLauncher, I will use the same bean, right ? Unless I put scope="step", in order to have one bean per job execution ?
On the other hand, if I launch the job from a CommandLineJobRunner, I will have two distinct application contexts, so two different beans, right ?
Are my assertions valid ?
Thanks
Yes that is correct. Basically, every Bean-instance in a SpringContext is a singleton.
However, most readers or writers have a state. For instance, FlatFileItemReader can only run once, after that it points to the end of the file and its "close" method had been called. Therefore, if you simply start the job again, it will not work, since the FlatfileItemReader is closed.
For such cases, you will need to define them with sope=step.

Talend Force run order of joblets

My company has a couple of joblets that we put in new jobs to do things like initialization of variables, get system information from the database and sending out error / warning emails. The issue we are running into is that if we go ahead and start creating the components of a job and realize that we forgot to include these 3 joblets, we have to basically re-create the job to ensure that the joblets are added first so they run first.
Is there any way to force these joblets to run first and possibly also in a certain order before moving on to the contents of the job being created? Please let me know if there is any information you may need that I'm missing as I have only been using Talend for a few days. The rest of the team has not been using it too much longer than I have, so they do not have the answer I'm looking for either. Thanks in advance!
In Joblets you can use the components Trigger_Input and Trigger_Output as connection-points for on subjob OK triggers. So you can connect joblets and other components in a job with triggers. Thus enforcing execution order.
But you cannot get a on subjob OK trigger from a tPreJob. I am thinking on triggering from a tPreJob to a tWarn (on component OK) and then from tWarn to the joblet (on subjob OK).

ScheduledExecutorService: modify one or more running tasks

I have a program, it loads a few tasks from a file prepared by user and start executing them according the scheduling shown in the file.
Example: taskFile.txt
Task1: run every hour
Task2: run every 2 seconds
...
TaskN: run every monday at 10:00
This first part is Ok, i solved by using ScheduledExecutorService and i am very satisfied. The tasks are load and run as they should.
Now, let's image that the user, by GUI (at runtime), decides that Task2 should run every minute, and he wants to remove Task3.
I cannot find any way to access one specific task in the pool, in order to remove/modify it.
So I cannot update tasks at runtime. When user changes a task, I can only modify the taskFile.txt and restart the application, in order to reload all tasks according the newly updated taskFile.txt.
Do you know any way to access a single task in order to modify/delete it?
Or even, a way to remove one given task, so i can insert a new one in the pool, with the modifications wanted by the user.
Thanks
This is not elegant, but works.
Let's suppose you need 10 threads, and sometimes you need to manage a specific thread.
Instead to have a pool with 10 thread, use 10 pools with one thread for each, keep them in your favourite data structure, and act on the pool_1 when you want to modify thread_1.
It's possible to remove the older Runnable from the pool and put a new one with the needed changes.
Otherways, anything put in the pool became anonymous and will be not directly manageable.
If somebody has a better solution...