Spring batch jsr 352 how to prevent partitioned job from leaving thread alive which prevent process from ending - spring-batch

Let me explain how my app is set up. First I have a stand alone command line started app that runs a main which in turn calls start on a job operator passing the appropriate params. I understand the start is an async call and once I call start unless I block some how in my main it dies.
My problem I have run into is when I run a partitioned job it appears to leave a few threads alive which prevents the entire processing from ending. When I run a non-partitioned job the process ends normally once the job has completed.
Is this normal and/or expected behavior? Is there a way to tell the partitioned threads to die. It seems that the partitioned threads are blocked waiting on something once the job has completed and they should not be?
I know that I could monitor for batch status in the main and possibly end it but as I stated in another question this adds a ton of chatter to the db and is not ideal.
An example of my job spec
<job id="partitionTest" xmlns="http://xmlns.jcp.org/xml/ns/javaee" version="1.0">
<step id="onlyStep">
<partition>
<plan partitions="2">
<properties partition="0">
<property name="partitionNumber" value="1"></property>
</properties>
<properties partition="1">
<property name="partitionNumber" value="2"></property>
</properties>
</plan>
</partition>
<chunk item-count="2">
<reader id="reader" ref="DelimitedFlatFileReader">
<properties>
<!-- Reads in from file Test.csv -->
<property name="fileNameAndPath" value="#{jobParameters['inputPath']}/CSVInput#{partitionPlan['partitionNumber']}.csv" />
<property name="fieldNames" value="firstName, lastName, city" />
<property name="fullyQualifiedTargetClass" value="com.test.transactionaltest.Member" />
</properties>
</reader>
<processor ref="com.test.partitiontest.Processor" />
<writer ref="FlatFileWriter" >
<properties>
<property name="appendOn" value="true"/>
<property name="fileNameAndPath" value="#{jobParameters['outputPath']}/PartitionOutput.txt" />
<property name="fullyQualifiedTargetClass" value="com.test.transactionaltest.Member" />
</properties>
</writer>
</chunk>
</step>
</job>
Edit:
Ok reading a bit more about this issue and looking into the spring batch code, it appears there is a bug at least in my opinion in the JsrPartitionHandler. Specifically the handle method creates a ThreadPoolTaskExecutor locally but then that thread pool is never cleaned up properly. A shutdown/destroy should be called before that method returns in order to perform some clean up otherwise the threads get left in memory and out of scope.
Please correct me if I am wrong here but that definitely seems like what the problem is.
I am going and try to make a change regarding it and see how it plays out. I'll update after I have done some testing.

I have confirmed this issue to be a bug (still in my opinion atm) in the spring batch core lib.
I have created a ticket over at the spring batch jira site. There is a simple attached java project to the ticket that confirms the issue I am seeing. If any one else runs into the problem they should refer to that ticket.
I have found a temporary work around that just uses a wait/notify scheme and it seems once added that the pooled threads shut down. I'll add each of the classes/code and try and explain what I did.
In main thread/class, this was code that lived in the main method or a method called from main
while(!ThreadNotifier.instance(this).getNotify()){
try {
synchronized(this){
System.out.println("WAIT THREAD IS =======" + Thread.currentThread().getName());
wait();
}
} catch (InterruptedException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
This is the ThreadNotifier class
public class ThreadNotifier {
private static ThreadNotifier tn = null;
private boolean notification = false;
private Object o;
private ThreadNotifier(Object o){
this.o = o;
}
public static ThreadNotifier instance(Object o){
if(tn == null){
tn = new ThreadNotifier(o);
}
return tn;
}
public void setNotify(boolean value){
notification = true;
synchronized(o){
System.out.println("NOTIFY THREAD IS =======" + Thread.currentThread().getName());
o.notify();
}
}
public boolean getNotify(){
return notification;
}
}
And lastly this is a job listener that I used to provide the notification back
public class PartitionWorkAround implements JobListener {
#Override
public void beforeJob() throws Exception {
// TODO Auto-generated method stub
}
#Override
public void afterJob() throws Exception {
ThreadNotifier.instance(null).setNotify(true);
}
}
This best I could come up with until the issue is fixed. For reference I used knowledge learned about guarded blocks here to figure out a way to do this.

Related

Spring Batch repeat step ifinitly

i want to ask you guys if there is a way to make a spring batch always be running and doing the same step over and over again but with a time lapse between this loops if the reader didn't find anything to do.
for example my spring batch read from database then do some updates on the list i got from my database. now i want this spring batch to do the same thing again, if he found new lines in database he will do the update otherwise he should wait some seconds then do the same read again and again.
My solution is this it works but i don't know if its the best practice to do.
i made my step call itself in the next step with causes an infinite loop.
then in my reader if he found data from database he will continue processing otherwise i'm doing a thread.sleep().
<job id="jobUpdate" xmlns="http://www.springframework.org/schema/batch">
<step id="updates" next="updates">
<tasklet>
<chunk reader="reader.." processor="processor..."
writer="writer..." commit-interval="1" />
</tasklet>
</step>
</job>
// my reader waiting code if the list is empty.
if(myList.isEmpty()) {
try {
Thread.sleep(constantes.WAIT_TIME_BATCH_RERUN);
System.out.println("im sleeping");
} catch (InterruptedException e) {
throw new RuntimeException(e);
}
}

How do I make data calls from different Blazor components simultaneously?

I'm new to Blazor and trying to make a page with several separate components to handle a massive form. Each individual component covers a part of the form.
The problem I'm facing is that each of my components needs access to data from the back-end, and not every component uses the same data. When the page loads, each components makes an attempt to fetch data from the server, which causes a problem with Entity Framework.
A second operation started on this context before a previous operation
completed. This is usually caused by different threads using the same
instance of DbContext.
This is obviously caused by the fact that my components are initialized at the same time, and all make their attempt to load the data simultaneously. I was under the impression that the way DI is set up in Blazor, this wouldn't be a problem, but it is.
Here are the components in my template:
<CascadingValue Value="this">
<!-- BASE DATA -->
<CharacterBaseDataView />
<!-- SPECIAL RULES -->
<CharacterSpecialRulesView />
</CascadingValue>
Here is how my components are initialized:
protected async override Task OnInitializedAsync()
{
CharacterDetailsContext = new EditContext(PlayerCharacter);
await LoadCharacterAsync();
}
private async Task LoadCharacterAsync()
{
PlayerCharacter = await PlayerCharacterService.GetPlayerCharacterAsync(ViewBase.CharacterId.Value);
CharacterDetailsContext = new EditContext(PlayerCharacter);
}
When two components with the above code are in the same view, the mentioned error occurs. I thread using the synchronous version "OnInitialized()" and simply discarding the task, but that didn't fix the error.
Is there some other way to call the data so that this issue doesn't occur? Or am I going about this the wrong way?
You've hit a common problem in using async operations in EF - two or more operations trying to use the same context at once.
Take a look at the MS Docs article about EF DBContexts - there's a section further down specific to Blazor. It explains the use of a DbContextFactory and CreateDbContext to create contexts for units-of-work i.e. one context per operation so two async operations each have a separate context.
Initially to solve the threading issues, I used DbContextFactory to create contexts for each operation - however this resulted in database in-consistency issues across components, and I realised I need change tracking across components.
Therefore instead, I keep my DbContext as scoped, and I don't create a new context before each operation.
I then adapted my OnInitializedAsync() methods to check if the calls to the database have completed, before making these calls through my injected services. This works really well for my app:
#code {
static Semaphore semaphore;
//code ommitted for brevity
protected override async Task OnInitializedAsync()
{
try
{
//First open global semaphore
semaphore = Semaphore.OpenExisting("GlobalSemaphore");
while (!semaphore.WaitOne(TimeSpan.FromTicks(1)))
{
await Task.Delay(TimeSpan.FromSeconds(1));
}
//If while loop is exited or skipped, previous service calls are completed.
ApplicationUsers = await ApplicationUserService.Get();
}
finally
{
try
{
semaphore.Release();
}
catch (Exception ex)
{
Console.WriteLine("ex.Message");
}
}
}

Spring Batch Jsr 352, manage processor skip outside/before skip listener

I am trying to find a way to manage a skip scenario in the process listener (or could be read or write listener as well). What I have found is the skip listener seems to be executed after the process listener's on error method. This means that I might be handling the error in some way with out knowledge that it is an exception to be skipped.
Is there some way to know that a particular exception is being skipped out side the skip listener? Something that could be pulled into the process listener or possibly else where.
The best approach I found to do this was just to add property to the step and then wire in the step context where i needed it.
<step id="firstStep">
<properties> <property name="skippableExceptions" value="java.lang.IllegalArgumentException"/> </properties>
</step>
This was not a perfect solution but the skip exceptions only seem to be set in StepFactoryBean and Tasklet and are not directly accessible.
For code in my listeners
#Inject
StepContext stepContext;
.
.
.
Properties p = stepContext.getProperties();
String exceptions = p.getProperty("skippableExceptions");

Store huge data at Chunk Level in Spring batch

I am new to spring and I am not having much knowledge about Spring ,please help me to solve this out .My use case is : we are using Spring batch with chunk oriented processing to process data.
At the end of each processed chunk (ie, once commit interval is met and values passed to writer), the List of values has to be stored so that once the whole tasklet is completed , the stored list of values has to be used for writing the values to a csv file. If any job failure has happened in the chunk processing then writing list of values to file should not happen.
Is there any way to store the huge data at chunk level and then finally processing those at next step/tasklet or in any other way?
Don't store all data in memory; is a bad pratice for a batch application.
An alternative can be create a standard read/process/write step where you write to your csv file a processed chunk.
When a job error occurs, stop job and delete your csv file (you will get the same result as not write it at all).
I think you reach your goals whitout memory issues.
I would suggest a different approach as from my point of view you are trying to work with spring batch in a way it was not planned to work.
Process the data chunk by chunk and write every chunk to csv using FlatFileItemWriter.
Use a file name that marks it as temp.
Wrap your step with a listener and use OnProcessError hooks.
When hitting OnProcessError log the failed item
Add a conditional flow for success and failure see here
In case of delete temp file
In case of success rename file
You may use SystemCommandTasklet or implement your own tasket for 6 and 7
Your listener will look similar to the one below
#Component
public class PromoteUpdateCountToJobContextListener implements StepListener {
#OnProcessError
public ExitStatus processError(Object item, Exception e){
String failureMessage = String.format("Failed to process due to item %s" ,
item.toString());
Logger.error(failureMessage);
return ExitStatus.FAILED;
}
}
Your Job xml will be similar to:
<batch:job>
<batch:step id="processData">
<batch:tasklet transaction-manager="transactionManager">
<batch:chunk reader="someReader"
writer="yourFlatFileItemWriter"/>
</batch:tasklet>
<batch:next on="*" to="renameTempCsv" />
<batch:next on="FAILED" to="deleteTempCsv" />
<batch:listeners>
<batch:listener ref="lineCurserListener" />
</batch:listeners>
</batch:step>
<batch:step id="deleteTempCsv">
<batch:tasklet ref="deleteTempCsvTasklet"/>
</batch:step>
<batch:step id="renameTempCsv">
<batch:tasklet ref="renameTempCsvTasklet"/>
</batch:step>
</batch:job>

schedule a trigger every minute, if job still running then standby and wait for the next trigger

I need to schedule a trigger to fire every minute, next minute if the job is still running the trigger should not fire and should wait another minute to check, if job has finished the trigger should fire
Thanks
In Quartz 2, you'll want to use the DisallowConcurrentExecution attribute on your job class. Then make sure that you set up a key using something similar to TriggerBuilder.Create().WithIdentity( "SomeTriggerKey" ) as DisallowConcurrentExecution uses it to determine if your job is already running.
[DisallowConcurrentExecution]
public class MyJob : IJob
{
...
}
I didnt find any thing about monitor.enter or something like that, thanks any way
the other answer is that the job should implement the 'StatefulJob' interface. As a StatefulJob, another instance will not run as long as one is already running
thanks again
IStatefulJob is the key here. Creating own locking mechanisms may cause problems with the scheduler as you are then taking part in the threading.
If you're using Quartz.NET, you can do something like this in your Execute method:
object execution_lock = new object();
public void Execute(JobExecutionContext context) {
if (!Monitor.TryEnter(execution_lock, 1)) {
return,
}
// do work
Monitor.Exit(execution_lock);
}
I pull this off the top of my head, maybe some names are wrong, but that's the idea: lock on some object while you're executing, and if upon execution the lock is on, then a previous job is still running and you simply return;
EDIT: the Monitor class is in the System.Threading namespace
If you are using spring quartz integration, you can specify the 'concurrent' property to 'false' from MethodInvokingJobDetailFactoryBean
<bean id="positionFeedFileProcessorJobDetail" class="org.springframework.scheduling.quartz.MethodInvokingJobDetailFactoryBean">
<property name="targetObject" ref="xxxx" />
<property name="targetMethod" value="xxxx" />
<property name="concurrent" value="false" /> <!-- This will not run the job if the previous method is not yet finished -->
</bean>