Skip exceptions in spring-batch and commit error in database - spring-batch

I'm using Spring batch to write a batch process and I'm having issues handling the exceptions.
I have a reader that fetches items from a database with an specific state. The reader passes the item to the processor step that can launch the exception MyException.class. When this exception is thrown I want to skip the item that caused that exception and continue reading the next one.
The issue here is that I need to change the state of that item in the database so it's not fetched again by the reader.
This is what I tried:
return this.stepBuilderFactory.get("name")
.<Input, Output>chunk(1)
.reader(reader())
.processor(processor())
.faultTolerant()
.skipPolicy(skipPolicy())
.writer(writer())
.build();
In my SkipPolicy class I have the next code:
public boolean shouldSkip(Throwable throwable, int skipCount) throws SkipLimitExceededException {
if (throwable instanceof MyException.class) {
// log the issue
// update the item that caused the exception in database so the reader doesn't return it again
return true;
}
return false;
}
With this code the exception is skipped and my reader is called again, however the SkipPolicy didn't commit the change or did a rollback, so the reader fetches the item and tries to process it again.
I also tried with an ExceptionHandler:
return this.stepBuilderFactory.get("name")
.<Input, Output>chunk(1)
.reader(reader())
.processor(processor())
.faultTolerant()
.skip(MyException.class)
.exceptionHandler(myExceptionHandler())
.writer(writer())
.build();
In my ExceptionHandler class I have the next code:
public void handleException(RepeatContext context, Throwable throwable) throws Throwable {
if (throwable.getCause() instanceof MyException.class) {
// log the issue
// update the item that caused the exception in database so the reader doesn't return it again
} else {
throw throwable;
}
}
With this solution the state is changed in the database, however it doesn't call the reader, instead it calls the method process of the processor() again, getting in an infinite loop.
I imagine I can use a listener in my step to handle the exceptions, but I don't like that solution because I will have to clone a lot of code asumming this exception could be launched in different steps/processors of my code.
What am I doing wrong?
EDIT: After a lot of tests and using different listeners like SkipListener, I couldn't achieve what I wanted, Spring Batch is always doing a rollback of my UPDATE.
Debugging this is what I found:
Once my listener is invoked and I update my item, the program enters the method write in the class FaultTolerantChunkProcessor (line #327).
This method will try the next code (copied from github):
try {
doWrite(outputs.getItems());
} catch (Exception e) {
status = BatchMetrics.STATUS_FAILURE;
if (rollbackClassifier.classify(e)) {
throw e;
}
/*
* If the exception is marked as no-rollback, we need to
* override that, otherwise there's no way to write the
* rest of the chunk or to honour the skip listener
* contract.
*/
throw new ForceRollbackForWriteSkipException(
"Force rollback on skippable exception so that skipped item can be located.", e);
}
The method doWrite (line #151) inside the class SimpleChunkProcessor will try to write the list of output items, however, in my case the list is empty, so in the line #159 (method writeItems) will launch an IndexOutOfBoundException, causing the ForceRollbackForWriteSkipException and doing the rollback I'm suffering.
If I override the class FaultTolerantChunkProcessor and I avoid writing the items if the list is empty, then everything works as intended, the update is commited and the program skips the error and calls the reader again.
I don't know if this is actually a bug or it's caused by something I'm doing wrong in my code.

A SkipListener is better suited to your use case than an ExceptionHandler in my opinion, as it gives you access to the item that caused the exception. With the exception handler, you need to carry the item in the exception or the repeat context.
Moreover, the skip listener allows you to know in which phase the exception happened (ie in read, process or write), while with the exception handler you need to find a way to detect that yourself. If the skipping code is the same for all phases, you can call the same method that updates the item's status in all the methods of the listener.

Related

How can I force a build process to fail through a Unity editor script?

I want to force the build process to fail if some validation conditions are not met.
I've tried using an IPreprocessBuildWithReport with no success:
using UnityEditor.Build;
using UnityEditor.Build.Reporting;
public class BuildProcessor : IPreprocessBuildWithReport
{
public int callbackOrder => 0;
public void OnPreprocessBuild(BuildReport report)
{
// Attempt 1
// Does not compile because the 'BuildSummary.result' is read only
report.summary.result = BuildResult.Failed;
// Attempt 2
// Causes a log in the Unity editor, but the build still succeeds
throw new BuildFailedException("Forced fail");
}
}
Is there any way to programmatically force the build process to fail?
I'm using Unity 2018.3.8f1.
As of 2019.2.14f1, the correct way to stop the build is to throw a BuildFailedException.
Other exception types do not interrupt the build.
Derived exception types do not interrupt the build.
Logging errors most certainly do not interrupt the build.
This is how Unity handles exceptions in PostProcessPlayer:
try
{
postprocessor.PostProcess(args, out props);
}
catch (System.Exception e)
{
// Rethrow exceptions during build postprocessing as BuildFailedException, so we don't pretend the build was fine.
throw new UnityEditor.Build.BuildFailedException(e);
}
Just for clarity, this will NOT stop the build.:
// This is not precisely a BuildFailedException. So the build will go on and succeed.
throw new CustomBuildFailedException();
...
public class CustomBuildFailedException: BuildException() {}
You can use OnValidate() which seems to be exactly what you're looking for.
Let's say you want to make sure a reference to a UI Text component is not null before building, in the script that should have the text reference, you add
private void OnValidate()
{
if (text == null)
{
Debug.LogError("Text reference is null!");
}
}
Having Debug.LogError calls during the build process actually cause the build to fail.

Eclipse (JDT) - performFinish method in wizard

I need to do something with wizards in Eclipse so I checked JDT how they have implemented wizards and found this weird code I don't understand.
It ignores the wizard scheduling rule (returned from getSchedulingRule) in case the code is called from already executing Job (it uses the scheduling rule of that Job). So if wizard needs the scheduling rule of entire workspace but the current thread is already executing any job, than the scheduling rule of this job is used instead, which can cause problems when the new runnable is executed in workspace. I added some comments to code so it is more clear.
Could any Eclipse expert explain why the try block is implemented as is (not just using getSchedulingRule)?
NewElementWizard
/**
* Returns the scheduling rule for creating the element.
* #return returns the scheduling rule
*/
protected ISchedulingRule getSchedulingRule() {
return ResourcesPlugin.getWorkspace().getRoot(); // look all by default
}
/*
* #see Wizard#performFinish
*/
#Override
public boolean performFinish() {
IWorkspaceRunnable op= new IWorkspaceRunnable() {
#Override
public void run(IProgressMonitor monitor) throws CoreException, OperationCanceledException {
try {
finishPage(monitor);
} catch (InterruptedException e) {
throw new OperationCanceledException(e.getMessage());
}
}
};
try {
//TODO: i need explanation of this block. Wizard should be used
// from UI thread, so the code Job.getJobManager().currentJob()
// means that there is possible Job currently executed by UI thread.
// Ok now if there is a job, use its scheduling rule ignoring getSchedulingRule.
// This could be maybe to force that this new runnable isn't executed until this thread finishes
// its current Job. Okb but if the current Job rule isn't so powerfull as this wizard needs, what than?
// It will cause error when executing op, because the runnable will not have enough access
// cause ignoring getSchedulingRule...
ISchedulingRule rule= null;
Job job= Job.getJobManager().currentJob();
if (job != null)
rule= job.getRule();
IRunnableWithProgress runnable= null;
if (rule != null)
runnable= new WorkbenchRunnableAdapter(op, rule, true);
else
runnable= new WorkbenchRunnableAdapter(op, getSchedulingRule());
getContainer().run(canRunForked(), true, runnable);
} catch (InvocationTargetException e) {
handleFinishException(getShell(), e);
return false;
} catch (InterruptedException e) {
return false;
}
return true;
}
I'm not sure I can explain all of this but a key thing to note is that
Job.getJobManager().currentJob();
only returns the current job in the current thread. Since performFinish is normally run in the UI thread this would not be an ordinary background job. UIJob jobs run in the UI thread. It looks to me like this code is trying to pick up the rule from some UI job that the wizard or associated code has already started.
The true arguments on the call:
new WorkbenchRunnableAdapter(op, rule, true)
will cause WorkbenchRunnableAdapter to call
Job.getJobManager().transferRule(fRule, thread);
if the thread changes. I think this means the code is trying to keep the same rule in use throughout the execution of the runnable and whatever job was previously running.

Verticles and uncaught exceptions

Considering the scenario that one of the verticles throws an uncaught exception.
What happens next?
If the verticle state is removed from the system is there some mechanism similar to erlang supervisors to restart the verticle?
Documentation is not very clear about this aspect.
Update based on comments:
What interest me the most is the situation when an exception is thrown from the processing handlers of a received message (through the bus)
Regards
I have answered part of my own question (with the help of a test program)
When exception is thrown in a event handler then the exception is caught by vert.x and swallowed (ignored). The event handler will process the next message.
Update: The app can register an exception handler and have all the uncaught Throwable delivered to this handler. There you can perform additional general processing
Update2: Use Vertx.exceptionHandler to register the handler
Vert.x is all about the same style, asynchronous programming, which is mainly highlighted by callback handlers.
To handle the deployment failure case, you have first to go the programmatic way, i.e. you have to deploy your verticle programmatically through let's say a deployment verticle providing a completion handler that will be populated with deployment result, here down a sample using Java (since your haven't opt for a specific language, I will go with my best) where:
MainVerticle: is your deployment verticle (used mainly to deploy other verticles)
some.package.MyVerticle: is your real verticle, note that I used the id here and not an instance.
public class MainVerticle extends AbstractVerticle {
public void start() {
vertx.deployVerticle("some.package.MyVerticle", res -> {
if (res.succeeded()) {
// Do whatever if deployment succeeded
} else {
// Handle deployment failure here...
}
});
}
}
Now when it comes to 'messaging failures', it would be harder to highlight a specific case since it can occur at many places and on behalf of both messaging ends.
If you want to register a failure case handler when sending a message, you can instantiate a MessageProducer<T> representing the stream it can be written to, then register an exception handler on it:
EventBus eb = vertx.eventBus();
MessageProducer<String> sender = eb.sender("someAddress");
sender.exceptionHandler(e -> {
System.out.println("An error occured" + e.getCause());
});
sender.write("Hello...");
On the other side, you can handle failure case when reading the received messages pretty much the same way, but using a MessageConsumer<T> this time:
EventBus eb = vertx.eventBus();
MessageConsumer<String> receiver = eb.consumer("someAddress");
receiver.exceptionHandler(e -> {
System.out.println("An error occured while readeing data" + e.getCause());
}).handler(msg -> {
System.out.println("A message has been received: " + msg.body());
});
To add a bit to the previous answer, if you want to react to all uncaught exceptions, register handler on vertx object, as follows:
vertx.exceptionHandler(new Handler<Throwable>() {
#Override
public void handle(Throwable event) {
// do what you meant to do on uncaught exception, e.g.:
System.err.println("Error");
someLogger.error(event + " throws exception: " + event.getStackTrace());
}
});
I ran into something similar to this. When an exception happens as part of processing a message in a Verticle, I just wanted to reply with the Exception.
The idea is to just bubble up the exceptions all the way back to the entry point in the app where a decision can be made about what to do with the failure, while capturing the entire stack along the way.
To accomplish it I wrote this function:
protected ReplyException buildReplyException(Throwable cause, String message)
{
ReplyException ex = new ReplyException(ReplyFailure.RECIPIENT_FAILURE, -1, message);
ex.initCause(cause);
return ex;
}
Which I then use to build handlers, or reply handlers, like this:
reply -> {
if (reply.succeeded()) {
message.reply(true);
} else {
message.reply(buildReplyException(reply.cause(), ""));
}
});
This way the guy that sent the original message will get a failed response which contains a cause that's an exception which has a full stack trace populated on it.
This approach worked very well for me to handle errors while processing messages.

Good design to Insert multiple records

I am working on a program that read from a file and insert line by line into Oracle 11g database using JTA/EclipseLink 2.3.x JPA with container managed transaction.
I've developed the code below, but I'm bugged by the fact that the failed lines need to be known and being fixed manually.
public class CreateAccount {
#PersistenceContext(unitName="filereader")
private EntityManager em;
private ArrayList<String> unprocessed;
public void upload(){
//reading the file into unprocessed
for (String s : unprocessed) {
this.process(s);
}
}
private void process(String s){
//Setting the entity with appropriate properties.
//Validate the entity
em.persist(account);
}
}
This first version takes a few seconds to commit 5000 rows to database, as it seems taking advantage of caching the prepared statement. This works fine when all entities to persist are valid. However, I am concerning that even if I validate the entity, it is still possible to fail due to various unexpected reason, and when any entity throw an exception during commit, I cannot find the particular record that caused it, and all entities had been rolled back.
I had tried another approach that start a new transaction and commit for each line without using managed transaction using the following code in process(String s).
for (String s : unprocessedLines) {
try {
em.getTransaction().begin();
this.process(s);
em.getTransaction().commit();
} catch (Exception e) {
// Any exception that a line caused can be caught here
e.printStackTrace();
}
}
The second version works well for logging erroneous line as exception caused by individual lines were caught and handled, but it takes over 300s to commit the same 5000 lines to database. The time it takes is not reasonable when a large file is being processed.
Is there any workaround that I could check and insert record quickly and at the same time being notified of any failed lines?
Well this is more likely a guess, but why don't you try to keep the transaction and commiting it in batch, then you'll keep the rollback exception at the same time will keep the speed:
try {
em.getTransaction().begin();
for (String s : unprocessedLines) {
this.process(s);
}
em.getTransaction().commit();
} catch (RollbackException exc) {
// here you have your rollback reason
} finally {
if(em.getTransaction.isActive()) {
em.getTransaction.rollback(); // well of course you should declare em.getTransaction as a varaible above instead of constantly invoking it as I do :-)
}
}
My solution turned out to be a binary search, and start with a block of reasonable number, e.g. last = first + 1023 to minimize the depth of the tree.
However, note that this work only if the error is deterministic, and is worse than committing each record once if the error rate is very high.
private boolean batchProcess(int first, int last){
try {
em.getTransaction().begin();
for (String s : unprocessedLines.size(); i++) {
this.process(s);
}
em.getTransaction().commit();
} catch (Exception e) {
e.printStackTrace();
if(em.getTransaction.isActive()) {
em.getTransaction.rollback();
}
if( first == last ){
failedLine.add(unprocessedLines(first));
} else {
int mid = (first + last)/2+1
batchProcess(first, mid-1);
batchProcess(mid, last);
}
}
}
For container managed transaction, one may need to do the binary search out of the context of the transaction, otherwise there will be RollbackException because the container had already decided to rollback this transaction.

How to end a job when no input read

We read most of our data from a DB. Sometimes the result-set is empty, and for that case we want the job to stop immediately, and not hand over to a writer. We don't want to create a file, if there is no input.
Currently we achieve this goal with a Step-Listener that returns a certain String, which is the input for a transition to either the next business-step or a delete-step, which deletes the file we created before (the file contains no real data).
I'd like the job to end after the reader realizes that there is no input?
New edit (more elegant way)
This approach is to elegantly move to the next step or end the batch application when the file is not found and prevent unwanted steps to execute (and their listeners too).
-> Check for the presence of file in a tasklet, say FileValidatorTasklet.
-> When the file is not found set some exit status (enum or final string) , here we have set EXIT_CODE
sample tasklet
public class FileValidatorTasklet implements Tasklet {
static final String EXIT_CODE = "SOME_EXIT_CODE";
static final String EXIT_DESC = "SOME_EXIT_DESC";
#Override
public RepeatStatus execute(StepContribution stepContribution, ChunkContext chunkContext) throws Exception {
boolean isFileFound = false;
//do file check and set isFileFound
if(!isFileFound){
stepContribution.setExitStatus(new ExitStatus(EXIT_CODE, EXIT_DESC));
}
return RepeatStatus.FINISHED;
}
}
-> In the job configuration of this application after executing FileValidatorTasklet, check for the presence of the EXIT_CODE.
-> Provide the subsequent path for this job if the code is found else the normal flow of the job.( Here we are simply terminating the job if the EXIT_CODE is found else continue with the next steps)
sample config
public Job myJob(JobBuilderFactory jobs) {
return jobs.get("offersLoaderJob")
.start(fileValidatorStep).on(EXIT_CODE).end() // if EXIT_CODE is found , then end the job
.from(fileValidatorStep) // else continue the job from here, after this step
.next(step2)
.next(finalStep)
.end()
.build();
}
Here we have taken advantage of conditional step flow in spring batch.
We have to define two separate path from step A. The flow is like A->B->C or A->D->E.
Old answer:
I have been through this and hence I am sharing my approach. It's better to
throw new RunTimeException("msg");.
It will start to terminate the Spring Application , rather than exact terminate at that point. All methods like close() in ( reader/writer) would be called and destroy method of all the beans would be called.
Note: While executing this in Listener, remember that by this point all the beans would have been initialized and code in their initialization (like afterPropertySet() ) would have executed.
I think above is the correct way, but if you are willing to terminate at that point only, you can try
System.exit(1);
It would likely be cleaner to use a JobExecutionDecider and based on the read count from the StepExecution set a new FlowExecutionStatus and route it to the end of the job.
Joshua's answer addresses the stopping of the job instead of transitioning to the next business step.
Your file writer might still create the file unnecessarily. You can create something like a LazyItemWriter with a delegate (FlatFileItemWriter) and it will only call delegate.open (once) if there's a call to write method. Of course you have to check if delegate.close() needs to be called only if the delegate was previously opened. This makes sure that no empty file is created and deleting it is no longer a concern.
I have the same question as the OP. I am using all annotations, and if the reader returns as null when no results (in my case a File) are found, then the Job bean will fail to be initialized with an UnsatisfiedDependencyException, and that exception is thrown to stdout.
If I create a Reader and then return it w/o a File specified, then the Job will be created. After that an ItemStreamException is thrown, but it is thrown to my log, as I am past the Job autowiring and inside the Step at that point. That seems preferable, at least for what I am doing.
Any other solution would be appreciated.
NiksVij Answer works for me, i implemented it like this:
#Component
public class FileValidatorTasklet implements Tasklet {
private final ImportProperties importProperties;
#Autowired
public FileValidatorTasklet(ImportProperties importProperties) {
this.importProperties = importProperties;
}
#Override
public RepeatStatus execute(StepContribution contribution, ChunkContext chunkContext) throws Exception {
String folderPath = importProperties.getPathInput();
String itemName = importProperties.getItemName();
File currentItem = new File(folderPath + File.separator + itemName);
if (currentItem.exists()) {
contribution.setExitStatus(new ExitStatus("FILE_FOUND", "FILE_FOUND"));
} else {
contribution.setExitStatus(new ExitStatus("NO_FILE_FOUND", "NO_FILE_FOUND"));
}
return RepeatStatus.FINISHED;
}
}
and in the Batch Configuration:
#Bean
public Step fileValidatorStep() {
return this.stepBuilderFactory.get("step1")
.tasklet(fileValidatorTasklet)
.build();
}
#Bean
public Job tdZuHostJob() throws Exception {
return jobBuilderFactory.get("tdZuHostJob")
.incrementer(new RunIdIncrementer())
.listener(jobCompletionNotificationListener)
.start(fileValidatorStep()).on("NO_FILE_FOUND").end()
.from(fileValidatorStep()).on("FILE_FOUND").to(testStep()).end()
.build();
}