Can I restart job and process only skipped items after I have corrected the mistakes in file? I'm reading documentation and not finding currently this possibility. You can restart job if it is failed, but I'm thinking restarting job after it has been completed with some skipped items. If this cannot be achieved with configuration, what would be good way to implement it myself?
What I have done in a case similiar to yours, is to log each skipped item in a file.
Then I created a second job that load the file and process all logged items.
Related
There are some jobs scheduled using any trigger either SimpleTrigger or CronTrigger, now want to unschedule and delete the jobs. The job can be in running or already completed its execution state. If a unschedule or already executed job is deleted then there won't be any worst impact but what happen to the running job, if unschedule using unscheduleJob() or deleted directly by deleteJob() methods of the Quartz?
And if the running job is being halted in-between when the unscheduleJob() or deleteJob() is called upon then is there any way to let the job to complete it's current execution before unscheduling or deleting to avoid any malfunctioning or bad data?
Tried to check the conflicting jobs and make use of SchedulerListener also but didn't get any information.
Thanks in Advance!!!
I am new to spring-batch, got few questions:-
I have got a question about the restart. As per documentation, the restart feature is enabled by default. What I am not clear is do I need to do any extra code for a restart? If so, I am thinking of adding a scheduled job that looks at failed processes and restarts them?
I understand spring-batch-admin is deprecated. However, we cannot use spring-cloud-data-flow right now. Is there any other alternative to monitor and restart jobs on demand?
The restart that you mention only means if a job is restartable or not .It doesn't mean Spring Batch will help you to restart the failed job automatically.
Instead, it provides the following building blocks for developers for achieving this task on their own :
JobExplorer to find out the id of the job execution that you want to restart
JobOperator to restart a job execution given a job execution id
Also , a restartable job can only be restarted if its status is FAILED. So if you want to restart a running job that was stop running because of the server breakdown , you have to first find out this running job and update its job execution status and all of its task execution status to FAILED first in order to restart it. (See this for more information). One of the solution is to implement a SmartLifecycle which use the above building blocks to achieve this goal.
I am writing Spring batch for chunk processing.
I have Single Job and Single Step. Within that Step I have Chunks which are dynamically sized.
SingleJob -> SingleStep -> chunk_1..chunk_2..chunk_3...so on
Following is a case I am trying to implement,
If Today I ran a Job and only chunk_2 failed and rest of chunks ran successfully. Now Tomorrow I want to run/restart ONLY failed chunks i.e. in this case chunk_2. (I don't want to run whole Job/Step/Other successfully completed Chunks)
I see Spring batch allow to store metadata and using that it helps to restart Jobs. but I did not get if it is possible to restart specific chunk as discuss above.
Am I missing any concept or if it is possible then any pseudo code/theoretical explanation or reference will help.
I appreciate your response
That's how Spring Batch works in a restart scenario, it will continue where it left off in the previous failed run.
So in your example, if the in the first run chunk1 has been correctly processed and chunk2 failed, the next job execution will restart at chunk2.
When spring batch DB failure happens or server is shut down, a spring batch job which was running at that time will be in a unknown started state.
In spring batch admin, we will not see an option to restart the job. Hence we are not able to resume the job.
How to restart the job from last successful commit?
The old discussions suggest that it had to be dealt manually by updating tables. I was manually able to update end time, status in batch step execution and batch job execution tables. Is it really the best option? It may not be practical to do that manually in a prod region.
As mentioned in the Aborting a Job section of the reference documentation, when a server failure happens, the job repository has no way to know that the process running the job died. Hence a manual intervention is required.
How to restart the job from last successful commit?
Change the status of the job to FAILED and restart the job instance, it should continue from where it left off.
I had set up druid cluster(10 nodes),ingestion kafka data using indexing service.However,I found many of tasks are failed like below,but some data had been existed in segments,I am not sure if all datas are pushed in the segments.
failed task lists
besides that,I choose some logs of failed tasks,found there are no fatal error messages,I posted the log file, please help me what caused the task failed.thank so much.
one log of failed tasks
there are 2 questions I want to ask,one is how to confirm all consumer data are pushed in the segments,the other is what caused the task Failure.
This looks to be the issue of Hadoop, where multiple threads trying to write to the same file at same time, you need to set overwrite=false
Check if you are running multiple ingestion tasks for same segments.
you can refer below link for further debugging it -
https://community.hortonworks.com/questions/139150/no-lease-on-file-inode-5425306-file-does-not-exist.html