When TaskExecutor concurrencyLimit is less than the number of flow steps, job will be blocked - spring-batch

I'm using Spring Batch 4.1.2.RELEASE. I have a problem about Parallel Steps
(using Split Flow). When SplitFlow's concurrencyLimit (or ThreadPoolTaskExecutor.corePoolSize) is less than the number of A split flow's steps. the job never stops, and no Exception thrown.
I know the solution is to increase the concurrencyLimit or decrease the number of steps in each flow. But I want to make sure whether there is a problem with job's TaskExecutor and task's TaskExecutor or my code is wrong.
Without consideration of SplitFlow, I found that if if the number of Jobs (as simple as possible) submitted to jobLauncher is more than its TaskExecutor.corePoolSize(assume 1), the job will be executed one by one. This is the expected result.
#Bean
public TaskExecutor taskExecutor() {
SimpleAsyncTaskExecutor executor = new SimpleAsyncTaskExecutor("tsk-Exec-");
executor.setConcurrencyLimit(2);
return executor;
}
#SpringBootApplication
#EnableBatchProcessing
#EnableWebMvc
public class BatchJobApplication {
public static void main(String[] args) {
SpringApplication.run(BatchJobApplication.class, args);
}
}
The code below create a single job, contains a split flow with 4 tasklet step.
#Autowired
private TaskExecutor taskExecutor;
public JobExecution experiment(Integer flowId) {
String dateFormat = LocalDate.now(ZoneId.of("+8")).format(DateTimeFormatter.BASIC_ISO_DATE);
JobBuilder job1 = this.jobBuilderFactory.get("Job_" + flowId + "_" + dateFormat);
List<TaskletStep> taskletSteps = Lists.newArrayList();
for (int i = 0; i < 4; i++) {
taskletSteps.add(this.stepBuilderFactory.get("step:" + i).tasklet(
(contribution, chunkContext) -> {
Thread.sleep(3000);
return RepeatStatus.FINISHED;
}).build());
}
JobExecution run = null;
FlowBuilder.SplitBuilder<SimpleFlow> splitFlow = new FlowBuilder<SimpleFlow>("splitFlow").split(taskExecutor);
FlowBuilder<SimpleFlow> lastFlowNode = null;
for (TaskletStep taskletStep : taskletSteps) {
SimpleFlow singleNode = new FlowBuilder<SimpleFlow>("async-fw-" + taskletStep.getName()).start(taskletStep).build();
lastFlowNode = splitFlow.add(singleNode);
}
Job build = job1.start(lastFlowNode.end()).build().build();
JobParametersBuilder jobParametersBuilder = new JobParametersBuilder();
jobParametersBuilder.addDate("parameterGenerated", new Date());
try {
run = jobLauncher.run(build, jobParametersBuilder.toJobParameters());
} catch (JobExecutionAlreadyRunningException e) {
e.printStackTrace();
} catch (JobRestartException e) {
e.printStackTrace();
} catch (JobInstanceAlreadyCompleteException e) {
e.printStackTrace();
} catch (JobParametersInvalidException e) {
e.printStackTrace();
}
return run;
}
Now It's blocked.
2019-07-29 18:08:10.321 INFO 24416 --- [ job-Exec-1] o.s.b.c.l.support.SimpleJobLauncher : Job: [FlowJob: [name=Job_2124_20190729]] launched with the following parameters: [{parameterGenerated=1564394890193}]
2019-07-29 18:08:13.392 DEBUG 24416 --- [ job-Exec-1] cTaskExecutor$ConcurrencyThrottleAdapter : Entering throttle at concurrency count 0
2019-07-29 18:08:13.393 DEBUG 24416 --- [ job-Exec-1] cTaskExecutor$ConcurrencyThrottleAdapter : Entering throttle at concurrency count 1
2019-07-29 18:08:13.393 DEBUG 24416 --- [ tsk-Exec-2] cTaskExecutor$ConcurrencyThrottleAdapter : Concurrency count 2 has reached limit 2 - blocking
2019-07-29 18:08:13.425 INFO 24416 --- [ tsk-Exec-1] o.s.batch.core.job.SimpleStepHandler : Executing step: [step:3]
2019-07-29 18:08:16.466 DEBUG 24416 --- [ tsk-Exec-1] cTaskExecutor$ConcurrencyThrottleAdapter : Returning from throttle at concurrency count 1
2019-07-29 18:08:16.466 DEBUG 24416 --- [ tsk-Exec-2] cTaskExecutor$ConcurrencyThrottleAdapter : Entering throttle at concurrency count 1
2019-07-29 18:08:16.466 DEBUG 24416 --- [ tsk-Exec-2] cTaskExecutor$ConcurrencyThrottleAdapter : Concurrency count 2 has reached limit 2 - blocking
2019-07-29 18:08:16.484 INFO 24416 --- [ tsk-Exec-3] o.s.batch.core.job.SimpleStepHandler : Executing step: [step:2]
2019-07-29 18:08:19.505 DEBUG 24416 --- [ tsk-Exec-3] cTaskExecutor$ConcurrencyThrottleAdapter : Returning from throttle at concurrency count 1
2019-07-29 18:08:19.505 DEBUG 24416 --- [ tsk-Exec-2] cTaskExecutor$ConcurrencyThrottleAdapter : Entering throttle at concurrency count 1
2019-07-29 18:08:19.506 DEBUG 24416 --- [ tsk-Exec-4] cTaskExecutor$ConcurrencyThrottleAdapter : Concurrency count 2 has reached limit 2 - blocking

I think I have found the answer yesterday evening.
When The TaskExecutor in SplitFlow with a few concurrencyLimit or ThreadPoolTaskExecutor.corePoolSize. According to the code, It's very likely to happen that All the Thread is blocked by future.get(), but no Thread available has chance to run taskletStep.
//SplitState.java:114
results.add(task.get());
In addition, the Threads created by TaskExecutor in JobLauncher doesn't have to wait results of future. So TaskExcutor always have enough free Thread to accept jobs, no one need to wait any condition.

Related

Spring #KafkaListener with topicPattern: handle runtime topic creation

I'm using Spring #KafkaListener with a topicPattern. If during the runtime of this application I create a new topic matching the pattern and start publishing to that, the listener application simply ignores those messages. In other words, it only pulls all the topics matching the pattern at startup and listens to those.
What's the easiest way to "refresh" that? Thanks!
By default, new topics will be picked up within 5 minutes (default) according to the setting of https://kafka.apache.org/documentation/#consumerconfigs_metadata.max.age.ms
The period of time in milliseconds after which we force a refresh of metadata even if we haven't seen any partition leadership changes to proactively discover any new brokers or partitions.
You can reduce it to speed things up at the expense of increased traffic.
EDIT
This shows it working as expected...
#SpringBootApplication
public class So71386069Application {
private static final Logger log = LoggerFactory.getLogger(So71386069Application.class);
public static void main(String[] args) {
SpringApplication.run(So71386069Application.class, args);
}
#KafkaListener(id = "so71386069", topicPattern = "so71386069.*",
properties = "metadata.max.age.ms:60000")
void listen(String in) {
System.out.println(in);
}
#Bean
public NewTopic topic() {
return TopicBuilder.name("so71386069").partitions(1).replicas(1).build();
}
#Bean
ApplicationRunner runner(KafkaAdmin admin) {
return args -> {
try (AdminClient client = AdminClient.create(admin.getConfigurationProperties())) {
IntStream.range(0, 10).forEach(i -> {
try {
Thread.sleep(30_000);
String topic = "so71386069-" + i;
log.info("Creating {}", topic);
client.createTopics(Collections.singleton(
TopicBuilder.name(topic).partitions(1).replicas(1).build())).all().get();
}
catch (InterruptedException e) {
Thread.currentThread().interrupt();
}
catch (ExecutionException e) {
e.printStackTrace();
}
});
}
};
}
}
2022-03-07 15:41:07.131 INFO 33630 --- [o71386069-0-C-1] o.s.k.l.KafkaMessageListenerContainer
: so71386069: partitions assigned: [so71386069-0]
2022-03-07 15:41:34.007 INFO 33630 --- [ main] com.example.demo.So71386069Application
: Creating so71386069-0
2022-03-07 15:42:04.193 INFO 33630 --- [ main] com.example.demo.So71386069Application
: Creating so71386069-1
...
2022-03-07 15:42:07.590 INFO 33630 --- [o71386069-0-C-1] o.s.k.l.KafkaMessageListenerContainer
: so71386069: partitions revoked: [so71386069-0]
...
2022-03-07 15:42:07.599 INFO 33630 --- [o71386069-0-C-1] o.s.k.l.KafkaMessageListenerContainer
: so71386069: partitions assigned: [so71386069-0, so71386069-1-0, so71386069-0-0]
2022-03-07 15:42:34.378 INFO 33630 --- [ main] com.example.demo.So71386069Application
: Creating so71386069-2
2022-03-07 15:43:04.554 INFO 33630 --- [ main] com.example.demo.So71386069Application
: Creating so71386069-3
...
2022-03-07 15:43:08.403 INFO 33630 --- [o71386069-0-C-1] o.s.k.l.KafkaMessageListenerContainer
: so71386069: partitions revoked: [so71386069-0, so71386069-1-0, so71386069-0-0]
...
2022-03-07 15:43:08.411 INFO 33630 --- [o71386069-0-C-1] o.s.k.l.KafkaMessageListenerContainer
: so71386069: partitions assigned: [so71386069-0, so71386069-3-0, so71386069-2-0, so71386069-1-0, so71386069-0-0]
...
I think that’s how it is by design. The Kafka client always has to subscribe to a topic before be able to get messages.
In this case, on startup the Kafka client/consumer is subscribing to topics matching patterns once at the startup and that’s what it carries on with.
But this is really an interesting question. The easiest and simplest answer is “Restarting the client/consumer“. However, will keep a watch on others answers to learn about any ideas.

While using Thread sleep on ParallelFlux its not waiting for sleep thread to get completed and executing onComplete() function

While using Parallel on Flux i am stopping thread for some time using thread sleep, but the problem is that flux not waiting till thread sleep time and executed on onComplete on subscribe.
List str = new ArrayList<>();
str.add("spring");
str.add("webflux");
str.add("example");
AtomicInteger num = new AtomicInteger();
ParallelFlux<Object> names = Flux.fromIterable(str)
.log()
.parallel(2)
.runOn(Schedulers.boundedElastic())
.map( s-> {
if(s.equalsIgnoreCase("webflux")) {
try {
System.out.println("waiting...");
Thread.sleep(1000);
System.out.println("done...");
} catch (InterruptedException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
return s+" "+num.incrementAndGet();
});
names.subscribe(s -> {
System.out.println("value "+s+" thread : "+Thread.currentThread().getName());
});
Output:
19:35:24.870 [main] INFO reactor.Flux.Iterable.1 - | onSubscribe([Synchronous Fuseable] FluxIterable.IterableSubscription)
19:35:24.896 [main] INFO reactor.Flux.Iterable.1 - | request(256)
19:35:24.897 [main] INFO reactor.Flux.Iterable.1 - | onNext(spring)
19:35:24.898 [main] INFO reactor.Flux.Iterable.1 - | onNext(webflux)
19:35:24.898 [main] INFO reactor.Flux.Iterable.1 - | onNext(example)
waiting...
value spring 1 thread : boundedElastic-1
value example 2 thread : boundedElastic-1
19:35:24.899 [main] INFO reactor.Flux.Iterable.1 - | onComplete()

How to handle UnkownProducerIdException

We are having some troubles with Spring Cloud and Kafka, at sometimes our microservice throws an UnkownProducerIdException, this is caused if the parameter transactional.id.expiration.ms is expired in the broker side.
My question, could it be possible to catch that exception and retry the failed message? If yes, what could be the best option to handle it?
I have took a look at:
- https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=89068820
- Kafka UNKNOWN_PRODUCER_ID exception
We are using Spring Cloud Hoxton.RELEASE version and Spring Kafka version 2.2.4.RELEASE
We are using AWS Kafka solution so we can't set a new value on that property I mentioned before.
Here is some trace of the exception:
2020-04-07 20:54:00.563 ERROR 5188 --- [ad | producer-2] o.a.k.c.p.internals.TransactionManager : [Producer clientId=producer-2] The broker returned org.apache.kafka.common.errors.UnknownProducerIdException: This exception is raised by the broker if it could not locate the producer metadata associated with the producerId in question. This could happen if, for instance, the producer's records were deleted because their retention time had elapsed. Once the last records of the producerId are removed, the producer's metadata is removed from the broker, and future appends by the producer will return this exception. for topic-partition test.produce.another-2 with producerId 35000, epoch 0, and sequence number 8
2020-04-07 20:54:00.563 INFO 5188 --- [ad | producer-2] o.a.k.c.p.internals.TransactionManager : [Producer clientId=producer-2] ProducerId set to -1 with epoch -1
2020-04-07 20:54:00.565 ERROR 5188 --- [ad | producer-2] o.s.k.support.LoggingProducerListener : Exception thrown when sending a message with key='null' and payload='{...}' to topic <some-topic>:
To reproduce this exception:
- I have used the confluent docker images and set the environment variable KAFKA_TRANSACTIONAL_ID_EXPIRATION_MS to 10 seconds so I wouldn't wait too much for this exception to be thrown.
- In another process, send one by one in interval of 10 seconds 1 message in the topic the java will listen.
Here is a code example:
File Bindings.java
import org.springframework.cloud.stream.annotation.Input;
import org.springframework.cloud.stream.annotation.Output;
import org.springframework.messaging.MessageChannel;
import org.springframework.messaging.SubscribableChannel;
public interface Bindings {
#Input("test-input")
SubscribableChannel testListener();
#Output("test-output")
MessageChannel testProducer();
}
File application.yml (don't forget to set the environment variable KAFKA_HOST):
spring:
cloud:
stream:
kafka:
binder:
auto-create-topics: true
brokers: ${KAFKA_HOST}
transaction:
producer:
error-channel-enabled: true
producer-properties:
acks: all
retry.backoff.ms: 200
linger.ms: 100
max.in.flight.requests.per.connection: 1
enable.idempotence: true
retries: 3
compression.type: snappy
request.timeout.ms: 5000
key.serializer: org.apache.kafka.common.serialization.StringSerializer
consumer-properties:
session.timeout.ms: 20000
max.poll.interval.ms: 350000
enable.auto.commit: true
allow.auto.create.topics: true
auto.commit.interval.ms: 12000
max.poll.records: 5
isolation.level: read_committed
configuration:
auto.offset.reset: latest
bindings:
test-input:
# contentType: text/plain
destination: test.produce
group: group-input
consumer:
maxAttempts: 3
startOffset: latest
autoCommitOnError: true
queueBufferingMaxMessages: 100000
autoCommitOffset: true
test-output:
# contentType: text/plain
destination: test.produce.another
group: group-output
producer:
acks: all
debug: true
The listener handler:
#SpringBootApplication
#EnableBinding(Bindings.class)
public class PocApplication {
private static final Logger log = LoggerFactory.getLogger(PocApplication.class);
public static void main(String[] args) {
SpringApplication.run(PocApplication.class, args);
}
#Autowired
private BinderAwareChannelResolver binderAwareChannelResolver;
#StreamListener(Topics.TESTLISTENINPUT)
public void listen(Message<?> in, String headerKey) {
final MessageBuilder builder;
MessageChannel messageChannel;
messageChannel = this.binderAwareChannelResolver.resolveDestination("test-output");
Object payload = in.getPayload();
builder = MessageBuilder.withPayload(payload);
try {
log.info("Event received: {}", in);
if (!messageChannel.send(builder.build())) {
log.error("Something happend trying send the message! {}", in.getPayload());
}
log.info("Commit success");
} catch (UnknownProducerIdException e) {
log.error("UnkownProducerIdException catched ", e);
} catch (KafkaException e) {
log.error("KafkaException catched ", e);
}catch (Exception e) {
System.out.println("Commit failed " + e.getMessage());
}
}
}
Regards
} catch (UnknownProducerIdException e) {
log.error("UnkownProducerIdException catched ", e);
To catch exceptions there, you need to set the sync kafka producer property (https://cloud.spring.io/spring-cloud-static/spring-cloud-stream-binder-kafka/3.0.3.RELEASE/reference/html/spring-cloud-stream-binder-kafka.html#kafka-producer-properties). Otherwise, the error comes back asynchronously
You should not "eat" the exception there; it must be thrown back to the container so the container will roll back the transaction.
Also,
}catch (Exception e) {
System.out.println("Commit failed " + e.getMessage());
}
The commit is performed by the container after the stream listener returns to the container so you will never see a commit error here; again, you must let the exception propagate back to the container.
The container will retry the delivery according to the consumer binding's retry configuration.
probably you can also use the callback function to handle the exception, not sure about the springframework lib for kafka, if using kafka client, you can something like this:
producer.send(record, new Callback() {
public void onCompletion(RecordMetadata metadata, Exception e) {
if(e != null) {
e.printStackTrace();
if(e.getClass().equals(UnknownProducerIdException.class)) {
logger.info("UnknownProducerIdException caught");
while(--retry>=0) {
send(topic,partition,msg);
}
}
} else {
logger.info("The offset of the record we just sent is: " + metadata.offset());
}
}
});

Spring batch - Commit failed while step execution data was already updated error at custom writer

After the writer class ends. The error occurs in spring batch execution.
all of read data is to be rollback and chunk data is not inserted to DB.
What is that mean of "Commit failed while step execution data was already updated error"? And why does occur this log?
Is it problem that may time of insert transaction in writer?
If there is one time of insert transaction(commented out to 'recordDataService.insertRecordData(recordData);', there is no problem.
My Code :
#Override
public void write(List<? extends RecordInfo> items) {
log.info("########################################################");
log.info("write");
log.info("########################################################");
recordInfoService.insertRecordInfo(recordInfos.get(0));
for( RecordData recordData : recordInfos.get(0).getRecordDataList()){
recordDataService.insertRecordData(recordData);
}
The Log of each other writer :
2015-07-17 00:05:38.995 INFO 42558 --- [ main] c.s.c.b.j.c.CollectRecordItemWriter : ########################################################
2015-07-17 00:05:38.995 INFO 42558 --- [ main] c.s.c.b.j.c.CollectRecordItemWriter : write
2015-07-17 00:05:38.995 INFO 42558 --- [ main] c.s.c.b.j.c.CollectRecordItemWriter : ########################################################
2015-07-17 16:34:26.921 INFO 41111 --- [ main] o.s.jdbc.support.SQLErrorCodesFactory : SQLErrorCodes loaded: [DB2, Derby, H2, HSQL, Informix, MS-SQL, MySQL, Oracle, PostgreSQL, Sybase, Hana]
2015-07-16 02:30:25.734 INFO 60636 --- [ main] o.s.batch.core.step.tasklet.TaskletStep : Commit failed while step execution data was already updated. Reverting to old version.
The Log of Execution : everything was to be rollback
+++++++++++++++++++++++++++++++++++++++++++++++++++++
Step collectRecordStep
WriteCount: 0
ReadCount: 14
ReadSkipCount: 0
Commits: 1
SkipCount: 0
Rollbacks: 14
Filter: 0
+++++++++++++++++++++++++++++++++++++++++++++++++++++

Async Spring Batch job fails to process file

I'm trying to process a file, and upload it into a database, using spring batch, right after uploading it. However the job completes right after it's started, and I'm not too sure of the exact reason. I think it's not doing what it should, in the tasklet.execute. Below is the DEBUG output:
22:25:09.823 [http-nio-127.0.0.1-8080-exec-2] DEBUG o.s.b.c.c.a.SimpleBatchConfiguration$ReferenceTargetSource - Initializing lazy target object
22:25:09.912 [SimpleAsyncTaskExecutor-1] INFO o.s.b.c.l.support.SimpleJobLauncher - Job: [FlowJob: [name=moneyTransactionImport]] launched with the following parameters: [{targetFile=C:\Users\test\AppData\Local\Temp\tomcat.1435325122308787143.8080\uploads\test.csv}]
22:25:09.912 [SimpleAsyncTaskExecutor-1] DEBUG o.s.batch.core.job.AbstractJob - Job execution starting: JobExecution: id=95, version=0, startTime=null, endTime=null, lastUpdated=Tue Sep 16 22:25:09 BST 2014, status=STARTING, exitStatus=exitCode=UNKNOWN;exitDescription=, job=[JobInstance: id=52, version=0, Job=[transactionImport]], jobParameters=[{targetFile=C:\Users\test\AppData\Local\Temp\tomcat.1435325122308787143.8080\uploads\test.csv}]
22:25:09.971 [SimpleAsyncTaskExecutor-1] DEBUG o.s.b.c.job.flow.support.SimpleFlow - Resuming state=transactionImport.step with status=UNKNOWN
22:25:09.972 [SimpleAsyncTaskExecutor-1] DEBUG o.s.b.c.job.flow.support.SimpleFlow - Handling state=transactionImport.step
22:25:10.018 [SimpleAsyncTaskExecutor-1] INFO o.s.batch.core.job.SimpleStepHandler - Executing step: [step]
22:25:10.019 [SimpleAsyncTaskExecutor-1] DEBUG o.s.batch.core.step.AbstractStep - Executing: id=93
22:25:10.072 [SimpleAsyncTaskExecutor-1] DEBUG o.s.batch.core.scope.StepScope - Creating object in scope=step, name=scopedTarget.reader
22:25:10.117 [SimpleAsyncTaskExecutor-1] DEBUG o.s.batch.core.scope.StepScope - Registered destruction callback in scope=step, name=scopedTarget.reader
22:25:10.136 [SimpleAsyncTaskExecutor-1] WARN o.s.b.item.file.FlatFileItemReader - Input resource does not exist class path resource [C:/Users/test/AppData/Local/Temp/tomcat.1435325122308787143.8080/uploads/test.csv]
22:25:10.180 [SimpleAsyncTaskExecutor-1] DEBUG o.s.b.repeat.support.RepeatTemplate - Starting repeat context.
22:25:10.181 [SimpleAsyncTaskExecutor-1] DEBUG o.s.b.repeat.support.RepeatTemplate - Repeat operation about to start at count=1
22:25:10.181 [SimpleAsyncTaskExecutor-1] DEBUG o.s.b.c.s.c.StepContextRepeatCallback - Preparing chunk execution for StepContext: org.springframework.batch.core.scope.context.StepContext#5d85b879
22:25:10.181 [SimpleAsyncTaskExecutor-1] DEBUG o.s.b.c.s.c.StepContextRepeatCallback - Chunk execution starting: queue size=0
22:25:12.333 [SimpleAsyncTaskExecutor-1] DEBUG o.s.b.repeat.support.RepeatTemplate - Starting repeat context.
22:25:12.333 [SimpleAsyncTaskExecutor-1] DEBUG o.s.b.repeat.support.RepeatTemplate - Repeat operation about to start at count=1
22:25:12.334 [SimpleAsyncTaskExecutor-1] DEBUG o.s.b.repeat.support.RepeatTemplate - Repeat is complete according to policy and result value.
22:25:12.334 [SimpleAsyncTaskExecutor-1] DEBUG o.s.b.c.s.item.ChunkOrientedTasklet - Inputs not busy, ended: true
22:25:12.334 [SimpleAsyncTaskExecutor-1] DEBUG o.s.b.core.step.tasklet.TaskletStep - Applying contribution: [StepContribution: read=0, written=0, filtered=0, readSkips=0, writeSkips=0, processSkips=0, exitStatus=EXECUTING]
22:25:12.337 [SimpleAsyncTaskExecutor-1] DEBUG o.s.b.core.step.tasklet.TaskletStep - Saving step execution before commit: StepExecution: id=93, version=1, name=step, status=STARTED, exitStatus=EXECUTING, readCount=0, filterCount=0, writeCount=0 readSkipCount=0, writeSkipCount=0, processSkipCount=0, commitCount=1, rollbackCount=0, exitDescription=
22:25:12.358 [SimpleAsyncTaskExecutor-1] DEBUG o.s.b.repeat.support.RepeatTemplate - Repeat is complete according to policy and result value.
22:25:12.358 [SimpleAsyncTaskExecutor-1] DEBUG o.s.batch.core.step.AbstractStep - Step execution success: id=93
22:25:12.419 [SimpleAsyncTaskExecutor-1] DEBUG o.s.batch.core.step.AbstractStep - Step execution complete: StepExecution: id=93, version=3, name=step, status=COMPLETED, exitStatus=COMPLETED, readCount=0, filterCount=0, writeCount=0 readSkipCount=0, writeSkipCount=0, processSkipCount=0, commitCount=1, rollbackCount=0
22:25:12.442 [SimpleAsyncTaskExecutor-1] DEBUG o.s.b.c.job.flow.support.SimpleFlow - Completed state=transactionImport.step with status=COMPLETED
22:25:12.443 [SimpleAsyncTaskExecutor-1] DEBUG o.s.b.c.job.flow.support.SimpleFlow - Handling state=transactionImport.COMPLETED
22:25:12.443 [SimpleAsyncTaskExecutor-1] DEBUG o.s.b.c.job.flow.support.SimpleFlow - Completed state=transactionImport.COMPLETED with status=COMPLETED
22:25:12.445 [SimpleAsyncTaskExecutor-1] DEBUG o.s.batch.core.job.AbstractJob - Job execution complete: JobExecution: id=95, version=1, startTime=Tue Sep 16 22:25:09 BST 2014, endTime=null, lastUpdated=Tue Sep 16 22:25:09 BST 2014, status=COMPLETED, exitStatus=exitCode=COMPLETED;exitDescription=, job=[JobInstance: id=52, version=0, Job=[transactionImport]], jobParameters=[{targetFile=C:\Users\test\AppData\Local\Temp\tomcat.1435325122308787143.8080\uploads\test.csv}]
22:25:12.466 [SimpleAsyncTaskExecutor-1] INFO o.s.b.c.l.support.SimpleJobLauncher - Job: [FlowJob: [name=transactionImport]] completed with the following parameters: [{targetFile=C:\Users\test\AppData\Local\Temp\tomcat.1435325122308787143.8080\uploads\test.csv}] and the following status: [COMPLETED]
My config is as follows:
#Configuration
#EnableBatchProcessing
public class BatchConfiguration {
#Inject
private TransactionRepository transactionRepository;
#Inject
private JobRepository jobRepository;
#Bean
#StepScope
public FlatFileItemReader<MoneyTransaction> reader(#Value("#{jobParameters[targetFile]}") String file) {
FlatFileItemReader<MoneyTransaction> reader = new FlatFileItemReader<>();
reader.setResource(new ClassPathResource(file));
reader.setLineMapper(new DefaultLineMapper<MoneyTransaction>() {
{
setLineTokenizer(new DelimitedLineTokenizer() {
{
setNames(new String[]{"Number", "Date", "Account", "Payee", "Cleared", "Amount", "Category", "Subcategory", "Memo"});
}
}
);
setFieldSetMapper(new BeanWrapperFieldSetMapper<MoneyTransaction>() {
{
setTargetType(MoneyTransaction.class);
}
});
}
}
);
reader.setStrict(false);
reader.setLinesToSkip(1);
return reader;
}
#Bean
public ItemProcessor<MoneyTransaction, Transaction> processor() {
return new TransactionProcessor();
}
#Bean
public RepositoryItemWriter writer() {
RepositoryItemWriter writer = new RepositoryItemWriter();
writer.setRepository(transactionRepository);
writer.setMethodName("save");
return writer;
}
#Bean
public Step step(StepBuilderFactory stepBuilderFactory, ItemReader<MoneyTransaction> reader,
ItemWriter<Transaction> writer, ItemProcessor<MoneyTransaction, Transaction> processor) {
return stepBuilderFactory.get("step")
.<MoneyTransaction, Transaction>chunk(100)
.reader(reader)
.processor(processor)
.writer(writer)
.build();
}
#Bean
public SimpleAsyncTaskExecutor taskExecutor() {
SimpleAsyncTaskExecutor executor = new SimpleAsyncTaskExecutor();
executor.setConcurrencyLimit(1);
return executor;
}
#Bean
public SimpleJobLauncher jobLauncher() {
SimpleJobLauncher jobLauncher = new SimpleJobLauncher();
jobLauncher.setJobRepository(jobRepository);
jobLauncher.setTaskExecutor(taskExecutor());
return jobLauncher;
}
}
And I save the file, and start processing in the following way:
public JobExecution processFile(String name, MultipartFile file) {
if (!file.isEmpty()) {
try {
byte[] bytes = file.getBytes();
String rootPath = System.getProperty("catalina.home");
File uploadDirectory = new File(rootPath.concat(File.separator).concat("uploads"));
if (!uploadDirectory.exists()) {
uploadDirectory.mkdirs();
}
File uploadFile = new File(uploadDirectory.getAbsolutePath() + File.separator + file.getOriginalFilename());
BufferedOutputStream stream =
new BufferedOutputStream(new FileOutputStream(uploadFile));
stream.write(bytes);
stream.close();
return startImportJob(uploadFile, "transactionImport");
} catch (Exception e) {
logger.error(String.format("Error processing file '%s'.", name), e);
throw new MoneyException(e);
}
} else {
throw new MoneyException("There was no file to process.");
}
}
/**
* #param file
*/
private JobExecution startImportJob(File file, String jobName) {
logger.debug(String.format("Starting job to import file '%s'.", file));
try {
Job job = jobs.get(jobName).incrementer(new MoneyRunIdIncrementer()).flow(step).end().build();
return jobLauncher.run(job, new JobParametersBuilder().addString("targetFile", file.getAbsolutePath()).toJobParameters());
} catch (JobExecutionAlreadyRunningException e) {
logger.error(String.format("Job for processing file '%s' is already running.", file), e);
throw new MoneyException(e);
} catch (JobParametersInvalidException e) {
logger.error(String.format("Invalid parameters for processing of file '%s'.", file), e);
throw new MoneyException(e);
} catch (JobRestartException e) {
logger.error(String.format("Error restarting job, for processing file '%s'.", file), e);
throw new MoneyException(e);
} catch (JobInstanceAlreadyCompleteException e) {
logger.error(String.format("Job to process file '%s' has already completed.", file), e);
throw new MoneyException(e);
}
}
I'm kind of stumped at the minute, and any help would be greatly received.
Thanks.
Found the issue. The problem was with the type of resource ClassPathResource(file), combined with the fact that I was setting the strict property to false.
reader.setResource(new ClassPathResource(file));
I should have used
reader.setResource(new FileSystemResource(file));
Which makes complete sense, as I wasn't uploading the file as a class path resource.