Reading Multiple excel File Using Spring Batch Extension - spring-batch

I am trying to read multiple excel files using Spring-Bath-Excel. In my scenario i don't know i advance how many files client will process i.e. if data would be very large, excel file will be split into multiple files like records1.xls ,records2.xls, records3.xls..
Is there any kind of MultiResourceItemReader available in Spring-Batch-Excel? I tried to set multiple resources at run time and also tried to use the patterns records*.xls but PoiItemReader did't allow me to do that .
I am using PoiItemReader for that .

To Read Multiple Excel
package com.abc.ingestion.job.dci;
import org.springframework.batch.core.Job;
import org.springframework.batch.core.Step;
import org.springframework.batch.core.configuration.annotation.EnableBatchProcessing;
import org.springframework.batch.core.configuration.annotation.JobBuilderFactory;
import org.springframework.batch.core.configuration.annotation.StepBuilderFactory;
import org.springframework.batch.core.launch.support.RunIdIncrementer;
import org.springframework.batch.extensions.excel.RowMapper;
import org.springframework.batch.extensions.excel.streaming.StreamingXlsxItemReader;
import org.springframework.batch.extensions.excel.support.rowset.DefaultRowSetFactory;
import org.springframework.batch.extensions.excel.support.rowset.StaticColumnNameExtractor;
import org.springframework.batch.item.ItemWriter;
import org.springframework.batch.item.file.MultiResourceItemReader;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.beans.factory.annotation.Value;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.core.io.Resource;
#Configuration
#EnableBatchProcessing
public class BatchConfig {
#Autowired
private JobBuilderFactory jobBuilderFactory;
#Autowired
private StepBuilderFactory stepBuilderFactory;
// Create input folder in resources
#Value("input/DCI*.xlsx")
private Resource[] inputResources;
#Bean
public MultiResourceItemReader<CouncilMapper> multiResourceItemReader() {
MultiResourceItemReader<CouncilMapper> resourceItemReader = new MultiResourceItemReader<>();
resourceItemReader.setResources(inputResources);
resourceItemReader.setDelegate(reader());
return resourceItemReader;
}
private RowMapper<CouncilMapper> excelRowMapper() {
return new Mapper();
}
#SuppressWarnings({ "rawtypes", "unchecked" })
#Bean
public StreamingXlsxItemReader<CouncilMapper> reader() {
final String[] COLUMNS = {"Reg_Type","RegUnder","registration_no","registration_date","course","Other_Course","LRegDate","council_name","full_name","CatName","Other_Category","father_name","mother_name","gender","nationality","date_of_birth","place_of_birth","permanent_address","business_address","current_city","current_state","permanent_city","mobile_number","OfficialTelephone","email","aadhar_number","PanNo","IsDeleted","CreatedDate","UpdatedDate","speciality_name"};
var factory = new DefaultRowSetFactory();
factory.setColumnNameExtractor(new StaticColumnNameExtractor(COLUMNS));
StreamingXlsxItemReader<CouncilMapper> reader = new StreamingXlsxItemReader<>();
reader.setLinesToSkip(1);
reader.setRowSetFactory(factory);
reader.setRowMapper(excelRowMapper());
return reader;
}
#Bean
ItemWriter<CouncilMapper> writer() {
return new Writer();
}
#Bean
public Job readFilesJob() {
return jobBuilderFactory
.get("readFilesJob")
.incrementer(new RunIdIncrementer())
.start(excelFileStep())
.build();
}
#Bean
public Step excelFileStep() {
return stepBuilderFactory.get("excelFileStep")
.<CouncilMapper, CouncilMapper>chunk(5)
.reader(multiResourceItemReader())
.writer(writer())
.build();
}
}
Mapper Class
package com.abc.ingestion.job.dci;
import java.util.HashMap;
import java.util.Map;
import java.util.stream.IntStream;
import com.fasterxml.jackson.databind.ObjectMapper;
import org.springframework.batch.extensions.excel.RowMapper;
import org.springframework.batch.extensions.excel.support.rowset.RowSet;
public class Mapper implements RowMapper<CouncilMapper> {
#Override
public CouncilMapper mapRow(RowSet rowSet) throws Exception {
var rowSetMetaData = rowSet.getMetaData();
String[] columnNames = rowSetMetaData.getColumnNames();
String[] rowData = rowSet.getCurrentRow();
var mapper = new ObjectMapper();
Map<String, String> excelData = new HashMap<>();
IntStream.range(0, columnNames.length).forEach(index -> excelData.put(columnNames[index], rowData[index]));
return mapper.convertValue(excelData, CouncilMapper.class);
}
}

Related

Error : Field job in com.example.partioner.DemoApplication required a bean of type 'org.springframework.batch.core.Job' that could not be found

I try a spring batch partitioning database program but I have this message when I try to run the batch:
APPLICATION FAILED TO START
Description: Field job in com.example.partioner.DemoApplication required a bean of type
'org.springframework.batch.core.Job' that could not be found.
The injection point has the following annotations: - #org.springframework.beans.factory.annotation.Autowired(required=true)
Action: Consider defining a bean of type
'org.springframework.batch.core.Job' in your configuration.
This is my main class :
package com.example.partioner;
import java.util.Date;
import org.springframework.batch.core.Job;
import org.springframework.batch.core.JobExecution;
import org.springframework.batch.core.JobParameters;
import org.springframework.batch.core.JobParametersBuilder;
import org.springframework.batch.core.configuration.annotation.EnableBatchProcessing;
import org.springframework.batch.core.launch.JobLauncher;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.boot.CommandLineRunner;
import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;
#SpringBootApplication
#EnableBatchProcessing
public class DemoApplication implements CommandLineRunner {
#Autowired
private JobLauncher jobLauncher;
#Autowired
private Job job;
public static void main(String[] args) {
SpringApplication.run(DemoApplication.class, args);
}
#Override
public void run(String... args) throws Exception {
System.out.println("STATUS STARTED===================");
JobParameters jobParameters = new JobParametersBuilder()
.addString("JobId", String.valueOf(System.currentTimeMillis()))
.addDate("date", new Date())
.addLong("time",System.currentTimeMillis()).toJobParameters();
JobExecution execution = jobLauncher.run(job, jobParameters);
System.out.println("STATUS :: "+execution.getStatus());
}
}
This my jobConfig class :
package com.example.config;
import java.util.HashMap;
import java.util.Map;
import javax.sql.DataSource;
import org.springframework.batch.core.Job;
import org.springframework.batch.core.Step;
import org.springframework.batch.core.configuration.annotation.JobBuilderFactory;
import org.springframework.batch.core.configuration.annotation.StepBuilderFactory;
import org.springframework.batch.core.configuration.annotation.StepScope;
import org.springframework.batch.item.database.BeanPropertyItemSqlParameterSourceProvider;
import org.springframework.batch.item.database.JdbcBatchItemWriter;
import org.springframework.batch.item.database.JdbcPagingItemReader;
import org.springframework.batch.item.database.Order;
import org.springframework.batch.item.database.support.MySqlPagingQueryProvider;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.beans.factory.annotation.Value;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.core.task.SimpleAsyncTaskExecutor;
import com.example.mapper.CustomerRowMapper;
import com.example.model.Customer;
import com.example.partitioner.ColumnRangePartitioner;
#Configuration
public class JobConfiguration {
#Autowired
private JobBuilderFactory jobBuilderFactory;
#Autowired
private StepBuilderFactory stepBuilderFactory;
#Autowired
private DataSource dataSource;
#Bean
public ColumnRangePartitioner partitioner()
{
ColumnRangePartitioner columnRangePartitioner = new ColumnRangePartitioner();
columnRangePartitioner.setColumn("id");
columnRangePartitioner.setDataSource(dataSource);
columnRangePartitioner.setTable("customer");
return columnRangePartitioner;
}
#Bean
#StepScope
public JdbcPagingItemReader<Customer> pagingItemReader(
#Value("#{stepExecutionContext['minValue']}") Long minValue,
#Value("#{stepExecutionContext['maxValue']}") Long maxValue)
{
System.out.println("reading " + minValue + " to " + maxValue);
Map<String, Order> sortKeys = new HashMap<>();
sortKeys.put("id", Order.ASCENDING);
MySqlPagingQueryProvider queryProvider = new MySqlPagingQueryProvider();
queryProvider.setSelectClause("id, firstName, lastName, birthdate");
queryProvider.setFromClause("from customer");
queryProvider.setWhereClause("where id >= " + minValue + " and id < " + maxValue);
queryProvider.setSortKeys(sortKeys);
JdbcPagingItemReader<Customer> reader = new JdbcPagingItemReader<>();
reader.setDataSource(this.dataSource);
reader.setFetchSize(10);
reader.setRowMapper(new CustomerRowMapper());
reader.setQueryProvider(queryProvider);
return reader;
}
#Bean
#StepScope
public JdbcBatchItemWriter<Customer> customerItemWriter()
{
JdbcBatchItemWriter<Customer> itemWriter = new JdbcBatchItemWriter<>();
itemWriter.setDataSource(dataSource);
itemWriter.setSql("INSERT INTO NEW_CUSTOMER VALUES (:id, :firstName, :lastName, :birthdate)");
itemWriter.setItemSqlParameterSourceProvider
(new BeanPropertyItemSqlParameterSourceProvider<>());
itemWriter.afterPropertiesSet();
return itemWriter;
}
#Bean
public Step slaveStep()
{
return stepBuilderFactory.get("slaveStep")
.<Customer, Customer>chunk(10)
.reader(pagingItemReader(null, null))
.writer(customerItemWriter())
.build();
}
#Bean
public Step step1()
{
return stepBuilderFactory.get("step1")
.partitioner(slaveStep().getName(), partitioner())
.step(slaveStep())
.gridSize(4)
.taskExecutor(new SimpleAsyncTaskExecutor())
.build();
}
#Bean
public Job job()
{
return jobBuilderFactory.get("job")
.start(step1())
.build();
}
}
This is my partitioner class:
package com.example.partitioner;
import java.util.HashMap;
import java.util.Map;
import javax.sql.DataSource;
import org.springframework.batch.core.partition.support.Partitioner;
import org.springframework.batch.item.ExecutionContext;
import org.springframework.jdbc.core.JdbcOperations;
import org.springframework.jdbc.core.JdbcTemplate;
public class ColumnRangePartitioner implements Partitioner
{
private JdbcOperations jdbcTemplate;
private String table;
private String column;
public void setTable(String table) {
this.table = table;
}
public void setColumn(String column) {
this.column = column;
}
public void setDataSource(DataSource dataSource) {
jdbcTemplate = new JdbcTemplate(dataSource);
}
#Override
public Map<String, ExecutionContext> partition(int gridSize)
{
int min = jdbcTemplate.queryForObject("SELECT MIN(" + column + ") FROM " + table, Integer.class);
int max = jdbcTemplate.queryForObject("SELECT MAX(" + column + ") FROM " + table, Integer.class);
int targetSize = (max - min) / gridSize + 1;
Map<String, ExecutionContext> result = new HashMap<>();
int number = 0;
int start = min;
int end = start + targetSize - 1;
while (start <= max)
{
ExecutionContext value = new ExecutionContext();
result.put("partition" + number, value);
if(end >= max) {
end = max;
}
value.putInt("minValue", start);
value.putInt("maxValue", end);
start += targetSize;
end += targetSize;
number++;
}
return result;
}
}
I don't understand the reason of this message and can't find a solution. I think I have put all the necessary annotations. I am a beginner and I hope you will help me.
You DemoApplication is in package com.example.partioner while your job configuration class JobConfiguration is in package com.example.config.
In order for Spring Boot to find your job, you need to move your JobConfiguration class to the same package as your main class DemoApplication or a package underneath it.
Please refer to the Structuring Your Code section of the reference documentation.

Spring Batch process different types in one step

I have a batch job that reads records from a file. I want to convert said records to PojoA (all strings). I want to run each record throw a validator ensure all fields are present. I then want to transform PojoA to PojoB. The issue I have is that I am unable to change the type of object mid-step.
return getStepBuilder("downloadData")
.<PojoA, PojoA>chunk(1000)
.reader(pojoAReader())
.processor(pojoAValidator)
.writer(pojoAWriter)
.processor(pojoAToPojoBTransformer) <- issue here, <PojoA, PojoB>
.write(pojoBWriter)
.build();
The reason PojoB exists is because PojoA is all strings; I want to persist all records regardless if they're invalid. PojoB has the accurate data types, e.g. Dates, numbers.
I think I need another step that deals with but how do I pass the PojoA's to step 2?
You cannot declare two processors/writers like:
.processor(pojoAValidator)
.writer(pojoAWriter)
.processor(pojoAToPojoBTransformer) <- issue here, <PojoA, PojoB>
.write(pojoBWriter)
You need to use a composite processor/writer for that.
Here is a quick example for a composite processor with processor1 (Integer -> Integer) then processor2 (Integer -> String):
import java.util.Arrays;
import org.springframework.batch.core.Job;
import org.springframework.batch.core.JobParameters;
import org.springframework.batch.core.Step;
import org.springframework.batch.core.configuration.annotation.EnableBatchProcessing;
import org.springframework.batch.core.configuration.annotation.JobBuilderFactory;
import org.springframework.batch.core.configuration.annotation.StepBuilderFactory;
import org.springframework.batch.core.launch.JobLauncher;
import org.springframework.batch.item.ItemProcessor;
import org.springframework.batch.item.ItemReader;
import org.springframework.batch.item.ItemWriter;
import org.springframework.batch.item.support.CompositeItemProcessor;
import org.springframework.batch.item.support.ListItemReader;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.context.ApplicationContext;
import org.springframework.context.annotation.AnnotationConfigApplicationContext;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
#Configuration
#EnableBatchProcessing
public class MyJob {
#Autowired
private JobBuilderFactory jobs;
#Autowired
private StepBuilderFactory steps;
#Bean
public ItemReader<Integer> itemReader() {
return new ListItemReader<>(Arrays.asList(1, 2, 3, 4, 5, 6, 7, 8, 9, 10));
}
#Bean
public ItemWriter<String> itemWriter() {
return items -> {
for (String item : items) {
System.out.println("item = " + item);
}
};
}
#Bean
public ItemProcessor<Integer, Integer> itemProcessor1() {
return item -> item + 1;
}
#Bean
public ItemProcessor<Integer, String> itemProcessor2() {
return String::valueOf;
}
#Bean
public ItemProcessor<Integer, String> compositeItemProcessor() {
CompositeItemProcessor<Integer, String> compositeItemProcessor = new CompositeItemProcessor<>();
compositeItemProcessor.setDelegates(Arrays.asList(itemProcessor1(), itemProcessor2()));
return compositeItemProcessor;
}
#Bean
public Step step() {
return steps.get("step")
.<Integer, String>chunk(5)
.reader(itemReader())
.processor(compositeItemProcessor())
.writer(itemWriter())
.build();
}
#Bean
public Job job() {
return jobs.get("job")
.start(step())
.build();
}
public static void main(String[] args) throws Exception {
ApplicationContext context = new AnnotationConfigApplicationContext(MyJob.class);
JobLauncher jobLauncher = context.getBean(JobLauncher.class);
Job job = context.getBean(Job.class);
jobLauncher.run(job, new JobParameters());
}
}

HikariCP for MongoDB in springboot

I am looking to create a connection pool - for mongoDB in Springboot. I am currently making use of Springdata Mongo repositories to connect to DB and collections but unsure of how to create the datapool connection
Here is the current implementation
PersonRepository.java
package com.test.TestAPI.repository;
import java.util.List;
import org.springframework.data.mongodb.repository.MongoRepository;
import org.springframework.stereotype.Repository;
import com.test.TestAPI.dto.Person;
#Repository
public interface PersonRepository extends MongoRepository<Person, String> {
}
PersonService
package com.test.TestAPI.service.impl;
import java.util.ArrayList;
import java.util.List;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.stereotype.Service;
import com.test.TestAPI.repository.PersonRepository;
import com.test.TestAPI.dto.Person;
#Service
public class PersonService {
#Autowired
private PersonRepository personRepo;
public List<Person> findAllPersons() {
return personRepo.findAll();
}
public Person createPerson(Person person) {
return personRepo.save(person);
}
}
PersonController
package com.test.TestAPI.controller;
import java.util.List;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.http.HttpStatus;
import org.springframework.http.MediaType;
import org.springframework.http.ResponseEntity;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.RequestMapping;
import org.springframework.web.bind.annotation.RestController;
import com.test.TestAPI.service.impl.PersonService;
import com.test.TestAPI.dto.Person;
import io.swagger.annotations.Api;
import io.swagger.annotations.ApiOperation;
#RestController
#RequestMapping(path="/v1/personController")
#Api(value="Controller for Person document")
public class PersonController {
#Autowired
PersonService service;
#GetMapping("/getAllPersons")
#ApiOperation(produces = MediaType.APPLICATION_JSON_VALUE, httpMethod = "GET", response = List.class,
value = "getAllPersons from the database", notes = "Sample note")
public ResponseEntity<List<Person>> getAllPersons(){
List<Person> personList = service.findAllPersons();
return new ResponseEntity<List<Person>>(personList, HttpStatus.OK);
}
}
SimpleCommandLineConfig
package com.test.TestAPI.config;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.boot.CommandLineRunner;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.core.annotation.Order;
import org.springframework.data.mongodb.repository.config.EnableMongoRepositories;
import org.springframework.stereotype.Component;
import com.test.TestAPI.repository.PersonRepository;
import com.test.TestAPI.dto.Person;
#Component
#Order(3)
public class SimpleCommandLineConfig implements CommandLineRunner {
#Autowired
PersonRepository repo;
#Override
public void run(String... args) throws Exception {
// TODO Auto-generated method stub
System.out.println("third command line runner");
System.out.println(repo.save(new Person("rr","rr",4)));
}
}
App.java
package com.test.TestAPI.main;
import org.springframework.boot.Banner;
import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;
import org.springframework.context.annotation.ComponentScan;
import org.springframework.data.jpa.repository.config.EnableJpaRepositories;
import org.springframework.data.mongodb.repository.config.EnableMongoRepositories;
/**
* Hello world!
*
*/
#SpringBootApplication
#ComponentScan(basePackages= {"com.test.TestAPI"})
#EnableMongoRepositories(basePackages= {"com.test.TestAPI.repository"})
public class App
{
public static void main(String[] args) throws Exception {
SpringApplication.run(App.class, args);
}
}
Also, I would like to know if spring data repos like a custom repo for MongoDB takes care of connection pool mechanism? How does connection pooling happen in that case? Could you please help me on this
I think I found an answer for this. Just like we use JDBC templates for RDBMS to store the databases, spring provides something called MongoTemplates which forms a connection pool based on the db configuration given
Here is a sample implementation
MongoClientFactory.java
public #Bean MongoClientFactoryBean mongo() throws Exception {
MongoClientFactoryBean mongo = new MongoClientFactoryBean();
mongo.setHost("localhost");
MongoClientOptions clientOptions = MongoClientOptions.builder().applicationName("FeddBackAPI_DB")
.connectionsPerHost(2000)
.connectTimeout(4000)
//.maxConnectionIdleTime(1000000000)
.maxWaitTime(3000)
.retryWrites(true)
.socketTimeout(4000)
.sslInvalidHostNameAllowed(true)//this is very risky
.build();
mongo.setMongoClientOptions(clientOptions);
return mongo;
}
DataSourceConfig.java (Another config class. Same config class can also be used to define all beans)
package com.fmr.FeedBackAPI.config;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.beans.factory.annotation.Qualifier;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.context.annotation.Import;
import org.springframework.core.env.Environment;
import org.springframework.data.mongodb.MongoDbFactory;
import org.springframework.data.mongodb.core.MongoOperations;
import org.springframework.data.mongodb.core.MongoTemplate;
import org.springframework.data.mongodb.core.SimpleMongoDbFactory;
import com.mongodb.Mongo;
import com.mongodb.MongoClient;
#Configuration
#Import(value=MongoClientFactory.class)
public class DataSourceConfig {
#Autowired
Mongo mongo;
#Autowired
Environment env;
#Bean
public String test() {
System.out.println("mongo"+mongo);
return "rer";
}
private MongoTemplate mongoTemplate() {
MongoDbFactory factory = new SimpleMongoDbFactory((MongoClient) mongo, "mongo_test");
MongoTemplate template = new MongoTemplate(factory);
return template;
}
#Bean
#Qualifier(value="customMongoOps")
public MongoOperations mongoOps() {
MongoOperations ops = mongoTemplate();
return ops;
}
#Bean
public MongoDbFactory factory() {
MongoDbFactory factory = new SimpleMongoDbFactory((MongoClient) mongo, "mongo_test");
return factory;
}
// #Bean
// public GridFsTemplate gridFsTemplate() {
// return new GridFsTemplate(mongo, converter)
// }
#Bean
public javax.sql.DataSource dataSource() {
HikariDataSource ds = new HikariDataSource();
ds.setMaximumPoolSize(100);
// ds.setDriverClassName(env.getProperty("spring.datasource.driver-class-name"));
// ds.setJdbcUrl(env.getProperty("spring.datasource.url"));;
//ds.setUsername(env.getProperty("spring.datasource.username"));
//ds.setPassword(env.getProperty("spring.datasource.password"));
ds.addDataSourceProperty("cachePrepStmts", true);
ds.addDataSourceProperty("prepStmtCacheSize", 250);
ds.addDataSourceProperty("prepStmtCacheSqlLimit", 2048);
ds.addDataSourceProperty("useServerPrepStmts", true);
return ds;
}
}
Now autowire the template/mongoOperations in you dao implementation or service and use it
package com.fmr.FeedBackAPI.service.impl;
import java.util.ArrayList;
import java.util.List;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.beans.factory.annotation.Qualifier;
import org.springframework.data.mongodb.core.MongoOperations;
import org.springframework.data.mongodb.core.MongoTemplate;
import org.springframework.stereotype.Service;
import com.fmr.FeedBackAPI.dto.ConfigDTO;
import static org.springframework.data.mongodb.core.query.Criteria.where;
import static org.springframework.data.mongodb.core.query.Query.query;
#Service
public class PlanConfigService {
#Autowired
#Qualifier(value="customMongoOps")
MongoOperations mongoOps;
public List<ConfigDTO> createConfigDTOList(List<ConfigDTO> configDTOList) {
List<ConfigDTO> configList = new ArrayList<>();
if(!mongoOps.collectionExists(ConfigDTO.class)) {
mongoOps.createCollection("ConfigDTO_table");
}
//create the configDTOList
mongoOps.insert(configDTOList, ConfigDTO.class);
configList = mongoOps.findAll(ConfigDTO.class, "ConfigDTO_table");
return configList;
}
public List<ConfigDTO> createConfigDTO(ConfigDTO configDTO) {
List<ConfigDTO> configList = new ArrayList<>();
if(!mongoOps.collectionExists(ConfigDTO.class)) {
mongoOps.createCollection("ConfigDTO_table");
}
//create the configDTOList
mongoOps.save(configDTO);
configList = mongoOps.find(query(where("planId").is(configDTO.getPlanId())), ConfigDTO.class,"ConfigDTO_table");
return configList;
}
}
Here is the application.properties (this is the default instance running in local)
spring.data.mongodb.host=localhost
spring.data.mongodb.port=27017
spring.data.mongodb.database=mongo_test
spring.data.mongodb.repositories=true
#
spring.datasource.url=jdbc:mongodb://localhost:27017/mongo_test

How to archive processed file in Multi-threaded Step in Spring batch?

I'm using multi-threaded step while reading a file from the resources. Let's say I have several files to be processed & multiple-threads are processing the same file so, I'm not sure at which point in time my whole files get processed.
Once my file successfully processed, I need to archive/delete the file. Can someone guide me what should I use?
Here is my sample code.
import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Paths;
import java.util.LinkedList;
import java.util.List;
import java.util.Queue;
import java.util.stream.Stream;
import javax.sql.DataSource;
import org.springframework.batch.core.Job;
import org.springframework.batch.core.JobExecutionListener;
import org.springframework.batch.core.JobParameters;
import org.springframework.batch.core.JobParametersBuilder;
import org.springframework.batch.core.Step;
import org.springframework.batch.core.StepExecution;
import org.springframework.batch.core.annotation.AfterStep;
import org.springframework.batch.core.annotation.BeforeStep;
import org.springframework.batch.core.configuration.annotation.BatchConfigurer;
import org.springframework.batch.core.configuration.annotation.DefaultBatchConfigurer;
import org.springframework.batch.core.configuration.annotation.JobBuilderFactory;
import org.springframework.batch.core.configuration.annotation.StepBuilderFactory;
import org.springframework.batch.core.launch.JobLauncher;
import org.springframework.batch.core.launch.support.RunIdIncrementer;
import org.springframework.batch.item.ItemReader;
import org.springframework.batch.item.ItemWriter;
import org.springframework.batch.item.NonTransientResourceException;
import org.springframework.batch.item.ParseException;
import org.springframework.batch.item.UnexpectedInputException;
import org.springframework.batch.support.transaction.ResourcelessTransactionManager;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.beans.factory.annotation.Qualifier;
import org.springframework.beans.factory.annotation.Value;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.core.io.Resource;
import org.springframework.core.task.SimpleAsyncTaskExecutor;
import org.springframework.core.task.TaskExecutor;
import com.iana.spring.batch.dao.GenericDAO;
import com.iana.spring.batch.listener.BatchJobCompletionListener;
#Configuration
public class BatchConfig {
#Autowired
private JobBuilderFactory jobBuilderFactory;
#Autowired
private StepBuilderFactory stepBuilderFactory;
#Autowired
private JobLauncher jobLauncher;
#Autowired
private Job processJob;
#Value("classpath*:/final/HYUMER_SI_*.txt")
private Resource[] inputFiles;
#Autowired
#Qualifier("test2DataSource")
private DataSource test2DataSource;
public void saveFileLog(String fileLog) throws Exception{
String query = "INSERT INTO FILE_LOG(LOG_INFO) VALUES (?)";
new GenericDAO().saveOrUpdate(test2DataSource, query, false, fileLog);
}
// This job runs in every 5 seconds
//#Scheduled(fixedRate = 150000000)
public void fixedRatedCallingMethod() {
try {
JobParameters jobParameters = new JobParametersBuilder()
.addLong("time", System.currentTimeMillis())
.toJobParameters();
jobLauncher.run(processJob, jobParameters);
System.out.println("I have been scheduled with Spring scheduler");
} catch (Exception e) {
e.printStackTrace();
}
}
/* In case of multiple DataSources configuration- we need to add following code.
* - It is a good practice to provide Spring Batch database as #Primary to get the benefits of all default functionalities
* implemented by Spring Batch Statistics.
* - All insert and update batch job running statistics will be maintained by Spring Batch Itself.
* - No need to write any extra line of codes.
* Error: To use the default BatchConfigurer the context must contain no more than one DataSource, found 2
*/
#Bean
BatchConfigurer configurer(#Qualifier("testDataSource") DataSource dataSource){
return new DefaultBatchConfigurer(dataSource);
}
#Bean
public Job processJob() throws Exception{
return jobBuilderFactory.get("processJob")
.incrementer(new RunIdIncrementer())
.listener(listener())
.flow(orderStep1())
.end()
.build();
}
#Bean
public TaskExecutor taskExecutor(){
SimpleAsyncTaskExecutor asyncTaskExecutor=new SimpleAsyncTaskExecutor("spring_batch");
asyncTaskExecutor.setConcurrencyLimit(20);
return asyncTaskExecutor;
}
#Bean
public ItemReader<String> batchItemReader() {
Queue<String> dataList = new LinkedList<String>();
return new ItemReader<String>() {
#BeforeStep
public void beforeStep(StepExecution stepExecution) {
System.err.println("in before step...");
try {
if(inputFiles != null) {
for (int i = 0; i < inputFiles.length; i++) {
String fileName = inputFiles[i].getFile().getAbsolutePath();
try (Stream<String> stream = Files.lines(Paths.get(fileName))) {
stream.forEach( s -> dataList.add(s));
} catch (IOException e) {
e.printStackTrace();
}
}
}
} catch (Exception e) {
e.printStackTrace();
}
System.out.println("fileList Size::"+dataList.size());
}
#Override
public synchronized String read()throws Exception, UnexpectedInputException, ParseException, NonTransientResourceException {
System.out.println("--> in item reader.........");
String fileName = null;
if (dataList.size() > 0) {
fileName = dataList.remove();
file_reading_cnt++;
}
return fileName;
}
#AfterStep
public void afterStep(StepExecution stepExecution) {
System.err.println("in after step..."+file_reading_cnt);
}
};
}
volatile int file_reading_cnt = 0;
#Bean
public ItemWriter<String> batchItemWriter(){
return new ItemWriter<String>() {
#Override
public void write(List<? extends String> fileList) throws Exception {
System.out.println("----- in item writer.........");
fileList.forEach(data -> {
try {
saveFileLog(data);
} catch (Exception e) {
e.printStackTrace();
}
});
}
};
}
/**
* To create a step, reader, processor and writer has been passed serially
*
* #return
*/
#Bean
public Step orderStep1() throws Exception{
return stepBuilderFactory.get("orderStep1").<String, String>chunk(20)
.reader(batchItemReader())
.writer(batchItemWriter())
.taskExecutor(taskExecutor())
.throttleLimit(20)
.build();
}
#Bean
public JobExecutionListener listener() {
return new BatchJobCompletionListener();
}
#Bean
public ResourcelessTransactionManager transactionManager() {
return new ResourcelessTransactionManager();
}
}

Spring Boot & MongoDB how to remove the '_class' column?

When inserting data into MongoDB Spring Data is adding a custom "_class" column, is there a way to eliminate the "class" column when using Spring Boot & MongoDB?
Or do i need to create a custom type mapper?
Any hints or advice?
Dave's answer is correct. However, we generally recommend not do this (that's why it's enabled by default in the first place) as you effectively throw away to persist type hierarchies or even a simple property set to e.g. Object. Assume the following type:
#Document
class MyDocument {
private Object object;
}
If you now set object to a value, it will be happily persisted but there's no way you can read the value back into it's original type.
A more up to date answer to that question, working with embedded mongo db for test cases:
I quote from http://mwakram.blogspot.fr/2017/01/remove-class-from-mongo-documents.html
Spring Data MongoDB adds _class in the mongo documents to handle
polymorphic behavior of java inheritance. If you want to remove _class
just drop following Config class in your code.
package com.waseem.config;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.data.mongodb.MongoDbFactory;
import org.springframework.data.mongodb.core.convert.DbRefResolver;
import org.springframework.data.mongodb.core.convert.DefaultDbRefResolver;
import org.springframework.data.mongodb.core.convert.DefaultMongoTypeMapper;
import org.springframework.data.mongodb.core.convert.MappingMongoConverter;
import org.springframework.data.mongodb.core.mapping.MongoMappingContext;
#Configuration
public class MongoConfig {
#Autowired MongoDbFactory mongoDbFactory;
#Autowired MongoMappingContext mongoMappingContext;
#Bean
public MappingMongoConverter mappingMongoConverter() {
DbRefResolver dbRefResolver = new DefaultDbRefResolver(mongoDbFactory);
MappingMongoConverter converter = new MappingMongoConverter(dbRefResolver, mongoMappingContext);
converter.setTypeMapper(new DefaultMongoTypeMapper(null));
return converter;
}
}
Here is a slightly simpler approach:
#Configuration
public class MongoDBConfig implements InitializingBean {
#Autowired
#Lazy
private MappingMongoConverter mappingMongoConverter;
#Override
public void afterPropertiesSet() throws Exception {
mappingMongoConverter.setTypeMapper(new DefaultMongoTypeMapper(null));
}
}
You can remove the _class by following code. You can use this in your mongo configuration class.
#Bean
public MongoTemplate mongoTemplate(MongoDatabaseFactory databaseFactory, MappingMongoConverter converter) {
converter.setTypeMapper(new DefaultMongoTypeMapper(null));
return new MongoTemplate(databaseFactory, converter);
}
I think you need to create a #Bean of type MongoTemplate and set the type converter explicitly. Details (non-Boot but just extract the template config): http://www.mkyong.com/mongodb/spring-data-mongodb-remove-_class-column/
Similar to RZet but avoids inheritance:
#Configuration
public class MongoConfiguration {
#Autowired
private MappingMongoConverter mappingMongoConverter;
// remove _class
#PostConstruct
public void setUpMongoEscapeCharacterConversion() {
mappingMongoConverter.setTypeMapper(new DefaultMongoTypeMapper(null));
}
}
A simple way (+ for ReactiveMongoTemplate):
#Configuration
public class MongoDBConfig {
#Autowired
private MongoClient mongoClient;
#Value("${spring.data.mongodb.database}")
private String dbName;
#Bean
public ReactiveMongoTemplate reactiveMongoTemplate() {
ReactiveMongoTemplate template = new ReactiveMongoTemplate(mongoClient, dbName);
MappingMongoConverter converter = (MappingMongoConverter) template.getConverter();
converter.setTypeMapper(new DefaultMongoTypeMapper(null));
converter.afterPropertiesSet();
return template;
}
}
Add a converter to remove class.
MappingMongoConverter converter =
new MappingMongoConverter(mongoDbFactory(), new MongoMappingContext());
converter.setTypeMapper(new DefaultMongoTypeMapper(null));
MongoTemplate mongoTemplate = new MongoTemplate(mongoDbFactory(), converter);
return mongoTemplate;
`
The correct answer above seems to be using a number of deprecated dependencies. For example if you check the code, it mentions MongoDbFactory which is deprecated in the latest Spring release. If you happen to be using MongoDB with Spring-Data in 2020, this solution seems to be older. For instant results, check this snippet of code. Works 100%.
Just Create a new AppConfig.java file and paste this block of code. You'll see the "_class" property disappearing from the MongoDB document.
package com.reddit.redditmain.Configuration;
import org.apache.naming.factory.BeanFactory;
import org.springframework.beans.factory.NoSuchBeanDefinitionException;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.data.convert.CustomConversions;
import org.springframework.data.mongodb.MongoDatabaseFactory;
import org.springframework.data.mongodb.core.MongoTemplate;
import org.springframework.data.mongodb.core.convert.DbRefResolver;
import org.springframework.data.mongodb.core.convert.DefaultDbRefResolver;
import org.springframework.data.mongodb.core.convert.DefaultMongoTypeMapper;
import org.springframework.data.mongodb.core.convert.MappingMongoConverter;
import org.springframework.data.mongodb.core.mapping.MongoMappingContext;
#Configuration
public class AppConfig {
#Autowired
MongoDatabaseFactory mongoDbFactory;
#Autowired
MongoMappingContext mongoMappingContext;
#Bean
public MappingMongoConverter mappingMongoConverter() {
DbRefResolver dbRefResolver = new DefaultDbRefResolver(mongoDbFactory);
MappingMongoConverter converter = new MappingMongoConverter(dbRefResolver, mongoMappingContext);
converter.setTypeMapper(new DefaultMongoTypeMapper(null));
return converter;
}
}
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.context.annotation.Primary;
import org.springframework.data.mongodb.MongoDatabaseFactory;
import org.springframework.data.mongodb.core.MongoTemplate;
import
org.springframework.data.mongodb.core.convert.DefaultMongoTypeMapper;
import
org.springframework.data.mongodb.core.convert.MappingMongoConverter;
#Configuration
public class MongoConfigWithAuditing {
#Bean
#Primary
public MongoTemplate mongoTemplate(MongoDatabaseFactory
mongoDatabaseFactory, MappingMongoConverter mappingMongoConverter) {
// this is to avoid saving _class to db
mappingMongoConverter.setTypeMapper(new DefaultMongoTypeMapper(null));
MongoTemplate mongoTemplate = new MongoTemplate(mongoDatabaseFactory, mappingMongoConverter);
return mongoTemplate;
}
}
Spring Boot 3 with reactive mongo.
package es.dmunozfer.trading.config;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.boot.autoconfigure.mongo.MongoProperties;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.data.mongodb.ReactiveMongoDatabaseFactory;
import org.springframework.data.mongodb.config.AbstractReactiveMongoConfiguration;
import org.springframework.data.mongodb.core.convert.DefaultMongoTypeMapper;
import org.springframework.data.mongodb.core.convert.MappingMongoConverter;
import org.springframework.data.mongodb.core.convert.MongoCustomConversions;
import org.springframework.data.mongodb.core.mapping.MongoMappingContext;
import org.springframework.data.mongodb.repository.config.EnableReactiveMongoRepositories;
import lombok.extern.slf4j.Slf4j;
#Slf4j
#Configuration
#EnableReactiveMongoRepositories("es.dmunozfer.trading.repository")
public class MongoConfig extends AbstractReactiveMongoConfiguration {
#Autowired
private MongoProperties mongoProperties;
#Override
protected String getDatabaseName() {
return mongoProperties.getDatabase();
}
#Bean
#Override
public MappingMongoConverter mappingMongoConverter(ReactiveMongoDatabaseFactory databaseFactory, MongoCustomConversions customConversions,
MongoMappingContext mappingContext) {
MappingMongoConverter converter = super.mappingMongoConverter(databaseFactory, customConversions, mappingContext);
converter.setTypeMapper(new DefaultMongoTypeMapper(null));
return converter;
}
}
I'm leaving this answer here in case someone wants to remove the _class from kotlin and update it a bit since the previous answers have several deprecated dependencies.
import org.springframework.beans.factory.BeanFactory
import org.springframework.context.annotation.Bean
import org.springframework.context.annotation.Configuration
import org.springframework.data.mongodb.MongoDatabaseFactory
import org.springframework.data.mongodb.core.convert.DbRefResolver
import org.springframework.data.mongodb.core.convert.DefaultDbRefResolver
import org.springframework.data.mongodb.core.convert.DefaultMongoTypeMapper
import org.springframework.data.mongodb.core.convert.MappingMongoConverter
import org.springframework.data.mongodb.core.mapping.MongoMappingContext
#Configuration
internal class SpringMongoConfig {
#Bean
fun mappingMongoConverter(
factory: MongoDatabaseFactory, context: MongoMappingContext,
beanFactory: BeanFactory
): MappingMongoConverter {
val dbRefResolver: DbRefResolver = DefaultDbRefResolver(factory)
val mappingConverter = MappingMongoConverter(dbRefResolver, context)
mappingConverter.setTypeMapper(DefaultMongoTypeMapper(null))
return mappingConverter
}
}