How to trim Strings before inserting to MongoDB using MongoTemplate? - mongodb

Basically, I am fetching data from another source and creating my db collections. But some of the data has spaces in the end, causing issues in frontend when used later.
Is there a generic way to trim all String fields of all Collections before inserting and updating to MongoDB using spring mongoTemplate configuration/code.
I don't want to write logic specific to each type of collection and each field in it. Also, is it a good practice to put such logic at DB repositories level?

Try to use AbstractMongoEventListener. For example, create create your own implementation of AbstractMongoEventListener:
#Component
class SaveMongoEventListener extends AbstractMongoEventListener<Object> {
#Override
public void onBeforeConvert(BeforeConvertEvent<Object> event) {
Object source = event.getSource();
for (Field field : source.getClass().getFields()) {
if (field.getType().isAssignableFrom(String.class)) {
try {
String value = (String) field.get(source);
field.setAccessible(true);
field.set(value != null ? value.trim(): value, source);
} catch (IllegalAccessException e) {
e.printStackTrace();
}
}
}
}
}
This implementation will trim all strings in your object before converting it to MongoDB object. The listener should work for all your collections. Do not forget to register this listener in Spring context.
To trim string after loading from MongoDB, you should do the same in onAfterConvert event.

Related

Any way I can change in runtime mongo document name

In the project we need to change collection name suffix everyday based on date.
So one day collection is named:
samples_22032019
and in the next day it is
samples_23032019
Everyday I need to change suffix and recompile spring-boot application because of this. Is there any way I can change this so the collection/table can be calculated dynamically based on current date? Any advice for MongoRepository?
Considering the below is your bean. you can use #Document annotation with spring expression language to resolve suffix at runtime. Like show below,
#Document(collection = "samples_#{T(com.yourpackage.Utility).getDateSuffix()}")
public class Samples {
private String id;
private String name;
}
Now have your date change function in a Utility method which spring can resolve at runtime. SpEL is handy in such scenarios.
package com.yourpackage;
public class Utility {
public static final String getDateSuffix() {
//Add your real logic here, below is for representational purpose only.
return DateTime.now().toDate().toString();;
}
}
HTH!
Make a cron job to run daily and generateNewName for your collection and execute the below code. Here I am getting collection using MongoDatabse than by using MongoNamespace we can rename the collection.
To get old/new collection name you can write a separate method.
#Component
public class RenameCollectionTask {
#Scheduled(cron = "${cron}")
public void renameCollection() {
// creating mongo client object
final MongoClient client = new MongoClient(HOST_NAME, PORT);
// selecting the mongo database
final MongoDatabase database = client.getDatabase("databaseName");
// selecting the mongo collection
final MongoCollection<Document> collection = database.getCollection("oldCollectionName");
// creating namespace
final MongoNamespace newName = new MongoNamespace("databaseName", "newCollectionName");
// renaming the collection
collection.renameCollection(newName);
System.out.println("Collection has been renamed");
// closing the client
client.close();
}
}
To assign the name of the collection you can refer this so that every time restart will not be required.
The renameCollection() method has the following limitations:
1) It cannot move a collection between databases.
2) It is not supported on sharded collections.
3) You cannot rename the views.
Refer this for detail.

How to use BeanWrapperFieldSetMapper to map a subset of fields?

I have a Spring batch application where BeanWrapperFieldSetMapper is used to map fields using a prototype object. However, the CSV file that is being read (via a FlatFileItemReader) contains one (indicator) field that determines the mapping of another field. If the indicator field has a value of Y, then the value of the another field should be mapped to property foo otherwise it should be mapped to property bar.
I know that I can use a custom FieldSetMapper to do this, but then I have to code the mapping all of the other fields (of which there are a quite a few). Alternatively, I could do this post reading via an ItemProcessor but then my domain (prototype) object must have a property representing the indicator field (which I prefer not to do since it is not really part of the business domain).
Is it possible to perhaps use a custom FieldSetMapper to only map these custom fields and delegate the other mappings to BeanWrapperFieldSetMapper? Or is there some other better way to solve for this?
Here is my current attempt to use a custom FieldSetMapper and delegate to BeanWrapperFieldSetMapper:
public class DelegatedFieldSetMapper extends BeanWrapperFieldSetMapper<MyProtoClass> {
#Override
public MyProtoClass mapFieldSet(FieldSet fieldSet) throws BindException {
String indicator = fieldSet.readString("indicator");
Properties fieldProperties = fieldSet.getProperties();
if (indicator.equalsIgnoreCase("y")) {
fieldProperties.put("test.foo", fieldSet.readString("value");
} else {
fieldProperties.put("test.bar", fieldSet.readString("value");
}
fieldProperties.remove("indicator");
Set<Object> keys = fieldProperties.keySet();
List<String> names = new ArrayList<String>();
List<String> values = new ArrayList<String>();
for (Object key : keys) {
names.add((String) key);
values.add((String) fieldProperties.getProperty((String) key));
}
DefaultFieldSet domainObjectFieldSet = new DefaultFieldSet(names.toArray(new String[names.size()]), values.toArray(new String[values.size()]));
return super.mapFieldSet(domainObjectFieldSet);
}
}
However, a FlatFileParseException is thrown. The relevant parts of the batch config class are as follows:
#Configuration
#EnableBatchProcessing
public class BatchConfiguration {
#Value("${file}")
private File file;
#Bean
#Scope("prototype")
public MyProtoClass () {
return new MyProtoClass();
}
#Bean
public ItemReader<MyProtoClass> reader(LineMapper<MyProtoClass> lineMapper) {
FlatFileItemReader<MyProtoClass> flatFileItemReader = new FlatFileItemReader<MyProtoClass>();
flatFileItemReader.setResource(new FileSystemResource(file));
final int NUMBER_OF_HEADER_LINES = 1;
flatFileItemReader.setLinesToSkip(NUMBER_OF_HEADER_LINES);
flatFileItemReader.setLineMapper(lineMapper);
return flatFileItemReader;
}
#Bean
public LineMapper<MyProtoClass> lineMapper(LineTokenizer lineTokenizer, FieldSetMapper<MyProtoClass> fieldSetMapper) {
DefaultLineMapper<MyProtoClass> lineMapper = new DefaultLineMapper<MyProtoClass>();
lineMapper.setLineTokenizer(lineTokenizer);
lineMapper.setFieldSetMapper(fieldSetMapper);
return lineMapper;
}
#Bean
public LineTokenizer lineTokenizer() {
DelimitedLineTokenizer lineTokenizer = new DelimitedLineTokenizer();
lineTokenizer.setNames(new String[] {"value", "test.bar", "test.foo", "indicator"});
return lineTokenizer;
}
#Bean
public FieldSetMapper<MyProtoClass> fieldSetMapper(PropertyEditor emptyStringToNullPropertyEditor) {
BeanWrapperFieldSetMapper<MyProtoClass> fieldSetMapper = new DelegatedFieldSetMapper();
fieldSetMapper.setPrototypeBeanName("myProtoClass");
Map<Class<String>, PropertyEditor> customEditors = new HashMap<Class<String>, PropertyEditor>();
customEditors.put(String.class, emptyStringToNullPropertyEditor);
fieldSetMapper.setCustomEditors(customEditors);
return fieldSetMapper;
}
Finally, the CSV flat file look like this:
value,bar,foo,indicator
abc,,,y
xyz,,,n
Let's say that BatchWorkObject is the class to be mapped.
Here's a sample code in Spring Boot style that needs only your custom logic to be added.
new BeanWrapperFieldSetMapper<BatchWorkObject>(){
{
this.setTargetType(BatchWorkObject.class);
}
#Override
public BatchWorkObject mapFieldSet(FieldSet fs)
throws BindException {
BatchWorkObject tmp= super.mapFieldSet(fs);
// your custom code here
return tmp;
}
});
The code actually accomplishes what is desired except for one issue that results in the FlatFileParseException. The DelegatedFieldSetMapper contains the issue as follows:
DefaultFieldSet domainObjectFieldSet = new DefaultFieldSet(names.toArray(new String[names.size()]), values.toArray(new String[values.size()]));
To resolve, change to:
DefaultFieldSet domainObjectFieldSet = new DefaultFieldSet(values.toArray(new String[values.size()]), names.toArray(new String[names.size()]));
Write your own FieldSetMapper with a set of prepared delegates inside.
Those delegates are pre-built for every different kind of fields mapping.
In your object route to correct delegate based on indicator field (with a Classifier, for example).
I can't see any other way, but this solution is quite easy and straightforward to maintain.
Processing based on the input format/data can be done using a custom implementation of ItemProcessor which is either changing values in the same entity (that was populated by IteamReader) or creates a new one output entity.

Is there a way to use Solr's streaming API with spring data solr?

I have a use case where I need to fetch the ids of my entire solr collection. For that, with solrj, I use the Streaming API like this :
CloudSolrServer server = new CloudSolrServer("zkHost1:2181,zkHost2:2181,zkHost3:2181");
SolrQuery query = new SolrQuery("*:*");
server.queryAndStreamResponse(tmpQuery, handler);
Where handler is a class that implements StreamingResponseCallback, ommited in my code for brevity.
Now, the Spring data repositories abstraction give me the ability to search by pages, by cursors, but I can't seem to find a way to handle the streaming use case.
Is there a workaround ?
SolrTemplate allows to access the underlying SolrClient in a callback style. So you could use that one to work around the current limitations.
The result conversion using the MappingSolrConverter available via the SolrTemplate is broken at the moment (I need to check why) - but you get the idea of how to do it.
solrTemplate.execute(new SolrCallback<Void>() {
#Override
public Void doInSolr(SolrClient solrClient) throws SolrServerException, IOException {
SolrQuery sq = new SolrQuery("*:*");
solrClient.queryAndStreamResponse("collection1", sq, new StreamingResponseCallback() {
#Override
public void streamSolrDocument(SolrDocument doc) {
// the bean conversion fails atm
// ExampleSolrBean bean = solrTemplate.getConverter().read(ExampleSolrBean.class, doc);
System.out.println(doc);
}
#Override
public void streamDocListInfo(long numFound, long start, Float maxScore) {
// do something useful
}
});
return null;
}
});

Ormlite and PostgreSQL - Error inserting text array with custom persister

I have been working to setup Ormlite as the primary data access layer between a PostgreSQL database and Java application. Everything has been fairly straightforward, until I started messing with PostgreSQL's array types. In my case, I have two tables that make use of text[] array type. Following the documentation, I created a custom data persister as below:
public class StringArrayPersister extends StringType {
private static final StringArrayPersister singleTon = new StringArrayPersister();
private StringArrayPersister() {
super(SqlType.STRING, new Class<?>[]{String[].class});
}
public static StringArrayPersister getSingleton() {
return singleTon;
}
#Override
public Object javaToSqlArg(FieldType fieldType, Object javaObject) {
String[] array = (String[]) javaObject;
if (array == null) {
return null;
} else {
String join = "";
for (String str : array) {
join += str +",";
}
return "'{" + join.substring(0,join.length() - 1) + "}'";
}
}
#Override
public Object sqlArgToJava(FieldType fieldType, Object sqlArg, int columnPos) {
String string = (String) sqlArg;
if (string == null) {
return null;
} else {
return string.replaceAll("[{}]","").split(",");
}
}
}
And then in my business object implementation, I set up the persister class on the column likeso:
#DatabaseField(columnName = TAGS_FIELD, persisterClass = StringArrayPersister.class)
private String[] tags;
When ever I try inserting a new record with the Dao.create statement, I get an error message saying tags is of type text[], but got character varying... However, when querying existing records from the database, the business object (and text array) load just fine.
Any ideas?
UPDATE:
PostGresSQL 9.2. The exact error message:
Caused by: org.postgresql.util.PSQLException: ERROR: column "tags" is
of type text[] but expression is of type character varying Hint: You
will need to rewrite or cast the expression.
I've not used ormlite before (I generally use MyBatis), however, I believe the proximal issue is this code:
private StringArrayPersister() {
super(SqlType.STRING, new Class<?>[]{String[].class});
}
SqlType.String is mapped to varchar in SQL in the ormlite code, and so therefore I believe is the proximal cause of the error you're getting. See ormlite SQL Data Types info for more detail on that.
Try changing it to this:
private StringArrayPersister() {
super(SqlType.OTHER, new Class<?>[]{String[].class});
}
There may be other tweaks necessary as well to get it fully up and running, but that should get you passed this particular error with the varchar type mismatch.

Using an Analyzer within a custom FieldBridge

I have a List getter method that I want to index (tokenized) into a number of fields.
I have a FieldBridge implementation that iterates over the list and indexes each string into a field with the index appended to the field name to give a different name for each.
I have two different Analyzer implementations (CaseSensitiveNGramAnalyzer and CaseInsensitiveNGramAnalyzer) that I want to use with this FieldBridge (to make a case-sensitive and a case-insensitive index of the field).
This is the FieldBridge I want to apply the Analyzers to:
public class StringListBridge implements FieldBridge
{
#Override
public void set(String name, Object value, Document luceneDocument, LuceneOptions luceneOptions)
{
List<String> strings = (List<String>) value;
for (int i = 0; i < strings.size(); i++)
{
addStringField(name + 1, strings.get(i), luceneDocument, luceneOptions);
}
}
private void addStringField(String fieldName, String fieldValue, Document luceneDocument, LuceneOptions luceneOptions)
{
Field field = new Field(fieldName, fieldValue, luceneOptions.getStore(), luceneOptions.getIndex(), luceneOptions.getTermVector());
field.setBoost(luceneOptions.getBoost());
luceneDocument.add(field);
}
}
Is it possible to apply an Analyzer to a field that uses a FieldBridge?
If so, can this be done with annotations, or does it have to be done programatically?
If the latter, can I inject the Analyzer as a parameter?
I am thinking along the lines of the following, but am not at all familiar with field token streams etc.:
private void addStringField(String fieldName, String fieldValue, Document luceneDocument, LuceneOptions luceneOptions)
{
Field field = new Field(fieldName, fieldValue, luceneOptions.getStore(), luceneOptions.getIndex(), luceneOptions.getTermVector());
field.setBoost(luceneOptions.getBoost());
try
{
field.setTokenStream(new CaseSensitiveNGramAnalyzer().reusableTokenStream(fieldName, new StringReader(fieldValue)));
}
catch (IOException e)
{
e.printStackTrace();
}
luceneDocument.add(field);
}
Is this a sane approach?
EDIT I have tried specifying the Analyzer and FieldBridge within a #Field annotation (without including the above analyzer code) as follows, but it appears to be using the default analyzer rather than those specified with analyzer = .
#Fields({
#Field(name="content-nocase",
index = Index.TOKENIZED,
analyzer = #Analyzer(impl = CaseInsensitiveNgramAnalyzer.class),
bridge = #FieldBridge(impl = StringListBridge.class)),
#Field(name = "content-case",
index = Index.TOKENIZED,
analyzer = #Analyzer(impl = CaseSensitiveNgramAnalyzer.class),
bridge = #FieldBridge(impl = StringListBridge.class)),
})
public List<String> getContents()
The solution atm is via a custom scoped analyzer or using #AnalyzerDiscriminator together with #AnalyzerDef. This is also discussed on the Hibernate Search forum - https://forum.hibernate.org/viewtopic.php?f=9&t=1016667
I managed to get this working. Hibernate Search appears not to use the specified Analyzer when both analyzer = and bridge = are specified, at least if the specified bridge creates multiple fields.
Manually passing the TokenStream from the desired analyzer to the generated Fields in the bridge got me the expected result:
private void addStringField(String fieldName, String fieldValue, Document luceneDocument, LuceneOptions luceneOptions)
{
Field field = new Field(fieldName, fieldValue, luceneOptions.getStore(), luceneOptions.getIndex(), luceneOptions.getTermVector());
field.setBoost(luceneOptions.getBoost());
// manually apply token stream from analyzer, as hibernate search does not
// apply the specified analyzer properly
try
{
field.setTokenStream(analyzer.reusableTokenStream(fieldName, new StringReader(fieldValue)));
}
catch (IOException e)
{
e.printStackTrace();
}
luceneDocument.add(field);
}
ParameterizedBridge is implemented to specify which analyzer to use (analyzer is instantiated and stored in a field before this method is called).