Writable Classes in mapreduce - class

How can i use the values from hashset (the docid and offset) to the reduce writable so as to connect map writable with reduce writable?
The mapper (LineIndexMapper) works fine but in the reducer (LineIndexReducer) i get the error that it can't get string as argument when i type this:
context.write(key, new IndexRecordWritable("some string");
although i have the public String toString() in the ReduceWritable too.
I believe the hashset in reducer's writable (IndexRecordWritable.java) maybe isn't taking the values correctly?
I have the below code.
IndexMapRecordWritable.java
import java.io.DataInput;
import java.io.DataOutput;
import java.io.IOException;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.io.Writable;
public class IndexMapRecordWritable implements Writable {
private LongWritable offset;
private Text docid;
public LongWritable getOffsetWritable() {
return offset;
}
public Text getDocidWritable() {
return docid;
}
public long getOffset() {
return offset.get();
}
public String getDocid() {
return docid.toString();
}
public IndexMapRecordWritable() {
this.offset = new LongWritable();
this.docid = new Text();
}
public IndexMapRecordWritable(long offset, String docid) {
this.offset = new LongWritable(offset);
this.docid = new Text(docid);
}
public IndexMapRecordWritable(IndexMapRecordWritable indexMapRecordWritable) {
this.offset = indexMapRecordWritable.getOffsetWritable();
this.docid = indexMapRecordWritable.getDocidWritable();
}
#Override
public String toString() {
StringBuilder output = new StringBuilder()
output.append(docid);
output.append(offset);
return output.toString();
}
#Override
public void write(DataOutput out) throws IOException {
}
#Override
public void readFields(DataInput in) throws IOException {
}
}
IndexRecordWritable.java
import java.io.DataInput;
import java.io.DataOutput;
import java.io.IOException;
import java.util.HashSet;
import org.apache.hadoop.io.Writable;
public class IndexRecordWritable implements Writable {
// Save each index record from maps
private HashSet<IndexMapRecordWritable> tokens = new HashSet<IndexMapRecordWritable>();
public IndexRecordWritable() {
}
public IndexRecordWritable(
Iterable<IndexMapRecordWritable> indexMapRecordWritables) {
}
#Override
public String toString() {
StringBuilder output = new StringBuilder();
return output.toString();
}
#Override
public void write(DataOutput out) throws IOException {
}
#Override
public void readFields(DataInput in) throws IOException {
}
}

Alright, here is my answer based on a few assumptions. The final output is a text file containing the key and the file names separated by a comma based on the information in the reducer class's comments on the pre-condition and post-condition.
In this case, you really don't need IndexRecordWritable class. You can simply write to your context using
context.write(key, new Text(valueBuilder.substring(0, valueBuilder.length() - 1)));
with the class declaration line as
public class LineIndexReducer extends Reducer<Text, IndexMapRecordWritable, Text, Text>
Don't forget to set the correct output class in the driver.
That must serve the purpose according to the post-condition in your reducer class. But, if you really want to write a Text-IndexRecordWritable pair to your context, there are two ways approach it -
with string as an argument (based on your attempt passing a string when you IndexRecordWritable class constructor is not designed to accept strings) and
with HashSet as an argument (based on the HashSet initialised in IndexRecordWritable class).
Since your constructor of IndexRecordWritable class is not designed to accept String as an input, you cannot pass a string. Hence the error you are getting that you can't use string as an argument. Ps: if you want your constructor to accept Strings, you must have another constructor in your IndexRecordWritable class as below:
// Save each index record from maps
private HashSet<IndexMapRecordWritable> tokens = new HashSet<IndexMapRecordWritable>();
// to save the string
private String value;
public IndexRecordWritable() {
}
public IndexRecordWritable(
HashSet<IndexMapRecordWritable> indexMapRecordWritables) {
/***/
}
// to accpet string
public IndexRecordWritable (String value) {
this.value = value;
}
but that won't be valid if you want to use the HashSet. So, approach #1 can't be used. You can't pass a string.
That leaves us with approach #2. Passing a HashSet as an argument since you want to make use of the HashSet. In this case, you must create a HashSet in your reducer before passing it as an argument to IndexRecordWritable in context.write.
To do this, your reducer must look like this.
#Override
protected void reduce(Text key, Iterable<IndexMapRecordWritable> values, Context context) throws IOException, InterruptedException {
//StringBuilder valueBuilder = new StringBuilder();
HashSet<IndexMapRecordWritable> set = new HashSet<>();
for (IndexMapRecordWritable val : values) {
set.add(val);
//valueBuilder.append(val);
//valueBuilder.append(",");
}
//write the key and the adjusted value (removing the last comma)
//context.write(key, new IndexRecordWritable(valueBuilder.substring(0, valueBuilder.length() - 1)));
context.write(key, new IndexRecordWritable(set));
//valueBuilder.setLength(0);
}
and your IndexRecordWritable.java must have this.
// Save each index record from maps
private HashSet<IndexMapRecordWritable> tokens = new HashSet<IndexMapRecordWritable>();
// to save the string
//private String value;
public IndexRecordWritable() {
}
public IndexRecordWritable(
HashSet<IndexMapRecordWritable> indexMapRecordWritables) {
/***/
tokens.addAll(indexMapRecordWritables);
}
Remember, this is not the requirement according to the description of your reducer where it says.
POST-CONDITION: emit the output a single key-value where all the file names are separated by a comma ",". <"marcello", "a.txt#3345,b.txt#344,c.txt#785">
If you still choose to emit (Text, IndexRecordWritable), remember to process the HashSet in IndexRecordWritable to get it in the desired format.

Related

Spring batch ItemReader locale, import a double with comma

I want to import the following file with Spring Batch
key;value
A;9,5
I model it with the bean
class CsvModel
{
String key
Double value
}
The shown code here is Groovy but the language is irrelevant for the problem.
#Bean
#StepScope
FlatFileItemReader<CsvModel> reader2()
{
// set the locale for the tokenizer, but this doesn't solve the problem
def locale = Locale.getDefault()
def fieldSetFactory = new DefaultFieldSetFactory()
fieldSetFactory.setNumberFormat(NumberFormat.getInstance(locale))
def tokenizer = new DelimitedLineTokenizer(';')
tokenizer.setNames([ 'key', 'value' ].toArray() as String[])
// and assign the fieldSetFactory to the tokenizer
tokenizer.setFieldSetFactory(fieldSetFactory)
def fieldMapper = new BeanWrapperFieldSetMapper<CsvModel>()
fieldMapper.setTargetType(CsvModel.class)
def lineMapper = new DefaultLineMapper<CsvModel>()
lineMapper.setLineTokenizer(tokenizer)
lineMapper.setFieldSetMapper(fieldMapper)
def reader = new FlatFileItemReader<CsvModel>()
reader.setResource(new FileSystemResource('output/export.csv'))
reader.setLinesToSkip(1)
reader.setLineMapper(lineMapper)
return reader
}
Setting up a reader is well known, what was new for me was the first code block, setting up a numberFormat / locale / fieldSetFactory and assign it to the tokenizer. However this doesn't work, I still receive the exception
Field error in object 'target' on field 'value': rejected value [5,0]; codes [typeMismatch.target.value,typeMismatch.value,typeMismatch.float,typeMismatch]; arguments [org.springframework.context.support.DefaultMessageSourceResolvable: codes [target.value,value]; arguments []; default message [value]]; default message [Failed to convert property value of type 'java.lang.String' to required type 'float' for property 'value'; nested exception is java.lang.NumberFormatException: For input string: "9,5"]
at org.springframework.batch.item.file.mapping.BeanWrapperFieldSetMapper.mapFieldSet(BeanWrapperFieldSetMapper.java:200) ~[spring-batch-infrastructure-4.1.2.RELEASE.jar:4.1.2.RELEASE]
at org.springframework.batch.item.file.mapping.DefaultLineMapper.mapLine(DefaultLineMapper.java:43) ~[spring-batch-infrastructure-4.1.2.RELEASE.jar:4.1.2.RELEASE]
at org.springframework.batch.item.file.FlatFileItemReader.doRead(FlatFileItemReader.java:180) ~[spring-batch-infrastructure-4.1.2.RELEASE.jar:4.1.2.RELEASE]
So the question is: how do I import floats in the locale de_AT (we write our decimals with a comma like this: 3,141592)? I could avoid this problem with a FieldSetMapper but I want to understand what's going on here and want to avoid the unnecessary mapper class.
And even the FieldSetMapper solution doesn't obey locales out of the box, I have to read a string and convert it myself in a double:
class PnwExportFieldSetMapper implements FieldSetMapper<CsvModel>
{
private nf = NumberFormat.getInstance(Locale.getDefault())
#Override
CsvModel mapFieldSet(FieldSet fieldSet) throws BindException
{
def model = new CsvModel()
model.key = fieldSet.readString(0)
model.value = nf.parse(fieldSet.readString(1)).doubleValue()
return model
}
}
The class DefaultFieldSet has a function setNumberFormat, but when and where do I call this function?
This unfortunately seems to be a bug. I have the same Problem and debugged into the code.
The BeanWrapperFieldSetMapper is not using the methods of DefaultFieldSetFactory, that would do the right conversion, but instead just uses FieldSet.getProperties and does the conversion by itself.
So, I see the following options: Provide the BeanWrapperFieldSetMapper either with PropertyEditors or a ConversionService, or use a different mapper.
Here is a sketch of a conversion Service:
private static class CS implements ConversionService {
#Override
public boolean canConvert(Class<?> sourceType, Class<?> targetType) {
return sourceType == String.class && targetType == double.class;
}
#Override
public boolean canConvert(TypeDescriptor sourceType, TypeDescriptor targetType) {
return sourceType.equals(TypeDescriptor.valueOf(String.class)) &&
targetType.equals(TypeDescriptor.valueOf(double.class)) ;
}
#Override
public <T> T convert(Object source, Class<T> targetType) {
return (T)Double.valueOf(source.toString().replace(',', '.'));
}
#Override
public Object convert(Object source, TypeDescriptor sourceType, TypeDescriptor targetType) {
return Double.valueOf(source.toString().replace(',', '.'));
}
}
and use it:
final BeanWrapperFieldSetMapper<IBISRecord> mapper = new BeanWrapperFieldSetMapper<>();
mapper.setTargetType(YourClass.class);
mapper.setConversionService(new CS());
...
new FlatFileItemReaderBuilder<IBISRecord>()
.name("YourReader")
.delimited()
.delimiter(";")
.includedFields(fields)
.names(names)
.fieldSetMapper(mapper)
.saveState(false)
.resource(resource)
.build();

How can I ignore a "$" in a DocumentContent to save in MongoDB?

My Problem is, that if I save a Document with a $ inside the content, Mongodb gives me an exception:
java.lang.IllegalArgumentException: Invalid BSON field name $ xxx
I would like that mongodb ignores the $ character in the content.
My Application is written in java. I read the content of the File and put it as a string into an object. After that the object will be saved with a MongoRepository class.
Someone has any ideas??
Example content
Edit: I heard mongodb has the same problem wit dot. Our Springboot found i workaround with dot, but not for dollar.
How to configure mongo converter in spring to encode all dots in the keys of map being saved in mongo db
If you are using Spring Boot you can extend MappingMongoConverter class and add override methods that do the escaping/unescaping.
#Component
public class MappingMongoConverterCustom extends MappingMongoConverter {
protected #Nullable
String mapKeyDollarReplacemant = "characters_to_replace_dollar";
protected #Nullable
String mapKeyDotReplacement = "characters_to_replace_dot";
public MappingMongoConverterCustom(DbRefResolver dbRefResolver, MappingContext<? extends MongoPersistentEntity<?>, MongoPersistentProperty> mappingContext) {
super(dbRefResolver, mappingContext);
}
#Override
protected String potentiallyEscapeMapKey(String source) {
if (!source.contains(".") && !source.contains("$")) {
return source;
}
if (mapKeyDotReplacement == null && mapKeyDollarReplacemant == null) {
throw new MappingException(String.format(
"Map key %s contains dots or dollars but no replacement was configured! Make "
+ "sure map keys don't contain dots or dollars in the first place or configure an appropriate replacement!",
source));
}
String result = source;
if(result.contains(".")) {
result = result.replaceAll("\\.", mapKeyDotReplacement);
}
if(result.contains("$")) {
result = result.replaceAll("\\$", mapKeyDollarReplacemant);
}
//add any other replacements you need
return result;
}
#Override
protected String potentiallyUnescapeMapKey(String source) {
String result = source;
if(mapKeyDotReplacement != null) {
result = result.replaceAll(mapKeyDotReplacement, "\\.");
}
if(mapKeyDollarReplacemant != null) {
result = result.replaceAll(mapKeyDollarReplacemant, "\\$");
}
//add any other replacements you need
return result;
}
}
If you go with this approach make sure you override the default converter from AbstractMongoConfiguration like below:
#Configuration
public class MongoConfig extends AbstractMongoConfiguration{
#Bean
public DbRefResolver getDbRefResolver() {
return new DefaultDbRefResolver(mongoDbFactory());
}
#Bean
#Override
public MappingMongoConverter mappingMongoConverter() throws Exception {
MappingMongoConverterCustom converter = new MappingMongoConverterCustom(getDbRefResolver(), mongoMappingContext());
converter.setCustomConversions(customConversions());
return converter;
}
.... whatever you might need extra ...
}

GWT celltable. How to set cell css style after changing value in EditTextCell

Base on user's entry in editable cell I would like to display concrete style. What I am trying to do is a very base validation.
I've tried already override getCellStyleNames in Anonymous new Column() {}, but this work on start, base on model values, but what would work, after user change value of that cell?
Please help.
You're very close.
Override getCellStyleNames in the new Column() {} is the first half of the solution.
The second half:
yourColumn.setFieldUpdater(new FieldUpdater<DataType, String>() {
#Override
public void update(int index, DataType object, String value) {
// the following line will apply the correct css
// based on the current cell value
cellTable.redrawRow(index);
}
});
Hope this would help!
The following code is a trivia but complete example.
A celltable with two column is defined. Each cell in the first column displays a simple question. Each cell in the second column is an editable cell which allows you to enter you answer to the question shown in first column. If your answer is correct, then the text of the answer will be styled as bold and black. Otherwise, the text will be styled as red in normal font weight.
Source code of the trivia GWT app:
import com.google.gwt.cell.client.Cell;
import com.google.gwt.cell.client.EditTextCell;
import com.google.gwt.cell.client.FieldUpdater;
import com.google.gwt.core.client.EntryPoint;
import com.google.gwt.user.cellview.client.CellTable;
import com.google.gwt.user.cellview.client.Column;
import com.google.gwt.user.cellview.client.TextColumn;
import com.google.gwt.user.client.ui.RootPanel;
import java.util.Arrays;
import java.util.List;
public class CellTableExample implements EntryPoint {
private static class Question {
private final String question;
private final String correctAnswer;
private String userProvidedAnswer;
public Question(String question, String correctAnswer) {
this.question = question;
this.correctAnswer = correctAnswer;
this.userProvidedAnswer = "";
}
public String getQuestion() {
return question;
}
public String getCorrectAnswer() {
return correctAnswer;
}
public String getUserProvidedAnswer() {
return userProvidedAnswer;
}
public void setUserProvidedAnswer(String userProvidedAnswer) {
this.userProvidedAnswer = userProvidedAnswer;
}
}
private static final List<Question> questionList = Arrays.asList(
new Question("Which city is capital of England?", "London"),
new Question("Which city is capital of Japan?", "Tokyo"));
#Override
public void onModuleLoad() {
final CellTable<Question> cellTable = new CellTable<>();
TextColumn<Question> questionCol = new TextColumn<Question>() {
#Override
public String getValue(Question object) {
return object.getQuestion();
}
};
Column<Question, String> ansCol = new Column<Question, String>(new EditTextCell()) {
#Override
public String getValue(Question object) {
return object.getUserProvidedAnswer();
}
#Override
public String getCellStyleNames(Cell.Context context, Question object) {
if (object.getUserProvidedAnswer().equalsIgnoreCase(object.getCorrectAnswer())) {
return "correct-answer";
} else {
return "wrong-answer";
}
}
};
ansCol.setFieldUpdater(new FieldUpdater<Question, String>() {
#Override
public void update(int index, Question object, String value) {
object.setUserProvidedAnswer(value);
cellTable.redrawRow(index);
}
});
cellTable.addColumn(questionCol, "Question");
cellTable.addColumn(ansCol, "Your Answer");
cellTable.setRowData(0, questionList);
RootPanel.get().add(cellTable);
}
}
Companion css file:
.correct-answer {
font-weight: bold;
color: black;
}
.wrong-answer {
font-weight: normal;
color: red;
}
Screenshot1: Right after the app started. The column of answers was empty.
Screenshot2: After I entered answers. Apparently I answered the first one correctly but not the second one.
Use setCellStyleNames inside the render method:
Column<MyType, String> testColumn = new Column<MyType, String>(new EditTextCell()) {
#Override
public String getValue(MyType object) {
return object.getValue();
}
#Override
public void render(Context context, MyType object, SafeHtmlBuilder sb) {
if(object.isValid())
setCellStyleNames("validStyleNames");
else
setCellStyleNames("invalidStyleNames");
super.render(context, object, sb);
}
};
Correct aproach is to overide getCellStyleNames method in anonymous class:
new Column<Model, String>(new EditTextCell())
You have to return string name of css class.

Singleton class with updated parameters in java

public class ThreadSafeSingleton implements Serializable {
#Override
public String toString() {
return "ThreadSafeSingleton [i=" + i + ", str=" + str + "]";
}
int i;
String str;
private static ThreadSafeSingleton instance;
public int getI() {
return i;
}
public void setI(int i) {
this.i = i;
}
public String getStr() {
return str;
}
public void setStr(String str) {
this.str = str;
}
private ThreadSafeSingleton(){
}
public static synchronized ThreadSafeSingleton getInstance(int i,String str){
if(instance == null){
synchronized (ThreadSafeSingleton.class) {
if(instance == null){
instance = new ThreadSafeSingleton();
}
}
}
instance.setI(i);
instance.setStr(str);
return instance;
}
public Object readResolve(){
System.out.println("readResolve executed");
return getInstance(this.i,this.str);
}
public static void main(String[] args) throws IOException, Exception {
FileOutputStream fos = new FileOutputStream(
"B://Serilization//text1.txt");
ObjectOutputStream oos = new ObjectOutputStream(fos);
ThreadSafeSingleton obj = new ThreadSafeSingleton();
obj.setI(1);
obj.setStr("katrina kaif");
oos.writeObject(obj);
System.out.println("serilization done");
FileInputStream fis = new FileInputStream("B://Serilization//text1.txt");
ObjectInputStream ois = new ObjectInputStream(fis);
ThreadSafeSingleton copy=(ThreadSafeSingleton) ois.readObject();
System.out.println("copy "+copy);
System.out.println("deserilization done");
}
}
in the above code i have a singleton class containing int i and String str attributes and i have implemented Serializable interface my requirement is that when i serialized a class i will serialize the class with some attributes values on one JVM and when i deserialize on another JVM i should get the same instance of my singleton class but the attributes in the class should get updated with the values i provided during serialization
here on internet i checked the solution i got to use readResolve method there you can write a logic which will set the values of attributes i provided during serialization of my singleton class so if you will see the code of readResolve i have written a code like this "return getInstance(this.i,this.str);" here i have used "this" keyword which means a current object is being used therefore i have question
i have doubt that is this code creating new object here as "this" refers to the current object apart from the object i created in the getInstance(int i,String str) method can anybody please explain is this breaking singleton ?
You may want to read up on Java serialization: readObject() vs. readResolve(). When readResolve() is called, your object has already been deserialized from the stream and fully created. Your this pointer, in that case, will be the object that the deserialization process has constructed, complete with the i and str values from the stream. If you use this.i and this.str to construct the new Singleton, you're not creating a new object with the new JVM's specific parameters.

Get and Set attribute values of a class using aspectJ

I am using aspectj to add some field to a existing class and annotate it also.
I am using load time weaving .
Example :- I have a Class customer in which i am adding 3 string attributes. But my issues is that I have to set some values and get it also before my business call.
I am trying the below approach.
In my aj file i have added the below, my problem is in the Around pointcut , how do i get the attribute and set the attribute.
public String net.customers.PersonCustomer.getOfflineRiskCategory() {
return OfflineRiskCategory;
}
public void net.customers.PersonCustomer.setOfflineRiskCategory(String offlineRiskCategory) {
OfflineRiskCategory = offlineRiskCategory;
}
public String net.customers.PersonCustomer.getOnlineRiskCategory() {
return OnlineRiskCategory;
}
public void net.customers.PersonCustomer.setOnlineRiskCategory(String onlineRiskCategory) {
OnlineRiskCategory = onlineRiskCategory;
}
public String net.customers.PersonCustomer.getPersonCommercialStatus() {
return PersonCommercialStatus;
}
public void net.customers.PersonCustomer.setPersonCommercialStatus(String personCommercialStatus) {
PersonCommercialStatus = personCommercialStatus;
}
#Around("execution(* net.xxx.xxx.xxx.DataMigration.populateMap(..))")
public Object invoke(ProceedingJoinPoint joinPoint) throws Throwable {
Object arguments[] = joinPoint.getArgs();
if (arguments != null) {
HashMap<String, String> hMap = (HashMap) arguments[0];
PersonCustomer cus = (PersonCustomer) arguments[1];
return joinPoint.proceed();
}
If anyone has ideas please let me know.
regards,
FT
First suggestion, I would avoid mixing code-style aspectj with annotation-style. Ie- instead of #Around, use around.
Second, instead of getting the arguments from the joinPoint, you should bind them in the pointcut:
Object around(Map map, PersonCustomer cust) :
execution(* net.xxx.xxx.xxx.DataMigration.populateMap(Map, PersonCustomer) && args(map, cust) {
...
return proceed(map, cust);
}
Now, to answer your question: you also need to use intertype declarations to add new fields to your class, so do something like this:
private String net.customers.PersonCustomer.OfflineRiskCategory;
private String net.customers.PersonCustomer.OnlineRiskCategory;
private String net.customers.PersonCustomer.PersonCommercialStatus;
Note that the private keyword here means private to the aspect, not to the class that you declare it on.