Hazelcast is calling loadAll() TWICE for the same key - distributed-computing

As far as I'm aware, when using a MapLoader, Hazelcast calls loadAllKeys() once on the node which owns the partition which owns the map's name.
loadAll(Collection<Long> keys) is then only called on partitions which own a given key retrieved from loadAllKeys(). After this, the values are distributed as-needed around the cluster.
I'm performing a basic test with one node, one map and one record in my persistent store.
What I'm finding is loadAllKeys() is correctly called once, however, loadAll(Collection<Long> keys) is called twice. Why is this the case?
My implementation of loadAll(Collection<Long> keys) is as follows:
#Override
public synchronized Map<Long, MyCacheEntry> loadAll(Collection<Long> keys) {
return myCacheRepository.loadMyCacheDataForKeys(keys)
.stream()
.collect(Collectors.toMap(MyCacheEntity::getId,
entity -> gson.fromJson(entity.getValue(), MyCacheEntry.class)
));
}
This means that I am doing two lookups to my persistent storage instead of one. Seeing as though I have one record in my database I would expect loadAll(Collection<Long> keys) to only be called once.
What is happening here?
My crude, test Hazelcast/Spring configuration is as follows:
#Configuration
public class HazelcastConfiguration {
private final MyMapStore myMapStore;
#Inject
HazelcastConfiguration(#Lazy MyMapStore myMapStore) {
this.myMapStore = myMapStore;
}
#PreDestroy
public void shutdown() {
Hazelcast.shutdownAll();
}
#Bean
public HazelcastInstance hazelcastInstance() {
Config config = new Config();
config.getGroupConfig().setName("MyGroup");
NetworkConfig networkConfig = config.getNetworkConfig();
networkConfig.setPortAutoIncrement(false);
JoinConfig joinConfig = networkConfig.getJoin();
joinConfig.getMulticastConfig().setEnabled(false);
joinConfig.getTcpIpConfig().setEnabled(true).setMembers(Collections.singletonList("127.0.0.1"));
MapConfig mapConfig = new MapConfig("MyMap");
mapConfig.setBackupCount(1);
mapConfig.setAsyncBackupCount(1);
mapConfig.setInMemoryFormat(InMemoryFormat.OBJECT);
mapConfig.setTimeToLiveSeconds(0);
EntryListenerConfig entryListenerConfig = new EntryListenerConfig();
entryListenerConfig.setImplementation(new MyCacheEntryListener());
mapConfig.addEntryListenerConfig(entryListenerConfig);
MapStoreConfig mapStoreConfig = new MapStoreConfig();
mapStoreConfig.setInitialLoadMode(MapStoreConfig.InitialLoadMode.EAGER);
mapStoreConfig.setWriteDelaySeconds(1);
mapStoreConfig.setImplementation(myMapStore);
mapConfig.setMapStoreConfig(mapStoreConfig);
config.addMapConfig(mapConfig);
return Hazelcast.newHazelcastInstance(config);
}
}

Related

Kafka state-store sharing across sub-toplogies

I am trying to create a custom joining consumer to join multiple events.
I have created a topology which have four sub-toplogies(subtopology-0, subtoplogy-1, subtopology-2, subtopology-3) not in the exact order of what is described by topology.describe().
I have created a state-store in three of the sub-toplogies(subtopology-0, subtoplogy-1, subtopology-2) and trying to attach all the state-store created different state-stores using .connectProcessorAndStateStores("PROCESS2", "COUNTS") as per the kafka developer guide https://kafka.apache.org/0110/documentation/streams/developer-guide
Here is the code snippet of how I am creating and attaching processors to the topology.
class StreamCustomizer implements KafkaStreamsInfrastructureCustomizer {
public someMethod(StreamBuilder builder) {
Topology topology = builder.build();
topology.addProcessor("Processor1", new Processor() {...}, "state-store-1).addStateStore(store1,..);
topology.addProcessor("Processor2", new Processor() {...}, "state-store-1)
.addStateStore(store1,..);
topology.addProcessor("Processor3", new Processor() {...}, "state-store-1)
addStateStore(store1,..);
topology.addProcessor("Processor4", new Processor4() {...}, "Processor1", Processor2", "Processor3")
connectProcessorAndStateStores("Prcoessor4", "state-store-1", "state-store-2", "state-store-3");
}
}
This is how the processor is defined for all the sub-toplogies as described above
new Processor {
private ProcessorContext;
private KeyValueStore<K, V> store;
init(ProcessorContext) {
this.context = context;
store = context.getStore("store-name");
}
}
This is hot the processor 4 is written, with all the state-store retrieved in init method from context store.
new Processor4() {
private KeyValueStore<K, V> store1;
private KeyValueStore<K, V> store2;
private KeyValueStore<K, V> store3;
}
I am observing a strange behaviour that with the above code, store1, store2, and store3 all are re-intiailized and no keys are preserved whatever were stored in their respective sub-toplogies(1,2,3). However, the same code works i.e., all state store preserved the key-value stored in their respective sub-topology when state-stores are declared at class level.
class StreamCustomizer implements KafkaStreamsInfrastructureCustomizer {
private KeyValueStore <K, V> store1;
private KeyValueStore <K, V> store2;
private KeyValueStore <K, V> store3;
}
and then in the processor implementation, just init the state-store in init method.
new Processor {
private ProcessorContext;
init(ProcessorContext) {
this.context = context;
store1 = context.getStore("store-name-1");
}
}
Can someone please assist in finding the reason, or if there is anything wrong in this topology? Also, I have read in state-store can be shared within the same sub-topology.
Hard to say (the code snippets are not really clear), however, if share state you effectively merge sub-topologies. Thus, if you do it correct, you would end up with a single sub-topology containing all your processor.
As long as you see 4 sub-topologies, state store are not shared yet, ie, not connected correctly.

Detecting abandoned processess in Kafka Streams 2.0

I have this case: users collect orders as order lines. I implemented this with Kafka topic containing events with order changes, they are merged, stored in local key-value store and broadcasted as second topic with order versions.
I need to somehow react to abandoned orders - ones that were started but there was no change for at least last x hours.
Simple solution could be to scan local storage every y minutes and post event of order status change to Abandoned. It seems I cannot access store not from processor... But it is also not very elegant coding. Any suggestions are welcome.
--edit
I cannot just add puctuation to merge/validation transformer, because its output is different and should be routed elsewhere, like on this image (single kafka streams app):
so "abandoned orders processor/transformer" will be a no-op for its input (the only trigger here is time). Another thing is that i such case (as on image) my transformer gets ForwardingDisabledProcessorContext upon initialization so I cannot emit any messages in punctuator. I could just pass there kafkaTemplate bean and just produce new messages, but then whole processor/transformer is just empty shell only to access local store...
this is snippet of code I used:
public class AbandonedOrdersTransformer implements ValueTransformer<OrderEvent, OrderEvent> {
#Override
public void init(ProcessorContext processorContext) {
this.context = processorContext;
stateStore = (KeyValueStore)processorContext.getStateStore(KafkaConfig.OPENED_ORDERS_STORE);
//main scheduler
this.context.schedule(TimeUnit.MINUTES.toMillis(5), PunctuationType.WALL_CLOCK_TIME, (timestamp) -> {
KeyValueIterator<String, Order> iter = this.stateStore.all();
while (iter.hasNext()) {
KeyValue<String, Order> entry = iter.next();
if(OrderStatuses.NEW.equals(entry.value.getStatus()) &&
(timestamp - entry.value.getLastChanged().getTime()) > TimeUnit.HOURS.toMillis(4)) {
//SEND ABANDON EVENT "event"
context.forward(entry.key, event);
}
}
iter.close();
context.commit();
});
}
#Override
public OrderEvent transform(OrderEvent orderEvent) {
//do nothing
return null;
}
#Override
public void close() {
//do nothing
}
}

#Transactional and serializable level problem

I have a problem with isolation levels in JPA. For example I have following code:
#Transactional(isolation = Isolation.SERIALIZABLE)
public void first() {
Obj obj = new Obj();
obj.setName("t");
objDAO.save(obj);
second();
}
#Transactional(propagation = Propagation.REQUIRES_NEW, isolation = Isolation.SERIALIZABLE)
public void second(){
List<Obj> objs = objDAO.findAll();
}
In my opinion the second method should not see uncomitted changes from method first. So new object with name "t" should not be visible till commit (but it is).
If I am wrong, then please, give me example in JPA where it won't be visible. Many thanks for any advice.
If your methods are inside one class it will not work because container will treat this as a single transaction. The container doesn't know that you want to create new transaction.
From the Spring reference:
Note: In proxy mode (which is the default), only 'external' method calls coming in through the proxy will be intercepted. This means that 'self-invocation', i.e. a method within the target object calling some other method of the target object, won't lead to an actual transaction at runtime even if the invoked method is marked with #Transactional!
If you want to call method second() in new transaction you can try this:
#Autowired
private ApplicationContext applicationContext;
#Transactional(isolation = Isolation.SERIALIZABLE)
public void first() {
Obj obj = new Obj();
obj.setName("t");
objDAO.save(obj);
applicationContext.getBean(getClass()).second();
}
#Transactional(propagation = Propagation.REQUIRES_NEW, isolation = Isolation.SERIALIZABLE)
public void second(){
List<Obj> objs = objDAO.findAll();
}

How to use BeanWrapperFieldSetMapper to map a subset of fields?

I have a Spring batch application where BeanWrapperFieldSetMapper is used to map fields using a prototype object. However, the CSV file that is being read (via a FlatFileItemReader) contains one (indicator) field that determines the mapping of another field. If the indicator field has a value of Y, then the value of the another field should be mapped to property foo otherwise it should be mapped to property bar.
I know that I can use a custom FieldSetMapper to do this, but then I have to code the mapping all of the other fields (of which there are a quite a few). Alternatively, I could do this post reading via an ItemProcessor but then my domain (prototype) object must have a property representing the indicator field (which I prefer not to do since it is not really part of the business domain).
Is it possible to perhaps use a custom FieldSetMapper to only map these custom fields and delegate the other mappings to BeanWrapperFieldSetMapper? Or is there some other better way to solve for this?
Here is my current attempt to use a custom FieldSetMapper and delegate to BeanWrapperFieldSetMapper:
public class DelegatedFieldSetMapper extends BeanWrapperFieldSetMapper<MyProtoClass> {
#Override
public MyProtoClass mapFieldSet(FieldSet fieldSet) throws BindException {
String indicator = fieldSet.readString("indicator");
Properties fieldProperties = fieldSet.getProperties();
if (indicator.equalsIgnoreCase("y")) {
fieldProperties.put("test.foo", fieldSet.readString("value");
} else {
fieldProperties.put("test.bar", fieldSet.readString("value");
}
fieldProperties.remove("indicator");
Set<Object> keys = fieldProperties.keySet();
List<String> names = new ArrayList<String>();
List<String> values = new ArrayList<String>();
for (Object key : keys) {
names.add((String) key);
values.add((String) fieldProperties.getProperty((String) key));
}
DefaultFieldSet domainObjectFieldSet = new DefaultFieldSet(names.toArray(new String[names.size()]), values.toArray(new String[values.size()]));
return super.mapFieldSet(domainObjectFieldSet);
}
}
However, a FlatFileParseException is thrown. The relevant parts of the batch config class are as follows:
#Configuration
#EnableBatchProcessing
public class BatchConfiguration {
#Value("${file}")
private File file;
#Bean
#Scope("prototype")
public MyProtoClass () {
return new MyProtoClass();
}
#Bean
public ItemReader<MyProtoClass> reader(LineMapper<MyProtoClass> lineMapper) {
FlatFileItemReader<MyProtoClass> flatFileItemReader = new FlatFileItemReader<MyProtoClass>();
flatFileItemReader.setResource(new FileSystemResource(file));
final int NUMBER_OF_HEADER_LINES = 1;
flatFileItemReader.setLinesToSkip(NUMBER_OF_HEADER_LINES);
flatFileItemReader.setLineMapper(lineMapper);
return flatFileItemReader;
}
#Bean
public LineMapper<MyProtoClass> lineMapper(LineTokenizer lineTokenizer, FieldSetMapper<MyProtoClass> fieldSetMapper) {
DefaultLineMapper<MyProtoClass> lineMapper = new DefaultLineMapper<MyProtoClass>();
lineMapper.setLineTokenizer(lineTokenizer);
lineMapper.setFieldSetMapper(fieldSetMapper);
return lineMapper;
}
#Bean
public LineTokenizer lineTokenizer() {
DelimitedLineTokenizer lineTokenizer = new DelimitedLineTokenizer();
lineTokenizer.setNames(new String[] {"value", "test.bar", "test.foo", "indicator"});
return lineTokenizer;
}
#Bean
public FieldSetMapper<MyProtoClass> fieldSetMapper(PropertyEditor emptyStringToNullPropertyEditor) {
BeanWrapperFieldSetMapper<MyProtoClass> fieldSetMapper = new DelegatedFieldSetMapper();
fieldSetMapper.setPrototypeBeanName("myProtoClass");
Map<Class<String>, PropertyEditor> customEditors = new HashMap<Class<String>, PropertyEditor>();
customEditors.put(String.class, emptyStringToNullPropertyEditor);
fieldSetMapper.setCustomEditors(customEditors);
return fieldSetMapper;
}
Finally, the CSV flat file look like this:
value,bar,foo,indicator
abc,,,y
xyz,,,n
Let's say that BatchWorkObject is the class to be mapped.
Here's a sample code in Spring Boot style that needs only your custom logic to be added.
new BeanWrapperFieldSetMapper<BatchWorkObject>(){
{
this.setTargetType(BatchWorkObject.class);
}
#Override
public BatchWorkObject mapFieldSet(FieldSet fs)
throws BindException {
BatchWorkObject tmp= super.mapFieldSet(fs);
// your custom code here
return tmp;
}
});
The code actually accomplishes what is desired except for one issue that results in the FlatFileParseException. The DelegatedFieldSetMapper contains the issue as follows:
DefaultFieldSet domainObjectFieldSet = new DefaultFieldSet(names.toArray(new String[names.size()]), values.toArray(new String[values.size()]));
To resolve, change to:
DefaultFieldSet domainObjectFieldSet = new DefaultFieldSet(values.toArray(new String[values.size()]), names.toArray(new String[names.size()]));
Write your own FieldSetMapper with a set of prepared delegates inside.
Those delegates are pre-built for every different kind of fields mapping.
In your object route to correct delegate based on indicator field (with a Classifier, for example).
I can't see any other way, but this solution is quite easy and straightforward to maintain.
Processing based on the input format/data can be done using a custom implementation of ItemProcessor which is either changing values in the same entity (that was populated by IteamReader) or creates a new one output entity.

Can't insert new entry into deserialized AutoBean Map

When i try to insert a new entry to a deserialized Map instance i get no exception but the Map is not modified. This EntryPoint code probes it. I'm doing anything wrong?
public class Test2 implements EntryPoint {
public interface SomeProxy {
Map<String, List<Integer>> getStringKeyMap();
void setStringKeyMap(Map<String, List<Integer>> value);
}
public interface BeanFactory extends AutoBeanFactory {
BeanFactory INSTANCE = GWT.create(BeanFactory.class);
AutoBean<SomeProxy> someProxy();
}
#Override
public void onModuleLoad() {
SomeProxy proxy = BeanFactory.INSTANCE.someProxy().as();
proxy.setStringKeyMap(new HashMap<String, List<Integer>>());
proxy.getStringKeyMap().put("k1", new ArrayList<Integer>());
proxy.getStringKeyMap().put("k2", new ArrayList<Integer>());
String payload = AutoBeanCodex.encode(AutoBeanUtils.getAutoBean(proxy)).toString();
proxy = AutoBeanCodex.decode(BeanFactory.INSTANCE, SomeProxy.class, payload).as();
// insert a new entry into a deserialized map
proxy.getStringKeyMap().put("k3", new ArrayList<Integer>());
System.out.println(proxy.getStringKeyMap().keySet()); // the keySet is [k1, k2] :-( ¿where is k3?
}
}
Shouldn't AutoBeanCodex.encode(AutoBeanUtils.getAutoBean(proxy)).toString(); be getPayLoad()
I'll check the code later, and I don't know if that is causing the issue. But it did stand out as different from my typical approach.
Collection classes such as java.util.Set and java.util.List are tricky because they operate in terms of Object instances. To make collections serializable, you should specify the particular type of objects they are expected to contain through normal type parameters (for example, Map<Foo,Bar> rather than just Map). If you use raw collections or maps you will get bloated code and be vulnerable to denial of service attacks.
Font: http://www.gwtproject.org/doc/latest/DevGuideServerCommunication.html#DevGuideSerializableTypes