Chronicle Queue memory mapping file to reduce avoid garbage collection? - memory-mapped-files

I have a question about how the Chronicle queue avoids garbage collection:
I understand that the Chronicle queue uses a memory mapping file so that it can save an object into main memory or dist rather than into JVM. However, when a processor deserializes the object from the main memory, it still needs to create a new instance. So where exactly Chronicle queue avoid garbage collection?
See the below case that comes from a Chronicle github example. When performs write to/read operation, it still needs to create a new instance using MyObject me = new MyObject() and "me" will be garbaged collected.
public class Example {
static class MyObject implements Marshallable {
String name;
int age;
#Override
public String toString() {
return Marshallable.$toString(this);
}
}
public static void main(String[] args) throws IOException {
// will write the .cq4 file to working directory
SingleChronicleQueue queue = SingleChronicleQueueBuilder.builder().path(Files
.createTempDirectory("queue").toFile()).build();
ExcerptAppender appender = queue.acquireAppender();
ExcerptTailer tailer = queue.createTailer();
MyObject me = new MyObject();
me.name = "rob";
me.age = 40;
// write 'MyObject' to the queue
appender.writeDocument(me);
// read 'MyObject' from the queue
MyObject result = new MyObject();
tailer.readDocument(result);
System.out.println(result);
}
}

You can reuse the object you deserialise into.
// created once
MyObject result = new MyObject();
// this can be called multiple times with the same object
tailer.readDocument(result);
The String is also pooled reducing garbage.
This way you can write and read millions of messages but only create one or two objects.

Related

invalid large object descriptor : 0 hibernate and postgres

We have a n tier application where we are reading the BLOB objects stored in the postgres database.
At times when we are trying to access the blob object through input stream we get "org.postgresql.util.PSQLException: ERROR: invalid large-object descriptor: 0" after reading the other blogs, this exception comes whenever we are trying to access the BLOB outside the transaction (transaction is committed).
But, in our case we get this exception even though the transaction is active. The BLOB is read within the transaction.
Any pointers as to why this exception is occuring even though the transaction is active?
Your description of the problem does not have specifics but in my code this error showed up when I tried to use Large Object outside the data access method. As in your case, the object was formed in the method. This is consistent with what other people noticed in this forum: Large Object exists only within the data access method (or transaction). I needed byte[] so I converted Large Object within the method, wrapped it up in a Data Transfer Object and was able to use it in other layers. This are relevant code snippets:
//This is Data Access Class
#Named
public class SupportDocsDAO {
protected ResultSet resultSet;
private LargeObject lob;
// SupportDocs is an Entity class in Data Transfer Objects package
private SupportDocs supportDocsDTO;
public LargeObject getLob() {
return lob;
}
public void setLob(LargeObject lob) {
this.lob = lob;
}
public SupportDocs getSupportDocsDTO() {
return supportDocsDTO;
}
public void setSupportDocsDTO(SupportDocs supportDocsDTO) {
this.supportDocsDTO = supportDocsDTO;
}
//.... other code
public SupportDocs fetchSupportDocForDescr(SupportDocs supportDocs1) {
Session session = HibernateUtil.getSessionFactory().openSession();
session.doWork(new Work() {
#Override
public void execute(java.sql.Connection connection) throws SQLException {
java.sql.PreparedStatement ps = null;
try {
LargeObjectManager lobm =
connection.unwrap(org.postgresql.PGConnection.class).getLargeObjectAPI();
ps = connection.prepareCall("{call ret_lo_supportdocs_id(?)}");
ps.setInt(1, supportDocs1.getSuppDocId());
ps.execute();
resultSet = ps.getResultSet();
while (resultSet.next()) {
supportDocsDTO.setFileNameDoc(resultSet.getString("filenamedoc"));
supportDocsDTO.setExtensionSd(resultSet.getString("extensionsd"));
long oid = resultSet.getLong("suppdoc_oid");
setLob(lobm.open(oid, LargeObjectManager.READ));
//This is the conversion of Large Object into byte[]
supportDocsDTO.setSuppDocImage(lob.read(lob.size()));
System.out.println("object size: " + lob.size());
}
// other code, catch, cleanup with finally, and return supportDocsDTO
This works without problems. I can recreate images and videos from obtained byte[].

getting statestore data from called function in kafka streams

In Kafka Streams' Processor API, can I pass processor context from init() as follows to other function and get the context back with state store in process()?
public void init(ProcessorContext context) {
this.context = context;
String resourceName = "config.properties";
ClassLoader loader = Thread.currentThread().getContextClassLoader();
Properties props = new Properties();
try(InputStream resourceStream = loader.getResourceAsStream(resourceName)) {
props.load(resourceStream);
}
catch(IOException e){
e.printStackTrace();
}
dataSplitter.timerMessageSource(props, context);//can I pass context like this?
this.context.schedule(1000);
// retrieve the key-value store named "patient"
kvStore = (KeyValueStore<String, PatientDataSummary>) this.context.getStateStore("patient");
//want to get the value of statestore filled by the called function timerMessageSource(), as the data to be put in statestore is getting generated in timerMessageSource()
//is there any way I can get that by using context or so
}
The usage of ProcessorContext is somewhat limited and you cannot call each method is provides at arbitrary times. Thus, it depend how you use it -- in general, you can pass it around as you wish (it will always be the same object throughout the live time of the processor).
If I understand your question correctly, you register a punctuation and use your dataSplitter within the punctuation callback and want to modify the store. That is absolutely possible -- you can either put the store into a class member similar to what you do with the context or use the context object to get the store within the punctuate callback.

Proper way to write a spring-batch ItemReader

I'm constructing a spring-batch job that modifies a given number of records. The list of record ID's are an input parameter of the job. For example, one job might be: Modify the record Id's {1,2,3,4} and set parameters X and Y on related tables.
Since I'm unable to pass a potentialy very long input list (tipical cases, 50K records) to my ItemReader I only pass a MyJobID which then the itemReader uses to load the target ID list.
Problem is, the resulting code appears "wrong" (altough it works) and not in the spirit of spring-batch. Here's the reader:
#Scope(value = "step", proxyMode = ScopedProxyMode.INTERFACES)
#Component
public class MyItemReader implements ItemReader<Integer> {
#Autowired
private JobService jobService;
private List<Integer> itemsList;
private Long jobId;
#Autowired
public MyItemReader(#Value("#{jobParameters['jobId']}") final Long jobId) {
this.jobId = jobId;
this.itemsList = null;
}
#Override
public Integer read() throws Exception, UnexpectedInputException, ParseException, NonTransientResourceException {
// First pass: Load the list.
if (itemsList == null) {
itemsList = new ArrayList<Integer>();
MyJob myJob = (MyJob) jobService.loadById(jobId);
for (Integer i : myJob.getTargedIdList()) {
itemsList.add(i);
}
}
// Serve one at a time:
if (itemsList.isEmpty()) {
return null;
} else {
return itemsList.remove(0);
}
}
}
I tried to move the first part of the read() method to the constructor but the #Autowired reference is null at that point. Afterwards (on the read method) it's initialized.
Is there a better way to write the ItemReader? I would like to move the "load"Or is this the best solution for this scenario?
Thank you.
Generally, your approach is not "wrong", but probably not ideal.
Firstly, you could move the initialisation to a initMethod which is annotated with #PostConstruct. This method is called after all Autowired fields have been injected:
#PostConstruct
public void afterPropertiesSet() throws Exception {
itemsList = new ArrayList<Integer>();
MyJob myJob = (MyJob) jobService.loadById(jobId);
for (Integer i : myJob.getTargedIdList()) {
itemsList.add(i);
}
}
But there is still the problem, that you load all the data at once. If you have a billion records to process, this could blow up the memory.
So what you should do is to load only a chunk of your data into memory, then return the items one by one in your read method. If all entries of a chunk have been returned, load the next chunk and return its items one by one again. If there is no other chunk to be loaded, then return null from the read method.
This ensures that you have a constant memory footprint regardless of how many records you have to process.
(If you have a look at FlatFileItemReader, you see that it uses a BufferedReader to read the data from the disk. While it has nothing to do with SpringBatch, it is the same principle: it reads a chunk of data from the disk, returns that and if more data is needed, it reads the next chunk of data).
Next problem is the restartability. What happens if the job crashes after doing 90% of the work? How can the job be restarted and only process the missing 10%?
This is actually a feature that springbatch provides, all you have to do is to implement the ItemStream interface and implement the methods open(), update(), close().
If you consider this two points - load data in chunks instead all at once and implement ItemStream interface - you'll end up having a reader that is in the spring spirit.

How to get a static List from a Message Driven Bean?

I am using a Message Driven Bean for storing messages in a list as you can see in the code given below:
/**
*
* #author sana-naeem
*/
#MessageDriven(mappedName = "jms/Queue-0", activationConfig = {
#ActivationConfigProperty(propertyName = "acknowledgeMode", propertyValue = "Auto-acknowledge"),
#ActivationConfigProperty(propertyName = "destinationType", propertyValue = "javax.jms.Queue")
})
public class MyMessageBean implements MessageListener{
private static ArrayList<String> list = new ArrayList<String>();
public MyMessageBean() {
}
#Override
public void onMessage(Message message) {
Textmessage msg = (TextMessage) message;
try {
if (message instanceof TextMessage) {
list.add("Messages: "+msg.getText());
} else {
System.out.println("No Text!!!");
}
} catch (JMSException ex) {
System.out.println("JMS.Exception....!!!");
}
}
public static ArrayList<String> getList() {
return list;
}
public void setList(ArrayList<String> list) {
this.list = new ArrayList<String>();
}
}
Now the problem is when I access the getter method from another Java Class; it is displaying list size=0;
Can I please know why is this happening;
I want to get that list in another Java Class;
If there is something wrong with the Queue, kindly let me know How to fix it???
It was working fine before;
Actually, previously I was using a Servlet to send messages but now, I am using a simple Java class with some initial context parameters defined...; so now the list is not working as expected...
Any advice or suggestion would be highly appreciable.
Thank you!
You shouldn't be using mutable static fields in an MDB. When you think MDB, think stateless, because that's how the container treats them. Here is a good reference;
http://www.coderanch.com/t/312086/EJB-JEE/java/Clarification-static-fields
Unfortunately, there isn't one single answer for your question. Without knowing anything about the rest of your application, how it is designed, what components you are using, how they interact, how it will be deployed, etc., it is difficult to give advice. But the first thing that comes to mind, given what limited information I have, is that you can create a separate object to wrap a Private ArrayList, and store a reference to that object in an appropriate context. Your MDB can then access that object. In that case, you'd have methods in the object to modify/read the ArrayList, and you'd synchronize these methods internally on the underlying ArrayList because the methods will be accessible by multiple threads.
But this is a brute force approach, not knowing anything else about your application. And depending on how your application is designed and deployed, this might not be the right advice. But the point here is that;
You need to separate your stateful object from what is intended to be a stateless object.
You need to make it accessible from an appropriate context.
You need to synchronize modify/read methods on the underlying POJO (ArrayList) if the containing object will be accessible by multiple threads.

Can't insert new entry into deserialized AutoBean Map

When i try to insert a new entry to a deserialized Map instance i get no exception but the Map is not modified. This EntryPoint code probes it. I'm doing anything wrong?
public class Test2 implements EntryPoint {
public interface SomeProxy {
Map<String, List<Integer>> getStringKeyMap();
void setStringKeyMap(Map<String, List<Integer>> value);
}
public interface BeanFactory extends AutoBeanFactory {
BeanFactory INSTANCE = GWT.create(BeanFactory.class);
AutoBean<SomeProxy> someProxy();
}
#Override
public void onModuleLoad() {
SomeProxy proxy = BeanFactory.INSTANCE.someProxy().as();
proxy.setStringKeyMap(new HashMap<String, List<Integer>>());
proxy.getStringKeyMap().put("k1", new ArrayList<Integer>());
proxy.getStringKeyMap().put("k2", new ArrayList<Integer>());
String payload = AutoBeanCodex.encode(AutoBeanUtils.getAutoBean(proxy)).toString();
proxy = AutoBeanCodex.decode(BeanFactory.INSTANCE, SomeProxy.class, payload).as();
// insert a new entry into a deserialized map
proxy.getStringKeyMap().put("k3", new ArrayList<Integer>());
System.out.println(proxy.getStringKeyMap().keySet()); // the keySet is [k1, k2] :-( ¿where is k3?
}
}
Shouldn't AutoBeanCodex.encode(AutoBeanUtils.getAutoBean(proxy)).toString(); be getPayLoad()
I'll check the code later, and I don't know if that is causing the issue. But it did stand out as different from my typical approach.
Collection classes such as java.util.Set and java.util.List are tricky because they operate in terms of Object instances. To make collections serializable, you should specify the particular type of objects they are expected to contain through normal type parameters (for example, Map<Foo,Bar> rather than just Map). If you use raw collections or maps you will get bloated code and be vulnerable to denial of service attacks.
Font: http://www.gwtproject.org/doc/latest/DevGuideServerCommunication.html#DevGuideSerializableTypes