Hazrlcast: write-behind works unstable - persistence

I have a process that populates data into map with persistence periodically. To be more exact there are two nodes: storage node with persistence enabled and cache maps defined and a lite client node started with 'lite' option and no map defined. Connection between nodes does look good. During testing I found out that only around a half of populated data is actually flown into persistence though all data is in cache. I can confirm this by browsing the cache map and via JMX stats. I can't indicate dependencies on the data or time it is populated.
Could someone please advise where the investigation should start from?

This is only my own fault. I did not provide 'lite' option to the populator properly so data is distributed between nodes and only written on the storage side as lite client does not have any persistence set up. I did not remove the question to prevent someone else from this silly failure.

Related

Retaining and Migrating Actor / Service State

I've been looking at using service fabric as a platform for a new solution that we are building and I am getting hung up on data / stage management. I really like the concept of reliable services and the actor model and as we have started to prototype out some things it seems be working well.
With that beings said I am getting hung up on state management and how I would use it in a 'real' project. I am also a little concerned with how the data feels like a black box that I can't interrogate or manipulate directly if needed. A couple scenarios I've thought about are:
How would I share state between two developers on a project? I have an Actor and as long as I am debugging the actor my state is maintained, replicated, etc. However when I shut it down the state is all lost. More importantly someone else on my team would need to set up the same data as I do, this is fine for transactional data - but certain 'master' data should just be constant.
Likewise I am curious about how I would migrate data changes between environments. We periodically pull production data down form our SQL Azure instance today to keep our test environment fresh, we also push changes up from time to time depending on the requirements of the release.
I have looked at the backup and restore process, but it feels cumbersome, especially in the development scenario. Asking someone to (or scripting the) restore on every partition of every stateful service seems like quite a bit of work.
I think that the answer to both of these questions is that I can use the stateful services, but I need to rely on an external data store for anything that I want to retain. The service would check for state when it was activated and use the stateful service almost as a write-through cache. I'm not suggesting that this needs to be a uniform design choice, more on a service by service basis - depending on the service needs.
Does that sound right, am I overthinking this, missing something, etc?
Thanks
Joe
If you want to share Actor state between developers, you can use a shared cluster. (in Azure or on-prem). Make sure you always do upgrade-style deployments, so state will survive. State is persisted if you configure the Actor to do so.
You can migrate data by doing a backup of all replica's of your service and restoring them on a different cluster. (have the service running and trigger data-loss). It's cumbersome yes, but at this time it's the only way. (or store state externally)
Note that state is safe in the cluster, it's stored on disk and replicated. There's no need to have an external store, provided you do regular state backups and keep them outside the cluster. Stateful services can be more than just caches.

Bluemix Session Cache: Trigger to evict cached data

I create Java web app on IBM Bluemix. This application shares session object among instances via Session Cache Service.
I understand how to program my application with session cache. But I could not find any descriptions if the total amount of cached data exceeds cache space (e.g. for starter plan, I can use 1GB cache space.).
These are my questions.
Q1. Are there any trigger to remove cached data from cache space?
Q2. After exceeding cache space, what data will be removed? Is there any cache strategy such as Least Recently Used, Least Frequently Used and so on?
The Session Cache service on IBM Bluemix is based on WebSphere Extreme Scale. Hence a lot of background information is provided in the Knowledge Center of WebSphere Extreme Scale. The standard Liberty profile for the Session Cache uses a Least Recently Used (LRU) algorithm to manage the space. I haven't tried it yet, but the first linked document describes how to monitor the cache and obtain statistics.

How to ensure that parallel queries to ext. system are executed only once and then cached

Server frameworks: Scala, Play 2.2, ReactiveMongo, Heroku
I think I have quite interesting brain teaser for you:
In my trip-planning application I want to display weather forecast on a map(similar to this). I'm using a paid REST service to query weather data. To speed up user experience and reduce costs I plan to cache weather data for each location for one hour.
There are a few not-so obvious things to consider:
It might require to query up to 100 location for weather to display one weather map
Weather must be queried in parallel because it would take too long to query it in serial fashion considering network latency
However launching 100 threads for each user request is not an option as well (imagine just 5 users looking at a map at one time)
The solution is to have let's say 50 workers that query weather for user requests
Multiple users might be viewing the same portion of map
There is a possible racing condition where one location is queried multiple times.
However it should be queried only once and then cached.
The application is running in clustered environment meaning there will be several play instances.
Coming from a Java EE background I can come up with a pretty good solution using the Java EE stack.
However I wonder how to do this using something more natural to Scala/Play stack: Akka. There is an example (google "heroku scala akka") for similar problem but it doesn't solve one issue: Racing condition when multiple users query the same data at once.
How would you implement this?
EDIT: I have decided that the requirement to ensure that weather data is updated only once is not necessary. The situation would happen far too infrequently to be a real problem and all proposed solutions would bring too much overhead and complexity to the system to be viable.
Thanks everyone for your time and effort. I hope answers to this question will help someone in the future with similar problem.
In Akka you can choose from multiple routing strategies. ConsistentHashingRoutingLogic could serve you well in this situation. Since actors are single-threaded you can easily maintain a cache in each actor. This routing logic will assure that two equal messages will always hit the same actor.
Each actor can work in the following way:
1. check local cache (for example apache commons LRUMap)
- if found, return
2. check global cache (distributed memcache or any other key-value store)
- if found, store the result in the local cache and return
3. query the REST service
4. store the result in the global and local caches
You can have a look at this question, which I based my answer on.
I decided that I'll post my JMS solution as well.
Controller that processes the request for weather does following:
Query the DB for weather data. If there are NO locations with out-of-date data reply immediately. Otherwise continue:
Start listening on a topic (explained later).
For each location: Check whether the weather for the location isn't being updated.
If not send a weather update request message to queue.
Certain amount of workers (50?) listen to that queue.
Worker first marks the location weather as being updated
Worker retrieves updated weather and updates the DB.
Worker sends a message to a topic with weather data for that location.
When controller receives (via topic) weather updates for all out-of-date locations, combine it with up-to-date locations and reply.

Caching strategy to reduce load on web application server

What is a good tool for applying a layer of caching between a webserver and an application server.
Basic Requirements:
The application server needs a way to remove items from the cache and put items in the cache with an expiration date.
The webserver needs a way to pull items out of the cache in a very light-weight, fast manner without requiring thread allocation on the application server.
It does not neccessarily need to be a distributed cache (accessible from multiple machines), but it wouldn't hurt.
Strategies I have considered:
Static file caching. Request comes in, gets hashed, if a file exists we serve it, if not we route the request to the app server. Is high I/O a problem or file locking problems due to concurrency? Is it accurate that the file system is actually very fast due to kernel level caching in memory.
Using a key-value DB like mongodb, or redis. This would store the finished HTML/JSON fragments in db. The webserver would be equipped to read from the DB and route to the app server if needed. The app server would be equipped to insert/remove from the DB.
A memory cache like memcached or Varnish (don't know much about Varnish). My only concern with memcached is that I'm going to want to cache 3 - 10 gigabytes of data at any given time, which is more than I can safely allocate in memory. Does memcached have a method to spill to the filesystem?
Any thoughts on some techniques and pitfalls when trying this type of caching layer?
You can also use GigaSpaces XAP in memory data grid for caching and even hosting your web application. You can choose just the caching option or combine the power of two and gain single management of your environment along other things.
Unlike the key value pair approach you suggested, using GigaSpaces XAP you'll be able to have complex queries such as SQL, object based temples and much more. In your caching scenario you should check out more specifically the local cache related features.
Local Cache
Web Container
Disclaimer, I am a developer in GigaSpaces.
Eitan
Just to answer this from the POV of using Coherence (http://coherence.oracle.com/):
1. The application server needs a way to remove items from the cache and put items in the cache with an expiration date.
// remove one item from cache
cache.remove(key);
// remove multiple items from cache
cache.keySet().removeAll(keylist);
2. The webserver needs a way to pull items out of the cache in a very light-weight, fast manner without requiring thread allocation on the application server.
// access one item from cache
Object value = cache.get(key);
// access multiple items from cache
Map mapKV = cache.getAll(keylist);
3. It does not neccessarily need to be a distributed cache (accessible from multiple machines), but it wouldn't hurt.
Elastic. Just add nodes. Auto-discovery. Auto-load-balancing. No data loss. No interruption. Every time you add a node, you get more data capacity and more throughput.
Automatic high availability (HA). Kill a process, no data loss. Kill a server, no data loss.
A memory cache like memcached or Varnish (don't know much about Varnish). My only concern with memcached is that I'm going to want to cache 3 - 10 gigabytes of data at any given time, which is more than I can safely allocate in memory. Does memcached have a method to spill to the filesystem?
Use both RAM and flash. Transparently. Easily handle 10s or even 100s of gigabytes per Coherence node (e.g. up to a TB or more per physical server).
For the sake of full disclosure, I work at Oracle. The opinions and views expressed in this post are my own, and do not necessarily reflect the opinions or views of my employer.

Continuation of a process after a system crash/restart - Drools Flow

I've been playing with examples I downloaded with the book Drools JBoss Rules 5.0. To my relief they work :) Drools Flow has been my point of interest as a possible workflow engine replacement.
As I'm trying to wrap my head around things, I've been wondering how a premature death of a rulesflow process gets restarted? What I'm mean is say a process is bouncing from one node to another like expected, then the containing process dies due to a crash, restart or whatever. Is the current node/place of the ruleflow process retained, and can it just continue from that point on system restart? If so how?
The group I work for is very Java EE centric with JBoss being our favorite application server. I see examples of Drools leveraging Spring's persistence and bean lookup support.
Are there examples of doing the same with JBoss?
If you persist the state of the process instances and tasks in the database. Even if the VM was down and restart again, you can retrieve the process instances.
Use the
To create the session
ksession = JPAKnowledgeService.newStatefulKnowledgeSession(kbase,null,env)
To load the session with the session id.
ksession = JPAKnowledgeService.loadStatefulKnowledgeSession( sessionId, kbase,
You only need to know the session id. Session information will be store in SessionInfo table. Download the example project below.
http://dl.dropbox.com/u/2634115/drools-test.zip
The example is using Btm with H2 database, it also work well with mysql-connector-java-5.1.13 with Btm. Note that the process that are complete will be automatically deleted from the database.
You are looking at the basic concept of Process Migration. During what is known as strong migration, a process can be stopped on one machine and the entire state of the process migrated to another machine (including the program counter and all existing stacks). Before you go thinking that this is completely insane, think about this from a JVM perspective. Since you're application is already being run in virtual hardware; it isn't hard to stop the application and pick it back up where it left off since it is completely virtualized.
If you would like another example, look at VMWare; an entire machine can be paused and migrated to another machine and started again. It's very interesting stuff and usually relates mainly to Distributed Computing where you might have hundreds of agents that need to migrate from machine to machine as some go down for maintenance.
I realize that I didn't give an example of this through JBoss; but giving a background on what exactly you're looking for can give you a much better insight into what to look for going forward.