I create Java web app on IBM Bluemix. This application shares session object among instances via Session Cache Service.
I understand how to program my application with session cache. But I could not find any descriptions if the total amount of cached data exceeds cache space (e.g. for starter plan, I can use 1GB cache space.).
These are my questions.
Q1. Are there any trigger to remove cached data from cache space?
Q2. After exceeding cache space, what data will be removed? Is there any cache strategy such as Least Recently Used, Least Frequently Used and so on?
The Session Cache service on IBM Bluemix is based on WebSphere Extreme Scale. Hence a lot of background information is provided in the Knowledge Center of WebSphere Extreme Scale. The standard Liberty profile for the Session Cache uses a Least Recently Used (LRU) algorithm to manage the space. I haven't tried it yet, but the first linked document describes how to monitor the cache and obtain statistics.
Related
Google Cloud CDN recommended to use versioned URL for static objects.
If I enabled Google Storage versioning, could the Cloud CDN get the fresh one instead of the cache one (prior to its normal expiration time) after updating an obeject on Storage?
By definition, the cache system (in the CDN or elsewhere) prevent any extra communication until the cache expire.
By the way, the cache system will never ask the backend before the cache expiration. In addition, Cloud Storage isn't aware that an additional layer catch its data and store them for a period of time.
By design, it's no, versioning change nothing in the CDN cache management.
I am building a REST based application and i am creating tokens after user gets successfully authenticated. Now I want to know where to store tokens , in DB or cache(Ehcache). Which method is best in what scenerio.
If the tokens are in DB then we have to fetch token from DB to authenticate but cache gives best performance but i am little bit confuse in what scenerio which method should be used.
My application would have thousands of visitors in a single time.
A cache is about temporary storage trading higher memory usage for lower latency. If you have no way of reconstructing the token in case it is evicted from the cache, then having them only in the cache is not an option. In this case you should store them in DB and cache them if you can measure performance benefits.
I have a process that populates data into map with persistence periodically. To be more exact there are two nodes: storage node with persistence enabled and cache maps defined and a lite client node started with 'lite' option and no map defined. Connection between nodes does look good. During testing I found out that only around a half of populated data is actually flown into persistence though all data is in cache. I can confirm this by browsing the cache map and via JMX stats. I can't indicate dependencies on the data or time it is populated.
Could someone please advise where the investigation should start from?
This is only my own fault. I did not provide 'lite' option to the populator properly so data is distributed between nodes and only written on the storage side as lite client does not have any persistence set up. I did not remove the question to prevent someone else from this silly failure.
What is a good tool for applying a layer of caching between a webserver and an application server.
Basic Requirements:
The application server needs a way to remove items from the cache and put items in the cache with an expiration date.
The webserver needs a way to pull items out of the cache in a very light-weight, fast manner without requiring thread allocation on the application server.
It does not neccessarily need to be a distributed cache (accessible from multiple machines), but it wouldn't hurt.
Strategies I have considered:
Static file caching. Request comes in, gets hashed, if a file exists we serve it, if not we route the request to the app server. Is high I/O a problem or file locking problems due to concurrency? Is it accurate that the file system is actually very fast due to kernel level caching in memory.
Using a key-value DB like mongodb, or redis. This would store the finished HTML/JSON fragments in db. The webserver would be equipped to read from the DB and route to the app server if needed. The app server would be equipped to insert/remove from the DB.
A memory cache like memcached or Varnish (don't know much about Varnish). My only concern with memcached is that I'm going to want to cache 3 - 10 gigabytes of data at any given time, which is more than I can safely allocate in memory. Does memcached have a method to spill to the filesystem?
Any thoughts on some techniques and pitfalls when trying this type of caching layer?
You can also use GigaSpaces XAP in memory data grid for caching and even hosting your web application. You can choose just the caching option or combine the power of two and gain single management of your environment along other things.
Unlike the key value pair approach you suggested, using GigaSpaces XAP you'll be able to have complex queries such as SQL, object based temples and much more. In your caching scenario you should check out more specifically the local cache related features.
Local Cache
Web Container
Disclaimer, I am a developer in GigaSpaces.
Eitan
Just to answer this from the POV of using Coherence (http://coherence.oracle.com/):
1. The application server needs a way to remove items from the cache and put items in the cache with an expiration date.
// remove one item from cache
cache.remove(key);
// remove multiple items from cache
cache.keySet().removeAll(keylist);
2. The webserver needs a way to pull items out of the cache in a very light-weight, fast manner without requiring thread allocation on the application server.
// access one item from cache
Object value = cache.get(key);
// access multiple items from cache
Map mapKV = cache.getAll(keylist);
3. It does not neccessarily need to be a distributed cache (accessible from multiple machines), but it wouldn't hurt.
Elastic. Just add nodes. Auto-discovery. Auto-load-balancing. No data loss. No interruption. Every time you add a node, you get more data capacity and more throughput.
Automatic high availability (HA). Kill a process, no data loss. Kill a server, no data loss.
A memory cache like memcached or Varnish (don't know much about Varnish). My only concern with memcached is that I'm going to want to cache 3 - 10 gigabytes of data at any given time, which is more than I can safely allocate in memory. Does memcached have a method to spill to the filesystem?
Use both RAM and flash. Transparently. Easily handle 10s or even 100s of gigabytes per Coherence node (e.g. up to a TB or more per physical server).
For the sake of full disclosure, I work at Oracle. The opinions and views expressed in this post are my own, and do not necessarily reflect the opinions or views of my employer.
I activated the wicket DebugBar in order to trace my session size. When I navigate on the web site, the indicated session size is stable at about 25k.
In the same time, the pagemap serialiazed on the disk continuously grows from about 25k for each page view.
What does that means? From what I understood, the pagemap on disk keeps all the pages. But why the session stays always at about 25k.
What is the impact on a big website. If I have 1000 parallel web sessions, the web server will need 25Mo to hold them and the disk 250Mo (10 pages * 25k * 1000)?
I will make some load test to check.
The debug bar value is telling you the size of your session in memory. As you browse to another page, the old page is serialized to the session store. This provides, among other things, back button support without killing your memory footprint.
So, to answer your first question, the size on disk grows because it is holding historical data while your session stays about the same because it is holding active data.
To answer your second question, its been some time since I have looked at it, but I believe the disk session store is capped at 10MB or so. Furthermore, you can change the behavior of the session store to meet your needs, but that's a whole different discussion.
See this Wiki page which describes the storage mechanisms in Wicket 1.5. It is a bit different than 1.4 but there is no such document for 1.4
Update: the Wiki page has been moved to the guide: https://ci.apache.org/projects/wicket/guide/7.x/guide/internals.html#pagestoring