I am building a REST based application and i am creating tokens after user gets successfully authenticated. Now I want to know where to store tokens , in DB or cache(Ehcache). Which method is best in what scenerio.
If the tokens are in DB then we have to fetch token from DB to authenticate but cache gives best performance but i am little bit confuse in what scenerio which method should be used.
My application would have thousands of visitors in a single time.
A cache is about temporary storage trading higher memory usage for lower latency. If you have no way of reconstructing the token in case it is evicted from the cache, then having them only in the cache is not an option. In this case you should store them in DB and cache them if you can measure performance benefits.
Related
we have a scenario where we need to aggregate data from several services and show in UI. The current scenario is when an agent logins in, we need to show cases assigned to that agent. Case information needs to be aggregated from several microservices. There would be around 1K cases assigned to agent at a time and all of the needs to be shown to agent so that he can perform sorting based on certain case data.
What be best approach to show data in this scenario? should we do API calls to several services for each case and aggregate and show ? Or there are better approaches to achieve this.
No. You'll certainly not call multiple APIs to aggregate data on runtime. Even if you call the apis parallely, it will be a huge latency.
You need to pre-aggregate the case details and cache them in a distributed caching system (e.g. Redis or memcached) using a streaming platform (e.g. Kafka). Also, store the pre-aggregated case details in a persistent database. Basically, it's a kind of materialized views.
Caching will enable you to serve the case details fast to the user without any noticeable latency. And streaming will help you to keep the cache and DB aggregations updated in a near-real time. Storing the materialized view in database will save you from storing everything in memory. You can use an LRU cache. Only the recently used data will be in cache. If you need to show any case data that is not in cache, you'd read it from database and store it in cache for future requests.
I recommend you read these two Martin Kleppmann articles here and here
I create Java web app on IBM Bluemix. This application shares session object among instances via Session Cache Service.
I understand how to program my application with session cache. But I could not find any descriptions if the total amount of cached data exceeds cache space (e.g. for starter plan, I can use 1GB cache space.).
These are my questions.
Q1. Are there any trigger to remove cached data from cache space?
Q2. After exceeding cache space, what data will be removed? Is there any cache strategy such as Least Recently Used, Least Frequently Used and so on?
The Session Cache service on IBM Bluemix is based on WebSphere Extreme Scale. Hence a lot of background information is provided in the Knowledge Center of WebSphere Extreme Scale. The standard Liberty profile for the Session Cache uses a Least Recently Used (LRU) algorithm to manage the space. I haven't tried it yet, but the first linked document describes how to monitor the cache and obtain statistics.
What is a good tool for applying a layer of caching between a webserver and an application server.
Basic Requirements:
The application server needs a way to remove items from the cache and put items in the cache with an expiration date.
The webserver needs a way to pull items out of the cache in a very light-weight, fast manner without requiring thread allocation on the application server.
It does not neccessarily need to be a distributed cache (accessible from multiple machines), but it wouldn't hurt.
Strategies I have considered:
Static file caching. Request comes in, gets hashed, if a file exists we serve it, if not we route the request to the app server. Is high I/O a problem or file locking problems due to concurrency? Is it accurate that the file system is actually very fast due to kernel level caching in memory.
Using a key-value DB like mongodb, or redis. This would store the finished HTML/JSON fragments in db. The webserver would be equipped to read from the DB and route to the app server if needed. The app server would be equipped to insert/remove from the DB.
A memory cache like memcached or Varnish (don't know much about Varnish). My only concern with memcached is that I'm going to want to cache 3 - 10 gigabytes of data at any given time, which is more than I can safely allocate in memory. Does memcached have a method to spill to the filesystem?
Any thoughts on some techniques and pitfalls when trying this type of caching layer?
You can also use GigaSpaces XAP in memory data grid for caching and even hosting your web application. You can choose just the caching option or combine the power of two and gain single management of your environment along other things.
Unlike the key value pair approach you suggested, using GigaSpaces XAP you'll be able to have complex queries such as SQL, object based temples and much more. In your caching scenario you should check out more specifically the local cache related features.
Local Cache
Web Container
Disclaimer, I am a developer in GigaSpaces.
Eitan
Just to answer this from the POV of using Coherence (http://coherence.oracle.com/):
1. The application server needs a way to remove items from the cache and put items in the cache with an expiration date.
// remove one item from cache
cache.remove(key);
// remove multiple items from cache
cache.keySet().removeAll(keylist);
2. The webserver needs a way to pull items out of the cache in a very light-weight, fast manner without requiring thread allocation on the application server.
// access one item from cache
Object value = cache.get(key);
// access multiple items from cache
Map mapKV = cache.getAll(keylist);
3. It does not neccessarily need to be a distributed cache (accessible from multiple machines), but it wouldn't hurt.
Elastic. Just add nodes. Auto-discovery. Auto-load-balancing. No data loss. No interruption. Every time you add a node, you get more data capacity and more throughput.
Automatic high availability (HA). Kill a process, no data loss. Kill a server, no data loss.
A memory cache like memcached or Varnish (don't know much about Varnish). My only concern with memcached is that I'm going to want to cache 3 - 10 gigabytes of data at any given time, which is more than I can safely allocate in memory. Does memcached have a method to spill to the filesystem?
Use both RAM and flash. Transparently. Easily handle 10s or even 100s of gigabytes per Coherence node (e.g. up to a TB or more per physical server).
For the sake of full disclosure, I work at Oracle. The opinions and views expressed in this post are my own, and do not necessarily reflect the opinions or views of my employer.
Can people give me examples of why they would use coreData in an application?
I ask this because most apps are just clients to a central server where an API of some sort gives you the information you need.
In my case I'm writing a timesheet application for a web app which has an API and I'm debating if there is any value in replicating the data structure on my server in core data(Sqlite)
e.g
Project has many timesheets
employee has many timesheets
It seems to me that I can just connect to the API on every call for lists of projects or existing timesheets for example.
I realize for some kind of offline mode you could store locally in core data but this creates way more problems because you now have a big problem with syncing that data back to the web server when you get connection again.. e.g. the project selected for a timesheet no longer exists.
Can any experienced developer shed some light on there experiences on when core data is best practice approach?
EDIT
I realise of course there is value in storing local persistance but the key value of user defaults seems to cover most applications I can think of.
You shouldn't think of CoreData simply as an SQLite database. It's not JUST an SQLite database. Sure, SQLite is an option, but there are other options as well, such as in-memory and, as of iOS5, a whole slew of custom data stores. The biggest benefit with CoreData is persistence, obviously. But even if you are using an in-memory data store, you get the benefits of a very well structured object graph, and all of the heavy lifting with regards to pulling information out of or putting information into the data store is handled by CoreData for you, without you necessarily needing to concern yourself with what is backing that data store. Sure, today you don't care too much about persistence, so you could use an in-memory data store. What happens if tomorrow, or in a month, or a year, you decide to add a feature that would really benefit from persistence? With CoreData, you simply change or add a persistent data store, and all of your methods to get information out or in remain unchanged. The overhead for that sort of addition is minimal in comparison to if you were trying to access SQLite or some other data store directly. IMHO, that's the biggest benefit: abstraction. And, in essence, abstraction is one of the most powerful things behind OOP. Granted, building the Data Model just for in-memory storage could be overkill for your app, depending on how involved the app is. But, just as a side note, you may want to consider what is faster: Requesting information from your web service every time you want to perform some action, or requesting the information once, storing it in memory, and acting on that stored value for the remainder of the session. An in-memory data store wouldn't persistent beyond that particular session.
Additionally, with CoreData you get a lot of other great features like saving, fetching, and undo-redo.
There are basically two kinds of apps. Those that provide you with local functionality (games, professional applications, navigation systems...) and those that grant access to a remote service.
Your app seems to be in the second category. If you access remote services, your users will want to access new or real-time data (you don't want to read 2 week old Facebook posts) but in some cases, local caching makes sense (e.g. reading your mails when you're on the train with unstable network).
I assume that the value of accessing cached entries when not connected to a network is pretty low for your customers (internal or external) compared to the importance of accessing real-time-data. So local storage might be not necessary at all.
If you don't have hundreds of entries in your timetable, "normal" serialization (NSCoding-protocol) might be enough. If you only access some "dashboard-data", you will be able to get along with simple request/response-caching (NSURLCache can do a lot of things...).
Core Data does make more sense if you have complex data structures which should be synchronized with a server. This adds a lot of synchronization logic to your project as well as complexity from Core Data integration (concurrency, thread-safety, in-app-conflicts...).
If you want to create a "client"-app with a server driven user experience, local storage is not necessary at all so my suggestion is: Keep it as simple as possible unless there is a real need for offline storage.
It's ideal for if you want to store data locally on the phone.
Seriously though, if you can't see a need for it for your timesheet app, then don't worry about it and don't use it.
Solving the sync problems that you would have with an "offline" mode would be detailed in your design of your app. For example - don't allow projects to be deleted. Why would you? Wouldn't you want to go back in time and look at previous data for particular projects? Instead just have a marker on the project to show it as inactive and a date/time that it was made inactive. If the data that is being synced from the device is for that project and is before the date/time that it was marked as inactive, then it's fine to sync. Otherwise display a message and the user will have to sort it.
It depends purely on your application's design whether you need to store some data locally or not, if it is a real problem or a thin GUI client around your web service. Apart from "offline" mode the other reason to cache server data on client side might be to take traffic load from your server. Just think what does it mean for your server to send every time the whole timesheet data to the client, or just the changes. Yes, it means more implementation on both side, but in some cases it has serious advantages.
EDIT: example added
You have 1000 records per user in your timesheet application and one record is cca 1 kbyte. In this case every time a user starts your application, it has to fetch ~1Mbyte data from your server. If you cache the data locally, the server can tell you that let's say two records were updated since your last update, so you'll have to download only 2 kbyte. Now you should scale up this for several tens of thousands of user and you will immediately notice the difference of the server bandwidth and CPU usage.
I am building a large application and I ususally use a simple session to store private global information however as the application could be rather large I belive this could be a problem due to the amount of memory sessions it could have.
Is there a better way to store such variables?
For example, when the user logs in I want to store data about that user and display it where needed without having to query the database each time.
Sessions are the way to go here, they are intended to persist information about the current session across requests. There is no other object in the ASP.NET framework that has this intention.
You could use the Cache, or store in the Application collection, but then the responsibility of uniquely identifying the individual session data is up to you.
What's also up to you is handling when the session terminates, and freeing up the instances that are stored in those collections (Cache or Application).
It's really a bad idea to start to ask these questions based on what you might "think" will happen. This is a form of premature optimization, and you should avoid it. Rather, use Sessions, as they were intended for this purpose, then measure where your bottlenecks are and address them, should performance be an issue when testing.
use cookies - they would work irrespective of your load balance environments
other options include:
1) writing your sessionvalues to a sql database - you can configure your asp.net app to configure session state to use sql server - but this has its own problems as sessions never time out (so u need to handle this via code explicitly)
2) if not using sql server - basically you would face a problem when you have too many users and you implement load balancing on your web server - so a user can go to a different web server in the same session (and it would not work)
there is a work around for this too - its called STICKY SESSIONS - where your web server guarantees your user would always hit the same web server within the session
3) with .net 2.0 provider model, you can even write your own session storage provider by implementing their delegates - so you can create your own xml files on your web server / shared server to read / write session data there :-)
so there are many ways you can solve this. however the simplest and cost effective solution is to use cookies
You might use Cache. That has built-in mechanism to free up when memory is running out...
Definitely use cookies for this. The best approach is to make yourself a cookies wrapper class that will do all the heavy lifting for you - checking if cookie is null, accessing the httpcontext, etc. No need to mess up your code with all that; just abstract it all out into cookies.cs or .vb.
SetCookieValue(someValue, cookieName); //there will be some expiration concerns here as well
myValue = GetCookieValue(cookieName);
Christian Weiss has a good strategy.
If you think your data is too large for the Session, I would consider a database of some sort using cache so that you don't unnecessary calls.
If it is per-user-session data you're storing, using the ASP.NET Session is definitely your best bet. If you're most worried about memory usage then you can use MSSQL mode. The data has to live somewhere and the choice of which session mode to use is dependent on your environment and the usage patterns of your users.
Don't assume there will be a problem with the size of session state until you see such a problem and have tried to solve it. For instance, it's possible that, although the application as a whole may use a large amount of session state, that any given user may not use that much in the course of a session.
I's also possible that changing from the default session state provider to the SQL provider or state server provider would ease a memory issue.
You can use Cache, but Cache is application-wide. You would need to qualify Cache entries with the user id or session id: Cache[userID + ".MyCacheEntry"].
Do not, under any circumstances, use static variables to store this data. As suggested by your subject line, they are application-wide, not per-user.