mybatis memcached cache failover - memcached

We're considering using memcached as a distributed cache for mybatis (the MyBatis-Memcached
integration module).
Does anyone know how to configure it so that the memcached servers are not single points of failure? Currently, if I configure multiple memcached servers, the cache requests are hashed out to the servers, but each server is a single point of failure (i.e. if one goes down, the app will fail).
We would like that if one of the memcached servers goes down, that the mybatis client treats this cache as lost, continues working and builds the cache back up on another available memcached server.
Anyone any experience with this?

Related

Distributed nodes to sync data without a single point of failure

Recently I've been reading about methods to horizontally scale a server.
The majority of methods, if not all, require a central server to sync the data (Redis is widely used here). That means that multiple nodes rely on one central server, one point of failure, to have data synced between them.
I know that I can have multiple instances of Redis and those nodes can connect to the backup Redis servers and keep syncing the data just in case the main Redis server fails, but I dont like the idea to have "2 layers" (Server layer and Redis layer) to achieve this. Is there any piece of software, or method, to have multiple nodes be synced between them only being connected between them?
I mean, like having the 3 Redis servers installed in the same 3 Server machines. If one servers goes down, it goes down along with the Redis server, so the other 2 Server machines connect to the fallback Redis server installed on their own machines, but instead of this hackish thingy, have a piece of software that simply connect between the nodes, sync the data and such.
I've been thinking on a system, but one problem arises. A problem that can be extended to the "having multiple Redis instances" thing.
Imagine that I have a system with X Redis servers in different datacenters, and also have X servers in different datacenters too. I don't want them in the same datacenter to avoid single point of failures. Now, let's say that half of the servers lose the connection to the other half, not because of machines going down, but because of connection failures (ISP problem and such). The slaves are going to think that the master Redis server is dead because the connection dropped and they cannot reconnect, and the same with part of the servers that also lost the connection, that are going to think that the master Redis is dead, and connect and sync data using the slaves.
Now we have an scenario of one master Redis server with X servers syncing data between them and some slave Redis servers with other X servers syncing data between them.
With this, all the distributed system to not have a single point of failure is a failure by itself, because still has a single point of failure that, instead of make people lose data, will make the system have unsynced garbage data.
And as it will be realtime data, having a "save timestamps a resync data when connection recovers in a few minutes" is not a solution.
What do you think? Any solution?

Is it possible to handle memcached memory geting overflow?

I have a serious problem my memcached memory is overflow and server is getting down.
So how to handle memcached, If memcached memory is getting full then it will just throw error msg, not set the memcached anymore.
memcached is distributed cache system and can be works on different servers. What is your server? is it couchebase, is it elastic cache of AWS? you can use memcache on many different server and providers and when you are creating those servers you need to configure them and set the size of memory you want memcache to use. for example in the company I am working, the test environment uses couchbase but the live uses Amazon Elastic cache.
Memcached uses LRU (least recently used) algorithm to insert the new object into the memory if the table is full. You should not have problem because of full memory and handling memcached. The problem can raise from the full memory but not with this reason that memcached cannot handle it. Exceptions and other problem can be in other part of the system which is quite normal if your memory is full. if you configure the server correctly memcache usually does not throw exception.
Is the server that runs memcached same as the server that runs your application? Memcached can be put on another server and in this way you can prevent the memory to become full.

spring batch remote partitioning remote step performance

We are using remote partitioning in our POC where we process around 20 million records. To process this records, slave needs some static metadata which is around 5000 rows. Our current POC uses EhCache to load this metadata in slave once time from db and put it in cache so the subseuent calls just get this data from cache for better performance.
Now since we are using remote partitioning, our slave has approx 20 MDP/thread so each message listener calls first to get the metadata from db, so basically 20 threads are hitting db at the same time on each remote machine. We have 2 machine for now but will grow to 4.
My question is , is there any better way to load this metadata only one time like before job starts and be accessible to all remote slave?
Or can we use step listener in remote stap? I dont think so this is a good idea, as it will be executed for each remote step execution but needed expert thoughts on this.
You could set up an EhCache server running as a separate application or use another product for caching instead like Hazelcast. If commercial products are an option for you, Coherence might also work.

What's the memcached server

I'm a beginner in learning memcached. The memcached server confused me most. Can I see it as a single server computer just like web server? I'm also confused about the relationship between memcached server and client, are they located at different computers?
I agree with most things #phihag has answered but I must clarify some things.
Memcached stores data according to a key (phihag called it an id, not to be confused with database ids). Data can be of various sizes so you can store small bits (like 1 record pulled from the database) or you can store huge chunks of data (like hundreds of records, or entire finished html pages).
Memcached is not typically used on the same machine as the application server, the reason for this is because it is designed to be used via TCP (it would be accessible via sockets if it were designed to work on the same server) and it was designed as a pooling server.
The pooling part is interesting - you can have 10 machines running Memcached each allocating a maximum 10GB of ram for this purpose. 10*10 = 100GB ram space.
When you write a value into Memcached only one (randomly or via some algorithms) of the servers stores it. When you try to read a value from Memcached only the server that has stored it will send it to you.
So indeed you can put all database/memcached/application/fileserver on the same machine, and typically you do that for you development sandbox. But you also can put each on a separate machine and any other combination of the two.
If you only need one Memcached server you will probably be OK with hosting it on the same machine as the application code.
If you start using a front-end cache server such as varnish or you configure NginX as a front-end cache server, you will have to configure some Memcached servers to store the data that these front-end cache servers are caching.
If you distribute your database into multiple servers and file servers into a CDN, that means that your application handles a lot of data in a short period of time so you'll need a lot of RAM space that couldn't be available in one single application server.
And since extending a memory pool for a Memcached server is as easy as adding the IP of the new server to the list, you will be scaling horizontally as in many servers (which is Memcached's true typical use).
The memcached server is a program which manages the data that memcached stores (not to be confused with a machine, which may also be called server). In theory, it can run on any computer. However, it is typically run on the same machine that the main application runs on.
The application then uses its memcached client to talk to the memcached server and ask for cached content. This is faster than querying data from a traditional database because
A memcached server just maps IDs to values, and never needs to scan an entire table
The memcached protocol is simpler. The server doesn't need to parse SQL or so, and the client doesn't need to craft it.
Since memcached does not require the reliability of a database (think of backups, fault isolation, clustering, security etc.), it can be run on the same machine that the application runs on. While you could run a database on the same machine that the applications runs on, doing so is frowned upon for the above reasons.

Alternatives to JMS for queuing

We have a REST web service that receives requests from external systems and makes updates to our DB accordingly. I'm looking to implement a caching/queuing solution for the requests that come in, as we've had some DB server challenges lately, and have lost some messages when the DB server went down.
Before I start putting together a simple persistent file-based queue, I'm wanting to see if there are any good alternatives to JMS as it's use is restricted in our environment.
Current platforms:
Jboss 4.3
Richfaces 3.3
Spring 3.0.5
RESTEasy
** UPDATES **
Per skaffman's question below, my requirements for clustering, transactions, etc.
Clustering: Our web and app servers are all clustered, so the queue(s) will need to be able to process items from all cluster nodes. However, our commits are essentially atomic, so ordering and synchronization issues are extremely minimal. Thread and cluster-safety is not really a factor. Separate/Independent queues on each cluster would be sufficient.
Transactions: Again, due to the atomic nature of our data, transactional needs are minmal/not required outside of each individual request.
Security: Moderate concern, but I would anticipate that to be handled by our regular security on the Web Service. I wouldn't anticipate anything reading or writing to the queue(s) other than the web-app itself. That would only be necessary in instances of high volume or when the DB is unavailable.
Thanks,
Mike
For one project we did use a queue (HornetQ) but was integrated in the war and deployable on a Tomcat because the customer did not want Weblogic or JBoss application servers, but if your restricting policy goes to your application architecture as well such solution would be forbidden.
For another project we did not use any JMS implementation and we make the asynchronous implementation by using a message database and the Service Activator of the spring-integration framework for consuming the events.
That way any message publisher just insert a row in a DB table and the Service Activator trigs the event and call any other service (Spring, Web-service, etc...).