Can I group the Memcached keys so that I can flush a group but the total Memcached

Can I group the Memcached keys so that I can flush a group but the total Memcached - memcached

I use Memcached to store content lists with very key combinations, when user edited the content, I must refresh the cache, but it is hard to say what particular list to refresh, it is not either a good idea to flush the entire Memcached server, so my question is: Can I group the Memcached keys so that I can flush a group but the total Memcached?

Memcached does not natively support flushing of the cache group. What you could try is to group your Memcached keys in namespaces. Check the Memcached wiki for more information.
If by any chance you're using Spring Boot you could try the auto-configuration library for the Memcached cache. The library supports clearing out of the cache group.

Memcached does not have support for range queries so unfortunately you cannot flush a subset of keys.

Related

Apache Ignite with Posgresql

Objective: To scale existing application where PostgreSQL is used as a data store.
How can Apache Ignite help: We have an application which has many modules and all the modules are using some shared tables. So we have only one PostgreSQL master database and It's already on AWS large SSD machines. We already have Redis for caching but as we no limitation of Redis is, It's not easy partial updates and querying on secondary indexes.
Our use case:
We have two big tables, one is member and second is subscription. It's many to many relations where one member is subscribed in multiple groups and we are maintaining subscriptions in subscription table.
Member table size is around 40 million and size of this table is around 40M x 1.5KB + more ~= 60GB
Challenge
A challenge is, we can't archive this data since every member is working and there are frequent updates and read on this table.
My thought:
Apache Ignite can help to provide a caching layer on top of PostgreSQL table, as per I read from the documentation.
Now, I have a couple of questions from an Implementation point of
view.
Will Apache Ignite fits in our use case? If Yes then,
Will apache Ignite keep all data 60GB in RAM? Or we can distribute RAM load on multiple machines?
On updating PostgreSQL database table, we are using python and SQLALchamy (ORM). Will there be a separate call for Apache Ignite to
update the same record in memory OR IS there any way that Apache
Ignite can sync it immediately from Database?
Is there enough support for Python?
Are there REST API support to Interact with Apache Ignite. I can avoid ODBC connection.
How about If this load becomes double in next one year?
A quick answer is much appreciated and Thanks in Advance.

Yes it should fit your case.
Apache Ignite has persistence meaning it can store the data on disk optionally, but if you employ it for caching only it will happily store everything in RAM.
There are two approaches. You can do your updates on Apache Ignite (which will propagate them to PostgreSQL) or you can do your updates to PostgreSQL and have Apache Ignite fetch them on the first use (pull from PostgreSQL). The latter only works for new records as you can imagine. There is no support of propagating data from PostgreSQL to Apache Ignite, I guess you could do something like that by using triggers but it is untested.
There is 3rd party client. I didn't try it. Apache Ignite only has built-in native clients for C++/C#/Java for now, other platforms can only connect through JDBC/ODBC/REST and only use a fraction of functionality.
There is REST API and it have improved recently.
120GB doesn't sound like anything scary as far as Apache Ignite is concerned.

in addition to alamar's answer:
You can store your data in-memory on many machines, as Ignite supports partitioned caches that are divided on parts and are distributed between machines. You can set data-collocations and number of backups.
There is an interesting memory model in Apache Ignite that allows you to persist data on the disk quickly. As Ignite Developers said a database behind the cluster will be slower than Ignite persistence because communication goes through external protocols
In our company we have huge Ignite cluster that keeps in RAM much more data

Zookeeper for Data Storage?

I want a external config store for some of my services , and the data can be in following format like JSON,YML,XML. The use case I want is that I can save my configs , change them dynamically , and the read for these configs will be very frequent. So, for this is Zookeeper a good solution. Also my configs are of atmost 500MB.
The reason that Zookeeper is under consideration as it has synchronization property, version (as I will be changing configs a lot) ,can provide notifications to the depending service of changes to config. Kindly tell if Zookeeper can be data store and will be best for this use case,any other suggestion if possible.

Zookeeper may be used as data store but
Size of single node should not be longer than 1MB
Getting huge amount of nodes from zookeeper will take time, so you need to use caches. You can use Curator PathChildrenCache recipe. If you have tree structure in your zNodes you can use TreeCache, but be aware that TreeCache had memory leaks in various 2.x versions of Curator.
Zookeeper notifications is a nice feature, but if you have pretty big cluster you might have too many watchers which brings stress on your zookeeper cluster.
Please find more information about zookeeper failure reasons.
So generally speaking Zookeeper can be used as a datastore if the data is organized as key/value and value doesn't exceed 1MB. In order to get fast access to the data you should use caches on your application side: see Curator PathChildrenCache recipe.
Alternatives are Etcd and consul

Propagate change in distributed in-memory cache

I've an application deployed on a cluster of 1000 commodity boxes. While starting, each instance of the application loads a non-trivial amount of data from database and uses this as cache. During a day, around 20%of this cached data needs to be updated.
What are the efficient ways of near simultaneous update of in-memory data of entire cluster? I thought of JMX, Zookeeper, but not sure if that would be really efficient/fast enough.

Well assuming you're using Memcached's consistent hashing, go a step further and have each cache replicate to their closest successor. This can lessen the problem but not entirely alleviate it but it's a simple solution, Gossip + CRDTs are another solution, Dynamo and Riak use a combination of Gossip, Consistent Hashing, and CRDTs.

How can I do Memcached HA?

I use memcached to save about 5MB data. About forty percent of the data updates ever seconds, that causes about 280 qps between memcached client and server, with get and set each takes half of the queries. Except the realization of such great data transaction, I also meet the HA problem.
Before I choose memcached, I've also looked at Redis. But it seems to be only one thread and not likely to performs well on data persistence. Also the client for Redis is not that easy and reach when it stands with Memcached.
But how can I do HA with memcached? How should I keep data duplication between the Master and slave memcached server; And when a memcached server crashed, it follows with the data consistency problem. Are there already some good tools for memcached HA, or if there is a better NoSql Database instead of memcached?

Maybe you can use repcached(repcached.lab.klab.org) or magent(https://code.google.com/p/memagent/)

how to rebalance mongodb after adding a shard

How do I rebalance mongodb data after adding a new shard
I do not think it does it automatically?
I could not find anything, it's there support for this?
Also the key for my shards is unique
Thanks,

When you add a new shard, MongoDB should automatically start re-balancing data according to your shard key.
However, it will not start re-balancing until there is a certain amount of data. So if you don't have few gigs of data, you probably won't see any re-balancing.
If you have lots of data and you add a shard and it never starts migrating data, then it's time to go to the google group and start asking questions. As it stands, lots of people are using auto-sharding and it is definitely re-balancing automatically.
I'm not saying you haven't found a bug, just that you'll need a lot more data to back that up.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse