i think that there are many reqirements for this.
for example, add a new post to a thread.
why no developer do this upgrade?
redis could do this, but the aim of redis is not same to memcache.
it is rare that redis is used as cache.
Memcached is a hash table. Depending on your needs, you can easily build a scalable set implementation or just use it as a set itself.
Memcached is all about O(1) operations. It's a multi-threaded app whose priority is predictability. Any operation that can't be done in constant time isn't a suitable use case for memcached.
Related
I understand redis is fast and I find that I can implement many things using just redis. But at the expense of making multiple queries for example. In I use Mongo, I might have a model/schema like:
Chatrooms (Mongo)
_id: ObjectId
name: string
users: array
With redis I need something abit more complex
chatrooms:<<id>> where id needs to be manually generated
name
chatrooms:<<id>>:users to store the set of users
To retrieve chatroom details for mongo is straightforward, with mongo I need to incur 2 queries in this case. In a more complex use case, maybe more
So I wonder from a performance point of view, which is more efficient? From a development point of view, definitely, its simpler to use Mongo for example.
So I wonder from a performance point of view, which is more efficient?
From a development point of view, definitely, its simpler to use Mongo
for example.
Redis supports Lua scripting using the EVAL/EVALSHA commands. What you call multiple queries can be simplified to just one of them, since Lua scripts are executed in Redis itself, and then you can call a cached Lua script by SHA from your Redis client.
In terms of performance, it will depend on your own code, but you can be sure that Redis should beat Mongo since Redis is an in-memory storage and it's asynchronously-persisted to disk (actually it creates snapshots overtime or based on some conditions).
About the simplicity, there's not something more simpler than using data structures like sets, hashes or lists, but anyway, simplicity is a subjective concept...
About Lua scripts
Pay attention to the fact that Lua scripts are atomically-executed, and since Redis can execute a command at once (an atomic operation is like a single command), you should not implement heavy operations in Lua because it may decrease the overall Redis performance.
Usually you implement Lua scripts to atomically write data like "set a string key, add a member to some set and remove if from who knows what sorted set". Since you don't want to have a chance of corrupting your data, you use Lua scripting or MULTI command to implement atomic operations.
That's one tricky question - you're basically asking to compare performance between two databases and contrast that with ease of use.
In terms of ease of use, I definitely believe MongoDB is more suitable for newcomers and, perhaps, developers who just want to use a database w/o understanding the pros/cons/strengths/weaknesses of each technology. MongoDB is an amazing piece of tech that is extremely accessible to a wide range of engineers at different levels - that's both its power and its downfall IMO: while you can do a lot with it, it is too generic for my taste to be really effective (sort of like a new Oracle).
Redis, OTOH, provides the basic building blocks - you need to understand how it works and how to put the blocks together to get at the needed result. This requires more effort for a developer - both for learning the Redis way and then implementing the solution with it.
So while MongoDB is easier to start using, I think Redis is more flexible in the long term and allows to have fine-grained control over every aspect of your database.
In terms of performance, I'm positive that Redis will outdo MongoDB any time. Yes, performance is affected by the "number of calls" but that's hardly a conclusive measure. The example provided (chatrooms) can be easily encapsulated in one or a few "calls" (e.g. use a Hash to contain all your objects and id) and that just one approach of doing it. Even assuming the OP's proposed "schema" and doing multiple calls to get the same data from Redis as one call from MongoDB provides, I'm sure that the data will come back faster from Redis than from any other database simply because Redis uses RAM whereas MongoDB is disk-based.
I suggest that in order to choose the right tech/solution, you take a minute to define and consider the requirements - if you need performance, Redis is your best bet.
It depends on what you want to index. If you just want to index by <id>, I don't see why you cannot get all the data in one query from redis. What you can do is to encode the name and users into, say, a JSON string, and retrieve it with just one query.
In general performance comparison is tricky. Less queries don't necessarily mean faster. I normally prefer redis or memcached type of in-memory caching if the data is not complex and does not require index over many fields.
I am using triple store database for one of my project (semantic search engine for healthcare) and it works pretty fine. I am considering on giving it a performance boost by using a layer of key value store above triple store. Triple store querying is slower since we do deep semantic processing.
This is how I am planning to improve performance:
1) Running Hadoop job for all query terms every day by querying triple store.
2) Caching these results in a key value store in a cluster.
3) When user searches for a query term, instead of searching triple store, key value store will be searched first. Triple store will be searched only when query term not found in key value store.
Key value pair which I plan to save is a "String" to "List of POJO mapping". I can save it as a BLOB.
I am confused on using which key value store. I am looking mainly for failover and load balancing support. All I need is a simple key value store which provides above features. I do not need to sort/search within values or any other functionalities.
Please correct me if I am wrong. I am assuming memcached and Redis will be faster since it is in memory. But I do not know if any Java clients of Redis(Jredis) or memchaced(Spymemcached) supports failover. I am not sure whether to go with in memory or persistent storage. I am also considering Voldemort, Cassandra and HBase. Overall key values will be around 2GB to 4GB size. Any pointers on this will be really helpful.
I am very new to nosql and key value stores. Please let me know if you need any more details.
Have you gone over memcached tutorial article (they explain load balancing aspects there, since memcached instances balance load based on your key hash, also spymemcached is discussed how it handles connectivity failures):
Use Memcached for Java enterprise performance, Part 1: Architecture and setup http://www.javaworld.com/javaworld/jw-04-2012/120418-memcached-for-java-enterprise-performance.html
Use Memcached for Java enterprise performance, Part 2: Database-driven web apps http://www.javaworld.com/javaworld/jw-05-2012/120515-memcached-for-java-enterprise-performance-2.html
For enterprise grade fail-over/cross data center replication support in memcached you should use Couchbase that offers these features. The product has evolved from memcached base.
Before you build infrastructure to load your cache, you might just try adding memcached on top of your existing system. First, measure your current performance well. I suggest JMeter or similar tools. Here's the workflow in your application: Check memcached, if it's there, you're done. If not, run the query against the triple store and save the results in memcached. This will improve performance if you have queries that are repeated. Memcached will use the memory you give it efficiently, throwing away things that don't get used very often. Failover is handled by your application (if it's not in memcached, you use your existing infrastructure).
We use triple store and cache data in memcache provided by google app engine and it works fine. It reduced the overhead of sparql query over triple store.
Only cassandra will have mentioned features and CQL full support, which helps in maintaining, otherwise maybe you should look in another direction:
Write heavy, replicated, bigger-than-memory key-value store
Since you want just to cache data in front of your triple store, going with disk-based, or replicated/distributed key-value stores seems to be pointless. All you need is essentially to cache data in front of your queries right on the machines where those queries are done. No "key-value stores", just vanilla Java caching solutions.
In 2016 the best cache for Java is Caffeine.
I use redis for frond end web app back but for backend I want to use riak but is not clear if it supports eventual consistency with atomic increments. I would like to do counters but does not have to be as fast as redis. If riak cannot do it, who can? Besides redis of course.
Thanks
This is not possible. There is no way for us to lock a riak key-value pair in order to prevent other processes from incrementing at the same time. Use a different product.
Riak does support atomic commits, but only for a single key/value. What you need to bear in mind is that when you write to Riak you're writing to a cluster that is eventually consistent. There is a chance that you're going to write a value to one node at the same time that something else is writing a value to the same location in another node. Assuming that allow_mult is set to true on the bucket, this will result in a conflict at read time which will need to be resolved by your application.
I need a recommendation for a key-value store. Here's my criteria:
Doesn't have to be persistent but needs to support lots of records (records are small, 100-1000 bytes)
Insert (put) will happen only occasionally, always in large datasets (bulk)
Get will be random and needs to be fast
Clients will be in Ruby and, perhaps Java
It should be relatively easy to setup and with as little maintenance needed as possible
Redis sounds like the right thing to use here. It's all in memory so it's very fast (The GET and SET operations are both O(1)) and it supports both Ruby and Java clients.
Aerospike would be a perfect because of below reasons:
Key Value based with clients available in Java and Ruby.
Throughput: Better than Redis/Mongo/Couchbase or any other NoSQL solution. See this http://www.aerospike.com/blog/use-1-aerospike-server-not-12-redis-shards/. Have personally seen it work fine with more than 300k read TPS and 100k Write TPS concurrently.
Automatic and efficient data sharding, data re-balancing and data distribution using RIPEMD160.
Highly Available system in case of Failover and/or Network Partitions.
Open sourced from 3.0 version.
Can be used in Caching mode with no persistence.
Supports LRU and TTL.
Little or No maintenance.
An AVL-Tree will give you O(log n) on insert, remove, search and most everything else.
1 and 3 both scream a database engine.
If your number of records isn't insane and you only have one client using this thing at the same time, I would personally recommend sqlite, which works with both Java and Ruby (also would pass #5). Otherwise go with a real database system, like MySql (since you're not on the Microsoft stack).
I am trying to understand what would be the need to go with a solution like memcached. It may seem like a silly question - but what does it bring to the table if all I need is to cache objects? Won't a simple hashmap do ?
Quoting from the memcache web site, memcache is…
Free & open source, high-performance,
distributed memory object caching
system, generic in nature, but
intended for use in speeding up
dynamic web applications by
alleviating database load.
Memcached is an in-memory key-value
store for small chunks of arbitrary
data (strings, objects) from results
of database calls, API calls, or page
rendering. Memcached is simple yet
powerful. Its simple design promotes
quick deployment, ease of development,
and solves many problems facing large
data caches. Its API is available for
most popular languages.
At heart it is a simple Key/Value
store
A key word here is distributed. In general, quoting from the memcache site again,
Memcached servers are generally
unaware of each other. There is no
crosstalk, no syncronization, no
broadcasting. The lack of
interconnections means adding more
servers will usually add more capacity
as you expect. There might be
exceptions to this rule, but they are
exceptions and carefully regarded.
I would highly recommend reading the detailed description of memcache.
Where are you going to put this hashmap? That's what it's doing for you. Any structure you implement on PHP is only there until the request ends. If you throw stuff in a persistent cache, you can fetch it back out for other requests, instead of rebuilding the data.
I know that this question is rather old, but in addition to being able to share a cache across multiple servers, there is also another aspect that is not mentioned in other answers and is the values expiration.
If you store the values in a HashMap, and that HashMap is bound to the Application context, it will keep growing in size, unless you expire items in some ways. Memcached expires object lazily for maximum performance.
When an item is added to the memcache, it can have an expiration time, for instance 600 seconds. After the object is expired it will just remain there, but if another object asks for it, it will purge it and return null.
Similarly, when memcached memory is full, it will look for the first expired item of adequate size and expire it to make room for the new item. Lastly, it can also happen that the cache is full and there isn't any item to expire, in which case it will replace the least used items.
Using a fully flagded cache system usually allow you to replicate the cache on many servers, or just scale to many server just to scale a lot of parallel requestes, all this remaining acceptable fast in term of reply.
There is an (old) article that compares different caching systems used by php:
https://www.percona.com/blog/2006/08/09/cache-performance-comparison/
Basically, file caching is faster than memcached.
So to answer the question, I believe you would have better performances using a file based cache system.
Here are the results from the tests of the article:
Cache Type Cache Gets/sec
Array Cache 365000
APC Cache 98000
File Cache 27000
Memcached Cache (TCP/IP) 12200
MySQL Query Cache (TCP/IP) 9900
MySQL Query Cache (Unix Socket) 13500
Selecting from table (TCP/IP) 5100
Selecting from table (Unix Socket) 7400