why does memcached not support "multi set" - memcached

Can anyone explain why memcached folks decided to support multi get but not multi set.
By multi I mean operation involving more than one key (see protocol at http://code.google.com/p/memcached/wiki/NewCommands).
So you can get multiple keys in one shot (basic advantage is the standard saving you get by doing less round trips) but why can not you get bulk sets?
My theory is that it was meant to do less number of sets and that too individually (e.g. on a cache read and miss). But I still do not see how multi-set really conflicts with the general philosophy of memcached.
I looked at the client features at http://code.google.com/p/memcached/wiki/NewCommonFeatures and it seems that some clients potentially do support "Multi-Set" (why only in binary protocol?). I am using Java spy memcached, btw.

It's not supported in the text protocol because it'd be very, very complicated to express, no clients would support it, and it would provide very little that you can't already do from the text protocol.
It's supported in the binary protocol because it's a trivial use case of binary operations.
spymemcached supports it implicitly -- just do a bunch of sets and magic happens:
http://dustin.github.com/2009/09/23/spymemcached-optimizations.html

I don't know a lot about memcache internals, but I assume writes have to be blocking, atomic operations. I assume that allowing multiple set operations to be batched, you could block all reads for a long time (or risk a get occurring while only half of a write had been applied). Forcing writes to be done individually allows them to be interleaved fairly with gets.

I would imagine that the restriction against using multi sets is to avoid collisions when writing cached values to the memcache.
As an object cache, I can't foresee an example of when you would need transactional type writes. This use case seems less suited for a caching layer, but better suited for the underlying database.
If sets come in interleaved from different clients, it is most likely the case that for one key, the last one wins, or is at least close enough, until the cache is invalidated and a newer value is written.
As Gian mentions, there don't seem to be any good reasons to block reads from the cache while several or many writes to the cache happen.

Related

Event Sourcing - How to query inside a command?

We would like to be able to read state inside a command use case.
We could get the state from event store for the specific aggregate, but what about querying aggregates by field(not id) or performing more complicated queries, that are not fitted for the event store?
The approach we were thinking was to use our read model for those cases as well and not only for query use cases.
This might be inconsistent, so a solution could be to have the latest version of the aggregate stored in both write/read models, in order to be able to tell if the state is correct or stale.
Does this make sense and if yes, if we need to get state by Id should we use event store or the read model?
If you want the absolute latest state of an event-sourced aggregate, you're going to have to read the latest snapshot (assuming that you are snapshotting) and then replay events since that snapshot from the event store. You can be aggressive about snapshotting (conceivably even saving a snapshot after every command), but you're giving away some write performance to make the read faster.
Updating the read model directly is conceivably possible, though that level of coupling is something that should be considered very carefully. Note also that you will very likely need some sort of two-phase commit to ensure that the read model is only updated when the write model is updated and vice versa. I strongly suggest considering why you're using CQRS/ES in this project, because you are quite possibly undermining that reason by doing this sort of thing.
In general, if you need a query for processing a particular command, it's likely that query will generally be the same, i.e. you don't need free-form query support. In that case, you can often have a read model that's tuned for exactly that query and which only cares about events which could affect that query: often a fairly small subset of the events. The finer-grained the read model, the easier it is to keep in sync (if it ignores 99% of events, for instance, it can't really fall that far behind).
Needing to make complex queries as part of command processing could also be a sign that your aggregate boundaries aren't right and could do with a re-examination.
Does this make sense
Maybe. Let's start with
This might be inconsistent
Yup, they might be. So what?
We typically respond to a query by sending an unlocked copy of the answer. In other words, it's possible that the actual information in the write model will change after this response is dispatched but before the response arrives at its destination. The client will be looking at a copy of the answer taken from the past.
So we might reasonably ask how much better it is to get information no more than one minute old compared to information no more than five minutes old. If the difference in value is pennies, then you should probably deploy the five minute version. If the difference is millions of dollars, then you're in a good position to negotiate a real budget to solve the problem.
For processing a command in our own write model, that kind of inconsistency isn't usually acceptable or wise. But neither of the two common answers require keeping the read and write models synchronized. The most common answer is to just work with the write model alone. The less common answer is to grab a snapshot out of a cache, and then apply any additional events to it to bring it up to date. The latter approach is "just" a performance optimization (first rule: don't.)
The variation that trips everyone up is trying to process a command somewhere else, enforcing a consistency rule on our data here. Once again, you need a really clear picture of how valuable the consistency is to the business. If it's really important, that may be a signal that the information in question shouldn't be split into two different piles - you may be working with the wrong underlying data model.
Possibly useful references
Pat Helland Data on the Outside Versus Data on the Inside
Udi Dahan Race Conditions Don't Exist

Is it practical to use one table for reading purpose only in a relational database?

I know this question would not be ideal in a real database world, however, I am building a web REST api to server a result that potentially need to join almost every table(i use normalization for sure).
So is it OK to do have one single table to hold the meta data used for reading API, but the table get updated as well when data updated in other tables? I am using PostgreSQL by the way.
This is not very clear so I will state my understanding of the question and give you what I see are the tradeoffs.
First.... It sounds to me like you want to effectively materialize a metadata table and have it live-updated when other tables update. This is not really what the MATERIALIED VIEW support in PostgreSQL is for.
You can use a trigger to update the data whenever something changes. Because of the way PostgreSQL handles things, this leads to more disk and CPU activity, but will probably add more on the latter than the former. So if you hare heavily CPU-bound that will pose more problems than if you are I/O bound.
Using triggers in this way adds a fair bit of complexity to your database and may reduce write scaling a bit but if the data is seldom written but read frequently it may be a clear win.
So in answer to your question, yes it is practical in at least some cases. Whether it is practical in your case, that will be for you to decide.

Key Value storage without a file system?

I am working on an application, where we are writing lots and lots of key value pairs. On production the database size will run into hundreds of Terabytes, even multiple Petabytes. The keys are 20 bytes and the value is maximum 128 KB, and very rarely smaller than 4 KB. Right now we are using MongoDB. The performance is not very good, because obviously there is a lot of overhead going on here. MongoDB writes to the file system, which writes to the LVM, which further writes to a RAID 6 array.
Since our requirement is very basic, I think using a general purpose database system is hitting the performance. I was thinking of implementing a simple database system, where we could put the documents (or 'values') directly to the raw drive (actually the RAID array), and store the keys (and a pointer to where the value lives on the raw drive) in a fast in-memory database backed by an SSD. This will also speed-up the reads, as all there would not be no fragmentation (as opposed to using a filesystem.)
Although a document is rarely deleted, we would still have to maintain a pool of free space available on the device (something that the filesystem would have provided).
My question is, will this really provide any significant improvements? Also, are there any document storage systems that do something like this? Or anything similar, that we can use as a starting poing?
Apache Cassandra jumps to mind. It's the current elect NoSQL solution where massive scaling is concerned. It sees production usage at several large companies with massive scaling requirements. Having worked a little with it, I can say that it requires a little bit of time to rethink your data model to fit how it arranges its storage engine. The famously citied article "WTF is a supercolumn" gives a sound introduction to this. Caveat: Cassandra really only makes sense when you plan on storing huge datasets and distribution with no single point of failure is a mission critical requirement. With the way you've explained your data, it sounds like a fit.
Also, have you looked into redis at all, at least for saving key references? Your memory requirements far outstrip what a single instance would be able to handle but Redis can also be configured to shard. It isn't its primary use case but it sees production use at both Craigslist and Groupon
Also, have you done everything possible to optimize mongo, especially investigating how you could improve indexing? Mongo does save out to disk, but should be relatively performant when optimized to keep the hottest portion of the set in memory if able.
Is it possible to cache this data if its not too transient?
I would totally caution you against rolling your own with this. Just a fair warning. That's not a knock at you or anyone else, its just that I've personally had to maintain custom "data indexes" written by in house developers who got in way over their heads before. At my job we have a massive on disk key-value store that is a major performance bottleneck in our system that was written by a developer who has since separated from the company. It's frustrating to be stuck such a solution among the exciting NoSQL opportunities of today. Projects like the ones I cited above take advantage of the whole strength of the open source community to proof and optimize their use. That isn't something you will be able to attain working on your own solution unless you make a massive investment of time, effort and promotion. At the very least I'd encourage you to look at all your nosql options and maybe find a project you can contribute to rather than rolling your own. Writing a database server itself is definitely a nontrivial task that needs a huge team, especially with the requirements you've given (but should you end up doing so, I wish you luck! =) )
Late answer, but for future reference I think Spider does this

NoSQL databases: what about read consistency?

From what I can make out NoSQL databases might be a good option for high intensity data read applications, but are a less good fit if you need to do also do a lot data updates and transactionality is very important to you (what with there being no ACID compliance). Right? Too simplistic maybe.
But anyway, supposing I'm partly right at least I'm now concerned about how NoSQL databases maintain a "read consistent" view of the data that you're either reading or writing. Or do they? And if they don't, isn't that a really big problem?
I mean, if the data that you're reading (or updating) is changing as you read it then you're potentially going to get an inconsistent/dirty result set. Coming from an Oracle rdbms background, where all this is just handled for you, I find it confusing how the lack of read consistency is anything but a big problem. Could well be though that I'm missing some key point about all this. Can someone set me straight?
I am a developer on the Oracle NoSQL Database and will answer your question relative to that particular NoSQL system.
The Oracle NoSQL Database API allows the programmer to specify -- with each API call -- the level of read consistency. The four possible values, ranging from strictest to loosest, are Absolute, Time, Version, and None. Absolute says to always read from the replication master so that the most current value is returned. "Time" says that the system can return a value from any replica that is at least within a certain time delta of the master (e.g. read the value from any replica that is within 2 seconds of the master). Every read and write call to the system returns a "version handle". This version handle may be passed into any read call when Consistency.Version is specified and it tells the system to read from any replica which is at least as up to date as that version. This is useful for Read Modify Write (aka CAS) scenarios. The last value, Consistency.None says that any replica can be used (i.e. there is no consistency guaranteed).
I hope this is helpful.
Charles Lamb
A NoSQL database can be read-consistent, although it's generally not a big problem if it's not strictly so, check out the CAP theorem. There's been quite a lot of research done in this area, I recommend reading Amazon's Dynamo paper for a quick view of some of the problems and solutions faced by distributed systems like NoSQL databases.
MongoDB allows the application to select the desired level of read consistency using "write concern". This concept allows your application to block until a certain condition is met for a given write.
By way of example, you can consider any write successful so long as the operation is communicated to a master server. Alternatively, you can block until a write has been propagated to a majority of nodes in your replica set. In this way, you can mix performance/consistency to taste.
It depends on the NoSQL database you are using as each implements a different strategy. You can read, for example, Riak's explanation of their "eventual consistency" model or Lars Hofhansel's writeup on ACID in HBase

Disadvantages of CouchDB

I've very recently fallen in love with CouchDB. I'm pretty excited by its enormous benefits and by its beauty. Now I want to make sure that I haven't missed any show-stopping disadvantages.
What comes to your mind? Attached is a list of points that I have collected. Is there anything to add?
Blog posts from as late as 2010 claim "not mature enough" (whatever that's worth).
Slower than in-memory DBMS.
In-place updates require server-side logic (update handlers).
Trades disk vs. speed: Databases can become huge compared to other DBMS (compaction functionality exists, though).
"Only" eventual consistency.
Temporary views on large datasets are very slow.
Replication of large databases may fail.
Map/reduce paradigm requires rethinking (only for completeness).
The only point that worries me is #3 (in-place updates), because it's quite inconvenient.
The data is in JSON
Which means that documents are quite large (BigData, network bandwidth, speed), and having descriptive key names actually hurts, since they add up to the document size.
No built in full text search
Although there are ways: couchdb-lucene, elasticsearch
plus some more:
It doesn't support transactions
It means that enforcing uniqueness of one field across all documents is not safe, for example, enforcing that a username is unique. Another consequence of CouchDB's inability to support the typical notion of a transaction is that things like inc/decrementing a value and saving it back are also dangerous. There aren't many instances that we would want to simply inc/decrement some value where we couldn't just store the individual documents separately and aggregate them with a view.
Relational data
If the data makes a lot of sense to be in 3rd normal form, and we try to follow that form in CouchDB, we are going to run into a lot of trouble. A possible way to solve this problem is with view collations, but we might constantly going to be fighting with the system. If the data can be reformatted to be much more denormalized, then CouchDB will work fine.
Data warehouse
The problem with this is that temporary views in CouchDB on large datasets are really slow. Using CouchDB and permanent views could work quite well. However, in most of cases, a Column-Oriented Database of some sort is a much better tool for the data warehousing job.
But CouchDB Rocks!
But don't let it discorage you: NoSQL DBs that are written in Erlang (CouchDB, Riak) are the best, since Erlang is meant for distributed systems. Have fun with Couch!
2 more things, which make me cry when using CouchDB (though it's awesome):
It is not designed for frequently updated data
It doesn't have built-in fulltext search
Lack of reader ACLs (does exist for writers, however)
As an old Lotus Domino pro I was looking to CouchDB as an alternative for a new project I'm kicking off and found the limits on readers to be very weak in Couch vs. Domino. In my app security is an important consideration and Couch would require a middleware layer to handle reader security.
If you have database in which it's okay that all defined users can see all the documents, then Couch looks like an interesting platform.
If restricting reads is needed then you'll need to look to a middleware solution or consider another alternative.
Note to CouchDB developers: Improve the platform security options. I realize they will diminish performance when used but note that and make the option available.
Now back to determining which database to use...
currently no support for ad-hoc queries (might change with advent of UnQL)
lack of binary protocol support for faster communication
It's nothing to do with CouchDB itself, but being a relative newcomer on the scene means that most sysadmins are still unfamiliar with it and won't allow it anywhere near "their" data centers. If you're in a situation where you're deploying to an environment you don't control yourself, this can be quite the battle.
Lack of support for data archiving - No official support for data
archiving is provided with couch db open source distribution.
Deleting records from db is not straightforward
No option to set a expire (TTL) flag for documents