AWS EC2 Spot Instances fit for edge services? - rest

Spot instances can randomly get shut down by Amazon. Does this mean that they would not work well as edge services (e.g. REST services)? Using an Elastic Load Balancer (ELB) plus some persistent EC2 nodes (plus the spot instances), would this work well if the client retried a few times upon failure? Or could they get numerous 404s, even with a few retries?

You will have a little bit of an impact if you decide to use spot instances in this scenario. The key will be getting the load balancer to recognize that the instance is out of service quickly. Also, not using sticky sessions can reduce the chance that they would get repeated 504 (Gateway timeout) errors.

Spot instances are a bit tricky to grok. On one hand they can give you compute power for a very low price, but on the other hand you might lose these instances with minimal notice.
One thing you can do is to give a "max bid" which represents the risk of losing the instances and not only the price you are willing to pay. Since you are not paying your bid price, but only the market price until the market price is higher than your max bid, most of the times you will pay a lower price than your max bid. For example, if you are bidding 90% of the on-demand (OD) price, you will most likely pay less than (for example, 30% of the on-demand price), on average during a run of a week or a month. You can even consider giving a max bid which is higher than on-demand (up to 4 times OD price), and still on average pay much less than the OD price.
It is best to analyze the spot prices for the last 3 months that are provided by the API, and check the behaviour of the market price for the different instance types and in the different regions and availability zones.
Another option you can consider is running 2 auto scaling groups (ASG). One will try to scale (or heal) your spot based instances, and one that will work with on-demand instances. The latter will be slower to kick in, and will work only if the Spot based group is not available due to higher market prices.

Related

How can I choose the right key-value store for my use case?

I will describe the data and case.
record {
customerId: "id", <---- indexed
binaryData: "data" <---- not indexed
}
Expectations:
customerId is random 10 digit number
Average size of binary record data - 1-2 kilobytes
There may be up to 100 records per one customerId
Overall number of records - 500M
Write pattern #1: insert one record at a time
Write pattern #2: batch, maybe in parallel, with speed of at least 20M record per hour
Search pattern #1: find all records by customerId
Search pattern #2: find of all records by customerId group, in parallel, at a rate of at least 10M customerId per hour
Data is not too important, we can trade some aspects of reliability for speed
We suppose to work in AWS / GCP - it's best we key-value store is administered by the cloud
We want to spend no more that 1K USD per month on cloud costs for this solution
What we have tried:
We have this approach implemented in relational database, in AWS RDS MariaDB. Server is 32GB RAM, 2TB GP2 SSD, 8 CPU. I found that IOPS usage was high and insert speed was not satisfactory. After investigation I concluded that due to random nature of customerId there is high rate of different writes to index. After this I did the following:
input data is sorted by customerId ASC
Additional trade was made to reduce index size with little degradation of single record read speed. For this I did some sort of buckets where records 1111111185 and 1111111186 go to same "bucket" 11111111. This way bucket can't contain more than 100 customerIds so read speed will be ok, and write speed improves.
Even like this, I could not make more than 1-3M record writes per hour. Different write concurrencies were tested, current value is 4 concurrent writers. After all modifications it's not clear what else we can improve:
IOPS is not at the top use (~4K per second),
CPU use is not high,
Network is not fully utilized,
Write and read throughputs are not capped.
Apparently, ACID principles are holding us back. I am in look for flatly scalable key-value store and will be glad to hear any ideas and roughly estimations.
So if I understand you...
2kb * 500m records ≈ 1 TB of data
20m writes/hr ≈ 5.5k writes/sec
That's quite doable in NoSQL.
The scale is not the issue. It's your cost.
$1k a month for 1 TB of data sounds like a reasonable goal. I just don't think that the public clouds are quite there yet.
Let me give an example with my recommendation: Scylla Cloud and Scylla Open Source. (Disclosure: I work for ScyllaDB.)
I will caution you that your $1k/month capitation on costs might cause you to consider and make some tradeoffs.
As is typical in high availability deployments, to ensure data redundancy in case of node failure, you could use 3x i3.2xlarge instances on AWS (can store 1.9 TB per instance).
You want the extra capacity to run compactions. We use incremental compaction, which saves on space amplification, but you don't want to go with the i3.xlarge (0.9 tb each), which is under your 1 tb limit unless really pressed for costs. In which case you'll have to do some sort of data eviction (like a TTL) to keep your data to around <600 gb.
Even with annual reserved pricing for Scylla Cloud (see here: https://www.scylladb.com/product/scylla-cloud/#pricing) of $764.60/server, to run the three i3.2xlarge would be $2,293.80/month. More than twice your budget.
Now, if you eschew managed services, and want to run self-service, you could go Scylla Open Source, and just look at the on-demand instance pricing (see here: https://aws.amazon.com/ec2/pricing/on-demand/). For 3x i3.2xlarge, you are running each at $0.624/hour. That's a raw on-demand cost of $449.28 each, which doesn't include incidentals like backups, data transfer, etc. But you could get three instances for $1,347.84. Open Source. Not managed.
Still over your budget, but closer. If you could get reserved pricing, that might just make it.
Edit: Found the reserve pricing:
3x i3.2xlarge is going to cost you
At monthly pricing $312.44 x 3 = $937.32, or
1 year up-front $3,482 annual/12 = $290.17/month/server x 3 = $870.50.
So, again, backups, monitoring, and other costs are above that. But you should be able to bring the raw server cost <$1,000 to meet your needs using Scylla Open Source.
But the admin burden is on your team (and their time isn't exactly zero cost).
For example, if you want monitoring on your system, you'll need to set up something like Prometheus, Grafana or Datadog. That will be other servers or services, and they aren't free. (The cost of backups and monitoring by our team are covered with Scylla Cloud. Part of the premium for the service.)
Another way to save money is to only do 2x replication. Which puts your data in a real risky place in case you lose a server. It is not recommended.
All of this was based on maximal assumptions of your data. That your records are all around 2k (not 1k). That you're not getting much utility out of data compression, which ScyllaDB has built in – see part one (https://www.scylladb.com/2019/10/04/compression-in-scylla-part-one/) and part two (https://www.scylladb.com/2019/10/07/compression-in-scylla-part-two/).
To my mind, you should be able to squeak through with your $1k/month budget if you go reserved pricing and open source. Though adding on monitoring and backups and other incidental costs (which I haven't calculated here) may end you up back over that number again.
Otherwise, $2.3k/month in a fully-managed-cloud enterprise package and you can sleep easy at night.

Multiple node pools vs single pool with many machines vs big machines

We're moving all of our infrastructure to Google Kubernetes Engine (GKE) - we currently have 50+ AWS machines with lots of APIs, Services, Webapps, Database servers and more.
As we have already dockerized everything, it's time to start moving everything to GKE.
I have a question that may sound too basic, but I've been searching the Internet for a week and did not found any reasonable post about this
Straight to the point, which of the following approaches is better and why:
Having multiple node pools with multiple machine types and always specify in which pool each deployment should be done; or
Having a single pool with lots of machines and let Kubernetes scheduler do the job without worrying about where my deployments will be done; or
Having BIG machines (in multiple zones to improve clusters' availability and resilience) and let Kubernetes deploy everything there.
List of consideration to be taken merely as hints, I do not pretend to describe best practice.
Each pod you add brings with it some overhead, but you increase in terms of flexibility and availability making failure and maintenance of nodes to be less impacting the production.
Nodes too small would cause a big waste of resources since sometimes will be not possible to schedule a pod even if the total amount of free RAM or CPU across the nodes would be enough, you can see this issue similar to memory fragmentation.
I guess that the sizes of PODs and their memory and CPU request are not similar, but I do not see this as a big issue in principle and a reason to go for 1). I do not see why a big POD should run merely on big machines and a small one should be scheduled on small nodes. I would rather use 1) if you need a different memoryGB/CPUcores ratio to support different workloads.
I would advise you to run some test in the initial phase to understand which is the size of the biggest POD and the average size of the workload in order to properly chose the machine types. Consider that having 1 POD that exactly fit in one node and assign to it is not the right to proceed(virtual machine exist for this kind of scenario). Since fragmentation of resources would easily cause to impossibility to schedule a large node.
Consider that their size will likely increase in the future and to scale vertically is not always this immediate and you need to switch off machine and terminate pods, I would oversize a bit taking this issue into account and since scaling horizontally is way easier.
Talking about the machine type you can decide to go for a machine 5xsize the biggest POD you have (or 3x? or 10x?). Oversize a bit as well the numebr of nodes of the cluster to take into account overheads, fragmentation and in order to still have free resources.
Remember that you have an hard limit of 100 pods each node and 5000 nodes.
Remember that in GCP the network egress throughput cap is dependent on the number of vCPUs that a virtual machine instance has. Each vCPU has a 2 Gbps egress cap for peak performance. However each additional vCPU increases the network cap, up to a theoretical maximum of 16 Gbps for each virtual machine.
Regarding the prices of the virtual machines notice that there is no difference in price buying two machines with size x or one with size 2x. Avoid to customise the size of machines because rarely is convenient, if you feel like your workload needs more cpu or mem go for HighMem or HighCpu machine type.
P.S. Since you are going to build a pretty big Cluster, check the size of the DNS
I will add any consideration that it comes to my mind, consider in the future to update your question with the description of the path you chose and the issue you faced.
1) makes a lot of sense as if you want, you can still allow kube deployments treat it as one large pool (by not adding nodeSelector/NodeAffinity) but you can have different machines of different sizes, you can think about having a pool of spot instances, etc. And, after all, you can have pools that are tainted and so forth excluded from normal scheduling and available to only a particular set of workloads. It is in my opinion preferred to have some proficiency with this approach from the very beginning, yet in case of many provisioners it should be very easy to migrate from 2) to 1) anyway.
2) As explained above, it's effectively a subset of 1) so better to build up exp with 1) approach from day 1, but if you ensure your provisioning solution supports easy extension to 1) model then you can get away with starting with this simplified approach.
3) Big is nice, but "big" is relative. It depends on the requirements and amount of your workloads. Remember that while you need to plan for loss of a whole AZ anyway, it will be much more frequent to loose single nodes (reboots, decommissions of underlying hardware, updates etc.) so if you have more hosts, impact of loosing one will be smaller. Bottom line is that you need to find your own balance, that makes sense for your particular scale. Maybe 50 nodes is too much, would 15 cut it? Who knows but you :)

Azure Service Fabric reliable collections and memory

Let's say I'm running a Service Fabric cluster on 5 D1 class (1 core, 3.5GB RAM, 50GB SSD) VMs. and that I'm running 2 reliable services on this cluster, one stateless and one stateful. Let's assume that the replica target is 3.
How to calculate how much can my reliable collections hold?
Let's say I add one or more stateful services. Since I don't really know how the framework distributes services do I need to take most conservative approach and assume that a node may run all of my stateful services on a single node and that their cumulative memory needs to be below the RAM available on a single machine?
TLDR - Estimating the expected capacity of a cluster is part art, part science. You can likely get a good lower bound which you may be able to push higher, but for the most part deploying things, running them, and collecting data under your workload's conditions is the best way to answer this question.
1) In general, the collections on a given machine are bounded by the amount of available memory or the amount of available disk space on a node, whichever is lower. Today we keep all data in the collections in memory and also persist it to disk. So the maximum amount that your collections across the cluster can hold is generally (Amount of available memory in the cluster) / (Target Replica Set Size).
Note that "Available Memory" is whatever is left over from other code running on the machines, including the OS. In your above example though you're not running across all of the nodes - you'll only be able to get 3 of them. So, (unrealistically) assuming 0 overhead from these other factors, you could expect to be able to put about 3.5 GB of data into that stateful service replica before you ran out of memory on the nodes on which it was running. There would still be 2 nodes in the cluster left empty.
Let's take another example. Let's say that it is about the same as your example above, except in this case you set up the stateful service to be partitioned. Let's say you picked a partition count of 5. So now on each node, you have a primary replica and 2 secondary replicas from other partitions. In this case, each partition would only be able to hold a maximum of around 1.16 GB of state, but now overall you can pack 5.83 GB of state into the cluster (since all nodes can now be utilized fully). Incidentally, just to prove out the math works, that's (3.5 GB of memory per node * 5 nodes in the cluster) [17.5] / (target replica set size of 3) = 5.83.
In all of these examples, we've also assumed that memory consumption for all partitions and all replicas is the same. A lot of the time that turns out to not be true (at least temporarily) - some partitions can end up with more or less work to do and hence have uneven resource consumption. We also assumed that the secondaries were always the same as the primaries. In the case of the amount of state, it's probably fair to assume that these will track fairly evenly, though for other resource consumption it may not (just something to keep in mind). In the case of uneven consumption, this is really where the rest of Service Fabric's Cluster Resource Management will help, since we can come to know about the consumption of different replicas and pack them efficiently into the cluster to make use of the available space. Automatic reporting of consumption of resources related to state in the collections is on our radar and something we want to do, so in the future, this would be automatic but today you'd have to report this consumption on your own.
2) By default, we will balance the services according to the default metrics (more about metrics is here). So by default, the different replicas of those two different services could end up on the machine, but in your example, you'll end up with 4 nodes with 1 replica from a service on it and then 1 node with two replicas from the two different services. This means that each service (each with 1 partition as per your example) would only be able to consume 1.75 GB of memory in each service for a total of 3.5 GB in the cluster. This is again less than the total available memory of the cluster since there are some portions of nodes that you're not utilizing.
Note that this is the maximum possible consumption, and presuming no consumption outside the service itself. Taking this as your maximum is not advisable. You'll want to reduce it for several reasons, but the most practical reason is to ensure that in the presence of upgrades and failures that there's sufficient available capacity in the cluster. As an example, let's say that you have 5 Upgrade Domains and 5 Fault Domains. Now let's say that a fault domain's worth of nodes fails while you have an upgrade going on in an upgrade domain. This means that (a little less than) 40% of your cluster capacity can be gone at any time, and you probably want enough room left over on the remaining nodes to continue. This means that if your cluster previously could hold 5.83 GB of state (from our prior calculations), in reality you probably don't want to put more than about 3.5 GB of state in it since with more of that the service may not be able to get back to 100% healthy (note also that we don't build replacement replicas immediately so the nodes would have to be down for your ReplicaRestartWaitDuration before you ran into this case). There's a bunch more information about metrics, capacity, buffered capacity (which you can use to ensure that room is left on nodes for the failure cases) and fault and upgrade domains are covered in this article.
There are some other things that practically will limit the amount of state you'll be able to store. You'll want to do several things:
Estimate the size of your data. You can make a reasonable estimate up-front of how big your data is by calculating the size of each field your objects hold. Be sure to take into consideration 64-bit references. This will give you a lower-bound starting point.
Storage overhead. Each object you store in a collection will come with some overhead for storing that object. In the reliable collections depending on the collection and the operations currently in flight (copy, enumerations, updates, etc.) this overhead can range from between 100 and around 700 bytes per item (row) stored in the collections. Do know also that we're always looking for ways to reduce the amount of overhead we introduce.
We also strongly recommend running your service over some period of time and measuring actual resource consumption via performance counters. Simulating some sort of real workload and then measuring the actual usage of the metrics you care about will serve you pretty well. The reason we recommend this in particular is that you will be able to see consumption from things like which CLR object heap your objects end up placed in, how often GC is running, if there's leaks, or other things like this which will impact the amount of memory you can actually utilize.
I know that this has been a long answer but I hope you find it helpful and complete.

What is the biggest Couchbase cluster nodes number?

any body knows what is biggest couchbase cluster has been deployed, since there are lot of info broadcast from each node, i am doubt on the scalability
thanks
I would like to answer this question differently than a "simple number of nodes". In your question you are talking about scalability and some "doubts" about it.... And as you can case, as Couchbase, I have no doubts about the scalability...
When people are using Couchbase, like any NoSQL solution, they have specific use case in mind for their data. And each use case have a specific data "life cycle" (volume, throughput, expiration, ...) So what do you have in mind when you are talking about scalability?
For example I have been working on a project where we have a 20 nodes cluster processing 650,000 op/s with 30% of mutation of the data. For this specific use case, no need to go bigger. You can see in other use case like Draw Something with 80/90 nodes ~50 million total users, 16 million daily users, 2 billion documents...
So instead of talking of "hypothetical" size of the cluster, I would like to understand your use case and type of deployment (available hardware/VMs) to define what will be a good topology.
Check out this article, it covers the growth of the game 'draw something '. They went from a 6 node cluster to a 90 node cluster in 8 weeks due to rapid growth. They also had zero downtime in adding nodes to the cluster and at week 6 were processing 3000 drawings a second.
http://www.couchbase.com/customer-stories/couchbase-helps-omgpop-scale-draw-something-50-million-users-50-days
Edit
Check slide 16 on this link, cluster size of 100+ for Viber
http://www.couchbase.com/presentations/couchbase-tlv-2014-couchbase-at-viber

StarCounter and CAP

I have been reading about a database named Starcounter. It makes a claim that it can handle loads that a "NoSql"-database only can handle without dropping consistency. As far as I understand the CAP-theorem, if you keep consistency, you lose availability or partition tolerance. So what trick makes StarCounter work?
I can imagine that StarCounter is fast, but the claim that NoSql needs to drop consistency to keep up seems a little bit strange to me. Can anyone please explain?
Thanks in advance
Roland
The short answer
The CAP theorem (aka Brewers theorem) cannot be beaten for a single piece of information (like a consistent database). If you have a horizontally scaled database, you won't get consistency and performance. This conclusion comes from the laws of physics and can be deducted from Brewers theorem and Einsteins theories of relativity. You need to scale-in/up, not out. Not very "cloudy", but as the enemies of Galileo would probably confess if they were alive today, nature does a poor job at honouring human fashion.
Scaling consistent data
I'm sure there are other approaches, but Starcounter works by hosting the database image in RAM. Instead of moving database data to the application code, parts of the application code is moved to the database. Only data in the final response gets moved from the original place in RAM memory (where the data was in the first place). This makes most of the data stay put even if there are millions of requests processed every second. The downside is that the database needs to know the programming language of your application logic. The upside, however, is obvious if you have ever tried to serve millions of HTTP requests/sec, each requiring extensive database access.
A more thourough answer
The question is a good one. It is no wonder you find it strange as it was only a few years back that CAP was proven (turned into a theorem). Many developers are as disappointed as a kid would be when theoretical physicist tells him to stop looking for the perpetual motion machine because it cannot work. We still want the scale-out consistent database, don't we?
The CAP theorem
The CAP theorem gives that any piece of information cannot have consistency (C), availability (A) and partition tolerance (P). It applies to a unit of information (such as a database). You can of course have independent pieces of information that operates differently. One piece could be AP, another could be CA and a third could be CP. You just cant have the same information being CAP.
The problem with the impossibility of the 'P' in a consistent and available database results in how a scaled-out database MUST do signalling between the nodes. The conclusion must be, that even in a hundred years from now, CAP gives that a single piece of consistent data will have to live on hardware interconnected using hard wires or light beams.
The problem with the P in CAP
The problem lies in performance if you apply horizontal scaling to an available consistent database. A good performance was the very reason to do horizontally scaling in the first place, this is a very bad thing. As every node needs communicate with the other nodes whenever there is database access to achieve consistency, and given the fact that signalling is ultimately limited by the speed of light, you are left with sad but true fact that database scientist (as well as CPU scientists) are not just being stubborn for failing to see scale-out as a a magical silver bullet. It will not happen because it cannot happen (however, parts of your database could be placed in a AP set, so remember, we are talking about consistent data here). Adding the theories of Einstein to the CAP theorem and the small box wins of the cloudy data-center for consistent data.
Perpetual machines and CAP
The state of things in the database community is a little bit like the state of perpetual motion machines when horse and carriage was the way to get to work. Without any theoretical evidence against it, the patent offices granted hundreds of patents for impossible perpetual machines. Today, we may laugh at this, but we have a similar situations in the database industry with consistent scale-out databases. When you hear somebody claim that they have a scale-out ACID database, be cautious. It was only after the dot com crash mathematicians at MIT proved Brewer right at the CAP theorem was officially born, so the hunt for the impossible has unfortunately not died off just yet. You can compare this, if you want, to the way laggards kept trying to invent the perpetual machine for years after modern theoretical physics should reasonably have put a stop to it. Old habits die hard (my apologies to anyone on Stack overflow still making drawings of bearings and arms moving ad finitum on their own accord - I don't mean to be offensive).
CAP and performance
All is not lost however. Not all pieces of information needs to be consistent. Not all pieces needs to scale-out. You just have the accept Brewers theorem and make the best out of it.
For applications such as Facebook, consistency is dropped. This is okay as data is entered once and then is manipulated by a single users. Still we can experience the side effects in everyday Facebook usage such as things popping in and out of existence for a while.
However, in most business applications, data needs to be correct. The sum of all accounts in your bookkeeping needs to amount to zero. Your stock inventory must equal to 8 if you sold 2 out of 10 items even if there are multiple users buying from the same stock.
The problem with scaling out available data is that you have to make do without partition tolerance. This fancy word simply means that you have to signal between the nodes in your cloud at all times. And as it takes light a few nanoseconds to travel a single meter, this becomes impossible without making your scale-out result in less performance rather than more performance. Of course, this is only true for consistent data. The implications of this has been known by the engineers of Intel, AMD, Oracle et. al for a long time. It is not their scientist haven't heard of scale-out. It is just that they have come to accept the world as Einstein described it.
Some comfort in the gloom
If you do the math, you find that a single PC has instructions to spare on each human being living on Earth for each second it is running (google on 'modern CPU' and 'MIPS'). If you do some more math, like taking the total turnover of Amazon.com (you can find it at wwww.nasdaq.com) divided by the price of an average book, you will find that the total number of sales transactions can fit in RAM of a single modern PC. The cool thing is that the number of items, customers, orders, products etc. occupies the same amount of space in 2012 as it did in 1950. Images, video and audio has increased in size, but numeric and textual information does not grow per item. Sure the number of transactions grows, but not in the same phase as computer power grows. So the logical solution is to scale out read-only and AP data and "scale-in/up" business data.
"Scale-in" instead of "scale-out"
Database engines and business logic running in a VM (like the Java VM or the .NET CLR) typically use fairly effective machine code. This means that moving memory is the overshadowing bottleneck of total throughput for a consistent database. This is often referred to as the memory wall (wikipedia has some useful information).
The trick is to transfer code to the database image instead of data from the database image to the code (if using a MVC or a MVVM pattern). This means that the consuming code executes in the same address space as the database image and that data is never moved (and the disk is merely securing transactions and images). Data can stay in the original database image and does not have to be copied into the memory of the application. Instead of treating the database as a RAM database, the database is treated as primary memory. Everything stays put.
Only data that is part of the final user response is moved out of the database image. For a large scale applications with hundreds of millions of simultaneous users this typically amounts to only a few million requests per second, something that a single PC has no problem with handling given that the HTTP packaging is done on gateway servers. Fortunately, such servers scales out beautifully as they don't need to share data.
As it turns out, the disk is fast at sequential writes so a raided disk can persist terabytes or changes every minute.
Horizontal scaling in Starcounter
Normally you do not scale a Starcounter node. It scales-in rather than out. This works well for a few million simultaneous users. To go above that, you need to add more Starcounter nodes. They can be used to partition data (but then you lose consistency and Starcounter is not designed for partitioning so it is less elegant than solutions such as Volt DB). So a better alternative is to use the additional Starcounter nodes as gateway servers. These servers simple accumulates all incoming HTTP requests for a millisecond at a time. This might sound like a short amount of time, but it is enough to accumulate thousands of request if you decided you need to scale Starcounter. The batch of requests are then sent to the ZLATAN node (Zero LATency Atomicity Node) a thousand times a second. Each such batch can contain thousands of requests. In this way, a few hundred million user sessions can be served by a single ZLATAN node. Although you can have several ZLATAN nodes, there is only one active ZLATAN node at a time. This is how the CAP theorem is honored. To go above that, you need to consider the same tradeoff as Facebook and others.
Another important note is that the ZLATAN node does not serve applications with data. Instead, the applications controller code is run by the ZLATAN node. The cost of serializing/deserializing and sending data to an application is far greater than to process the controller logic cycles. I.e. the code is sent to the database instead of the other way around (a traditional approach is that the applications asks for data or sends data).
Making the "shared-everything" node faster by doing less
The use of the database as a "heap" for the programming language instead of a remote system for serialization and deserialization is a trick that Starcounter calls VMDBMS. If the database is in RAM, you should not move data from one place in RAM to another place in RAM which is the case with most RAM databases.
There is no 'trick'. Starcounter is talking about speed, while CAP/NoSQL are talking about scalability. There is a trade-off between features+scalability vs speed.
Sometimes it's OK to ignore scalability if you can prove there are bottlenecks elsewhere. For instance, a new startup shouldn't worry about their website scaling to a million users, they should worry about getting their first hundred users. (Does anyone remember how often Twitter was down in the early days?) Starcounter can be useful if their transaction rate is much greater than your web page hit rate.
On the other hand, I don't trust anyone who lumps all "NoSQL" Databases together. The various NoSQL databases are more different than alike. They have radically different architectures and properties. Some of them scale to thousands of nodes, some of them don't scale beyond one node. Sometimes adding scalability slows you down. Sometimes removing features speeds you up.
http://strata.oreilly.com/2010/12/strata-gems-mysql-handlersocket.html