When and how to update a cache - rest

I have a service A and a service B.
The service A is a REST API that stores some relevant information, that the sevice B needs, in a database.
The service B handles a lot of traffic and is constantly consuming messages from a Kafka topic. Each message needs some information from the service A. But this information rarely changes, at most it changes a time per day.
So, in order to avoid hitting the REST API constantly for information that rarely changes, i'm going to implement a cache. (Not using a cache would give also the problem of querying the DB all the time). And the service B will hit the cache first, and if it doesn't have the required data it will only hit A once.
Here comes the question.
If the service A updates its information, i would need to update the cache right away.
What is the best way of doing this?
1) I can implement something in the REST API to let B notice that it needs to update his chache, but in terms of separation of concerns and encapsulation, isn't bad that A knows that B handles an internal cache? (I think it is)
2) I can implement a pool in B (and make B check if the info changed every X time) or get the cache updated every X time. But this way i have the risk of not getting the information updated right away.
3) Maybe a cache in A for this information? At least i avoid querying the DB, but not hitting the API :/
Is there a better way of handling this?
Thanks!

This is a question of consistency guarantees and it is a core issue in distributed systems.
Your scenario contains three services: A, B and the database.
If B must never ever under any circumstances use stale data, then you have two options:
All reads will hit the database (no caching at A or B). Built in mechanisms at the database, such as the database's internal cache, disk cache and RAID mirroring, might relief some of the disk I/O bottleneck.
Cache the data at A (or B) and enforce strong consistency between the cache and the database, which means that every write would be done inside a distributed transaction between the database and the cache (or by using some other consensus protocol which provides strong consistency guarantees)
The first option requires no effort, and would work fine for a certain workload, but would become a serious bottleneck if the data ingress at B requires more throughput that the database can withhold.
The second option is quite complex to implement, would slow down data changes complicate the system and hurt its overall availability: if A goes down then data cannot be changed at the database; it a goes down amidst a transaction then the data won't be available for reading from the database (!)
The good news is that most systems don't need such strong consistency guarantees, and they're OK with using stale data occasionally, under specific circumstances.
If this is the case for your system, then there are several ways of invalidating the cache. Personally I'd go with Jose Martinez's suggestion to use a message queueing system, combined with the Publish/Subscribe pattern: service B would publish a "data changed" message to the pub/sub (the message would include information as to what data item changed exactly), service A would be a subscriber processing "data changed" messages and invalidating its cache as they arrive.
Additional points:
Caching inside B might seem like it can provide strong consistency at first, but truth is you might need to scale B so you'll have multiple instances of B, each with its own cache that needs be invalidated and synchronized.
You may use a whole other service for holding the cached data (Redis, Memcached etc.), which would allow you to split he responsibilities over the cached data (B could invalidate it and A could be reading from it directly), but it won't change the essence of the consistency dilemma.

Adding a third bullet point to #CapnSchwenk's answer...
Have A submit all changes to a message queue, like Rabbit MQ. The message queue can handle persistence (in case B is down), and publisher design model implementation. The queue can also contain the new data so that B need not have to query A for the new data.

Based on this statement: "If the service A updates its information, i would need to update the cache right away", then your 2 choices in my experience would be some form of distributed cache:
Have service A provide a listener mechanism that Service B can subscribe to, to invalidate its own internal cache when data changes;
Implement a distributed caching layer such as ehcache or memcache that both Service A and B are aware of; when Service A updates it writes the new value into the cache and all subscribers are automatically updated
Hope that helps!

Related

Microservice data replication patterns

In a microservice architecture, we usually have two ways for 2 microservices to communicate. Let’s say service A needs to get information from service B. The first option is a remote call, usually synchronous over HTTPS, so service A query an API hosted by service B.
The second option is adopting an event-driven architecture, where the state of service B can be published and consumed by service A in an asynchronous way. Using this model, service A can update its own database with the information from the service B’s events and all queries are made locally in this database. This approach has the advantage of a better decoupling of microservices, from development until operations. But it comes with some disadvantages related to data replication.
The first one is the high consumption of disk space, since the same data can reside in the databases of the microservices that need it. But the second one is worst in my opinion: data can become stale if service B can’t process its subscription as fast as needed, or it can’t be available for service A at the same time it’s created at service B, given the eventual consistency of the model.
Let’s say we’re using Kafka as an event hub, and its topics are configured to use 7 days of data retention. Service A is kept in sync as service B publishes its state. After two weeks, a new service C is deployed and its database needs to be enriched with all information that service B holds. We can only get partial information from Kafka topics since the oldest events are gone. My question here is what are the patterns we can use to achieve this microservice’s database enrichment (besides asking service B to republish all its current state to the event hub).
There are 2 options:
You can enable log compaction for Kafka for an individual topic. That will keep the most recent value for a given key discarding old updates. This saves space and also holds more data than the normal mode for a given retention period
Assuming you take a backup of service B DB on a daily basis, on introduction of a new service C, you need to first create the initial state of C from the latest backup of B and then replay the Kafka topic events from the particular offset id that represents the data after the backup.
Your concern is right but at the same time Microservices approach is give and take. You get loose coupling at the cost of individual data base for each service. There is no right answer to microservices architecture and really depends on what you are trying to achieve.
According to CAP theorem you have to compromise between consistency and availability and in most cases we go with eventual consistency . If your service A is not consistent with B then it will eventually be and that's the trade off at the cost of availability.
Another thing regarding microservice is that you only keep the reference of data from other service and may be very limited actual data from other service but definitely not much. And that too only if replicating the data is making your service independent and autonomouse, if you can't achieve any of it even after replicating the data then there is no point. e.g. Your shipping service will have complete history of order transition , but your booking service only have the latest status of order (e.g. in transit , On board etc) . User goes to booking and you show the current status of the order. But if user click details you get all the order transition history from shipping microservice. Now at some point your shipping service goes down and your user comes to check the status you at-least have current order status even when you can't show the details because order status is replicated in the booking service.
Regarding new services joining the system at later stage , Event sourcing is the pattern that you use for these kind of scenarios. Its complex pattern but it will bring your newly added services to the state at which you want them to be. You basically save all your events in an event store and replay them to attain the current state of the system and pre-populate service C database with those events.

Synchronising transactions between database and Kafka producer

We have a micro-services architecture, with Kafka used as the communication mechanism between the services. Some of the services have their own databases. Say the user makes a call to Service A, which should result in a record (or set of records) being created in that service’s database. Additionally, this event should be reported to other services, as an item on a Kafka topic. What is the best way of ensuring that the database record(s) are only written if the Kafka topic is successfully updated (essentially creating a distributed transaction around the database update and the Kafka update)?
We are thinking of using spring-kafka (in a Spring Boot WebFlux service), and I can see that it has a KafkaTransactionManager, but from what I understand this is more about Kafka transactions themselves (ensuring consistency across the Kafka producers and consumers), rather than synchronising transactions across two systems (see here: “Kafka doesn't support XA and you have to deal with the possibility that the DB tx might commit while the Kafka tx rolls back.”). Additionally, I think this class relies on Spring’s transaction framework which, at least as far as I currently understand, is thread-bound, and won’t work if using a reactive approach (e.g. WebFlux) where different parts of an operation may execute on different threads. (We are using reactive-pg-client, so are manually handling transactions, rather than using Spring’s framework.)
Some options I can think of:
Don’t write the data to the database: only write it to Kafka. Then use a consumer (in Service A) to update the database. This seems like it might not be the most efficient, and will have problems in that the service which the user called cannot immediately see the database changes it should have just created.
Don’t write directly to Kafka: write to the database only, and use something like Debezium to report the change to Kafka. The problem here is that the changes are based on individual database records, whereas the business significant event to store in Kafka might involve a combination of data from multiple tables.
Write to the database first (if that fails, do nothing and just throw the exception). Then, when writing to Kafka, assume that the write might fail. Use the built-in auto-retry functionality to get it to keep trying for a while. If that eventually completely fails, try to write to a dead letter queue and create some sort of manual mechanism for admins to sort it out. And if writing to the DLQ fails (i.e. Kafka is completely down), just log it some other way (e.g. to the database), and again create some sort of manual mechanism for admins to sort it out.
Anyone got any thoughts or advice on the above, or able to correct any mistakes in my assumptions above?
Thanks in advance!
I'd suggest to use a slightly altered variant of approach 2.
Write into your database only, but in addition to the actual table writes, also write "events" into a special table within that same database; these event records would contain the aggregations you need. In the easiest way, you'd simply insert another entity e.g. mapped by JPA, which contains a JSON property with the aggregate payload. Of course this could be automated by some means of transaction listener / framework component.
Then use Debezium to capture the changes just from that table and stream them into Kafka. That way you have both: eventually consistent state in Kafka (the events in Kafka may trail behind or you might see a few events a second time after a restart, but eventually they'll reflect the database state) without the need for distributed transactions, and the business level event semantics you're after.
(Disclaimer: I'm the lead of Debezium; funnily enough I'm just in the process of writing a blog post discussing this approach in more detail)
Here are the posts
https://debezium.io/blog/2018/09/20/materializing-aggregate-views-with-hibernate-and-debezium/
https://debezium.io/blog/2019/02/19/reliable-microservices-data-exchange-with-the-outbox-pattern/
first of all, I have to say that I’m no Kafka, nor a Spring expert but I think that it’s more a conceptual challenge when writing to independent resources and the solution should be adaptable to your technology stack. Furthermore, I should say that this solution tries to solve the problem without an external component like Debezium, because in my opinion each additional component brings challenges in testing, maintaining and running an application which is often underestimated when choosing such an option. Also not every database can be used as a Debezium-source.
To make sure that we are talking about the same goals, let’s clarify the situation in an simplified airline example, where customers can buy tickets. After a successful order the customer will receive a message (mail, push-notification, …) that is sent by an external messaging system (the system we have to talk with).
In a traditional JMS world with an XA transaction between our database (where we store orders) and the JMS provider it would look like the following: The client sets the order to our app where we start a transaction. The app stores the order in its database. Then the message is sent to JMS and you can commit the transaction. Both operations participate at the transaction even when they’re talking to their own resources. As the XA transaction guarantees ACID we’re fine.
Let’s bring Kafka (or any other resource that is not able to participate at the XA transaction) in the game. As there is no coordinator that syncs both transactions anymore the main idea of the following is to split processing in two parts with a persistent state.
When you store the order in your database you can also store the message (with aggregated data) in the same database (e.g. as JSON in a CLOB-column) that you want to send to Kafka afterwards. Same resource – ACID guaranteed, everything fine so far. Now you need a mechanism that polls your “KafkaTasks”-Table for new tasks that should be send to a Kafka-Topic (e.g. with a timer service, maybe #Scheduled annotation can be used in Spring). After the message has been successfully sent to Kafka you can delete the task entry. This ensures that the message to Kafka is only sent when the order is also successfully stored in application database. Did we achieve the same guarantees as we have when using a XA transaction? Unfortunately, no, as there is still the chance that writing to Kafka works but the deletion of the task fails. In this case the retry-mechanism (you would need one as mentioned in your question) would reprocess the task an sends the message twice. If your business case is happy with this “at-least-once”-guarantee you’re done here with a imho semi-complex solution that could be easily implemented as framework functionality so not everyone has to bother with the details.
If you need “exactly-once” then you cannot store your state in the application database (in this case “deletion of a task” is the “state”) but instead you must store it in Kafka (assuming that you have ACID guarantees between two Kafka topics). An example: Let’s say you have 100 tasks in the table (IDs 1 to 100) and the task job processes the first 10. You write your Kafka messages to their topic and another message with the ID 10 to “your topic”. All in the same Kafka-transaction. In the next cycle you consume your topic (value is 10) and take this value to get the next 10 tasks (and delete the already processed tasks).
If there are easier (in-application) solutions with the same guarantees I’m looking forward to hear from you!
Sorry for the long answer but I hope it helps.
All the approach described above are the best way to approach the problem and are well defined pattern. You can explore these in the links provided below.
Pattern: Transactional outbox
Publish an event or message as part of a database transaction by saving it in an OUTBOX in the database.
http://microservices.io/patterns/data/transactional-outbox.html
Pattern: Polling publisher
Publish messages by polling the outbox in the database.
http://microservices.io/patterns/data/polling-publisher.html
Pattern: Transaction log tailing
Publish changes made to the database by tailing the transaction log.
http://microservices.io/patterns/data/transaction-log-tailing.html
Debezium is a valid answer but (as I've experienced) it can require some extra overhead of running an extra pod and making sure that pod doesn't fall over. This could just be me griping about a few back to back instances where pods OOM errored and didn't come back up, networking rule rollouts dropped some messages, WAL access to an aws aurora db started behaving oddly... It seems that everything that could have gone wrong, did. Not saying Debezium is bad, it's fantastically stable, but often for devs running it becomes a networking skill rather than a coding skill.
As a KISS solution using normal coding solutions that will work 99.99% of the time (and inform you of the .01%) would be:
Start Transaction
Sync save to DB
-> If fail, then bail out.
Async send message to kafka.
Block until the topic reports that it has received the
message.
-> if it times out or fails Abort Transaction.
-> if it succeeds Commit Transaction.
I'd suggest to use a new approach 2-phase message. In this new approach, much less codes are needed, and you don't need Debeziums any more.
https://betterprogramming.pub/an-alternative-to-outbox-pattern-7564562843ae
For this new approach, what you need to do is:
When writing your database, write an event record to an auxiliary table.
Submit a 2-phase message to DTM
Write a service to query whether an event is saved in the auxiliary table.
With the help of DTM SDK, you can accomplish the above 3 steps with 8 lines in Go, much less codes than other solutions.
msg := dtmcli.NewMsg(DtmServer, gid).
Add(busi.Busi+"/TransIn", &TransReq{Amount: 30})
err := msg.DoAndSubmitDB(busi.Busi+"/QueryPrepared", db, func(tx *sql.Tx) error {
return AdjustBalance(tx, busi.TransOutUID, -req.Amount)
})
app.GET(BusiAPI+"/QueryPrepared", dtmutil.WrapHandler2(func(c *gin.Context) interface{} {
return MustBarrierFromGin(c).QueryPrepared(db)
}))
Each of your origin options has its disadvantage:
The user cannot immediately see the database changes it have just created.
Debezium will capture the log of the database, which may be much larger than the events you wanted. Also deployment and maintenance of Debezium is not an easy job.
"built-in auto-retry functionality" is not cheap, it may require much codes or maintenance efforts.

Kafka validate messages in stateful processing

I have an application where multiple users can send REST operations to modify the state of shared objects.
When an object is modified, then multiple actions will happen (DB, audit, logging...).
Not all the operations are valid for example you can not Modify an object after it was Deleted.
Using Kafka I was thinking about the following architecture:
Rest operations are queuing in a Kafka topic.
Operations to the same object are going to the same partition. So all the object's operations will be in sequence and processed by a consumer
Consumers are listening to a partition and validate the operation using an in-memory database
If the operation was valid then is sent to a "Valid operation topic" otherways is sent to an "Invalid operation topic"
Other consumers (db, log, audit) are listening to the "Valid operation topic"
I am not very sure about point number 3.
I don't like the idea to keep the state of all my objects. (I have billions of objects and even if an object can be of 10mb in size, what I need to store to validate its state is just few Kbytes...)
However, is this a common pattern? Otherwise how can you verify the validity of certain operations?
Also what would do you use as a in-memory database? Surely it has to be highly available, fault-tolerant and support transaction (read and write).
I believe this is a very valid pattern, and is essentially a variation to an event-sourced CQRS pattern.
For example, Lagom implements their CQRS persistence in a very similar fashion (although based on completely different toolset)
A few points:
you are right about the need for sequencial operations: since all your state mutations need to be based on the result of the previous mutation, there must be a strong order in their execution. This is very often the case for such things, so we like to be able to scale those operations horizontally as much as possible so that each of those sequences operations is happening in parallel to many other sequences. In your case we have one such sequence per shared object.
Relying on Kafka partitioning by key is a good way to achieve that (assuming you do not set max.in.flight.requests.per.connection higher than the default value 1). Here again Lagom has a similar approach by having their persistent entity distributed and single-threaded. I'm not saying Lagom is better, I'm just comforting you in the fact that is approach is used by others :)
a key aspect of your pattern is the transformation of a Command into an Event: in that jargon a command is seen as a request to impact the state and may be rejected for various reasons. An event is a description of a state update that happened in the past and is irrefutable from the point of view of those who receive it: a event always tells the truth. The process you are describing would be a controller that is at the boundary between the two: it is responsible for transforming commands into events.
In that sense the "Valid operation topic" you mention would be an event-sourced description of the state updates of your process. Since it's all backed by Kafka it would be arbitrarily partionable and thus scalable, which is awesome :)
Don't worry about the size of the sate of all your object, it must sit somewhere somehow. Since you have this controller that transforms the commands into events, this one becomes the primary source of truth related to that object, and this one is responsible for storing it: this controller handles the primary storage for your events, so you must cater space for it. You can use Kafka Streams's Key value store: those are local to each of your processing instance, though if you make them persistent they have no problem in handling data much bigger that the available RAM. Behind the scene data is spilled to disk thanks to RocksDB, and even more behind the scene it's all event-sourced to a kafka topic so your state store is replicated and will be transparently re-created on another machine if necessary
I hope this helps you finalise your design :)

How do I keep the RDMS and Kafka in sync?

We want to introduce a Kafka Event Bus which will contain some events like EntityCreated or EntityModified into our application so other parts of our system can consume from it. The main application uses an RDMS (i.e. postgres) under the hood to store the entities and their relationship.
Now the issue is how you make sure that you only send out EntityCreated events on Kafka if you successfully saved to the RDMS. If you don't make sure that this is the case, you end up with inconsistencies on the consumers.
I saw three solutions, of which none is convincing:
Don't care: Very dangerous, there can be something going wrong when inserting into an RDMS.
When saving the entity, also save the message which should be sent into a own table. Then have a separate process which consumes from this table and publishes to Kafka and after a success deleted from this table. This is quiet complex to implement and also looks like an anti-pattern.
Insert into the RDMS, keep the (SQL-) Transaction open until you wrote successfully to Kafka and only then commit. The problem is that you potentially keep the RDMS transaction open for some time. Don't know how big the problem is.
Do real CQRS which means that you don't save at all to the RDMS but construct the RDMS out of the Kafka queue. That seems like the ideal way but is difficult to retrofit to a service. Also there are problems with inconsistencies due to latencies.
I had difficulties finding good solutions on the internet.
Maybe this question is to broad, feel free to point me somewhere it fits better.
When saving the entity, also save the message which should be sent into a own table. Then have a separate process which consumes from this table and publishes to Kafka and after a success deleted from this table. This is quiet complex to implement and also looks like an anti-pattern.
This is, in fact, the solution described by Udi Dahan in his talk: Reliable Messaging without Distributed Transactions. It's actually pretty close to a "best practice"; so it may be worth exploring why you think it is an anti-pattern.
Do real CQRS which means that you don't save at all to the RDMS but construct the RDMS out of the Kafka queue.
Noooo! That's where the monster is hiding! (see below).
If you were doing "real CQRS", your primary use case would be that your writers make events durable in your book of record, and the consumers would periodically poll for updates. Think "Atom Feed", with the additional constraint that the entries, and the order of entries, is immutable; you can share events, and pages of events; cache invalidation isn't a concern because, since the state doesn't change, the event representations are valid "forever".
This also has the benefit that your consumers don't need to worry about message ordering; the consumers are reading documents of well ordered events with pointers to the prior and subsequent documents.
Furthermore, you've additionally gotten a solution to a versioning story: rather than broadcasting N different representations of the same event, you send out one representation, and then negotiate the content when the consumer polls you.
Now, polling does have latency issues; you can reduce the latency by broadcasting an announcement of the update, and notifying the consumers that new events are available.
If you want to reduce the rate of false polling (waking up a consumer for an event that they don't care about), then you can start adding more information into the notification, so that the consumer can judge whether to pull an update.
Notice that "wake up and maybe poll" is a process that is triggered by a single event in isolation. "Wake up and poll just this message" is another variation on the same idea. We broadcast a thin version of EmailDeliveryScheduled; and the service responsible for that calls back to ask for the email/an enhanced version of the event with the details needed to construct the email.
These are specializations of "wake up and consume the notification". If you have a use case where you can't afford the additional latency required to poll, you can use the state in the representation of the isolated event.
But trying to reproduce an ordered sequence of events when that information is already exposed as a sharable, cacheable document... That's a pretty unusual use case right there. I wouldn't worry about it as a general problem to solve -- my guess is that these cases are rare, and not easily generalized.
Note that all of the above is about messaging, not about Kafka. Notice that messaging and event sourcing are documented as different use cases. Jay Kreps wrote (2013)
I use the term "log" here instead of "messaging system" or "pub sub" because it is a lot more specific about semantics and a much closer description of what you need in a practical implementation to support data replication.
You can think of the log as acting as a kind of messaging system with durability guarantees and strong ordering semantics
The book of record should be the sole authority for the order of event messages. Any consumer that cares about order should be reading ordered documents from the book of record, rather than reading unordered documents and reconstructing the order.
In your current design....
Now the issue is how you make sure that you only send out EntityCreated events on Kafka if you successfully saved to the RDMS.
If the RDBMS is the book of record (the source of "truth"), then the Kafka log isn't (yet).
You can get there from here, over a number of gentle steps; roughly, you add events into the existing database, you read from the existing database to write into kafka's log; you use kafka's log as a (time delayed) source of truth to build a replica of the existing RDBMS, you migrate your read use cases to the replica, you migrate your write use cases to kafka, and you decommission the legacy database.
Kafka's log may or may not be the book of record you want. Greg Young has been developing Get Event Store for quite some time, and has enumerated some of the tradeoffs (2016). Horses for courses - I wouldn't expect it to be too difficult to switch the log from one of these to the other with a well written code base, but I can't speak at all to the additional coupling that might occur.
There is no perfect way to do this if your requirement is look SQL & kafka as a single node. So the question should be: "What bad things(power failure, hardware failure) I can afford if it happen? What the changes(programming, architecture) I can take if it must apply to my applications?"
For those points you mentioned:
What if the node fail after insert to kafka before delete from sql?
What if the node fail after insert to kafka before commit the sql transaction?
What if the node fail after insert to sql before commit the kafka offset?
All of them will facing the risk of data inconsistency(4 is slightly better if the data insert to sql can not success more than once such as they has a non database generated pk).
From the viewpoint of changes, 3 is smallest, however, it will decrease sql throughput. 4 is biggest due to your business logic model will facing two kinds of database when you coding(write to kafka by a data encoder, read from sql by sql sentence), it has more coupling than others.
So the choice is depend on what your business is. There is no generic way.

NEventStore 3.0 - Throughput / Performance

I have been experimenting with JOliver's Event Store 3.0 as a potential component in a project and have been trying to measure the throughput of events through the Event Store.
I started using a simple harness which essentially iterated through a for loop creating a new stream and committing a very simple event comprising of a GUID id and a string property to a MSSQL2K8 R2 DB. The dispatcher was essentially a no-op.
This approach managed to achieve ~3K operations/second running on an 8 way HP G6 DL380 with the DB on a separate 32 way G7 DL580. The test machines were not resource bound, blocking looks to be the limit in my case.
Has anyone got any experience of measuring the throughput of the Event Store and what sort of figures have been achieved? I was hoping to get at least 1 order of magnitude more throughput in order to make it a viable option.
I would agree that blocking IO is going to be the biggest bottleneck. One of the issues that I can see with the benchmark is that you're operating against a single stream. How many aggregate roots do you have in your domain with 3K+ events per second? The primary design of the EventStore is for multithreaded operations against multiple aggregates which reduces contention and locks for read-world applications.
Also, what serialization mechanism are you using? JSON.NET? I don't have a Protocol Buffers implementation (yet), but every benchmark shows that PB is significantly faster in terms of performance. It would be interesting to run a profiler against your application to see where the biggest bottlenecks are.
Another thing I noticed was that you're introducing a network hop into the equation which increases latency (and blocking time) against any single stream. If you were writing to a local SQL instance which uses solid state drives, I could see the numbers being much higher as compared to a remote SQL instance running magnetic drives and which have the data and log files on the same platter.
Lastly, did your benchmark application use System.Transactions or did it default to no transactions? (The EventStore is safe without use of System.Transactions or any kind of SQL transaction.)
Now, with all of that being said, I have no doubt that there are areas in the EventStore that could be dramatically optimized with a little bit of attention. As a matter of fact, I'm kicking around a few backward-compatible schema revisions for the 3.1 release to reduce the number writes performed within SQL Server (and RDBMS engines in general) during a single commit operation.
One of the biggest design questions I faced when starting on the 2.x rewrite that serves as the foundation for 3.x is the idea of async, non-blocking IO. We all know that node.js and other non-blocking web servers beat threaded web servers by an order of magnitude. However, the potential for complexity introduced on the caller is increased and is something that must be strongly considered because it is a fundamental shift in the way most programs and libraries operate. If and when we do move to an evented, non-blocking model, it would be more in a 4.x time frame.
Bottom line: publish your benchmarks so that we can see where the bottlenecks are.
Excellent question Matt (+1), and I see Mr Oliver himself replied as the answer (+1)!
I wanted to throw in a slightly different approach that I myself am playing with to help with the 3,000 commits-per-second bottleneck you are seeing.
The CQRS Pattern, that most people who use JOliver's EventStore seem to be attempting to follow, allows for a number of "scale out" sub-patterns. The first one people usually queue off is the Event commits themselves, which you are seeing a bottleneck in. "Queue off" meaning offloaded from the actual commits and inserting them into some write-optimized, non-blocking I/O process, or "queue".
My loose interpretation is:
Command broadcast -> Command Handlers -> Event broadcast -> Event Handlers -> Event Store
There are actually two scale-out points here in these patterns: the Command Handlers and Event Handlers. As noted above, most start with scaling out the Event Handler portions, or the Commits in your case to the EventStore library, because this is usually the biggest bottleneck due to the need to persist it somewhere (e.g. Microsoft SQL Server database).
I myself am using a few different providers to test for the best performance to "queue up" these commits. CouchDB and .NET's AppFabric Cache (which has a great GetAndLock() feature). [OT]I really like AppFabric's durable-cache features that lets you create redundant cache servers that backup your regions across multiple machines - therefore, your cache stays alive as long as there is at least 1 server up and running.[/OT]
So, imagine your Event Handlers do not write the commits to the EventStore directly. Instead, you have a handler insert them into a "queue" system, such as Windows Azure Queue, CouchDB, Memcache, AppFabric Cache, etc. The point is to pick a system with little to no blocks to queue up the events, but something that is durable with redundancy built-in (Memcache being my least favorite for redundancy options). You must have that redundancy, in the case that if a server drops, you still have the event queued up.
To finally commit from this "Queued Event", there are several options. I like Windows Azure's Queue pattern for this, because of the many "workers" you can have constantly looking for work in the queue. But it doesn't have to be Windows Azure - I've mimicked Azure's Queue pattern in local code using a "Queue" and "Worker Roles" running in background threads. It scales really nicely.
Say you have 10 workers constantly looking into this "queue" for any User Updated events (I usually write a single worker role per Event type, makes scaling out easier as you get to monitor the stats of each type). Two events get inserted into the queue, the first two workers instantly pick up a message each, and insert them (Commit them) directly into your EventStore at the same time - multithreading, as Jonathan mentioned in his answer. Your bottleneck with that pattern would be whatever database/eventstore backing you select. Say your EventStore is using MSSQL and the bottleneck is still 3,000 RPS. That is fine, because the system is built to 'catch up' when those RPS drops down to, say 50 RPS after a 20,000 burst. This is the natural pattern CQRS allows for: "Eventual Consistency."
I said there was other scale-out patterns native to the CQRS patterns. Another, as I mentioned above, is the Command Handlers (or Command Events). This is one I have done as well, especially if you have a very rich domain domain as one of my clients does (dozens of processor-intensive validation checks on every Command). In that case, I'll actually queue off the Commands themselves, to be processed in the background by some worker roles. This gives you a nice scale out pattern as well, because now your entire backend, including the EvetnStore commits of the Events, can be threaded.
Obviously, the downside to that is that you loose some real-time validation checks. I solve that by usually segmenting validation into two categories when structuring my domain. One is Ajax or real-time "lightweight" validations in the domain (kind of like a Pre-Command check). And the others are hard-failure validation checks, that are only done in the domain but not available for realtime checking. You would then need to code-for-failure in Domain model. Meaning, always code for a way out if something fails, usually in the form of a notification email back to the user that something went wrong. Because the user is no longer blocked by this queued Command, they need to be notified if the command fails.
And your validation checks that need to go to the 'backend' is going to your Query or "read-only" database, riiiight? Don't go into the EventStore to check for, say, a unique Email address. You'd be doing your validation against your highly-available read-only datastore for the Queries of your front end. Heck, have a single CouchDB document be dedicated to only a list of all email addresses in the system as your Query portion of CQRS.
CQRS is just suggestions... If you really need realtime checking of a heavy validation method, then you can build a Query (read-only) store around that, and speed up the validation - on the PreCommand stage, before it gets inserted into the queue. Lots of flexibility. And I would even argue that validating things like empty Usernames and empty Emails is not even a domain concern, but a UI responsiblity (off-loading the need to do real-time validation in the domain). I've architected a few projects where I had very rich UI validation on my MVC/MVVM ViewModels. Of course my Domain had very strict validation, to ensure it is valid before processing. But moving the mediocre input-validation checks, or what I call "light-weight" validation, up into the ViewModel layers gives that near-instant feedback to the end-user, without reaching into my domain. (There are tricks to keep that in sync with your domain as well).
So in summary, possibly look into queuing off those Events before they are committed. This fits nicely with EventStore's multi-threading features as Jonathan mentions in his answer.
We built a small boilerplate for massive concurrency using Erlang/Elixir, https://github.com/work-capital/elixir-cqrs-eventsourcing using Eventstore. We still have to optimize db connections, pooling, etc... but the idea of having one process per aggregate with multiple db connections is aligned with your needs.