Using integer ids with CQRS Pattern - cqrs

I've been asked to apply the CQRS pattern to an existing project. If at all possible this should not make for any schema changes to the database, which uses integer ids.
However, every example of CQRS I've seen uses Guids, my question is is this an inherent requirement of using CQRS or is there a way I can adapt it.
(We're doing this as an exercise, so the usefulness / practicality of doing such a thing isn't in question).

There is no reason why CQRS should require GUIDS. If you want multiple write services you typically need GUIDS, but CQRS is just about separating the read and write services (and the models used by the two).

Since one aspect of using CQRS is that you can use it to create a (large) distributed application, not having a UUID, but an integer ID may hurt the system's ability to work in a distributed fashion.
The problem here is that you rely on a central counter that needs locking, transactions, and all these kinds of things you do not want to have in a distributed system.
So, yes, of course you can use an integer ID, but in terms of scalability it's a bad idea. If this isn't important for you, it will (probably) work anyway.

CQRS requires guids as the source of a command has to be independent of any central dispatch of IDs. They are easy to generate. Where a standard library doesn't provide you with a guid, crypto libraries are actually better. Here's what I use for Go:
func pseudo_uuid() (uuid string) {
b := make([]byte, 16)
_, err := rand.Read(b)
if err != nil {
fmt.Println("Error: ", err)
return
}
uuid = fmt.Sprintf("%X-%X-%X-%X-%X", b[0:4], b[4:6], b[6:8], b[8:10], b[10:])
return
}
While looking at CQRS as just a way to separate a system into 2 systems - one that takes care of state transitions and one for everything else - it is important to observe that the motivation for CQRS is scalability and the the abstractions you build by relying on a central dispatch of IDs will cost you a refactoring effort as you scale up.

Related

CQRS Event Sourcing valueobject & entities accepting commands

Experts,
I am in the evaluating moving a nice designed DDD app to an event sourced architecture as a pet project.
As can be inferred, my Aggregate operations are relatively coarse grained. As part of my design process, I now find myself emitting a large number of events from a small subset of operations to satisfy what I know to be requirements of the read model. Is this acceptable practice ?
In addition to this, I have distilled a lot of the domain complexity via use of ValueObjects & entities. Can VO's/ E accept commands and emit events themselves, or should I expose state and add from the command handler lower down the stack ?
In terms of VO's note that I use mutable operations sparingly and it is a trade off between over complicating other areas of my domain.
I now find myself emitting a large number of events from a small subset of operations to satisfy what I know to be requirements of the read model. Is this acceptable practice ?
In most cases, you should have 1 event by command. Remember events describe the user intent so you have to stay close to the user action.
Can VO's/ E accept commands and emit events themselves
No only aggregates emit events and yes you can come up with very messy aggregates if you have a lot of events, but there is solutions for that like delegating the work to the commands and events themselves. I blogged about this technique here: https://dev.to/maximegel/cqrs-scalable-aggregates-731
In terms of VO's note that I use mutable operations sparingly
As long as you're aware of the consequences. That's fine. Trade-off are part of the job, but be sure you team is aware of that since it's written everywhere that value objects are immutable you expose yourselves to confusion and pointer issues.

ZeroMQ Subscriber filtering on multipart messages

With ZeroMQ PUB/SUB model, is it possible for a subscriber to filter based on contents of more than just the first frame?
For example, if we have a multi-frame message that contains three frames1) data type,2) instrument, and then 3) the actual data,is it possible to subscribe to a specific data type, instrument pair?
(All of the examples I've seen only shows filtering based off of the first message of a multipart message).
Different modes of ZeroMQ PUB/SUB filtering
Initial ZeroMQ model used a subscriber-side subscription-based filtering.
That was easier for publisher, as it need not handle subscriber-specific handling and simply fed all data to all SUB-s ( yes, at the cost of the vasted network traffic and causing SUB-side workload to process all incoming BLOBs, all the way up from the PHY-media, even if it did not subscribe to the particular topic. Yes, pretty expensive in low-latency designs )
PUB-side filtering was initially proposed for later implementation
still, this mode does not allow your idea to fly just on using the PUB/SUB Scaleable Formal Communication Pattern.
ZeroMQ protocol design strives to inspire users, how to achieve just-enough designed distributed systems' behaviour.
As understood from your 1-2-3 intention, a custom logic shall be just enough to achieve the desired processing.
So, how to approach the solution?
Do not hesitate to setup several messaging / control relays in parallel in your application domain-specific solution, it works much better for any tailor-made solutions and a way safer than to try to "bend" a library's primitive archetype ( in this case the trivial PUB/SUB topic-based filtering ) to make something a bit different, than for what an original use-case was designed.
The more true, if your domain-specific use is FOREX / Equity trading, where latency is your worst enemy, so a successful approach has to minimise stream-decoding and alternative branching as much as possible.
nanoseconds do matter
If one reads into details about multiframe composition ( a sender-side BLOB assembly ), there is no latency-wise advantage, as the whole BLOB gets onto wire only after it has been completed - se there is no advantage for your SUB-side in case your idea was to handle initial frame content for signalling, as the whole BLOB arrives "together"

Difference between CQRS and CQS

I am learning what is CQRS pattern and came to know there is also CQS pattern.
When I tried to search I found lots of diagrams and info on CQRS but didn't found much about CQS.
Key point in CQRS pattern
In CQRS there is one model to write (command model) and one model to read (query model), which are completely separate.
How is CQS different from CQRS?
CQS (Command Query Separation) and CQRS (Command Query Responsibility Segregation) are very much related. You can think of CQS as being at the class or component level, while CQRS is more at the bounded context level.
I tend to think of CQS as being at the micro level, and CQRS at the macro level.
CQS prescribes separate methods for querying from or writing to a model: the query doesn't mutate state, while the command mutates state but does not have a return value. It was devised by Bertrand Meyer as part of his pioneering work on the Eiffel programming language.
CQRS prescribes a similar approach, except it's more of a path through your system. A query request takes a separate path from a command. The query returns data without altering the underlying system; the command alters the system but does not return data.
Greg Young put together a pretty thorough write-up of what CQRS is some years back, and goes into how CQRS is an evolution of CQS. This document introduced me to CQRS some years ago, and I find it a still very useful reference document.
This is an old question but I am going to take a shot at answering it. I do not often answer questions on StackOverflow, so please forgive me if I do something outside the bounds of community in terms of linking to things, writing a long answer, etc.
There are many differences between CQRS and CQS however CQRS uses CQS inside of its definition! Let's start with defining the two and then we can discuss differences.
CQS defines two types of messages depending on their return value: no return value (void) specifies this is a Command; a return value (non-void) specifies this method is a Query.
Commands change information
Queries return information
Commands change state. Queries do not.
Now for CQRS, which uses the same definition as CQS for Commands and Queries. What CQRS says is that we do not want one object with Command and Query methods. Instead we want two objects: one with all the Commands and one with all the Queries.
The idea overall is very simple; it's everything after doing this where things become interesting. There are numerous talks online, of me discussing some of the attributes associated (sorry way too much to type here!).
CQS is about Command and Queries. It doesn't care about the model. You have somehow separated services for reading data, and others for writing data.
CQRS is about separate models for writes and reads. Of course, usage of write model often requires reading something to fulfill business logic, but you can only do reads on read model. Separate Databases are state of the art. But imagine single DB with separate models for read and writes modeled in ORM. It's very often good enough.
I have found that people often say they practice CQRS when they have CQS.
Read the inventor Greg Young's answer
I think, like "Dependency Injection" the concepts are so simple and taken for granted that the fact that they have fancy names seems to drive people to think they're something more than they are, especially as CQRS is often quoted alongside Event Sourcing.
CQS is the separation of methods that read to those that change state; don't do both in a single method. This is micro level.
CQRS extends this concept into a higher level for machine-machine APIs, separation of the message models and processing paths.
So CQRS is a principle you apply to the code in an API or facade.
I have found CQRS to essentially be a very strong S in SOLID, pushing the separation deeply into the psyche of developers to produce more maintainable code.
I think web applications are a bad fit for CQRS since the mutation of state via representation transfer means the command and query are two sides of the same request-response. The representation is a command and the response is the query.
For example, you send an order and receive a view of all your orders.
Imagine if the code of a website was factored into a command side and query side. The route action handling code would need to fall into one of those sides, but it does both.
Imagining a stronger segregation, if the code was moved into two different compilable codebases, then the website would accept a POST of a form, but the user would have to browse to another website URL to see the impact of the action. This is obviously crazy. One workaround would be to always redirect, though this wouldn't really be RESTful since the ideal REST application is where the next representation contains hypertext to drive the next state transition and so on.
Given that a website is a REST API between human and machine (or machine and machine), this also includes REST APIs, though other types of HTTP message passing API may be a perfect fit for CQRS.
A service or facade within the bounds of the website may obviously work well with CQRS, though the action handlers would sit outside this boundary.
See CQS on Wikipedia
The biggest difference is CQRS uses separate data stores for commands and queries. A query store can use a different technology like a document database or just be a denormalized schema in the same database that makes querying the data easier.
The data between databases is usually copied asynchronously using something like a service bus. Therefore, data in the query store is eventually consistent (is going to be there at some point of time). Applications need to account for that. While it is possible to use the same transaction (same database or a 2-phase commit) to write in both stores, it is usually not recommended for scalability reasons.
CQS architecture reads and writes from the same data store/tables.

Using Scala, does a functional paradigm make sense for analyzing live data?

For example, when analyzing live stockmarket data I expose a method to my clients
def onTrade(trade: Trade) {
}
The clients may choose to do anything from counting the number of trades, calculating averages, storing high lows, price comparisons and so on. The method I expose returns nothing and the clients often use vars and mutable structures for their computation. For example when calculating the total trades they may do something like
var numTrades = 0
def onTrade(trade: Trade) {
numTrades += 1
}
A single onTrade call may have to do six or seven different things. Is there any way to reconcile this type of flexibility with a functional paradigm? In other words a return type, vals and nonmutable data structures
You might want to look into Functional Reactive Programming. Using FRP, you would express your trades as a stream of events, and manipulate this stream as a whole, rather than focusing on a single trade at a time.
You would then use various combinators to construct new streams, for example one that would return the number of trades or highest price seen so far.
The link above contains links to several Haskell implementations, but there are probably several Scala FRP implementations available as well.
One possibility is using monads to encapsulate state within a purely functional program. You might check out the Scalaz library.
Also, according to reports, the Scala team is developing a compiler plug-in for an effect system. Then you might consider providing an interface like this to your clients,
def callbackOnTrade[A, B](f: (A, Trade) => B)
The clients define their input and output types A and B, and define a pure function f that processes the trade. All "state" gets encapsulated in A and B and threaded through f.
Callbacks may not be the best approach, but there are certainly functional designs that can solve such a problem. You might want to consider FRP or a state-monad solution as already suggested, actors are another possibility, as is some form of dataflow concurrency, and you can also take advantage of the copy method that's automatically generated for case classes.
A different approach is to use STM (software transactional memory) and stick with the imperative paradigm whilst still retaining some safety.
The best approach depends on exactly how you're persisting the data and what you're actually doing in these state changes. As always, let a profiler be your guide if performance is critical.

Specification: Use cases for CRUD

I am writing a Product requirements specification. In this document I must describe the ways that the user can interact with the system in a very high level. Several of these operations are "Create-Read-Update-Delete" on some objects.
The question is, when writing use cases for these operations, what is the right way to do so? Can I write only one Use Case called "Manage Object x" and then have these operations as included Use Cases? Or do I have to create one use case per operation, per object? The problem I see with the last approach is that I would be writing quite a few pages that I feel do not really contribute to the understanding of the problem.
What is the best practice?
The original concept for use cases was that they, like actors, and class definitions, and -- frankly everything -- enjoy inheritance, as well as <<uses>> and <<extends>> relationships.
A Use Case superclass ("CRUD") makes sense. A lot of use cases are trivial extensions to "CRUD" with an entity type plugged into the use case.
A few use cases will be interesting extensions to "CRUD" with variant processing scenarios for -- maybe -- a fancy search as part of Retrieve, or a multi-step process for Create or Update, or a complex confirmation for Delete.
Feel free to use inheritance to simplify and normalize your use cases. If you use a UML tool, you'll notice that Use Cases have an "inheritance" arrow available to them.
The answer really depends on how complex the interactions are and how many variations are possible from object to object. There are two real reasons why I suggest that you develop specific use cases for each CRUD
(a) If you really are only doing a high-level summary of the interaction then the overhead is very small
(b) I've found it useful to specify a set of generic Use Cases for modifying 'Resources' and then extending / overriding particular steps for particular objects. Obviously the common behaviour is captured in the generic 'Resource' use cases.
As your understanding of the domain develops (i.e. as business users dump more requirements on you), you are more likely to add to the CRUD rather than remove it.
It makes sense to distinguish between workflow cases and resource/object lifecycles.
They interact but they are not the same; it makes sense to specify them both.
Use case scenarios or more extended workflow specifications typically describe how a case may proceed through the system's workflow. This will typically include interaction with various different resources. These interactions can often be characterized as C,R,U or D.
Resource lifecycles provide the process model of what may happen to a particular (type of) resource (object). They are often trivial "flower" models that say: any of C,R,U,D may happen to this resource in any order, so they are not very interesting by themselves.
The link between the two is that steps from the workflow and from the lifecycles coincide.
I feel representation - as long as it makes sense and is readable - does not matter. Conforming to the UML spec in all details is especially irrelevant.
What does matter, that you spec clearly states the operations and operation types the implementaton requires.
C: What form of insert operations exists. Can you insert rows not fully populated? Can you insert rows without an ID? Can you retrieve the ID last inserted? Can you cancel an insert selectively? What happens on duplicate keys or constraints failure? Is there a REPLACE INTO equivalent?
R: By what fields can you select? Can you do arbitrary grouping, orders? Can you create aggregate fields, aliases? How can you retrieve embedded (has many etc.) data? How do you specify depth of recursion, limits?
U, D: see R + C