When I have used Pub/Sub pattern frameworks like MVVMLight before, I have seen that the subscriber's calls are handled synchronously. From a scalability point of view, does a reactive framework like Rx help scalability where the pub and sub are completely decoupled and scalable? Which pattern helps scalability?
I don't know the specifics of MVVMLight, but in general the Pub/Sub is a pattern, where:
Publishers and subscribers don't know about each other. They only know about a broker, where they publish/consume messages.
As a result, the publication and consumption of messages is done asynchronously and is completely decoupled. This means that the publication/consumption side can be scaled independently and in case of failures of one part, the other part is able to keep working.
Now, reactive programming is a pattern used to model changes and their propagation across multiple actors. As such, it's not so much concerned with implementation details, but more focused on providing an abstract, declarative interface, which makes it easier to work with streams of events and perform processing on top of them. Straight from ReactiveX's documentation:
ReactiveX is not biased toward some particular source of concurrency or asynchronicity. Observables can be implemented using thread-pools, event loops, non-blocking I/O, actors (such as from Akka), or whatever implementation suits your needs, your style, or your expertise. Client code treats all of its interactions with Observables as asynchronous, whether your underlying implementation is blocking or non-blocking and however you choose to implement it.
So, the decoupling/scalability will be mainly dependent on the implementation used underneath; the main benefit of the framework is mainly the abstract, declarative interface provided.
Regarding the observer pattern (which is mentioned in the question's title): it's a rather low-level primitive that can be used to achieve the same goal, but can probably lead to a much more complex codebase. For more details on the pitfalls of observer pattern when compared with more abstract reactive frameworks, you can read the following paper:
Deprecating the Observer pattern with Scala.React
The reactive programming paradigm is often presented in object-oriented languages as an extension of the Observer design pattern. You can also compare the main reactive streams pattern with the familiar Iterator design pattern, as there is a duality to the Iterable-Iterator pair in all of these libraries. One major difference is that, while an Iterator is pull-based, reactive streams are push-based.
Using an iterator is an imperative programming pattern, even though the method of accessing values is solely the responsibility of the Iterable. Indeed, it is up to the developer to choose when to access the next() item in the sequence. In reactive streams, the equivalent of the above pair is Publisher-Subscriber. But it is the Publisher that notifies the Subscriber of newly available values as they come, and this push aspect is the key to being reactive. Also, operations applied to pushed values are expressed declaratively rather than imperatively: The programmer expresses the logic of the computation rather than describing its exact control flow.
Source: https://projectreactor.io/docs/core/release/reference/#intro-reactive
Related
Experts,
I am in the evaluating moving a nice designed DDD app to an event sourced architecture as a pet project.
As can be inferred, my Aggregate operations are relatively coarse grained. As part of my design process, I now find myself emitting a large number of events from a small subset of operations to satisfy what I know to be requirements of the read model. Is this acceptable practice ?
In addition to this, I have distilled a lot of the domain complexity via use of ValueObjects & entities. Can VO's/ E accept commands and emit events themselves, or should I expose state and add from the command handler lower down the stack ?
In terms of VO's note that I use mutable operations sparingly and it is a trade off between over complicating other areas of my domain.
I now find myself emitting a large number of events from a small subset of operations to satisfy what I know to be requirements of the read model. Is this acceptable practice ?
In most cases, you should have 1 event by command. Remember events describe the user intent so you have to stay close to the user action.
Can VO's/ E accept commands and emit events themselves
No only aggregates emit events and yes you can come up with very messy aggregates if you have a lot of events, but there is solutions for that like delegating the work to the commands and events themselves. I blogged about this technique here: https://dev.to/maximegel/cqrs-scalable-aggregates-731
In terms of VO's note that I use mutable operations sparingly
As long as you're aware of the consequences. That's fine. Trade-off are part of the job, but be sure you team is aware of that since it's written everywhere that value objects are immutable you expose yourselves to confusion and pointer issues.
When learning Scala, one of the first things I learned was that every function returns something. There is no "void"-function/method as there is, for instance in Java. Thus many Scala-functions are true functions, in a mathematic way, and objects can remain largely stateless.
Now I learned that the actor model is a very popular model among functional languages like Scala. However, actors promote a fire-and-forget style of programming, and callers usually don't expect callees to directly reply to messages (except when using the "ask"/"?"-method). Therefore, actors need to remember some sort of state.
Am I right assuming that the actor model is more like a trade-off between scalability and maintainability (due to its statefulness), and could sometimes even be considered an anti-pattern?
Yes you're essentially right (I'm not quite sure what you have in mind when you say scalability vs maintainability).
Actors are popular in Scala because of Akka (which presumably is in turn popular because of the support it gets from Lightbend). It is, not however, the case that actors are overwhelmingly popular in general in the functional programming world (although implementations exist for all the languages I'm thinking of). Below are my vastly simplified impressions (so take them with the requisite amount of salt) of two other FP language communities, both of which use actors (far?) less frequently than Scala does.
The Haskell community tends to use either STM/channels (often in an STM context). Straight up MVars also get used surprisingly often.
The Clojure community sometimes touts its own built-in version of STM, but its flagship concurrency model is really core.async, which is at its heart again channels.
As an aside STM, channels, and actors can all be layered upon one another; its sort of weird to compare them as if they were mutually exclusive approaches. In practice though it's rare to see them all used in tandem.
Actors do indeed involve state (and in the case of Akka skirt type safety) and as a result are very expressive and can pretty much do anything concurrency-wise. In this way they're similar to side-effectful functions, which are more expressive than pure functions. Indeed actors in a way are the pure essence of OO, with all its pros and cons.
As such there is a sizable chunk of the Scala community that would say yes, if most of the time when you face concurrency issues, you're using actors, that's probably an anti-pattern.
If you can, try to get away with just using Futures or scalaz.concurrent.Tasks. In return for less expressiveness you get more composability.
If your problem naturally lends itself to a single, global state (e.g. in the form of global invariants that you want to enforce), think about STM. In the Scala community, although an STM library exists, my impression is that STM is usually emulated by using actors.
If your concurrency problems mainly revolves around streaming multiple sources of data, think about using one of Scala's streaming libraries.
Actors are specifically a tool in the toolbox for handling and distributing state. So yes, they should have state - if they don't then you just could use Futures.
Please note however that Actors (at least Akka Actors) handle distribution (running location-transparently on multiple nodes) which neither functions of Futures are able to do. The concurrency aspects of Actors are a result of them handling the more complex case - networking. In that sense, Actors unify the remote case with the local case, by making the remote case be first-class. And as it turns out, on networks messaging is exactly what you can both count and build on if you want reliable, resilient and also fast systems.
Hope this answers the "big picture" part of your question.
So now with swift, the ReactiveCocoa people have rewritten it in version 3.0 for swift
Also, there's been another project spun up called RxSwift.
I wonder if people could add information about what the differences in design/api/philosophy of the two frameworks are (please, in the spirit of SO, stick to things which are true, rather than opinions about which is "best")
[Note for StackOverflow mods: This question DOES have definitive answers, the answer is the differences between the two frameworks. I think it is also highly on topic for SO]
To get started, my initial impression from reading their ReadMe's is:
As someone who is familiar with the "real" C# Rx from microsoft, RxSwift looks a lot more recognisable.
ReactiveCococa seems to have gone off into its own space now, introducing new abstractions such as Signals vs SignalProducers and Lifting. On the one hand this seems to clarify some situations (what's a Hot vs Cold signal) but on the other hand this seems to increase the complexity of the framework a LOT
This is a very good question. Comparing the two worlds is very hard. Rx is a port of what Reactive Extensions are in other languages like C#, Java or JS.
Reactive Cocoa was inspired by Functional Reactive Programming, but in the last months, has been also pointed as inspired by Reactive Extensions as well. The outcome is a framework that shares some things with Rx, but has names with origins in FRP.
The first thing to say is that neither RAC nor RxSwift are Functional Reactive Programming implementations, according to Conal's definition of the concept. From this point everything can be reduced to how each framework handles side effects and a few other components.
Let's talk about the community and meta-tech stuff:
RAC is a 3 years old project, born in Objective-C later ported to Swift (with bridges) for the 3.0 release, after completely dropping the ongoing work on Objective-C.
RxSwift is a few months old project and seems to have a momentum in the community right now. One thing that is important for RxSwift is that is under the ReactiveX organization and that all other implementations are working in the same way, learning how to deal with RxSwift will make working with Rx.Net, RxJava or RxJS a simple task and just a matter of language syntax. I could say that is based on the philosophy learn once, apply everywhere.
Now it's time for the tech stuff.
Producing/Observing Entities
RAC 3.0 has 2 main entities, Signal and SignalProducer, the first one publishes events regardless a subscriber is attached or not, the second one requires a start to actually having signals/events produced. This design has been created to separate the tedious concept of hot and cold observables, that has been source of confusion for a lot of developers. This is why the differences can be reduced to how they manage side effects.
In RxSwift, Signal and SignalProducer translates to Observable, it could sound confusing, but these 2 entities are actually the same thing in the Rx world. A design with Observables in RxSwift has to be created considering if they are hot or cold, it could sound as unnecessary complexity, but once you understood how they work (and again hot/cold/warm is just about the side effects while subscribing/observing) they can be tamed.
In both worlds, the concept of subscription is basically the same, there's one little difference that RAC introduced and is the interruption event when a Signal is disposed before the completion event has been sent.
To recap both have the following kind of events:
Next, to compute the new received value
Error, to compute an error and complete the stream, unsubscribing all the observers
Complete, to mark the stream as completed unsubscribing all observers
RAC in addition has interrupted that is sent when a Signal is disposed before completing either correctly or with an error.
Manually Writing
In RAC, Signal/SignalProducer are read-only entities, they can't be managed from outside, same thing is for Observable in RxSwift. To turn a Signal/SignalProducer into a write-able entity, you have to use the pipe() function to return a manually controlled item. On the Rx space, this is a different type called Subject.
If the read/write concept sounds unfamiliar, a nice analogy with Future/Promise can be made. A Future is a read-only placeholder, like Signal/SignalProducer and Observable, on the other hand, a Promise can be fulfilled manually, like for pipe() and Subject.
Schedulers
This entity is pretty much similar in both worlds, same concepts, but RAC is serial-only, instead RxSwift features also concurrent schedulers.
Composition
Composition is the key feature of Reactive Programming. Composing streams is the essence of both frameworks, in RxSwift they are also called sequences.
All the observable entities in RxSwift are of type ObservableType, so we compose instances of Subject and Observable with the same operators, without any extra concern.
On RAC space, Signal and SignalProducer are 2 different entities and we have to lift on SignalProducer to be able to compose what is produced with instances of Signal. The two entities have their own operators, so when you need to mix things, you have to make sure a certain operator is available, on the other side you forget about the hot/cold observables.
About this part, Colin Eberhardt summed it nicely:
Looking at the current API the signal operations are mainly focussed on the ‘next’ event, allowing you to transform values, skip, delay, combine and observe on different threads. Whereas the signal producer API is mostly concerned with the signal lifecycle events (completed, error), with operations including then, flatMap, takeUntil and catch.
Extra
RAC has also the concept of Action and Property, the former is a type to compute side effects, mainly relating to user interaction, the latter is interesting when observing a value to perform a task when the value has changed. In RxSwift the Action translates again into an Observable, this is nicely shown in RxCocoa, an integration of Rx primitives for both iOS and Mac. The RAC's Property can be translated into Variable (or BehaviourSubject) in RxSwift.
It's important to understand that Property/Variable is the way we have to bridge the imperative world to the declarative nature of Reactive Programming, so sometimes is a fundamental component when dealing with third party libraries or core functionalities of the iOS/Mac space.
Conclusion
RAC and RxSwift are 2 complete different beasts, the former has a long history in the Cocoa space and a lot of contributors, the latter is fairly young, but relies on concepts that have been proven to be effective in other languages like Java, JS or .NET. The decision on which is better is on preference. RAC states that the separation of hot/cold observable was necessary and that is the core feature of the framework, RxSwift says that the unification of them is better than the separation, again it's just about how side effects are managed/performed.
RAC 3.0 seems to have introduced some unexpected complexity on top of the major goal of separating hot/cold observables, like the concept of interruption, splitting operators between 2 entities and introducing some imperative behaviour like start to begin producing signals. For some people these things can be a nice thing to have or even a killer feature, for some others they can be just unnecessary or even dangerous. Another thing to remember is that RAC is trying to keep up with Cocoa conventions as much as possible, so if you are an experienced Cocoa Dev, you should feel more comfortable to work with it rather than RxSwift.
RxSwift on the other hand lives with all the downsides like hot/cold observables, but also the good things, of Reactive Extensions. Moving from RxJS, RxJava or Rx.Net to RxSwift is a simple thing, all the concepts are the same, so this makes finding material pretty interesting, maybe the same problem you are facing now, has been solved by someone in RxJava and the solution can be reapplied taking in consideration the platform.
Which one has to be picked is definitely a matter of preference, from an objective perspective is impossible to tell which one is better. The only way is to fire Xcode and try both of them and pick the one that feels more comfortable to work with. They are 2 implementations of similar concepts, trying to achieve the same goal: simplifying software development.
I read the paper cowritten by Odersky, "Deprecating the Observer Pattern
with Scala.React"
The github looks abandoned:
https://github.com/ingoem/scala-react
Also, the recent Reactive Programming Coursera class, used the JavaRx Observable library (with Scala support of course).
Is there a story behind this? I can presume scala.react just didn't make it very far. Is the JavaRx library based on Observable advisable? Or can we expect something similar or better from Typesafe?
Citing Li Haoyi,
who has used Scala.React, his observations are:
"it is extremely difficult to set up and get started."
"It requires a fair amount of global configuration"
"It took several days to get a basic dataflow graph (..,) working."
He had a lot of questions but did not manage to contact the author of the publication...
Li also implemented a Scala.RX addressing these and other issues.
The code is good shape but I cannot observe any action of pushing it into the Standard Scala library. Also, Li is the driver behind the ongoing Scala & Javascript effort thus he is mostly occupied with that project.
Answering your questions:
Is the JavaRx library based on Observable advisable?
JavaRx is based on the Observer pattern Martin Odersky tried to deprecate...
https://github.com/Netflix/RxJava/blob/master/rxjava-core/src/main/java/rx/Observer.java
https://github.com/Netflix/RxJava/blob/master/rxjava-core/src/main/java/rx/Observable.java
While every issue Martin pointed out in the paper is true and valid,
Netflix had exploited a major property of Observables:
Futures and Observables share an isomorphism, thus are composable.
In JavaRx, an Observable returns a stream of events. However, a Future
on the other hand, can be seen as a specialized Observable that returns
only a singleton. In this case, Futures and Observables can be asynchronously composed
whenever it makes sense.
Is there a story behind this?
No idea but maybe Netflix did some sponsoring. You may have noticed the Netflix logo appearing in the RX diamonds examples....
Or can we expect something similar or better from Typesafe?
I honestly doubt that. Why should they? Typesafe is busy with pushing their
stack into industry and advancing Akka further. Scala.React is a neat idea but
does not produce any cash whereas Akka brings them paying customers....
Instead I would ask the question what exactly Scala.React, after all, tries to solve?
IMHO,JavaRx already does a good job, is in production and those improvements Scala.React could possible add are most likely not enough for a major change.
RxJava: Reactive Extensions has very little in common with scala.react. RxJava deals with observers and concurrency but helps very little regarding correctness of evaluation order. Basically it is just streams of events, and if events that are split into several effects those will never be coherent again. Basically it's a mess and can only be used for GUI where precision in computation is not so critical. You never know when you get an extra update or extra refresh.
scala.react is a single threaded computation model and deals with order of computation with a strict evaluation order that is defined by the functional dependencies between computations.
Akka, or actors, again, is a third model and completely different thing. It is just threads with some fancy syntax and scheduliing, really.
No wonder everyone is confused. Sadly scala.react has not moved anywhere, which is bad as it's the only innovative model of these three.
I was reading an interesting blog post about Erlang/OTP and the actor model. I also hear that Scala supports the actor model. From the little I gathered so far, the actor model breaks down processing into components that communicate with each other by passing messages. Typically, those processes are immutable.
Are those features language-specific though or more at the architecture level? more specifically, can't you just implement the same actor model in almost any language, and just use some form of message-queue to pass messages between worker processes? (for example, use something like celery). Or is it that those languages like Erlang and Scala simply do this transparently and much faster?
Certainly you can define an "Actor Library" in virtually any language, but in Erlang the model is baked-in to the language, and is really the only concurrency model available.
While Scala's actors system is well implemented, at the end of the day, it still vulnerable to some hazards that Erlang is immune from. I'll draw your attention to this paper.
This would be the case for any Actor library implemented in any imperative language that supports shared mutable state.
An interesting exception to this is Nodes.js. Some work is being done with actors between Nodes that probably exhibit the same isolation properties as Erlang, simply because there is no shared mutable state.
Actor model is not limited to any specific platform or programming language, it's just a model after all.
Erlang and Scala have really good and useful implementations of this model, which fits nicely in typical technology stack of these platforms and helps to effectively solve certain kinds of tasks.
To add to the points mentioned above, the fact that in Erlang actor model is the only way you can program, makes your code scalable from the get-go. Erlang processes are lightweight, and you can spawn 10-100K on one machine (I don't think you can do it with python), this changes the way you approach problems. For example, in our product we parse web server logs with Erlang and spawn an Erlang process to handle each line. That way, if one log line is corrupted, or the process that handles it crashes, nothing happens to the other ones.
Another difference is when you start using OTP you get processes supervisors and you can make processes connected so if one terminates all the others do.
Other than that, Erlang has some other nice feature (which can be found in other languages through libraries, but again here it's baked in) like pattern matching and hot deploy.
No, there is nothing language-specific about the Actor Model. In fact, you already mention Scala in your question, where actors are not part of the language but are instead implemented as a library. (Three competing libraries, actually.)
However, just like Functional Programming or Object-Oriented Programming, having direct support for Actor Programming, or at least support for some abstractions that make it easier to implement, in the language will lead to a very different programming experience. Anyone who has ever done Functional Programming or Object-Oriented Programming in C will probably understand this.