All of our backend services is written in Scala. We mostly write pure functional Scala using Cats.
I am trying to figure out if there is a design pattern in Cats or Scala in general that I can use to design a EventLogger.
This eventLogger should collects "events" (simple key values) as the request flows through the logic. At the end of the request, I want to write the collected events to data store. We already have a "context" implicit parameter that gets passed to all the methods. I could add this EventLogger to my Context class and it would have access to the event logger from most parts of my code. Now I am trying to figure out how to design the eventLogger itself, without using a mutable collection.
I have used to akka actors to collect states in the past to manage mutating states. I would prefer not to introduce Akka into our classpath just for this.
As #AndreyTyukin suggests, Writer would work well here.
You could do something like:
object EventLogger {
type Event = (String, String)
def log(event: Event): Writer[Vector[Event], Unit] =
Writer.tell(Vector(event))
}
And then you could use it like this:
// Example usage
for {
something <- Writer.value(myFunc(arg))
_ <- EventLogger.log("function_finished" -> "myFunc")
somethingElse <- Writer.value(myFunc2(arg2))
_ <- EventLogger.log("function_finished" -> "myFunc2")
} yield combine(something, somethingElse)
At the end of this, you'll have some kind of Writer[Vector[Event], ?] value where the ? might be a value you're interested in and the Vector[Event] is all of your Event log data ready for you to do something with.
Also usually Writer won't be the only container you want to use. You probably want to investigate Monad Transformers to stack up containers or something like Eff.
Related
I read the definition of a State Monad as follows, which includes a definition of flatMap. The literal definition of flatMap is clear to me, but what would be a typical use case of it?
trait State[S,A]{
def run (initial:S):(S,A)
def flatMap[B] (f:A=>State[S,B]):State[S,B] =
State{ s =>
val (s1, a) = run(s)
f(a).run(s1)
}
}
object State{
def apply[S,A](f:S => (S,A)):State[S,A] =
new State[S,A] {
def run(initial:S): (S,A) = f(initial)
}
}
According to cats State monad documentation
The flatMap method on State[S, A] lets you use the result of one State
in a subsequent State
It means we can have state transitions nicely lined up within a for-comprehension like so
val createRobot: State[Seed, Robot] =
for {
id <- nextLong
sentient <- nextBoolean
isCatherine <- nextBoolean
name = if (isCatherine) "Catherine" else "Carlos"
isReplicant <- nextBoolean
model = if (isReplicant) "replicant" else "borg"
} yield Robot(id, sentient, name, model)
In general, purpose of flatMap is chaining of monadic computations, so whatever monad we have we can stick it within a for-comprehension.
This quote is taken from wikibooks about state monad in haskell.
If you have programmed in any other language before, you likely wrote some functions that "kept state". For those new to the concept, a state is one or more variables that are required to perform some computation but are not among the arguments of the relevant function. Object-oriented languages like C++ make extensive use of state variables (in the form of member variables inside classes and objects). Procedural languages like C on the other hand typically use global variables declared outside the current scope to keep track of state.
In Haskell, however, such techniques are not as straightforward to apply. Doing so will require mutable variables which would mean that functions will have hidden dependencies, which is at odds with Haskell's functional purity. Fortunately, often it is possible to keep track of state in a functionally pure way. We do so by passing the state information from one function to the next, thus making the hidden dependencies explicit.
Basically its purpose is the to write purely functional programs that manipulate state, having the API compute the next state rather than actually mutate anything.
The most common examples for the state monad are:
Generate random numbers.
Building games
Parsers
Data Stuctures
Any finite state machine program.
You can also check the cats page for the state monad
Note: There is an additional more complexed state monad which is called IndexedState monad, which basically gives you the option to change the state.
I am writing a class that takes a Flow (representing a kind of socket) as a constructor argument and that allows to send messages and wait for the respective answers asynchronously by returning a Future. Example:
class SocketAdapter(underlyingSocket: Flow[String, String, _]) {
def sendMessage(msg: MessageType): Future[ResponseType]
}
This is not necessarily trivial because there may be other messages in the socket stream that are irrelevant, so some filtering is required.
In order to test the class I need to provide something like a "TestFlow" analogous to TestSink and TestSource. In fact I can create a flow by combining both. However, the problem is that I only obtain the actual probes upon materialization and materialization happens inside the class under test.
The problem is similar to the one I described in this question. My problem would be solved if I could materialize the flow first and then pass it to a client to connect to it. Again, I'm thinking about using MergeHub and BroadcastHub and again I see the problem that the resulting stream would behave differently because it is not linear anymore.
Maybe I misunderstood how a Flow is supposed to be used. In order to feed messages into the flow when sendMessage() is called, I need a certain kind of Source anyway. Maybe a Source.actorRef(...) or Source.queue(...), so I could pass in the ActorRef or SourceQueue directly. However, I'd prefer if this choice was up to the SocketAdapter class. Of course, this applies to the Sink as well.
It feels like this is a rather common case when working with streams and sockets. If it is not possible to create a "TestFlow" like I need it, I'm also happy with some advice on how to improve my design and make it better testable.
Update: I browsed through the documentation and found SourceRef and SinkRef. It looks like these could solve my problem but I'm not sure yet. Is it reasonable to use them in my case or are there any drawbacks, e.g. different behaviour in the test compared to production where there are no such refs?
Indirect Answer
The nature of your question suggests a design flaw which you are bumping into at testing time. The answer below does not address the issue in your question, but it demonstrates how to avoid the situation altogether.
Don't Mix Business Logic with Akka Code
Presumably you need to test your Flow because you have mixed a substantial amount of logic into the materialization. Lets assume you are using raw sockets for your IO. Your question suggests that your flow looks like:
val socketFlow : Flow[String, String, _] = {
val socket = new Socket(...)
//business logic for IO
}
You need a complicated test framework for your Flow because your Flow itself is also complicated.
Instead, you should separate out the logic into an independent function that has no akka dependencies:
type MessageProcessor = MessageType => ResponseType
object BusinessLogic {
val createMessageProcessor : (Socket) => MessageProcessor = {
//business logic for IO
}
}
Now your flow can be very simple:
val socket : Socket = new Socket(...)
val socketFlow = Flow.map(BusinessLogic.createMessageProcessor(socket))
As a result: your unit testing can exclusively work with createMessageProcessor, there's no need to test akka Flow because it is a simple veneer around the complicated logic that is tested independently.
Don't Use Streams For Concurrency Around 1 Element
The other big problem with your design is that SocketAdapter is using a stream to process just 1 message at a time. This is incredibly wasteful and unnecessary (you're trying to kill a mosquito with a tank).
Given the separated business logic your adapter becomes much simpler and independent of akka:
class SocketAdapter(messageProcessor : MessageProcessor) {
def sendMessage(msg: MessageType): Future[ResponseType] = Future {
messageProcessor(msg)
}
}
Note how easy it is to use Future in some instances and Flow in other scenarios depending on the need. This comes from the fact that the business logic is independent of any concurrency framework.
This is what I came up with using SinkRef and SourceRef:
object TestFlow {
def withProbes[In, Out](implicit actorSystem: ActorSystem,
actorMaterializer: ActorMaterializer)
:(Flow[In, Out, _], TestSubscriber.Probe[In], TestPublisher.Probe[Out]) = {
val f = Flow.fromSinkAndSourceMat(TestSink.probe[In], TestSource.probe[Out])
(Keep.both)
val ((sinkRefFuture, (inProbe, outProbe)), sourceRefFuture) =
StreamRefs.sinkRef[In]()
.viaMat(f)(Keep.both)
.toMat(StreamRefs.sourceRef[Out]())(Keep.both)
.run()
val sinkRef = Await.result(sinkRefFuture, 3.seconds)
val sourceRef = Await.result(sourceRefFuture, 3.seconds)
(Flow.fromSinkAndSource(sinkRef, sourceRef), inProbe, outProbe)
}
}
This gives me a flow I can completely control with the two probes but I can pass it to a client that connects source and sink later, so it seems to solve my problem.
The resulting Flow should only be used once, so it differs from a regular Flow that is rather a flow blueprint and can be materialized several times. However, this restriction applies to the web socket flow I am mocking anyway, as described here.
The only issue I still have is that some warnings are logged when the ActorSystem terminates after the test. This seems to be due to the indirection introduced by the SinkRef and SourceRef.
Update: I found a better solution without SinkRef and SourceRef by using mapMaterializedValue():
def withProbesFuture[In, Out](implicit actorSystem: ActorSystem,
ec: ExecutionContext)
: (Flow[In, Out, _],
Future[(TestSubscriber.Probe[In], TestPublisher.Probe[Out])]) = {
val (sinkPromise, sourcePromise) =
(Promise[TestSubscriber.Probe[In]], Promise[TestPublisher.Probe[Out]])
val flow =
Flow
.fromSinkAndSourceMat(TestSink.probe[In], TestSource.probe[Out])(Keep.both)
.mapMaterializedValue { case (inProbe, outProbe) =>
sinkPromise.success(inProbe)
sourcePromise.success(outProbe)
()
}
val probeTupleFuture = sinkPromise.future
.flatMap(sink => sourcePromise.future.map(source => (sink, source)))
(flow, probeTupleFuture)
}
When the class under test materializes the flow, the Future is completed and I receive the test probes.
I am writing a program that has to interact with a library that was implemented using Akka. In detail, this library exposes an Actor as endpoint.
As far as I know and as it is explained in the book Applied Akka Pattern, the best way to interact with an Actor system from the outside is using the Ask Pattern.
The library I have to use exposes an actor Main that accepts a Create message. In response to this message, it can respond with two different messages to the caller, CreateAck and CreateNack(error).
The code I am using is more or less the following.
implicit val timeout = Timeout(5 seconds)
def create() = (mainActor ? Create).mapTo[???]
The problem is clearly that I do not know which kind of type I have to use in mapTo function, instead of ???.
Am I using the right approach? Is there any other useful pattern to access to an Actor System from an outside program that does not use Actors?
In general it's best to leave Actors to talk between Actors, you'd simply receive a response then - simple.
If you indeed have to integrate them with the "outside", the ask pattern is fine indeed. Please note though that if you're doing this inside an Actor, this perhaps isn't the best way to go about it.
If there's a number of unrelated response types I'd suggest:
(1) Make such common type; this can be as simple as :
sealed trait CreationResponse
final case object CreatedThing extends CreationResponse
final case class FailedCreationOfThing(t: Throwable) extends CreationResponse
final case class SomethingElse...(...) extends CreationResponse
which makes the protocol understandable, and trackable. I recommend this as it's explicit and helps in understanding what's going on.
(2) For completely unrelated types simply collecting over the future would work by the way, without doing the mapTo:
val res: Future[...] = (bob ? CreateThing) collect {
case t: ThatWorked => t // or transform it
case nope: Nope => nope // or transform it to a different value
}
This would work fine type wise if the results, t and nope have a common super type, that type would then be the ... in the result Future. If a message comes back and does not match any case it'd be a match error; you could add a case _ => whatever then for example, OR it would point to a programming error.
See if CreateAck or CreateNack(error) inherit from any sort of class or object. If thats the case you can use the parent class or object in the .mapTo[CreateResultType].
Another solution is to use .mapTo[Any] and use a match case to find the resulting type.
How do I create a properly functional configurable object in Scala? I have watched Tony Morris' video on the Reader monad and I'm still unable to connect the dots.
I have a hard-coded list of Client objects:
class Client(name : String, age : Int){ /* etc */}
object Client{
//Horrible!
val clients = List(Client("Bob", 20), Client("Cindy", 30))
}
I want Client.clients to be determined at runtime, with the flexibility of either reading it from a properties file or from a database. In the Java world I'd define an interface, implement the two types of source, and use DI to assign a class variable:
trait ConfigSource {
def clients : List[Client]
}
object ConfigFileSource extends ConfigSource {
override def clients = buildClientsFromProperties(Properties("clients.properties"))
//...etc, read properties files
}
object DatabaseSource extends ConfigSource { /* etc */ }
object Client {
#Resource("configuration_source")
private var config : ConfigSource = _ //Inject it at runtime
val clients = config.clients
}
This seems like a pretty clean solution to me (not a lot of code, clear intent), but that var does jump out (OTOH, it doesn't seem to me really troublesome, since I know it will be injected once-and-only-once).
What would the Reader monad look like in this situation and, explain it to me like I'm 5, what are its advantages?
Let's start with a simple, superficial difference between your approach and the Reader approach, which is that you no longer need to hang onto config anywhere at all. Let's say you define the following vaguely clever type synonym:
type Configured[A] = ConfigSource => A
Now, if I ever need a ConfigSource for some function, say a function that gets the n'th client in the list, I can declare that function as "configured":
def nthClient(n: Int): Configured[Client] = {
config => config.clients(n)
}
So we're essentially pulling a config out of thin air, any time we need one! Smells like dependency injection, right? Now let's say we want the ages of the first, second and third clients in the list (assuming they exist):
def ages: Configured[(Int, Int, Int)] =
for {
a0 <- nthClient(0)
a1 <- nthClient(1)
a2 <- nthClient(2)
} yield (a0.age, a1.age, a2.age)
For this, of course, you need some appropriate definition of map and flatMap. I won't get into that here, but will simply say that Scalaz (or RĂșnar's awesome NEScala talk, or Tony's which you've seen already) gives you all you need.
The important point here is that the ConfigSource dependency and its so-called injection are mostly hidden. The only "hint" that we can see here is that ages is of type Configured[(Int, Int, Int)] rather than simply (Int, Int, Int). We didn't need to explicitly reference config anywhere.
As an aside, this is the way I almost always like to think about monads: they hide their effect so it's not polluting the flow of your code, while explicitly declaring the effect in the type signature. In other words, you needn't repeat yourself too much: you say "hey, this function deals with effect X" in the function's return type, and don't mess with it any further.
In this example, of course the effect is to read from some fixed environment. Another monadic effect you might be familiar with include error-handling: we can say that Option hides error-handling logic while making the possibility of errors explicit in your method's type. Or, sort of the opposite of reading, the Writer monad hides the thing we're writing to while making its presence explicit in the type system.
Now finally, just as we normally need to bootstrap a DI framework (somewhere outside our usual flow of control, such as in an XML file), we also need to bootstrap this curious monad. Surely we'll have some logical entry point to our code, such as:
def run: Configured[Unit] = // ...
It ends up being pretty simple: since Configured[A] is just a type synonym for the function ConfigSource => A, we can just apply the function to its "environment":
run(ConfigFileSource)
// or
run(DatabaseSource)
Ta-da! So, contrasting with the traditional Java-style DI approach, we don't have any "magic" occurring here. The only magic, as it were, is encapsulated in the definition of our Configured type and the way it behaves as a monad. Most importantly, the type system keeps us honest about which "realm" dependency injection is occurring in: anything with type Configured[...] is in the DI world, and anything without it is not. We simply don't get this in old-school DI, where everything is potentially managed by the magic, so you don't really know which portions of your code are safe to reuse outside of a DI framework (for example, within your unit tests, or in some other project entirely).
update: I wrote up a blog post which explains Reader in greater detail.
At the moment, Im trying to understand Functional Programming in Scala and I came across a problem I cannot figure out myself.
Imagine the following situation:
You have two classes: Controller and Bot. A Bot is an independent Actor which is initiated by a Controller, does some expensive operation and returns the result to the Controller. The purpose of the Controller is therefore easy to describe: Instantiate multiple objects of Bot, start them and receive the result.
So far, so good; I can implement all this without using any mutable objects.
But what do I do, if I have to store the result that a Bot returns, to use it later as input for another Bot (and later on means that I don't know when at compile time!)?
Doing this with a mutable list or collection is fairly easy, but I add a lot of problems to my code (as we are dealing with concurrency here).
Is it possible, following the FP paradigm, to solve this by using immutable objects (lists...) safely?
BTW, im new to FP, so this question might sound stupid, but I cannot figure out how to solve this :)
Actors usually have internal state, being, themselves, mutable beasts. Note that actors are not a FP thing.
The setup you describe seems to rely on a mutable controller, and it is difficult to get around it in a language that is not non-strict by default. Depending on what you are doing, though, you could rely on futures. For example:
case Msg(info) =>
val v1 = new Bot !! Fn1(info)
val v2 = new Bot !! Fn2(info)
val v3 = new Bot !! Fn3(info)
val v4 = new Bot !! Fn4(v1(), v2(), v3())
reply(v4())
In this case -- because !! returns a Future -- v1, v2 and v3 will be computed in parallel. The message Fn4 is receiving as parameters the futures applied, meaning it will wait until all values are computed before it starts computing.
Likewise, the reply will only be sent after v4 has been computed, as the future has been applied for as well.
A really functional way of doing these things is the functional reactive programming, or FRP for short. It is a different model than actors.
The beauty of Scala, though, is that you can combine such paradigms to the extent that better fits your problem.
This is how an Erlang-like actor could look in Scala:
case class Actor[State](val s: State)(body: State => Option[State]) { // immutable
#tailrec
def loop(s1: State) {
body(s1) match {
case Some(s2) => loop(s2)
case None => ()
}
}
def act = loop(s)
}
def Bot(controller: Actor) = Actor(controller) {
s =>
val res = // do the calculations
controller ! (this, res)
None // finish work
}
val Controller = Actor(Map[Bot, ResultType]()) {s =>
// start bots, perhaps using results already stored in s
if (
// time to stop, e.g. all bots already finished
)
None
else
receive {
case (bot, res) => Some(s + (bot -> res)) // a bot has reported result
}
}
Controller.act