Suppose I need to implement a custom message-oriented protocol in Scala. I need to implement also the client/server code.
I would define "cases classes" for protocol messages as follows:
trait Message
case class Request1(...) extends Message
case class Response1(...) extends Message
case class Request2(...) extends Message
case class Response2(...) extends Message
... // other requests/responses
Now I need functions to read/write the messages from/to input/output streams and handle the messages.
def read(in: InputStream): Message = {...}
def write(msg: Message, out: OutputStream) {...}
def handle(msg:Message): Message = msg match {
case req: Request1 = ... // handle Request1
case resp: Response1 = ... // handle Response1
... // cases for all other message types
}
I guess it works but I wonder if I can improve the solution. How would you correct or improve it ?
Have you had a look at Akka?
Akk makes it much simpler to develop distributed applications, no need to define input and output streams manually. Just have a look at the "Remoting" example on the homepage.
The benefits of this approach would be that you can focus on the protocol itself, i.e., in your case the development of one (or more) actors on the client side, and one (or more) actors on the server side.
Akka should provide you with all 'lower-level' functionality you need, taking care of the actual sending an receiving of the messages, multi-threading, and so on; so you don't have to re-invent the wheel. This should also make your code easier to maintain by others in the future, as Akka is a well-known toolkit.
To get a basic idea of how actors work, have a look at this book chapter, but note that it describes the Scala actors, which have been replaced by Akka actors in the meantime. If you want to dig deeper, I'd recommend Akka Concurrency, which is more up to date.
Related
I am writing a class that takes a Flow (representing a kind of socket) as a constructor argument and that allows to send messages and wait for the respective answers asynchronously by returning a Future. Example:
class SocketAdapter(underlyingSocket: Flow[String, String, _]) {
def sendMessage(msg: MessageType): Future[ResponseType]
}
This is not necessarily trivial because there may be other messages in the socket stream that are irrelevant, so some filtering is required.
In order to test the class I need to provide something like a "TestFlow" analogous to TestSink and TestSource. In fact I can create a flow by combining both. However, the problem is that I only obtain the actual probes upon materialization and materialization happens inside the class under test.
The problem is similar to the one I described in this question. My problem would be solved if I could materialize the flow first and then pass it to a client to connect to it. Again, I'm thinking about using MergeHub and BroadcastHub and again I see the problem that the resulting stream would behave differently because it is not linear anymore.
Maybe I misunderstood how a Flow is supposed to be used. In order to feed messages into the flow when sendMessage() is called, I need a certain kind of Source anyway. Maybe a Source.actorRef(...) or Source.queue(...), so I could pass in the ActorRef or SourceQueue directly. However, I'd prefer if this choice was up to the SocketAdapter class. Of course, this applies to the Sink as well.
It feels like this is a rather common case when working with streams and sockets. If it is not possible to create a "TestFlow" like I need it, I'm also happy with some advice on how to improve my design and make it better testable.
Update: I browsed through the documentation and found SourceRef and SinkRef. It looks like these could solve my problem but I'm not sure yet. Is it reasonable to use them in my case or are there any drawbacks, e.g. different behaviour in the test compared to production where there are no such refs?
Indirect Answer
The nature of your question suggests a design flaw which you are bumping into at testing time. The answer below does not address the issue in your question, but it demonstrates how to avoid the situation altogether.
Don't Mix Business Logic with Akka Code
Presumably you need to test your Flow because you have mixed a substantial amount of logic into the materialization. Lets assume you are using raw sockets for your IO. Your question suggests that your flow looks like:
val socketFlow : Flow[String, String, _] = {
val socket = new Socket(...)
//business logic for IO
}
You need a complicated test framework for your Flow because your Flow itself is also complicated.
Instead, you should separate out the logic into an independent function that has no akka dependencies:
type MessageProcessor = MessageType => ResponseType
object BusinessLogic {
val createMessageProcessor : (Socket) => MessageProcessor = {
//business logic for IO
}
}
Now your flow can be very simple:
val socket : Socket = new Socket(...)
val socketFlow = Flow.map(BusinessLogic.createMessageProcessor(socket))
As a result: your unit testing can exclusively work with createMessageProcessor, there's no need to test akka Flow because it is a simple veneer around the complicated logic that is tested independently.
Don't Use Streams For Concurrency Around 1 Element
The other big problem with your design is that SocketAdapter is using a stream to process just 1 message at a time. This is incredibly wasteful and unnecessary (you're trying to kill a mosquito with a tank).
Given the separated business logic your adapter becomes much simpler and independent of akka:
class SocketAdapter(messageProcessor : MessageProcessor) {
def sendMessage(msg: MessageType): Future[ResponseType] = Future {
messageProcessor(msg)
}
}
Note how easy it is to use Future in some instances and Flow in other scenarios depending on the need. This comes from the fact that the business logic is independent of any concurrency framework.
This is what I came up with using SinkRef and SourceRef:
object TestFlow {
def withProbes[In, Out](implicit actorSystem: ActorSystem,
actorMaterializer: ActorMaterializer)
:(Flow[In, Out, _], TestSubscriber.Probe[In], TestPublisher.Probe[Out]) = {
val f = Flow.fromSinkAndSourceMat(TestSink.probe[In], TestSource.probe[Out])
(Keep.both)
val ((sinkRefFuture, (inProbe, outProbe)), sourceRefFuture) =
StreamRefs.sinkRef[In]()
.viaMat(f)(Keep.both)
.toMat(StreamRefs.sourceRef[Out]())(Keep.both)
.run()
val sinkRef = Await.result(sinkRefFuture, 3.seconds)
val sourceRef = Await.result(sourceRefFuture, 3.seconds)
(Flow.fromSinkAndSource(sinkRef, sourceRef), inProbe, outProbe)
}
}
This gives me a flow I can completely control with the two probes but I can pass it to a client that connects source and sink later, so it seems to solve my problem.
The resulting Flow should only be used once, so it differs from a regular Flow that is rather a flow blueprint and can be materialized several times. However, this restriction applies to the web socket flow I am mocking anyway, as described here.
The only issue I still have is that some warnings are logged when the ActorSystem terminates after the test. This seems to be due to the indirection introduced by the SinkRef and SourceRef.
Update: I found a better solution without SinkRef and SourceRef by using mapMaterializedValue():
def withProbesFuture[In, Out](implicit actorSystem: ActorSystem,
ec: ExecutionContext)
: (Flow[In, Out, _],
Future[(TestSubscriber.Probe[In], TestPublisher.Probe[Out])]) = {
val (sinkPromise, sourcePromise) =
(Promise[TestSubscriber.Probe[In]], Promise[TestPublisher.Probe[Out]])
val flow =
Flow
.fromSinkAndSourceMat(TestSink.probe[In], TestSource.probe[Out])(Keep.both)
.mapMaterializedValue { case (inProbe, outProbe) =>
sinkPromise.success(inProbe)
sourcePromise.success(outProbe)
()
}
val probeTupleFuture = sinkPromise.future
.flatMap(sink => sourcePromise.future.map(source => (sink, source)))
(flow, probeTupleFuture)
}
When the class under test materializes the flow, the Future is completed and I receive the test probes.
Lately I've found myself wrapping actors in classes so that I get back a little of the typesafety I lose when dealing with ActorRefs.
The problem is, at the end, that not only I need to send a specific message, I also need to cast the response to the expected result.
So I thought that I could send messages to actors that contain Promise so that they could report the result eventually.
Is that a bad idea? It looks pretty neat to me... Is typesafe and works just as good. Why hasn't anyone come with the idea? Is there anything wrong with it that I haven't noticed?
ask pattern based solution
case class GetUser(id:Long)
(actorRef ! GetUser(1l)).mapTo[User]
class UserRepoActor extends Actor{
def receive={
case GetUser(id)=>
sender() ! getUser(id)
}
...
}
Promise based solution
case class GetUser(id: Long, resp: Promise[User])
val req = GetUser(1l,Promise())
actorRef ! req
req.resp.future // No casting!!
class UserRepoActor extends Actor{
def receive={
case GetUser(id,resp)=>
response.success(getUser(id))
}
...
}
There is nothing wrong. Very close approach is used in akka typed with the only difference: a single-use ActorRef[T] is being sent instead of Promise[T]
Promises won't work in distributed actor system.
At least, without additional efforts for that.
Ask pattern is definitely better.
1) Actors are supposed to share no state and interact with the outer world via messages. Fulfilling the promise is actually a mutating shared variable
2) Passing the stateful objects into actor's creator (e.g. promise) breaks actor's lifecycle in case of restarts
So promise-based approach works in simple cases. But if you use it just like that probably you don't need such complicated stuff like akka at all?
I am writing a program that has to interact with a library that was implemented using Akka. In detail, this library exposes an Actor as endpoint.
As far as I know and as it is explained in the book Applied Akka Pattern, the best way to interact with an Actor system from the outside is using the Ask Pattern.
The library I have to use exposes an actor Main that accepts a Create message. In response to this message, it can respond with two different messages to the caller, CreateAck and CreateNack(error).
The code I am using is more or less the following.
implicit val timeout = Timeout(5 seconds)
def create() = (mainActor ? Create).mapTo[???]
The problem is clearly that I do not know which kind of type I have to use in mapTo function, instead of ???.
Am I using the right approach? Is there any other useful pattern to access to an Actor System from an outside program that does not use Actors?
In general it's best to leave Actors to talk between Actors, you'd simply receive a response then - simple.
If you indeed have to integrate them with the "outside", the ask pattern is fine indeed. Please note though that if you're doing this inside an Actor, this perhaps isn't the best way to go about it.
If there's a number of unrelated response types I'd suggest:
(1) Make such common type; this can be as simple as :
sealed trait CreationResponse
final case object CreatedThing extends CreationResponse
final case class FailedCreationOfThing(t: Throwable) extends CreationResponse
final case class SomethingElse...(...) extends CreationResponse
which makes the protocol understandable, and trackable. I recommend this as it's explicit and helps in understanding what's going on.
(2) For completely unrelated types simply collecting over the future would work by the way, without doing the mapTo:
val res: Future[...] = (bob ? CreateThing) collect {
case t: ThatWorked => t // or transform it
case nope: Nope => nope // or transform it to a different value
}
This would work fine type wise if the results, t and nope have a common super type, that type would then be the ... in the result Future. If a message comes back and does not match any case it'd be a match error; you could add a case _ => whatever then for example, OR it would point to a programming error.
See if CreateAck or CreateNack(error) inherit from any sort of class or object. If thats the case you can use the parent class or object in the .mapTo[CreateResultType].
Another solution is to use .mapTo[Any] and use a match case to find the resulting type.
Depending on a reply from a Scala Actor seems incredibly error-prone to me. Is this truly the idiomatic Scala way to have conversations between actors? Is there an alternative, or a safer use of reply that I'm missing?
(About me: I'm familiar with synchronization in Java, but I've never designed an actor-based system before and don't yet have a full understanding of the paradigm.)
Example mistakes
For a trivial demonstration, let's look at this silly integer-parsing Actor:
import actors._, Actor._
val a = actor {
loop {
react {
case s: String => reply(s.toInt)
}
}
}
We could intend to use this as
scala> a !? "42"
res0: Any = 42
But if the actor fails to reply (in this case because a careless programmer did not think to catch NumberFormatException in the actor), we'll be waiting forever:
scala> a !? "f"
We also make a mistake at the call site. This next example also blocks indefinitely, because the actor does not reply to Int messages:
scala> a !? 42
Timeout
You could use !? (msec: Long, msg: Any) if the expected reply has some known reasonable time bound, but that is not the case in most circumstances I can think of.
Guaranteeing reply
One thought would be to design that actor such that it necessarily replies to every message:
import actors._, Actor._
val a = actor {
loop {
react {
case m => reply {
try {
m match {
case s: String => reply(s.toInt)
case _ => None
}
} catch {
case e => e
}
}
}
}
}
This feels better, although there is still a little fear of accidentally invoking !? on an actor is no longer acting.
I can see your concerns, but I would actually argue that this is not any worse than the synchronization you are used to. Who guarantees that the locks will ever be released again?
Using !? is at your own risk, so no there are no 'safer' uses that I am aware of. Threads can block or die and there is absolutely nothing we can do about it. Except for providing safety-valves that can soften the blow.
The event-based acting actually gives you alternatives to receiving replies synchronously. The timeout is one of them but another thing such as Futures via the !! method. They are designed to handle deadlocks such as that. The method immediately returns a future that can be handled later.
For inspiration and more in-depth design decisions see:
Actors:
http://docs.scala-lang.org/overviews/core/actors.html
Futures (in scala 2.10):
http://docs.scala-lang.org/sips/pending/futures-promises.html
Don't bother with old local actors - learn Akka. Also it's good that you know about synchronized, but personally me - almost never use such a word, even in Java code. Imagine synchronized is deprecated, learn Java memory model, learn CAS.
I am not familiar with the Actor system in the Scala standard library myself, but I highly recommend checking out the Akka toolkit (http://akka.io/) which has "replaced" the Scala Actors and comes with the Scala distribution as of Scala 2.10.
In terms of Actor system design in general, some of the key ideas are asynchronous (non-blocking), isolated mutability, and communication via message passing. Each Actor encapsulates it's own state, nobody else is allowed to touch it. You can send an Actor a message that may "ask" it to change state, but the Actor implementation is free to ignore it. Messages are sent asynchronously (you CAN make it blocking, not recommended). If you want to have some sort of "response" (so that you can associate a message with a previously sent message), the Future API in Scala 2.10 and ask of Akka can help.
Regarding your error format exception and the problem in general, consider looking at the ask and Future API in Scala 2.10 and Akka 2.1. It will handle exceptions and is non-blocking.
Scala 2.10 also has a new Try that is intended as an alternative to the old-fashioned try-catch clauses. The Try has an apply method that you would use like any try (minus the catch and finally). Try has two sub-classes Success and Failure. An instance of Try[T] will have subclasses Success[T] and Failure[Throwable]. It is easier to explain by example:
>>> val x: Try[Int] = Try { "5".toInt } // Success[Int] with encapsulated value 5
>>> val y: Try[Int] = Try { "foo".toInt } // Failure(java.lang.NumberFormatException: For input string: "foo")
Since Try does not throw the actual exception and the subclasses are conveniently case-classes, you could easily use the result as a message to an Actor.
I would like to know if it's possible (and how) to get an akka actor to receive messages from stdin. Essentially, the idea would be for every line of input to be sent as a message to the actor, e.g.
> myprogram
DO X
DO Y
...
and then to have the actor receive messages "DO X", "DO Y", etc.
Is there a standard solution to do this?
I guess one way would be to do this:
spawn {
while(in.available) {
actor ! in.readLine
}
}
But then I'd have two actors (or one actor-based task and one actor) and I'd be using blocking IO (is that safe with actors, by the way?)... Also, it makes it harder to control the spawn block (e.g. to kill the task).
Added further follow ups from OP
I have a couple follow ups, if you will allow me...
Is there a performance hit using this solution (i.e. does CamelServiceManager start a lot of things? HTTP server, etc.)?
Got a good tutorial for beginners? I started reading Camel from the official Akka documentation, but it seems to assume more knowledge of Camel than I currently possess. For instance, I couldn't figure out how to use a custom java.io.InputStream as endpointUri.
You could use akka-camel together with the camel-stream component to let actors receive messages from stdin. Here's a working example:
import akka.actor.Actor
import akka.camel.{Message, CamelServiceManager, Consumer}
object Example extends App {
CamelServiceManager.startCamelService
Actor.actorOf[ExampleConsumer].start
}
class ExampleConsumer extends Actor with Consumer {
def endpointUri = "stream:in"
def receive = {
case msg: Message => println("received %s" format msg.bodyAs[String])
}
}
Update: Answers to the follow-up questions
The CamelServiceManager.startCamelService method starts a CamelContext and two Akka actors that register newly started Consumer actor endpoints at the CamelContext. No HTTP server is started.
Good introductions to Apache Camel are Apache Camel: Integration Nirvana article and chapter 1 of the Camel in Action book. The Appendix E of Camel in Action is an introduction to akka-camel.
Setting a custom InputStream at the endpoint URI is currently not possible with the camel-stream component.