How to run Akka Future using given SecurityManager? - scala

For an open-source multiplayer programming game written in Scala that loads players' bot code via a plug-in system from .jar files, I'd like to prevent the code of the bots from doing harm on the server system by running them under a restrictive SecurityManager implementation.
The current implementation uses a URLClassLoader to extract a control function factory for each bot from its related plug-in .jar file. The factory is then used to instantiate a bot control function instance for each new round of the game. Then, once per simulation cycle, all bot control functions are invoked concurrently to get the bot's responses to their environment. The concurrent invocation is done using Akka's Future.traverse() with an implicitly provided ActorSystem shared by other concurrently operating components (compile service, web server, rendering):
val future = Future.traverse(bots)(bot => Future { bot.respondTo(state) })
val result = Await.result(future, Duration.Inf)
To restrict potentially malicious code contained in the bot plug-ins from running, it appears that following the paths taken in this StackOverflow question and this one I need to have the bot control functions execute in threads running under an appropriately restrictive SecurityManager implementation.
Now the question: how can I get Akka to process the work currently done in Future.traverse() with actors running in threads that have the desired SecurityManager, while the other Actors in the system, e.g. those running the background compile service, continue to run as they do now, i.e. unrestricted?

You can construct an instance of ExecutionContext (eg. via ExecutionContext.fromExecutorService) that runs all work under the restrictive security manager, and bring it into the implicit scope for Future.apply and Future.traverse.
If the invoked functions do not need to interact with the environment, I don't think you need a separate ActorSystem.

Related

Usage of akka ActorContext in integration tests

We're migrating our project to typed akka Actors. We have services that can create actors at will, something like:
class Service(ac: ActorContext[_])
def action() = {
ac.spawn(MyBehavior().receive)
}
}
The problem arises when we're trying to test this service. It seems that we should only use testKit.spawn or testKit.createTestProbe methods. It works fine for unit-testing our classes.
But there is no way to get ActorContext[_] to pass it to services that are being tested.
So, it seems that that we are misusing the ActorContext in our service. But how a service can create a new actor?
I have an idea of passing a special ActorRef[CreateActor] that can create those actor. This way we can remove the dependency on ActorContext[_] in the Service, but I wanted to see if there is a better option.
First of all, you should not be passing ActorContext around, as many of the methods exposed in context are not thread safe.
You can create single ActorSystem[SpawnProtocol.Command] per JVM and pass around that to your services.
Then you can send Spawn command which takes typed actor behavior to injected actorSystem.
val echoActor: Future[ActorRef[Echo]] = actorSustem ? { replyTo: ActorRef[ActorRef[Echo]] => Spawn(echoBehavior, "echo-actor", Props.empty, replyTo)}
Do you really need to create actors from the service though?
Maybe you can have top level wiring where you create all the necessary actors and then inject ActorRef to your services.
Off-course if your actors are short lived and gets created and destroyed based on the demand, then what I suggested above makes sense.

Not calling Cluster.close() with the Cassandra Java driver causes application to be "zombie" and not exit

When my connection is open, the application won't exit, this causes some nasty problems for me (highly concurrent and nested using a shared sesssion, don't know when each part is finished) - is there a way to make sure that the cluster doesn't "hang" the application?
For example here:
object ZombieTest extends App {
val session= Cluster.builder().addContactPoint("localhost").build().connect()
// app doesn't exit unless doing:
session.getCluster.close() // won't exit unless this is called
}
In a slightly biased answer, you could look at https://github.com/outworkers/phantom instead of using the standard java driver.
You get scala.concurrent.Future, monix.eval.Task or even com.twitter.util.Future from a query automatically. You can choose between all three.
DB connection pools are better isolated inside ContactPoint and Database abstraction layers, which have shutdown methods you can easily wire in to your app lifecycle.
It's far faster than the Java driver, as the serialization an de-serialisation of types is wired in compile time via more advanced macro mechanisms.
The short answer is that you need to have a lifecycle way of calling session.close or session.closeAsync when you shut down everything else, it's how it's designed to work.

How does spray.routing.HttpService dispatch requests?

Disclaimer: I have no scala experience for now, so my question is connected with very basics.
Consider the following example (it may be incomplete):
import akka.actor.{ActorSystem, Props}
import akka.io.IO
import spray.can.Http
import akka.pattern.ask
import akka.util.Timeout
import scala.concurrent.duration._
import akka.actor.Actor
import spray.routing._
import spray.http._
object Boot extends App {
implicit val system = ActorSystem("my-actor-system")
val service = system.actorOf(Props[MyActor], "my")
implicit val timeout = Timeout(5.seconds)
IO(Http) ? Http.Bind(service, interface = "localhost", port = 8080)
}
class MyActor extends Actor with MyService {
def actorRefFactory = context
def receive = runRoute(myRoute)
}
trait MyService extends HttpService {
val myRoute =
path("my") {
post {
complete {
"PONG"
}
}
}
}
My question is: what actually happens when control reaches complete block? The question seems to be too general, so let me split it.
I see creation of a single actor in the example. Does it mean that the application is single-threaded and uses only one cpu core?
What happens if I do blocking call inside complete?
If p. 1 is true and p. 2 will block, how do I dispatch requests to utilize all cpus? I see two ways: actor per request and actor per connection. The second one seems to be reasonable, but I cannot find the way to do it using spray library.
If the previous question is irrelevant, will detach directive do? And what about passing function returning Future to complete directive? What is the difference between detach and passing function returning the Future?
What is the proper way to configure number of working threads and balance requests/connections?
It would be great if you point me explanations in the official documentation. It is very extensive and I believe I am missing something.
Thank you.
It's answered here by Mathias - one of the Spray authors. Copying his reply for the reference:
In the end the only thing that really completes the request is a call
to requestContext.complete. Thereby it doesn't matter which thread
or Actor context this call is made from. All that matters is that it
does happen within the configured "request-timeout" period. You can of
course issue this call yourself in some way or another, but spray
gives you a number of pre-defined constructs that maybe fit your
architecture better than passing the actual RequestContext around.
Mainly these are:
The complete directive, which simply provides some sugar on top of the "raw" ctx => ctx.complete(…) function literal.
The Future Marshaller, which calls ctx.complete from an future.onComplete handler.
The produce directive, which extracts a function T => Unit that can later be used to complete the request with an instance of a custom
type.
Architecturally, in most cases, it's a good idea to not have the API
layer "leak into" the core of your application. I.e. the application
should not know anything about the API layer or HTTP. It should only
deal with objects of its own domain model. Therefore passing the
RequestContext directly to the application core is mostly not the best
solution.
Resorting to the "ask" and relying on the Future Marshaller is an
obvious, well understood and rather easy alternative. It comes with
the (small) drawback that an ask comes with a mandatory timeout check
itself which logically isn't required (since the spray-can layer
already takes care of request timeouts). The timeout on the ask is
required for technical reasons (so the underlying PromiseActorRef can
be cleaned up if the expected reply never comes).
Another alternative to passing the RequestContext around is the
produce directive (e.g. produce(instanceOf[Foo]) { completer =>
…). It extracts a function that you can pass on to the application
core. When your core logic calls complete(foo) the completion logic
is run and the request completed. Thereby the application core remains
decoupled from the API layer and the overhead is minimal. The
drawbacks of this approach are twofold: first the completer function
is not serializable, so you cannot use this approach across JVM
boundaries. And secondly the completion logic is now running directly
in an actor context of the application core, which might change
runtime behavior in unwanted ways if the Marshaller[Foo] has to do
non-trivial tasks.
A third alternative is to spawn a per-request actor in the API layer
and have it handle the response coming back from the application core.
Then you do not have to use an ask. Still, you end up with the same
problem that the PromiseActorRef underlying an ask has: how to clean
up if no response ever comes back from the application core? With a
re-request actor you have full freedom to implement a solution for
this question. However, if you decide to rely on a timeout (e.g. via
context.setReceiveTimeout) the benefits over an "ask" might be
non-existent.
Which of the described solutions best fits you architecture you need
to decide yourself. However, as I hopefully was able to show, you do
have a couple of alternatives to choose from.
To answer some of your specific questions: There is only a single actor/handler that services the route thus if you make it block Spray will block. This means you want to either complete the route immediately or dispatch work using either of the 3 options above.
There are many examples on the web for these 3 options. The easiest is to wrap your code in a Future. Check also "actor per request" option/example. In the end your architecture will define the most appropriate way to go.
Finally, Spray runs on top of Akka, so all Akka configuration still applies. See HOCON reference.conf and application.conf for Actor threading settings.

Syncronous Scala Future without separate thread

I'm building a library that, as part of its functionality, makes HTTP requests. To get it to work in the multiple environments it'll be deployed in I'd like it to be able to work with or without Futures.
One option is to have the library parametrise the type of its response so you can create an instance of the library with type Future, or an instance with type Id, depending on whether you are using an asynchronous HTTP implementation. (Id might be an Identity monad - enough to expose a consistent interface to users)
I've started with that approach but it has got complicated. What I'd really like to do instead is use the Future type everywhere, boxing synchronous responses in a Future where necessary. However, I understand that using Futures will always entail some kind of threadpool. This won't fly in e.g. AppEngine (a required environment).
Is there a way to create a Future from a value that will be executed on the current thread and thus not cause problems in environments where it isn't possible to spawn threads?
(p.s. as an additional requirement, I need to be able to cross build the library back to Scala v2.9.1 which might limit the features available in scala.concurrent)
From what I understand you wish to execute something and then wrap the result with Future. In that case, you can always use Promise
val p = Promise[Int]
p success 42
val f = p.future
Hence you now have a future wrapper containing the final value 42
Promise is very well explained here .
Take a look at Scalaz version of Future trait. That's based on top of Trampoline mechanism which will be executing by the current thread unless fork or apply won't be called + that completely removes all ExecutionContext imports =)

Is the Scala compiler reentrant?

For a multi-player programming game, I'm working on a background compilation server for Scala that supports compilation of multiple, independent source trees submitted by the players. I succeeded in running fast, sequential compilations without reloading the compiler by instantiating the Global compiler object via
val compilerGlobal = new Global(settings, reporter)
and then running individual compile jobs via
val run = new compilerGlobal.Run
run.compile(sourceFilePathList)
I would now ideally like to parallelize the server (i.e. make multiple compilation runs concurrently), but still without reloading the compiler (primarily to avoid re-parsing the lib) from scratch each time. Is this possible, i.e. is the second part shown above (safely :-) re-entrant, or does it hold global state? If not, is there something else I can try? I am currently focused on supporting Scala 2.9.1.
Yes, compiler Runs share state, so you should not share them between threads. It's one of the issues that comes up in the Eclipse plugin. As #EJP noted, the symbol table is shared.
This is not so important in your case, but comes up in an IDE: the compiler uses laziness in types, meaning additional computation (and mutation) may happen when calling methods on Symbol. Because of visibility issues, it's important that these methods are called on the same thread as the one that created them.