I'm trying to write a simple matrix multiplication program with concurrent processing using Scala and Akka actors. I've not even written 10% of the code and I'm running into trouble. I created two actors - master and worker. I'm trying to communicate between them but its runs into an infinite loop. Any suggestions are really appreciated. As you can see, the code below does nothing, it prints 2 10X10 matrices in the master, after that the worker is called. But the worker's workDone message never comes back to the master. I also suspect this has to do something with a warning I'm getting:
patterns after a variable pattern cannot match (inside receive of master for case "masterSend")
import akka.actor.{ActorRef, Actor, ActorSystem, Props}
import scala.Array._
import scala.util.Random
case object masterSend
case object workSend
case object workDone
object MatrixMultiply {
val usage = """
Usage: MainStart <matrix-dimension> <high-value>
"""
def main(args: Array[String]) {
if (args.length != 2) {
println(usage)
System.exit(1)
}
val Dim = args(0).toInt
val Max = args(1).toInt
val system = ActorSystem("ComputeSystem")
val worker = system.actorOf(Props[Worker], name = "worker")
val master = system.actorOf(Props(new Master(Dim, Max, worker)), name = "master")
master ! masterSend
}
class Master(Dim: Int, Max: Int, worker : ActorRef) extends Actor {
def receive = {
case masterSend =>
val r = new Random(34636)
val matrixA = ofDim[Int](Dim,Dim)
val matrixB = ofDim[Int](Dim,Dim)
println("Matrix A: ")
for (i <- 0 to Dim - 1) {
for (j <- 0 to Dim - 1) {
matrixA(i)(j) = r.nextInt(Max)
print(matrixA(i)(j) + " ")
}
println()
}
r.setSeed(23535)
println("Matrix B: ")
for (i <- 0 to Dim - 1) {
for (j <- 0 to Dim - 1) {
matrixB(i)(j) = r.nextInt(Max)
print(matrixB(i)(j) + " ")
}
println()
}
worker ! workSend
case workDone =>
println("Work was done!!")
context.system.shutdown()
}
}
class Worker extends Actor {
def receive = {
case workSend =>
println("Work Done")
sender ! workDone
}
}
}
The problem is with pattern matching on objects you've created. It's matching inproperly. Do not bother yourself with objects. Use strings for example:
object A {
val masterSend = "masterSend"
val workSend = "workSend"
val workDone = "workDone"
}
object MatrixMultiply {
val usage = """
Usage: MainStart <matrix-dimension> <high-value>
"""
def main(args: Array[String]) {
val Dim = 3
val Max = 2
val system = ActorSystem("ComputeSystem")
val worker = system.actorOf(Props[Worker], name = "worker")
val master = system.actorOf(Props(new Master(Dim, Max, worker)), name = "master")
master ! A.masterSend
}
class Master(Dim: Int, Max: Int, worker : ActorRef) extends Actor {
def receive = {
case A.masterSend =>
println("Master sent")
worker ! A.workSend
case A.workDone =>
println("Work was done!!")
context.system.shutdown()
}
}
class Worker extends Actor {
def receive = {
case A.workSend =>
println("Work Done")
sender ! A.workDone
}
}
}
You've named your object from lower case letter.
object messageSend
But pattern matching consider it not as an object but as a some new variable instead.
case messageSend => messageSend - is a variable
You'd be able to write anything here case magicBall => will also compile.
Related
My stream works for smaller file of 1000 lines but stops when I test it on a large file ~12MB and ~250,000 lines? I tried applying backpressure with a buffer and throttling it and still same thing...
Here is my data streamer:
class UserDataStreaming(usersFile: File) {
implicit val system = ActorSystemContainer.getInstance().getSystem
implicit val materializer = ActorSystemContainer.getInstance().getMaterializer
def startStreaming() = {
val graph = RunnableGraph.fromGraph(GraphDSL.create() {
implicit builder =>
val usersSource = builder.add(Source.fromIterator(() => usersDataLines)).out
val stringToUserFlowShape: FlowShape[String, User] = builder.add(csvToUser)
val averageAgeFlowShape: FlowShape[User, (String, Int, Int)] = builder.add(averageUserAgeFlow)
val averageAgeSink = builder.add(Sink.foreach(averageUserAgeSink)).in
usersSource ~> stringToUserFlowShape ~> averageAgeFlowShape ~> averageAgeSink
ClosedShape
})
graph.run()
}
val usersDataLines = scala.io.Source.fromFile(usersFile, "ISO-8859-1").getLines().drop(1)
val csvToUser = Flow[String].map(_.split(";").map(_.trim)).map(csvLinesArrayToUser)
def csvLinesArrayToUser(line: Array[String]) = User(line(0), line(1), line(2))
def averageUserAgeSink[usersSource](source: usersSource) {
source match {
case (age: String, count: Int, totalAge: Int) => println(s"age = $age; Average reader age is: ${Try(totalAge/count).getOrElse(0)} count = $count and total age = $totalAge")
case bad => println(s"Bad case: $bad")
}
}
def averageUserAgeFlow = Flow[User].fold(("", 0, 0)) {
(nums: (String, Int, Int), user: User) =>
var counter: Option[Int] = None
var totalAge: Option[Int] = None
val ageInt = Try(user.age.substring(1, user.age.length-1).toInt)
if (ageInt.isSuccess) {
counter = Some(nums._2 + 1)
totalAge = Some(nums._3 + ageInt.get)
}
else {
counter = Some(nums._2 + 0)
totalAge = Some(nums._3 + 0)
}
//println(counter.get)
(user.age, counter.get, totalAge.get)
}
}
Here is my Main:
object Main {
def main(args: Array[String]): Unit = {
implicit val system = ActorSystemContainer.getInstance().getSystem
implicit val materializer = ActorSystemContainer.getInstance().getMaterializer
val usersFile = new File("data/BX-Users.csv")
println(usersFile.length())
val userDataStreamer = new UserDataStreaming(usersFile)
userDataStreamer.startStreaming()
}
It´s possible that there may be any error related to one row of your csv file. In that case, the stream materializes and stops. Try to define your flows like that:
FlowFlowShape[String, User].map {
case (user) => try {
csvToUser(user)
}
}.withAttributes(ActorAttributes.supervisionStrategy {
case ex: Throwable =>
log.error("Error parsing row event: {}", ex)
Supervision.Resume
}
In this case the possible exception is captured and the stream ignores the error and continues.
If you use Supervision.Stop, the stream stops.
I have a master Actor responsible for initializing some worker actors (there are two types of worker actors, namely, ParamServer actor and DataShard actor). For example, if I initiated 20 datashard actors via ClusterSharding(system).start(_,_,_,_,_) and after that I want to send some message to all datashard actors (say case object ReadyToProcess). I read that I can send messages to entities in Akka Cluster Shard via local ShardRegion(system).shardRegion(_). Is local shardRegion(_) will send to all datashards or just one. How can I send msgs to all datashard actors?
The master class given be:
class Master(ports: Seq[String],
dataSet: Seq[Example],
dataPerReplica: Int,
layerDimensions: Seq[Int],
activation: ActivationFunction,
activationFunctionDer: ActivationFunction,
learningRate: Double) extends Actor with ActorLogging {
val dataShards = dataSet.grouped(dataPerReplica).toSeq
val numLayers = layerDimensions.size
var numShardsFinished = 0
ports foreach { port =>
val config = ConfigFactory.parseString("akka.remote.netty.tcp.port=" + port).withFallback(ConfigFactory.load())
val clusterSystem = ActorSystem("ClusterSystem", config)
val paramServerRegions: Array[ActorRef] = new Array[ActorRef](numLayers - 1)
for (i <- 0 to numLayers - 2) {
paramServerRegions(i) = ClusterSharding(clusterSystem).start(
typeName = ParamServer.shardName,
entityProps = ParamServer.props(i, dataShards.size, learningRate, NeuralNetworkOps.randomMatrix(layerDimensions(i + 1), layerDimensions(i) + 1)),
settings = ClusterShardingSettings(clusterSystem),
extractEntityId = ParamServer.extractEntityId,
extractShardId = ParamServer.extractShardId
)
}
//create actors for each data shard/replica. Each replica needs to know about all parameter shards because they will
//be reading from them and updating them
val dataShardRegions: Array[ActorRef] = new Array[ActorRef](dataShards.size)
for (i <- 0 to dataShards.size) {
dataShardRegions(i) = ClusterSharding(clusterSystem).start(
typeName = DataShard.shardName,
entityProps = DataShard.props(i, clusterSystem, dataShards(i), activation, activationFunctionDer, paramServerRegions),
settings = ClusterShardingSettings(clusterSystem),
extractEntityId = ParamServer.extractEntityId,
extractShardId = ParamServer.extractShardId
)
}
}
def receive: Receive = {
case Start => {
val shardRegionSender = ClusterSharding(context.system).shardRegion(DataShard.shardName)
println("Tomosha boshlandi")
shardRegionSender ! ReadyToProcess
}
case ShardDone(id) => {
numShardsFinished+=1
log.info("")
if (numShardsFinished == dataShards.size) {
context.parent ! JobDone
context.stop(self)
}
}
}
}
I try write some simple akka-http and akka-streams based application, that handle http requests, always with one precompiled stream, because I plan to use long time processing with back-pressure in my requestProcessor stream
My application code:
import akka.actor.{ActorSystem, Props}
import akka.http.scaladsl._
import akka.http.scaladsl.server.Directives._
import akka.http.scaladsl.server._
import akka.stream.ActorFlowMaterializer
import akka.stream.actor.ActorPublisher
import akka.stream.scaladsl.{Sink, Source}
import scala.annotation.tailrec
import scala.concurrent.Future
object UserRegisterSource {
def props: Props = Props[UserRegisterSource]
final case class RegisterUser(username: String)
}
class UserRegisterSource extends ActorPublisher[UserRegisterSource.RegisterUser] {
import UserRegisterSource._
import akka.stream.actor.ActorPublisherMessage._
val MaxBufferSize = 100
var buf = Vector.empty[RegisterUser]
override def receive: Receive = {
case request: RegisterUser =>
if (buf.isEmpty && totalDemand > 0)
onNext(request)
else {
buf :+= request
deliverBuf()
}
case Request(_) =>
deliverBuf()
case Cancel =>
context.stop(self)
}
#tailrec final def deliverBuf(): Unit =
if (totalDemand > 0) {
if (totalDemand <= Int.MaxValue) {
val (use, keep) = buf.splitAt(totalDemand.toInt)
buf = keep
use foreach onNext
} else {
val (use, keep) = buf.splitAt(Int.MaxValue)
buf = keep
use foreach onNext
deliverBuf()
}
}
}
object Main extends App {
val host = "127.0.0.1"
val port = 8094
implicit val system = ActorSystem("my-testing-system")
implicit val fm = ActorFlowMaterializer()
implicit val executionContext = system.dispatcher
val serverSource: Source[Http.IncomingConnection, Future[Http.ServerBinding]] = Http(system).bind(interface = host, port = port)
val mySource = Source.actorPublisher[UserRegisterSource.RegisterUser](UserRegisterSource.props)
val requestProcessor = mySource
.mapAsync(1)(fakeSaveUserAndReturnCreatedUserId)
.to(Sink.head[Int])
.run()
val route: Route =
get {
path("test") {
parameter('test) { case t: String =>
requestProcessor ! UserRegisterSource.RegisterUser(t)
???
}
}
}
def fakeSaveUserAndReturnCreatedUserId(param: UserRegisterSource.RegisterUser): Future[Int] =
Future.successful {
1
}
serverSource.to(Sink.foreach {
connection =>
connection handleWith Route.handlerFlow(route)
}).run()
}
I found solution about how create Source that can dynamically accept new items to process, but I can found any solution about how than obtain result of stream execution in my route
The direct answer to your question is to materialize a new Stream for each HttpRequest and use Sink.head to get the value you're looking for. Modifying your code:
val requestStream =
mySource.map(fakeSaveUserAndReturnCreatedUserId)
.to(Sink.head[Int])
//.run() - don't materialize here
val route: Route =
get {
path("test") {
parameter('test) { case t: String =>
//materialize a new Stream here
val userIdFut : Future[Int] = requestStream.run()
requestProcessor ! UserRegisterSource.RegisterUser(t)
//get the result of the Stream
userIdFut onSuccess { case userId : Int => ...}
}
}
}
However, I think your question is ill posed. In your code example the only thing you're using an akka Stream for is to create a new UserId. Futures readily solve this problem without the need for a materialized Stream (and all the accompanying overhead):
val route: Route =
get {
path("test") {
parameter('test) { case t: String =>
val user = RegisterUser(t)
fakeSaveUserAndReturnCreatedUserId(user) onSuccess { case userId : Int =>
...
}
}
}
}
If you want to limit the number of concurrent calls to fakeSaveUserAndReturnCreateUserId then you can create an ExecutionContext with a defined ThreadPool size, as explained in the answer to this question, and use that ExecutionContext to create the Futures:
val ThreadCount = 10 //concurrent queries
val limitedExecutionContext =
ExecutionContext.fromExecutor(Executors.newFixedThreadPool(ThreadCount))
def fakeSaveUserAndReturnCreatedUserId(param: UserRegisterSource.RegisterUser): Future[Int] =
Future { 1 }(limitedExecutionContext)
I am working on an artificial life simulation with Scala and Akka and so far I've been super happy with both. I am having some issues with timing however that I can't quite explain.
At the moment, each animal in my simulation is a pair of actors (animal + brain). Typically, these two actors take turns (animal sends sensor input to brain, waits for result, acts on it and starts over). Every now and then however, animals need to interact with each other to eat each other or reproduce.
The one thing that is odd to me is the timing. It turns out that sending a message from one animal to another is a LOT slower (about 100x) than sending from animal to brain. This puts my poor predators and sexually active animals at a disadvantage as opposed to the vegetarians and asexual creatures (disclaimer: I am vegetarian myself but I think there are better reasons for being a vegetarian than getting stuck for a bit while trying to hunt...).
I extracted a minimal code snippet that demonstrates the problem:
package edu.blindworld.test
import java.util.concurrent.TimeUnit
import akka.actor.{ActorRef, ActorSystem, Props, Actor}
import akka.pattern.ask
import akka.util.Timeout
import scala.concurrent.Await
import scala.concurrent.duration.Duration
import scala.util.Random
class Animal extends Actor {
val brain = context.actorOf(Props(classOf[Brain]))
var animals: Option[List[ActorRef]] = None
var brainCount = 0
var brainRequestStartTime = 0L
var brainNanos = 0L
var peerCount = 0
var peerRequestStartTime = 0L
var peerNanos = 0L
override def receive = {
case Go(all) =>
animals = Some(all)
performLoop()
case BrainResponse =>
brainNanos += (System.nanoTime() - brainRequestStartTime)
brainCount += 1
// Animal interactions are rare
if (Random.nextDouble() < 0.01) {
// Send a ping to a random other one (or ourselves). Defer our own loop
val randomOther = animals.get(Random.nextInt(animals.get.length))
peerRequestStartTime = System.nanoTime()
randomOther ! PeerRequest
} else {
performLoop()
}
case PeerResponse =>
peerNanos += (System.nanoTime() - peerRequestStartTime)
peerCount += 1
performLoop()
case PeerRequest =>
sender() ! PeerResponse
case Stop =>
sender() ! StopResult(brainCount, brainNanos, peerCount, peerNanos)
context.stop(brain)
context.stop(self)
}
def performLoop() = {
brain ! BrainRequest
brainRequestStartTime = System.nanoTime()
}
}
class Brain extends Actor {
override def receive = {
case BrainRequest =>
sender() ! BrainResponse
}
}
case class Go(animals: List[ActorRef])
case object Stop
case class StopResult(brainCount: Int, brainNanos: Long, peerCount: Int, peerNanos: Long)
case object BrainRequest
case object BrainResponse
case object PeerRequest
case object PeerResponse
object ActorTest extends App {
println("Sampling...")
val system = ActorSystem("Test")
val animals = (0 until 50).map(i => system.actorOf(Props(classOf[Animal]))).toList
animals.foreach(_ ! Go(animals))
Thread.sleep(5000)
implicit val timeout = Timeout(5, TimeUnit.SECONDS)
val futureStats = animals.map(_.ask(Stop).mapTo[StopResult])
val stats = futureStats.map(Await.result(_, Duration(5, TimeUnit.SECONDS)))
val brainCount = stats.foldLeft(0)(_ + _.brainCount)
val brainNanos = stats.foldLeft(0L)(_ + _.brainNanos)
val peerCount = stats.foldLeft(0)(_ + _.peerCount)
val peerNanos = stats.foldLeft(0L)(_ + _.peerNanos)
println("Average time for brain request: " + (brainNanos / brainCount) / 1000000.0 + "ms (sampled from " + brainCount + " requests)")
println("Average time for peer pings: " + (peerNanos / peerCount) / 1000000.0 + "ms (sampled from " + peerCount + " requests)")
system.shutdown()
}
This is what happens here:
I am creating 50 pairs of animal/brain actors
They are all launched and run for 5 seconds
Each animal does an infinite loop, taking turns with its brain
In 1% of all runs, an animal sends a ping to a random other animal and waits for its reply. Then, it continues its loop with its brain
Each request to the brain and to peer is measured, so that we can get an average
After 5 seconds, everything is stopped and the timings for brain-requests and pings to peers are compared
On my dual core i7 I am seeing these numbers:
Average time for brain request: 0.004708ms (sampled from 21073859 requests)
Average time for peer pings: 0.66866ms (sampled from 211167 requests)
So pings to peers are 165x slower than requests to brains. I've been trying lots of things to fix this (e.g. priority mailboxes and warming up the JIT), but haven't been able to figure out what's going on. Does anyone have an idea?
I think you should use the ask pattern to handle the message. In your code, the BrainRequest was sent to the brain actor and then it sent back the BrainResponse. The problem was here. The BrainResponse was not that BrainRequest's response. Maybe it was previous BrainRequest's response.
The following code uses the ask pattern and the perf result is almost same.
package edu.blindworld.test
import java.util.concurrent.TimeUnit
import akka.actor.{ActorRef, ActorSystem, Props, Actor}
import akka.pattern.ask
import akka.util.Timeout
import scala.concurrent.ExecutionContext.Implicits.global
import scala.concurrent.Await
import scala.concurrent.duration._
import scala.util.Random
class Animal extends Actor {
val brain = context.actorOf(Props(classOf[Brain]))
var animals: Option[List[ActorRef]] = None
var brainCount = 0
var brainRequestStartTime = 0L
var brainNanos = 0L
var peerCount = 0
var peerRequestStartTime = 0L
var peerNanos = 0L
override def receive = {
case Go(all) =>
animals = Some(all)
performLoop()
case PeerRequest =>
sender() ! PeerResponse
case Stop =>
sender() ! StopResult(brainCount, brainNanos, peerCount, peerNanos)
context.stop(brain)
context.stop(self)
}
def performLoop(): Unit = {
brainRequestStartTime = System.nanoTime()
brain.ask(BrainRequest)(10.millis) onSuccess {
case _ =>
brainNanos += (System.nanoTime() - brainRequestStartTime)
brainCount += 1
// Animal interactions are rare
if (Random.nextDouble() < 0.01) {
// Send a ping to a random other one (or ourselves). Defer our own loop
val randomOther = animals.get(Random.nextInt(animals.get.length))
peerRequestStartTime = System.nanoTime()
randomOther.ask(PeerRequest)(10.millis) onSuccess {
case _ =>
peerNanos += (System.nanoTime() - peerRequestStartTime)
peerCount += 1
performLoop()
}
} else {
performLoop()
}
}
}
}
class Brain extends Actor {
override def receive = {
case BrainRequest =>
sender() ! BrainResponse
}
}
case class Go(animals: List[ActorRef])
case object Stop
case class StopResult(brainCount: Int, brainNanos: Long, peerCount: Int, peerNanos: Long)
case object BrainRequest
case object BrainResponse
case object PeerRequest
case object PeerResponse
object ActorTest extends App {
println("Sampling...")
val system = ActorSystem("Test")
val animals = (0 until 50).map(i => system.actorOf(Props(classOf[Animal]))).toList
animals.foreach(_ ! Go(animals))
Thread.sleep(5000)
implicit val timeout = Timeout(5, TimeUnit.SECONDS)
val futureStats = animals.map(_.ask(Stop).mapTo[StopResult])
val stats = futureStats.map(Await.result(_, Duration(5, TimeUnit.SECONDS)))
val brainCount = stats.foldLeft(0)(_ + _.brainCount)
val brainNanos = stats.foldLeft(0L)(_ + _.brainNanos)
val peerCount = stats.foldLeft(0)(_ + _.peerCount)
val peerNanos = stats.foldLeft(0L)(_ + _.peerNanos)
println("Average time for brain request: " + (brainNanos / brainCount) / 1000000.0 + "ms (sampled from " + brainCount + " requests)")
println("Average time for peer pings: " + (peerNanos / peerCount) / 1000000.0 + "ms (sampled from " + peerCount + " requests)")
system.shutdown()
}
I moved from Casbah to Reactive Mongo and from that moment I couldn't make work the test of my actor.
I have a dao for the persistence layer and tests for that tier. All the tests passed. So, the only thing that comes to my mind its a problem of synchronization.
" UserActor " should {
val socketActorProbe = new TestProbe(system)
val peyiProbe = new TestProbe(system)
val identifyId = 1
val emailCsr = "csr#gmail.com"
val emailPeyi = "peyi#gmail.com"
val point = new Point[LatLng](new LatLng(-31.4314041, -64.1670626))
" test preStart() " in new WithApplication {
db.createDB(id1, id2, id3)
val userActorRefCsr = TestActorRef[UserActor](Props(classOf[UserActor], emailCsr, socketActorProbe.ref))
val csr = userActorRefCsr.underlyingActor
val userActorRef = TestActorRef[UserActor](Props(classOf[UserActor], emailPeyi, socketActorProbe.ref))
val peyi = userActorRef.underlyingActor
peyi.receive(ActorIdentity(identifyId, Option(userActorRefCsr)))
db.clearDB()
}
Actor class.
class UserActor(email: String, upstream: ActorRef) extends Actor {
import UserActor._
val identifyId = 1
val usersFromDB = ReactiveMongoFactory.db.collection[BSONCollection]("users")
val userDao = new UserDao(usersFromDB)
val meFuture = userDao.findMeByEmail(email)
var friends: Map[String, ActorRef] = Map()
override def preStart() = {
meFuture onComplete { result =>
val emailsFriends: List[String] = userDao.getMyFriendsEmail(result.get.get)
println(emailsFriends)
for (email <- emailsFriends) {
println("sending msg to " + email)
context.actorSelection("/user/" + email) ! Identify(identifyId)
}
}
}
private def giveMyFriend(email: String): Option[ActorRef] = {
for(friend <- friends){
if (friend._1 == email) new Some(friend._2)
}
None
}
def active(another: ActorRef): Actor.Receive = {
case Terminated(`another`) => context.stop(self)
}
def receive = {
case ActorIdentity(`identifyId`, Some(actorRef)) =>
meFuture onComplete { result =>
println(" ... subscribing ... " + result.get.get.basicProfile.email)
actorRef ! Subscribe(result.get.get.basicProfile.email.get)
context.watch(actorRef)
context.become(active(actorRef))
}
case Subscribe(email) =>
friends += (email -> sender)
context watch sender
case Terminated(user) => {
for(friend <- friends){
if (friend._2 == user ) friends -= friend._1 //removing by key
}
}
case UserMoved(email, point) =>
upstream ! UserPosition(email, System.currentTimeMillis(), point.coordinates)
}
}
Im receiving the following output.
The exception is thrown in the following lines of code.
def findMeByEmail(email: String): Future[Option[User]] = {
val query = BSONDocument("email" -> email)
println( " .... finding user ..... email: " + email )
val cursor = users.find(query).cursor[BSONDocument]
val userFuture = cursor.headOption.map(
doc => Some(userReader.read(doc.get))
)
userFuture
}
If I run the test for that method, it's all ok.
describe("get my friends emails") {
it("returns a list of emails") {
val futureUser = userDao.findMeByEmail("csr#gmail.com")
ScalaFutures.whenReady(futureUser) { result =>
val friends = userDao.getMyFriendsEmail(result.get)
assert(friends.length == 2)
}
}
}
Basically, Im trying to look my friends (Other actor) and then register them in a map to have a reference. I couldn't find any good example which shows tests using Reactive Mongo with Actors.
I hope somebody can help me to understand whats going on here. Thanks in advance.