Difference between RoundRobinRouter and RoundRobinRoutinglogic - scala

So I was reading tutorial about akka and came across this http://manuel.bernhardt.io/2014/04/23/a-handful-akka-techniques/ and I think he explained it pretty well, I just picked up scala recently and having difficulties with the tutorial above,
I wonder what is the difference between RoundRobinRouter and the current RoundRobinRouterLogic? Obviously the implementation is quite different.
Previously the implementation of RoundRobinRouter is
val workers = context.actorOf(Props[ItemProcessingWorker].withRouter(RoundRobinRouter(100)))
with processBatch
def processBatch(batch: List[BatchItem]) = {
if (batch.isEmpty) {
log.info(s"Done migrating all items for data set $dataSetId. $totalItems processed items, we had ${allProcessingErrors.size} errors in total")
} else {
// reset processing state for the current batch
currentBatchSize = batch.size
allProcessedItemsCount = currentProcessedItemsCount + allProcessedItemsCount
currentProcessedItemsCount = 0
allProcessingErrors = currentProcessingErrors ::: allProcessingErrors
currentProcessingErrors = List.empty
// distribute the work
batch foreach { item =>
workers ! item
}
}
}
Here's my implementation of RoundRobinRouterLogic
var mappings : Option[ActorRef] = None
var router = {
val routees = Vector.fill(100) {
mappings = Some(context.actorOf(Props[Application3]))
context watch mappings.get
ActorRefRoutee(mappings.get)
}
Router(RoundRobinRoutingLogic(), routees)
}
and treated the processBatch as such
def processBatch(batch: List[BatchItem]) = {
if (batch.isEmpty) {
println(s"Done migrating all items for data set $dataSetId. $totalItems processed items, we had ${allProcessingErrors.size} errors in total")
} else {
// reset processing state for the current batch
currentBatchSize = batch.size
allProcessedItemsCount = currentProcessedItemsCount + allProcessedItemsCount
currentProcessedItemsCount = 0
allProcessingErrors = currentProcessingErrors ::: allProcessingErrors
currentProcessingErrors = List.empty
// distribute the work
batch foreach { item =>
// println(item.id)
mappings.get ! item
}
}
}
I somehow cannot run this tutorial, and it's stuck at the point where it's iterating the batch list. I wonder what I did wrong.
Thanks

In the first place, you have to distinguish diff between them.
RoundRobinRouter is a Router that uses round-robin to select a connection.
While
RoundRobinRoutingLogic uses round-robin to select a routee
You can provide own RoutingLogic (it has helped me to understand how Akka works under the hood)
class RedundancyRoutingLogic(nbrCopies: Int) extends RoutingLogic {
val roundRobin = RoundRobinRoutingLogic()
def select(message: Any, routees: immutable.IndexedSeq[Routee]): Routee = {
val targets = (1 to nbrCopies).map(_ => roundRobin.select(message, routees))
SeveralRoutees(targets)
}
}
link on doc http://doc.akka.io/docs/akka/2.3.3/scala/routing.html
p.s. this doc is very clear and it has helped me the most

Actually I misunderstood the method, and found out the solution was to use RoundRobinPool as stated in http://doc.akka.io/docs/akka/2.3-M2/project/migration-guide-2.2.x-2.3.x.html
For example RoundRobinRouter has been renamed to RoundRobinPool or
RoundRobinGroup depending on which type you are actually using.
from
val workers = context.actorOf(Props[ItemProcessingWorker].withRouter(RoundRobinRouter(100)))
to
val workers = context.actorOf(RoundRobinPool(100).props(Props[ItemProcessingWorker]), "router2")

Related

Breadth first search inefficiencies

I'm writing a simple breadth first search algorithm is Scala and I feel like it should be pretty efficient. However when I run this one some relatively small problems I'm managing to run out of memory.
def search(start: State): Option[State] = {
val queue: mutable.Queue[State] = mutable.Queue[State]()
queue.enqueue( start )
while( queue.nonEmpty ){
val node = queue.dequeue()
if( self.isGoal(node) )
return Some(node)
self.successors(node).foreach( queue.enqueue )
}
None
}
I believe the enqueue and dequeue methods on a mutable queue were constant time and for each is implemented efficiently. The methods isGoal and successors I know are as efficient as they as they can be. I don't understand how I can be running out of memory so quickly. Are there any inefficiencies in this code that I'm missing?
I think c0der's comment nailed it: you may be getting caught in an infinite loop re-checking nodes that you've already visited. Consider the following changes:
def search(start: State): Option[State] = {
var visited: Set[State] = Set() // change #1
val queue: mutable.Queue[State] = mutable.Queue[State]()
queue.enqueue( start )
while( queue.nonEmpty ){
val node = queue.dequeue()
if (!visited.contains(node)) { // change #2
visited += node // change #3
if( self.isGoal(node) )
return Some(node)
self.successors(node).foreach( queue.enqueue )
}
}
None
}
Initialize a new Set, visited, to keep track of which Nodes you've been to.
Immediately after dequeueing a Node, check if you've visited it before. If not, continue checking this Node. Otherwise, ignore it.
Make sure to add this Node to the visited Set so it's not checked again in the future.
Hope that helps :D
You have some Java, not Scala code there. For Scala vars and while is something that you should not use at all. Here is my suggestion how you could solve this.
class State(val neighbours: List[State]) // I am not sure how your State class looks like, but it could look something like this
val goal = new State(List())
def breathFirst(start: State): Option[State] = {
#scala.annotation.tailrec
def recursiveFunction(visited: List[State], toVisit: List[State]): Option[State] = { // So we will create recursive function with visited nodes and nodes that we should visit
if (toVisit.isEmpty) return None // If toVisit is empty that means that there is no path from start to goal, return none
else {
val visiting = toVisit.head // Else we should take first node from toVisit
val visitingNeighbours = visiting.neighbours // Take all neighbours from node that we are visiting
val visitingNeighboursNotYetVisited = visitingNeighbours.filter(x => !visited.contains(x)) //Filter all neighbours that are not visited
if (visitingNeighboursNotYetVisited.contains(goal)) { //if we found goal, return it
return Some(goal)
} else {
return recursiveFunction(visited :+ visiting, toVisit.tail ++ visitingNeighboursNotYetVisited) // Otherwise add node that we visited in this iteration to list of visited nodes that does not have visited node - it was head so we take toVisit.tail
// and also we will take all neighbours that are not visited and add them to toVisit list for next iteration
}
}
}
if (start == goal) { // If goal is start, return start
Some(start)
} else { // else call our recursive function with empty visited list and with toVisit list that has start node
recursiveFunction(List(), List(start))
}
}
NOTE: You could change:
val visitingNeighboursNotYetVisited = visitingNeighbours.filter(x => !visited.contains(x)) //Filter all neighbours that are not visited
with
val visitingNeighboursNotYetVisited = visitingNeighbours
and check if you will go out of memory, and, as probably you wont it will show you why you should use tailrec.

Spark-Streaming broadcast variable to custom receiver

I created an application which is using Spark-Streaming with the custom receiver Google Pub/Sub.
I hit my performance limit and interested to drop messages without processing it. I had an idea to store() sub of reading messages
I used apache/bahir receiver
val pullResponse = client.projects().subscriptions().pull(subscriptionFullName, pullRequest).execute()
val receivedMessages = pullResponse.getReceivedMessages.asScala.toList
Utils.LOG.info(s"receivedMessages from PUB/SUB ${receivedMessages.size}")
rateLimiter.acquire(receivedMessages.size)
var factor: Int = 0
if (dropFactorBroad != null) {
factor = dropFactorBroad.value
} else {
Utils.LOG.info("dropFactorBroad is null")
}
val endIndex = if (factor > receivedMessages.length) receivedMessages.length else factor
val messagesToStore = receivedMessages.slice(0, receivedMessages.length - endIndex)
store(messagesToStore.map(x => {
val sm = new SparkPubsubMessage
sm.message = x.getMessage
sm
})
.iterator)
val ackRequest = new AcknowledgeRequest()
ackRequest.setAckIds(receivedMessages.map(x => x.getAckId).asJava)
client.projects().subscriptions().acknowledge(subscriptionFullName, ackRequest).execute()
dropFactorBroad - is Broadcast variable which is updated on every onBatchCompleted(unpersisted and created again)
It is not working, I'm getting
java.lang.NullPointerException
at com.mag.ingester.ReceiverDropFactorBroadcaster.value(ReceiverDropFactorBroadcaster.scala:20)
at com.mag.pubSubReceiver.PubsubReceiver.receive(PubsubInputDStream.scala:260)
at com.mag.pubSubReceiver.PubsubReceiver$$anon$1.run(PubsubInputDStream.scala:244)
ReceiverDropFactorBroadcaster is dropFactorBroad
How can I control the receive store?
Should I kill receivers change the variable and start it again? (How can it be done?)
Thanks

Read password in Scala in a console-agnostic way

I have an easy task to accomplish: read a password from a command line prompt without exposing it. I know that there is java.io.Console.readPassword, however, there are times when you cannot access console as if you are running your app from an IDE (such as IntelliJ).
I stumbled upon this Password Masking in the Java Programming Language tutorial, which looks nice, but I fail to implement it in Scala. So far my solution is:
class EraserThread() extends Runnable {
private var stop = false
override def run(): Unit = {
stop = true
while ( stop ) {
System.out.print("\010*")
try
Thread.sleep(1)
catch {
case ie: InterruptedException =>
ie.printStackTrace()
}
}
}
def stopMasking(): Unit = {
this.stop = false
}
}
val et = new EraserThread()
val mask = new Thread(et)
mask.start()
val password = StdIn.readLine("Password: ")
et.stopMasking()
When I start this snippet I get a continuos printing of asterisks on new lines. E.g.:
*
*
*
*
Is there any specific in Scala why this is not working? Or is there any better way to do this in Scala in general?

neo4j 3.0 embedded - no nodes

There's sometime I must be missing about neo4j 3.0 embedded. After creating a node, setting some properties, and marking the transaction as success. I then re-open the DB, but there are no nodes in it! What am I missing here? The neo4j documentation is pretty poor.
val graph1 = {
val graphDb = new GraphDatabaseFactory()
.newEmbeddedDatabase(new File("/opt/neo4j/deviceGraphTest" ))
val tx = graphDb.beginTx()
val node = graphDb.createNode()
node.setProperty("name", "kitchen island")
node.setProperty("bulbType", "incandescent")
tx.success()
graphDb.shutdown()
}
val graph2 = {
val graphDb2 = new GraphDatabaseFactory()
.newEmbeddedDatabase(new File("/opt/neo4j/deviceGraphTest" ))
val tx2 = graphDb2.beginTx()
val allNodes = graphDb2.getAllNodes.iterator().toList
allNodes.foreach(node => {
printNode(node)
})
}
The transaction what you have opened has to be closed with the command tx.close() after setting the transaction to state success. I do not know the exact scala syntax but it would be good to put the full block into a try/catch and to finally close the transaction in the finally block.
Here is the documentation for Java: https://neo4j.com/docs/java-reference/current/javadocs/org/neo4j/graphdb/Transaction.html

Specify Variable Initialization Order in Scala

I have a special class Model that needs to have its methods called in a very specific order.
I tried doing something like this:
val model = new Model
new MyWrappingClass {
val first = model.firstMethod()
val second = model.secondMethod()
val third = model.thirdMethod()
}
The methods should be called in the order listed, however I am seeing an apparently random order.
Is there any way to get the variable initialization methods to be called in a particular order?
I doubt your methods are called in the wrong order. But to be sure, you can try something like this:
val (first, second, third) = (
model.firstMethod(),
model.secondMethod(),
model.thirdMethod()
)
You likely have some other problem with your code.
I can run 100 million loops where it never gets the order wrong, as follows:
class Model {
var done = Array(false,false,false);
def firstMethod():Boolean = { done(0) = true; done(1) || done(2) };
def secondMethod():Boolean = { done(1) = true; !done(0) || done(2) };
def thirdMethod():Boolean = { done(2) = true; !done(0) || !done(1) };
};
Notice that these methods return a True if done out of order and false when called in order.
Here's your class:
class MyWrappingClass {
val model = new Model;
val first = model.firstMethod()
val second = model.secondMethod()
val third = model.thirdMethod()
};
Our function to check for bad behavior on each trial:
def isNaughty(w: MyWrappingClass):Boolean = { w.first || w.second || w.third };
A short program to test:
var i = 0
var b = false;
while( (i<100000000) && !b ){
b = isNaughty(new MyWrappingClass);
i += 1;
}
if (b){
println("out-of-order behavior occurred");
println(i);
} else {
println("looks good");
}
Scala 2.11.7 on OpenJDK8 / Ubuntu 15.04
Of course this doesn't prove it impossible to have wrong order, only that correct behavior seems highly repeatable in a fairly simple case.