Advice needed implementing direct and inverse Dijkstra algorithm in Scala/Spark - scala

I'm trying to implement both direct Dijkstra and its inverse version (that is finding longest path instead of shortest ones) but I'm having some trouble because I'm getting infinite distances for not disconnected nodes in the weighted undirected graph (and zero distances in the inverse version).
So far, I trusted in and modified this implementation I found in web: [http://note.yuhc.me/2015/03/graphx-pregel-shortest-path/]
My implementations for both functions are as follows:
Direct Dijkstra:
// Implementation of Dijkstra algorithm using Pregel API
def computeMinDistance(u: VertexId, k1: VertexId): Double = {
val g: Graph[(Double, VertexId), Double] = this.uncertainGraph.mapVertices((id, _) =>
if (id == u) (0.0, id) else (Double.PositiveInfinity, id)
)
println("Computing Digkstra distance info fof id: " + u.toString)
val sssp: Graph[(Double, VertexId), Double] = g.pregel[(Double, VertexId)]((Double.PositiveInfinity, Long.MaxValue), Int.MaxValue, EdgeDirection.Either)(
(id, dist, newDist) => {
if(dist._1 < newDist._1) {
(dist._1, id)
} else {
(newDist._1, id)
}
},
triplet => { // Send Message
if (triplet.srcAttr._1 + triplet.attr < triplet.dstAttr._1) {
println("triplet.srcAttr._1 = " + triplet.srcAttr._1 .toString)
println("triplet.dstAttr._1 = " + triplet.dstAttr._1 .toString)
Iterator((triplet.dstId, (triplet.srcAttr._1 + triplet.attr, triplet.srcId)))
} else {
Iterator.empty
}
},
(a, b) => (math.min(a._1, b._1), a._2) // Merge Message
)
sssp.vertices.take(20).foreach(println(_))
sssp.vertices.filter(element => element._1 == k1).map(element => element._2._1).collect()(0)
}
Inverse Dijkstra:
def computeMaxDistance(node: VertexId, center: VertexId): Double = {
val g: Graph[(Double, VertexId), Double] = this.uncertainGraph.mapVertices((id, _) =>
if (id != node) (0.0, id) else (Double.PositiveInfinity, id)
)
val sslp: Graph[(Double, VertexId), Double] = g.pregel[(Double, VertexId)]((Double.PositiveInfinity, Long.MaxValue), Int.MaxValue, EdgeDirection.Either)(
(id, dist, newDist) => {
if(dist._1 > newDist._1) {
(dist._1, id)
} else {
(newDist._1, id)
}
},
triplet => { // Send Message
if (triplet.srcAttr._1 + triplet.attr > triplet.dstAttr._1) {
println("triplet.srcAttr._1 = " + triplet.srcAttr._1 .toString)
println("triplet.dstAttr._1 = " + triplet.dstAttr._1 .toString)
Iterator((triplet.dstId, (triplet.srcAttr._1 + triplet.attr, triplet.srcId)))
} else {
Iterator.empty
}
},
(a, b) => (math.max(a._1, b._1), a._2) // Merge Message
)
sslp.vertices.take(20).foreach(println(_))
sslp.vertices.filter(element => element._1 == center).map(element => element._2._1).collect()(0)
}
Any help deeply appreciated. I'm not really that experienced with Scala and Spark. Thanks in advance.

Related

Parallel FP Growth in Spark

I am trying to understand the "add" and "extract" methods of the FPTree class:
(https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/mllib/fpm/FPGrowth.scala).
What is the purpose of 'summaries' variable?
where is the Group list?
I assume it is the following, am I correct:
val numParts = if (numPartitions > 0) numPartitions else data.partitions.length
val partitioner = new HashPartitioner(numParts)
What will 'summaries contain for 3 transactions of {a,b,c} , {a,b} , {b,c} where all are frequent?
def add(t: Iterable[T], count: Long = 1L): FPTree[T] = {
require(count > 0)
var curr = root
curr.count += count
t.foreach { item =>
val summary = summaries.getOrElseUpdate(item, new Summary)
summary.count += count
val child = curr.children.getOrElseUpdate(item, {
val newNode = new Node(curr)
newNode.item = item
summary.nodes += newNode
newNode
})
child.count += count
curr = child
}
this
}
def extract(
minCount: Long,
validateSuffix: T => Boolean = _ => true): Iterator[(List[T], Long)] = {
summaries.iterator.flatMap { case (item, summary) =>
if (validateSuffix(item) && summary.count >= minCount) {
Iterator.single((item :: Nil, summary.count)) ++
project(item).extract(minCount).map { case (t, c) =>
(item :: t, c)
}
} else {
Iterator.empty
}
}
}
After a bit experiments, it is pretty straight forward:
1+2) The partition is indeed the Group representative.
It is also how the conditional transactions calculated:
private def genCondTransactions[Item: ClassTag](
transaction: Array[Item],
itemToRank: Map[Item, Int],
partitioner: Partitioner): mutable.Map[Int, Array[Int]] = {
val output = mutable.Map.empty[Int, Array[Int]]
// Filter the basket by frequent items pattern and sort their ranks.
val filtered = transaction.flatMap(itemToRank.get)
ju.Arrays.sort(filtered)
val n = filtered.length
var i = n - 1
while (i >= 0) {
val item = filtered(i)
val part = partitioner.getPartition(item)
if (!output.contains(part)) {
output(part) = filtered.slice(0, i + 1)
}
i -= 1
}
output
}
The summaries is just a helper to save the count of items in transaction
The extract/project will generate the FIS by using up/down recursion and dependent FP-Trees (project), while checking summaries if traversal that path is needed.
summaries of node 'a' will have {b:2,c:1} and children of node 'a' are 'b' and 'c'.

Warnsdorff’s algorithm for custom move in Scala

I have written a program to find the list for move required to cover all square of the chessboard using Warnsdorff’s algorithm. it is perfectly working for 7x7 board but not working for board like 8x8, 10x10 or 16x16. Its goes on running for long. The below are the code. Please point out where I am going wrong.
object PawnTourMain {
def main(args: Array[ String ]): Unit = {
val kt = PawnTour(7)
kt.findTour(0, 1, 0)
kt.printSolution
}
class PawnTour(size: Int, board: Array[ Array[ Int ] ], possibleMoves: Array[ Array[ Array[ Point ] ] ]) {
val UNUSED = -1
def findTour(x: Int, y: Int, current: Int): Boolean = {
if (board(x)(y) != UNUSED) return false
//Mark current position as 'current'
board(x)(y) = current
if (current == size * size - 1) { //done :)
return true
}
for (d <- possibleMoves(x)(y)) {
if (findTour(d.x, d.y, current + 1)) return true
}
//if we are here, all our options ran out :(
//reset the current cell and return false
board(x)(y) = UNUSED
false
}
def printSolution: Unit = {
board foreach {
row =>
row foreach (number => print(number + " ")); println
}
}
}
case class Point(x: Int, y: Int)
object PawnTour {
val DIRECTIONS = Array(Point(3, 0), Point(-3, 0), Point(2, -2), Point(2, 2), Point(0, 3), Point(0, -3), Point(-2, -2), Point(2, 2))
def apply(n: Int): PawnTour = {
val board = Array.fill[ Int ](n, n)(-1)
val possibleMoves = Array.tabulate(n, n) { (x, y) =>
DIRECTIONS.flatMap { d =>
val nx = x + d.x
val ny = y + d.y
if ((nx >= 0 && nx < n) && (ny >= 0 && ny < n)) Option(Point(nx, ny)) else None
}
}
var x = 0
while (x < n) {
var y = 0
while (y < n) {
val moves: Array[ Point ] = possibleMoves(x)(y)
moves.sortBy((o1: Point) => possibleMoves(o1.x)(o1.y).size)
y += 1
println(moves.toList)
}
x += 1
}
new PawnTour(n, board, possibleMoves)
}
def printSolution(array: Array[ Array[ Int ] ]): Unit = {
array foreach {
row =>
row foreach (number => print(number + " ")); println
}
}
}
}
From Wikipedia
The knight is moved so that it always proceeds to the square from which the knight will have the fewest onward moves
Your implementation does not take into account already visited squares. It creates and sorts possible moves on empty board but forgets to update them when algorithm makes a move.
When square is visited you should remove it from possible moves and then reorder possible moves again

Scala - Recursive method is return different values

I have implemented a calculation to obtain the node score of each nodes.
The formula to obtain the value is:
The children list can not be empty or a flag must be true;
The iterative way works pretty well:
class TreeManager {
def scoreNo(nodes:List[Node]): List[(String, Double)] = {
nodes.headOption.map(node => {
val ranking = node.key.toString -> scoreNode(Some(node)) :: scoreNo(nodes.tail)
ranking ::: scoreNo(node.children)
}).getOrElse(Nil)
}
def scoreNode(node:Option[Node], score:Double = 0, depth:Int = 0):Double = {
node.map(n => {
var nodeScore = score
for(child <- n.children){
if(!child.children.isEmpty || child.hasInvitedSomeone == Some(true)){
nodeScore = scoreNode(Some(child), (nodeScore + scala.math.pow(0.5, depth)), depth+1)
}
}
nodeScore
}).getOrElse(score)
}
}
But after i've refactored this piece of code to use recursion, the results are totally wrong:
class TreeManager {
def scoreRecursive(nodes:List[Node]): List[(Int, Double)] = {
def scoreRec(nodes:List[Node], score:Double = 0, depth:Int = 0): Double = nodes match {
case Nil => score
case n =>
if(!n.head.children.isEmpty || n.head.hasInvitedSomeone == Some(true)){
score + scoreRec(n.tail, score + scala.math.pow(0.5, depth), depth + 1)
} else {
score
}
}
nodes.headOption.map(node => {
val ranking = node.key -> scoreRec(node.children) :: scoreRecursive(nodes.tail)
ranking ::: scoreRecursive(node.children)
}).getOrElse(Nil).sortWith(_._2 > _._2)
}
}
The Node is an object of a tree and it's represented by the following class:
case class Node(key:Int,
children:List[Node] = Nil,
hasInvitedSomeone:Option[Boolean] = Some(false))
And here is the part that i'm running to check results:
object Main {
def main(bla:Array[String]) = {
val xx = new TreeManager
val values = List(
Node(10, List(Node(11, List(Node(13))),
Node(12,
List(
Node(14, List(
Node(15, List(Node(18))), Node(17, hasInvitedSomeone = Some(true)),
Node(16, List(Node(19, List(Node(20)))),
hasInvitedSomeone = Some(true))),
hasInvitedSomeone = Some(true))),
hasInvitedSomeone = Some(true))),
hasInvitedSomeone = Some(true)))
val resIterative = xx.scoreNo(values)
//val resRecursive = xx.scoreRec(values)
println("a")
}
}
The iterative way is working because i've checked it but i didn't get why recursive return wrong values.
Any idea?
Thank in advance.
The recursive version never recurses on children of the nodes, just on the tail. Whereas the iterative version correctly both recurse on the children and iterate on the tail.
You'll notice your "iterative" version is also recursive btw.

Build dynamic query with Slick 2.1.0

Goal is to filter Items with optional keywords and/or shopId.
If none of them are defined, all Items should be returned.
My attempt is
case class ItemSearchParameters(keywords: Option[String], shopId: Option[Long])
def search(params: ItemSearchParameters): Either[Failure, List[Item]] = {
try {
db withDynSession {
val q = Items.query
if (params.keywords.isDefined) {
q.filter { i =>
((i.title like "%" + params.keywords + "%")
|| (i.description like "%" + params.keywords + "%"))
}
}
if (params.shopId.isDefined) {
q.filter { i =>
i.shopId === params.shopId
}
}
Right(q.run.toList)
}
} catch {
case e: SQLException =>
Left(databaseError(e))
}
}
params.keywords or params.ShopId defined this function returned all Items. Can someone please explain what is wrong?
Update: second attempt
def search(params: ItemSearchParameters): Either[Failure, List[Item]] = {
try {
db withDynSession {
var q = Items.query
q = params.keywords.map{ k => q.filter(_.title like "%" + k + "%")} getOrElse q
q = params.keywords.map{ k => q.filter(_.description like "%" + k + "%")} getOrElse q
q = params.shopId.map{ sid => q.filter(_.shopId === sid)} getOrElse q
Right(q.run.toList)
}
} catch {
case e: SQLException =>
Left(databaseError(e))
}
}
For this second attempt how to do (title OR description) if keywords isDefined?
Update: Third attempt with MaybeFilter Not working
case class MaybeFilter[X, Y](val query: scala.slick.lifted.Query[X, Y, Seq]) {
def filteredBy(op: Option[_])(f:(X) => Column[Option[Boolean]]) = {
op map { o => MaybeFilter(query.filter(f)) } getOrElse { this }
}
}
class ItemDAO extends Configuration {
implicit def maybeFilterConversor[X,Y](q:Query[X,Y,Seq]) = new MaybeFilter(q)
def search(params: ItemSearchParameters): Either[Failure, List[Item]] = {
try {
db withDynSession {
val q = Items
.filteredBy(params.keywords){i => ((i.title like "%" + params.keywords + "%")
|| (i.description like "%" + params.keywords + "%"))}
.filteredBy(params.shopId){_.shopId === params.shopId}
.query
Right(q.list)
}
} catch {
case e: SQLException =>
Left(databaseError(e))
}
}
}
Third attempt returns empty list if keywords is given
def search(params: ItemSearchParameters): Either[Failure, List[Item]] = {
try {
db withDynSession {
var q = Items.query
q = params.keywords.map{ k => q.filter(
i => (i.title like "%" + k + "%")
|| (i.description like "%" + k + "%")
)} getOrElse q
q = params.shopId.map{ sid => q.filter(
_.shopId === sid
)} getOrElse q
Right(q.run.toList)
}
} catch {
case e: SQLException =>
Left(databaseError(e))
}
}
I am not sure it is the best answer because of var q
As I understood you correct, you want to make a filter by optional fields.
Your second attempt is quiet closer to reality, the first has incorrect matching, you compare option fields to non option. You've answered your own answer while I was writing this response :)
I'd like to recommend you this MaybeFilter https://gist.github.com/cvogt/9193220
Or here is modified version: https://github.com/neowinx/hello-slick-2.1-dynamic-filter/blob/master/src/main/scala/HelloSlick.scala#L3-L7
Maybe this can help you to solve your problem in a more generic way.

Value * is not a member of AnyVal

This is a fold that I wrote and I get this error:
Error:(26, 42) value * is not a member of AnyVal
(candE.intersect(candR), massE * massR)
^
allAssignmentsTable is a List[Map[Set[Candidate[A]],Double]]
val allAssignmentsTable = hypothesis.map(h => {
allAssignments.map(copySet => {
if(h.getAssignment.keySet.contains(copySet))
(copySet -> h.getAssignment(copySet))
else
(copySet -> 0.0)
}).toMap
})
val aggregated = allAssignmentsTable.foldLeft(initialFold) { (res,element) =>
val allIntersects = element.map {
case (candE, massE) =>
res.map {
case (candR, massR) => candE.intersect(candR), massE * massR
}.toList
}.toList.flatten
val normalizer = allIntersects.groupBy(_._1).filter(_._1.size == 0).map {
case(key, value) => value.foldLeft(0.0)((e,i) => i._2 + e)
}.head
allIntersects.groupBy(_._1).map {
case(key, value) => key -> value.foldLeft(0.0)((e,i) => i._2 + e)
}
}
if I do this: case(candE, massE:Double) then I won't get an error but I will get exception in match.
The problem that you get here:
val aggregated = allAssignmentsTable.foldLeft(initialFold) { (res,element) =>
val allIntersects = element.map {
case (candE, massE) =>
res.map {
case (candR, massR) => candE.intersect(candR), massE * massR
}.toList
}.toList.flatten
is most probably arising from the previous code block:
val allAssignmentsTable = hypothesis.map(h => {
allAssignments.map(copySet => {
if(h.getAssignment.keySet.contains(copySet))
(copySet -> h.getAssignment(copySet))
else
(copySet -> 0.0)
}).toMap
})
My hypothesis is that h.getAssignment(copySet) returns something else instead of Double (which seems to be confirmed by the error message quoted in the OP - (26, 42)etc, neither of these two values look like it is a Double. Therefore, allAssignmentsTable undercover is probably not List[Map[Set[Candidate[A]],Double]] but something else e.g. it has Any instead of Double, therefore operator * cannot be applied.