Functional match-maker algorithm in scala - scala

Suppose, in scala, I have a collection of Person objects where each person has an identifier and quantity value:
case class Person(identifier: String, quantity : Int)
A positive quantity represents supply and a negative quantity represents demand. Similarly, a transfer can be represented as:
case class Transfer(quantity : Int, supplier : String, consumer : String)
What is a "functional" algorithm that can maximize the transfers from suppliers to consumers by matching as much supply as possible with demand?
The signature would look something like
def matchMaker(people : Iterable[Person]) : Iterable[Transfer] = ???
Note: the collection type Iterable is not strictly necessary for either the input or output. A Set, List, etc. will suffice.
Example:
If our population is:
val people = Iterable(Person("Alice", 10),
Person("Charlie", -5),
Person("Bob", 4))
The matchmaker algorithm would create an Iterable that could be:
Iterable(Transfer(5, "Alice", "Charlie"))
Or, another possible solution could be
Iterable(Transfer(4, "Bob", "Charlie"),
Transfer(1, "Alice", "Charlie"))
A bad solution would be where only some of the possible transfer was identified:
Iterable(Transfer(4, "Bob", "Charlie")) //Charlie still has demand left
Thank you in advance for your review.

It's not pretty, and has much room for improvement in the details. But it should at least transport the idea of the algorithm and showcase some awesome scala language features like implicits, Ordering trait, tailrecursion, ...
implicit val peopleOrdering: Ordering[Person] = Ordering.by(_.quantity)
def matchMaker(people : Iterable[Person]) : Iterable[Transfer] = {
#tailrec
def matchMaker(sortedPeople : Vector[Person], transfers: List[Transfer]) : Iterable[Transfer] = {
// just to make a point, because I am expecting the incoming vector to be sorted
// If you are confident about your code you probably don't need the require
// However, imo, it is always a good idea to double check
// and require is 'elidable' so won't clutter your program compiled for production
require(in.sorted == in, "Passed person Vector MUST be sorted.")
if (sortedPeople.forall(_.quantity >= 0) || sortedPeople.forall(_.quantity <= 0)) {
// nothing more that can be done
transfers
} else {
val sender = sortedPeople.last
val receiver = sortedPeople.head
val transferQuantity = if (receiver.quantity + sender.quantity >= 0) {
-receiver.quantity
} else {
sender.quantity
}
val transfer = Transfer(transferQuantity, sender.identifier, receiver.identifier)
val nextPeople = sortedPeople.map {
case `sender` => sender.copy(quantity = sender.quantity - transfer.quantity)
case `receiver` => receiver.copy(quantity = receiver.quantity + transfer.quantity)
case other => other
}
matchMaker(nextPeople.sorted, transfer :: transfers)
}
}
matchMaker(people.toVector.sorted, Nil)
}
The algorithm will recursively transfer from the richest person to the person with the highest debt until no one is left with debt. Unless all persons are in debt or penniless from the beginning.

Split into those with supply and those wanting:
val (suppliers, demanders) = people.partition(_.quantity > 0)
Define a function to consume one supplier's quantity, returning an updated list of those wanting, and an updated list of transfers done
def consume(supplier: Person, demanders: Iterable[Person], transfers: List[Transfer]) = {
val (q, ds, ts) = demanders.foldLeft(
(supplier.quantity, Iterable[Person](), transfers))
{
case ((quantity, ds, ts), d) =>
val amount = Math.min(quantity, -d.quantity)
if (amount != 0) (quantity - amount,
ds ++ Iterable(d.copy(quantity = d.quantity + amount)),
Transfer(amount, supplier.identifier, d.identifier) :: ts)
else
(quantity, ds ++ Iterable(d), ts)
}
(ds, ts)
}
Go over the suppliers, consuming each, and passing along the updated demanders and current list of transfers
val (_, transfers) = suppliers.foldLeft((demanders, List[Transfer]()))
{ case ((ds, ts), s) => consume(s, ds, ts) }
transfers
// List(Transfer(5,Alice,Charlie))
An optimisation: drop a demander in consume if their demand is now satisfied. That way it won't be considered for later suppliers.
def consume(supplier: Person, demanders: Iterable[Person], transfers: List[Transfer]) = {
val (q, ds, ts) = demanders.foldLeft((supplier.quantity, Iterable[Person](), transfers))
{
case ((quantity, ds, ts), d) =>
val amount = Math.min(quantity, -d.quantity)
val remaining = d.quantity + amount
if (amount != 0) (quantity - amount,
if (remaining != 0) ds ++ Iterable(d.copy(quantity = remaining))
else ds,
Transfer(amount, supplier.identifier, d.identifier) :: ts)
else (quantity, ds ++ Iterable(d), ts)
}
(ds, ts)
}

Related

How to create two sequence out of one comparing one custom object with another in that sequence?

case class Submission(name: String, plannedDate: Option[LocalDate], revisedDate: Option[LocalDate])
val submission_1 = Submission("Åwesh Care", Some(2020-05-11), Some(2020-06-11))
val submission_2 = Submission("robin Dore", Some(2020-05-11), Some(2020-05-30))
val submission_3 = Submission("AIMS Hospital", Some(2020-01-24), Some(2020-07-30))
val submissions = Seq(submission_1, submission_2, submission_3)
Split the submissions so that the submission with the same plannedDate and/or revisedDate
goes to sameDateGroup and others go to remainder.
val (sameDateGroup, remainder) = someFunction(submissions)
Example result as below:
sameDateGroup should have
Seq(Submission("Åwesh Care", Some(2020-05-11), Some(2020-06-11)),
Submission("robin Dore", Some(2020-05-11), Some(2020-05-30)))
and remainder should have:
Seq(Submission("AIMS Hospital", Some(2020-01-24), Some(2020-07-30)))
So, if I understand the logic here, submission A shares a date with submission B (and both would go in the sameDateGrooup) IFF:
subA.plannedDate == subB.plannedDate
OR subA.plannedDate == subB.revisedDate
OR subA.revisedDate == subB.plannedDate
OR subA.revisedDate == subB.revisedDate
Likewise, and conversely, submission C belongs in the remainder category IFF:
subC.plannedDate // is unique among all planned dates
AND subC.plannedDate // does not exist among all revised dates
AND subC.revisedDate // is unique among all revised dates
AND subC.revisedDate // does not exist among all planned dates
Given all that, I think this does what you're describing.
import java.time.LocalDate
case class Submission(name : String
,plannedDate : Option[LocalDate]
,revisedDate : Option[LocalDate])
val submission_1 = Submission("Åwesh Care"
,Some(LocalDate.parse("2020-05-11"))
,Some(LocalDate.parse("2020-06-11")))
val submission_2 = Submission("robin Dore"
,Some(LocalDate.parse("2020-05-11"))
,Some(LocalDate.parse("2020-05-30")))
val submission_3 = Submission("AIMS Hospital"
,Some(LocalDate.parse("2020-01-24"))
,Some(LocalDate.parse("2020-07-30")))
val submissions = Seq(submission_1, submission_2, submission_3)
val pDates = submissions.groupBy(_.plannedDate)
val rDates = submissions.groupBy(_.revisedDate)
val (sameDateGroup, remainder) = submissions.partition(sub =>
pDates(sub.plannedDate).lengthIs > 1 ||
rDates(sub.revisedDate).lengthIs > 1 ||
pDates.keySet(sub.revisedDate) ||
rDates.keySet(sub.plannedDate))
A simple way to do this is to count the number of matching submissions for each submission in the list, and use that to partition the list:
def matching(s1: Submission, s2: Submission) =
s1.plannedDate == s2.plannedDate || s1.revisedDate == s2.revisedDate
val (sameDateGroup, remainder) =
submissions.partition { s1 =>
submissions.count(s2 => matching(s1, s2)) > 1
}
The matching function can contain whatever specific test is required.
This is O(n^2) so a more sophisticated algorithm would be needed for very long lists.
I think this will do the trick.
I'm sorry, some variablenames are not very meaningful, because I used different case class hen trying this. For some reason I only thought about using .groupBy later. So I'm not really recommend using this, as it is a bit uncomprehensible and can be solved easier with groupby
case class Submission(name: String, plannedDate: Option[String], revisedDate: Option[String])
val l =
List(
Submission("Åwesh Care", Some("2020-05-11"), Some("2020-06-11")),
Submission("robin Dore", Some("2020-05-11"), Some("2020-05-30")),
Submission("AIMS Hospital", Some("2020-01-24"), Some("2020-07-30")))
val t = l
.map((_, 1))
.foldLeft(Map.empty[Option[String], (List[Submission], Int)])((acc, idnTuple) => idnTuple match {
case (idn, count) => {
acc
.get(idn.plannedDate)
.map {
case (mapIdn, mapCount) => acc + (idn.plannedDate -> (idn :: mapIdn, mapCount + count))
}.getOrElse(acc + (idn.plannedDate -> (List(idn), count)))
}})
.values
.partition(_._2 > 1)
val r = (t._1.map(_._1).flatten, t._2.map(_._1).flatten)
println(r)
It basically follows the map-reduce wordcount schema.
If someone sees this, and knows how to do the tuple deconstruction easier, please let me know in the comments.

Scala String Equality Question from Programming Interview

Since I liked programming in Scala, for my Google interview, I asked them to give me a Scala / functional programming style question. The Scala functional style question that I got was as follows:
You have two strings consisting of alphabetic characters as well as a special character representing the backspace symbol. Let's call this backspace character '/'. When you get to the keyboard, you type this sequence of characters, including the backspace/delete character. The solution you are to implement must check if the two sequences of characters produce the same output. For example, "abc", "aa/bc". "abb/c", "abcc/", "/abc", and "//abc" all produce the same output, "abc". Because this is a Scala / functional programming question, you must implement your solution in idiomatic Scala style.
I wrote the following code (it might not be exactly what I wrote, I'm just going off memory). Basically I just go linearly through the string, prepending characters to a list, and then I compare the lists.
def processString(string: String): List[Char] = {
string.foldLeft(List[Char]()){ case(accumulator: List[Char], char: Char) =>
accumulator match {
case head :: tail => if(char != '/') { char :: head :: tail } else { tail }
case emptyList => if(char != '/') { char :: emptyList } else { emptyList }
}
}
}
def solution(string1: String, string2: String): Boolean = {
processString(string1) == processString(string2)
}
So far so good? He then asked for the time complexity and I responded linear time (because you have to process each character once) and linear space (because you have to copy each element into a list). Then he asked me to do it in linear time, but with constant space. I couldn't think of a way to do it that was purely functional. He said to try using a function in the Scala collections library like "zip" or "map" (I explicitly remember him saying the word "zip").
Here's the thing. I think that it's physically impossible to do it in constant space without having any mutable state or side effects. Like I think that he messed up the question. What do you think?
Can you solve it in linear time, but with constant space?
This code takes O(N) time and needs only three integers of extra space:
def solution(a: String, b: String): Boolean = {
def findNext(str: String, pos: Int): Int = {
#annotation.tailrec
def rec(pos: Int, backspaces: Int): Int = {
if (pos == 0) -1
else {
val c = str(pos - 1)
if (c == '/') rec(pos - 1, backspaces + 1)
else if (backspaces > 0) rec(pos - 1, backspaces - 1)
else pos - 1
}
}
rec(pos, 0)
}
#annotation.tailrec
def rec(aPos: Int, bPos: Int): Boolean = {
val ap = findNext(a, aPos)
val bp = findNext(b, bPos)
(ap < 0 && bp < 0) ||
(ap >= 0 && bp >= 0 && (a(ap) == b(bp)) && rec(ap, bp))
}
rec(a.size, b.size)
}
The problem can be solved in linear time with constant extra space: if you scan from right to left, then you can be sure that the /-symbols to the left of the current position cannot influence the already processed symbols (to the right of the current position) in any way, so there is no need to store them.
At every point, you need to know only two things:
Where are you in the string?
How many symbols do you have to throw away because of the backspaces
That makes two integers for storing the positions, and one additional integer for temporary storing the number of accumulated backspaces during the findNext invocation. That's a total of three integers of space overhead.
Intuition
Here is my attempt to formulate why the right-to-left scan gives you a O(1) algorithm:
The future cannot influence the past, therefore there is no need to remember the future.
The "natural time" in this problem flows from left to right. Therefore, if you scan from right to left, you are moving "from the future into the past", and therefore you don't need to remember the characters to the right of your current position.
Tests
Here is a randomized test, which makes me pretty sure that the solution is actually correct:
val rng = new util.Random(0)
def insertBackspaces(s: String): String = {
val n = s.size
val insPos = rng.nextInt(n)
val (pref, suff) = s.splitAt(insPos)
val c = ('a' + rng.nextInt(26)).toChar
pref + c + "/" + suff
}
def prependBackspaces(s: String): String = {
"/" * rng.nextInt(4) + s
}
def addBackspaces(s: String): String = {
var res = s
for (i <- 0 until 8)
res = insertBackspaces(res)
prependBackspaces(res)
}
for (i <- 1 until 1000) {
val s = "hello, world"
val t = "another string"
val s1 = addBackspaces(s)
val s2 = addBackspaces(s)
val t1 = addBackspaces(t)
val t2 = addBackspaces(t)
assert(solution(s1, s2))
assert(solution(t1, t2))
assert(!solution(s1, t1))
assert(!solution(s1, t2))
assert(!solution(s2, t1))
assert(!solution(s2, t2))
if (i % 100 == 0) {
println(s"Examples:\n$s1\n$s2\n$t1\n$t2")
}
}
A few examples that the test generates:
Examples:
/helly/t/oj/m/, wd/oi/g/x/rld
///e/helx/lc/rg//f/o, wosq//rld
/anotl/p/hhm//ere/t/ strih/nc/g
anotx/hb/er sw/p/tw/l/rip/j/ng
Examples:
//o/a/hellom/, i/wh/oe/q/b/rld
///hpj//est//ldb//y/lok/, world
///q/gd/h//anothi/k/eq/rk/ string
///ac/notherli// stri/ig//ina/n/g
Examples:
//hnn//ello, t/wl/oxnh///o/rld
//helfo//u/le/o, wna//ova//rld
//anolq/l//twl//her n/strinhx//g
/anol/tj/hq/er swi//trrq//d/ing
Examples:
//hy/epe//lx/lo, wr/v/t/orlc/d
f/hk/elv/jj//lz/o,wr// world
/anoto/ho/mfh///eg/r strinbm//g
///ap/b/notk/l/her sm/tq/w/rio/ng
Examples:
///hsm/y//eu/llof/n/, worlq/j/d
///gx//helf/i/lo, wt/g/orn/lq/d
///az/e/notm/hkh//er sm/tb/rio/ng
//b/aen//nother v/sthg/m//riv/ng
Seems to work just fine. So, I'd say that the Google-guy did not mess up, looks like a perfectly valid question.
You don't have to create the output to find the answer. You can iterate the two sequences at the same time and stop on the first difference. If you find no difference and both sequences terminate at the same time, they're equal, otherwise they're different.
But now consider sequences such as this one: aaaa/// to compare with a. You need to consume 6 elements from the left sequence and one element from the right sequence before you can assert that they're equal. That means that you would need to keep at least 5 elements in memory until you can verify that they're all deleted. But what if you iterated elements from the end? You would then just need to count the number of backspaces and then just ignoring as many elements as necessary in the left sequence without requiring to keep them in memory since you know they won't be present in the final output. You can achieve O(1) memory using these two tips.
I tried it and it seems to work:
def areEqual(s1: String, s2: String) = {
def charAt(s: String, index: Int) = if (index < 0) '#' else s(index)
#tailrec
def recSol(i1: Int, backspaces1: Int, i2: Int, backspaces2: Int): Boolean = (charAt(s1, i1), charAt(s2, i2)) match {
case ('/', _) => recSol(i1 - 1, backspaces1 + 1, i2, backspaces2)
case (_, '/') => recSol(i1, backspaces1, i2 - 1, backspaces2 + 1)
case ('#' , '#') => true
case (ch1, ch2) =>
if (backspaces1 > 0) recSol(i1 - 1, backspaces1 - 1, i2 , backspaces2 )
else if (backspaces2 > 0) recSol(i1 , backspaces1 , i2 - 1, backspaces2 - 1)
else ch1 == ch2 && recSol(i1 - 1, backspaces1 , i2 - 1, backspaces2 )
}
recSol(s1.length - 1, 0, s2.length - 1, 0)
}
Some tests (all pass, let me know if you have more edge cases in mind):
// examples from the question
val inputs = Array("abc", "aa/bc", "abb/c", "abcc/", "/abc", "//abc")
for (i <- 0 until inputs.length; j <- 0 until inputs.length) {
assert(areEqual(inputs(i), inputs(j)))
}
// more deletions than required
assert(areEqual("a///////b/c/d/e/b/b", "b"))
assert(areEqual("aa/a/a//a//a///b", "b"))
assert(areEqual("a/aa///a/b", "b"))
// not enough deletions
assert(!areEqual("aa/a/a//a//ab", "b"))
// too many deletions
assert(!areEqual("a", "a/"))
PS: just a few notes on the code itself:
Scala type inference is good enough so that you can drop types in the partial function inside your foldLeft
Nil is the idiomatic way to refer to the empty list case
Bonus:
I had something like Tim's soltion in mind before implementing my idea, but I started early with pattern matching on characters only and it didn't fit well because some cases require the number of backspaces. In the end, I think a neater way to write it is a mix of pattern matching and if conditions. Below is my longer original solution, the one I gave above was refactored laater:
def areEqual(s1: String, s2: String) = {
#tailrec
def recSol(c1: Cursor, c2: Cursor): Boolean = (c1.char, c2.char) match {
case ('/', '/') => recSol(c1.next, c2.next)
case ('/' , _) => recSol(c1.next, c2 )
case (_ , '/') => recSol(c1 , c2.next)
case ('#' , '#') => true
case (a , b) if (a == b) => recSol(c1.next, c2.next)
case _ => false
}
recSol(Cursor(s1, s1.length - 1), Cursor(s2, s2.length - 1))
}
private case class Cursor(s: String, index: Int) {
val char = if (index < 0) '#' else s(index)
def next = {
#tailrec
def recSol(index: Int, backspaces: Int): Cursor = {
if (index < 0 ) Cursor(s, index)
else if (s(index) == '/') recSol(index - 1, backspaces + 1)
else if (backspaces > 1) recSol(index - 1, backspaces - 1)
else Cursor(s, index - 1)
}
recSol(index, 0)
}
}
If the goal is minimal memory footprint, it's hard to argue against iterators.
def areSame(a :String, b :String) :Boolean = {
def getNext(ci :Iterator[Char], ignore :Int = 0) : Option[Char] =
if (ci.hasNext) {
val c = ci.next()
if (c == '/') getNext(ci, ignore+1)
else if (ignore > 0) getNext(ci, ignore-1)
else Some(c)
} else None
val ari = a.reverseIterator
val bri = b.reverseIterator
1 to a.length.max(b.length) forall(_ => getNext(ari) == getNext(bri))
}
On the other hand, when arguing FP principals it's hard to defend iterators, since they're all about maintaining state.
Here is a version with a single recursive function and no additional classes or libraries. This is linear time and constant memory.
def compare(a: String, b: String): Boolean = {
#tailrec
def loop(aIndex: Int, aDeletes: Int, bIndex: Int, bDeletes: Int): Boolean = {
val aVal = if (aIndex < 0) None else Some(a(aIndex))
val bVal = if (bIndex < 0) None else Some(b(bIndex))
if (aVal.contains('/')) {
loop(aIndex - 1, aDeletes + 1, bIndex, bDeletes)
} else if (aDeletes > 0) {
loop(aIndex - 1, aDeletes - 1, bIndex, bDeletes)
} else if (bVal.contains('/')) {
loop(aIndex, 0, bIndex - 1, bDeletes + 1)
} else if (bDeletes > 0) {
loop(aIndex, 0, bIndex - 1, bDeletes - 1)
} else {
aVal == bVal && (aVal.isEmpty || loop(aIndex - 1, 0, bIndex - 1, 0))
}
}
loop(a.length - 1, 0, b.length - 1, 0)
}

Understanding performance of a tailrec annotated recursive method in scala

Consider the following method - which has been verified to conform to the proper tail recursion :
#tailrec
def getBoundaries(grps: Seq[(BigDecimal, Int)], groupSize: Int, curSum: Int = 0, curOffs: Seq[BigDecimal] = Seq.empty[BigDecimal]): Seq[BigDecimal] = {
if (grps.isEmpty) curOffs
else {
val (id, cnt) = grps.head
val newSum = curSum + cnt.toInt
if (newSum%50==0) { println(s"id=$id newsum=$newSum") }
if (newSum >= groupSize) {
getBoundaries(grps.tail, groupSize, 0, curOffs :+ id) // r1
} else {
getBoundaries(grps.tail, groupSize, newSum, curOffs) // r2
}
}
}
This is running very slowly - about 75 loops per second. When I hit the stacktrace (a nice feature of Intellij) almost every time the line that is currently being invoked is the second tail-recursive call r2. That fact makes me suspicious of the purported "scala unwraps the recursive calls into a while loop". If the unwrapping were occurring then why are we seeing so much time in the invocations themselves?
Beyond having a properly structured tail recursive method are there other considerations to get a recursive routine have performance approaching a direct iteration?
The performance will depend on the underlying type of the Seq.
If it is List then the problem is appending (:+) to the List because this gets very slow with long lists because it has to scan the whole list to find the end.
One solution is to prepend to the list (+:) each time and then reverse at the end. This can give very significant performance improvements, because adding to the start of a list is very quick.
Other Seq types will have different performance characteristics, but you can convert to a List before the recursive call so that you know how it is going to perform.
Here is sample code
def getBoundaries(grps: Seq[(BigDecimal, Int)], groupSize: Int): Seq[BigDecimal] = {
#tailrec
def loop(grps: List[(BigDecimal, Int)], curSum: Int, curOffs: List[BigDecimal]): List[BigDecimal] =
if (grps.isEmpty) curOffs
else {
val (id, cnt) = grps.head
val newSum = curSum + cnt.toInt
if (newSum >= groupSize) {
loop(grps.tail, 0, id +: curOffs) // r1
} else {
loop(grps.tail, newSum, curOffs) // r2
}
}
loop(grps.toList, 0, Nil).reverse
}
This version gives 10x performance improvement over the original code using the test data provided by the questioner in his own answer to the question.
The issue is not in the recursion but instead in the array manipulation . With the following testcase it runs at about 200K recursions per second
type Fgroups = Seq[(BigDecimal, Int)]
test("testGetBoundaries") {
val N = 200000
val grps: Fgroups = (N to 1 by -1).flatMap { x => Array.tabulate(x % 20){ x2 => (BigDecimal(x2 * 1e9), 1) }}
val sgrps = grps.sortWith { case (a, b) =>
a._1.longValue.compare(b._1.longValue) < 0
}
val bb = getBoundaries(sgrps, 100 )
println(bb.take(math.min(50,bb.length)).mkString(","))
assert(bb.length==1900)
}
My production data sample has a similar number of entries (Array with 233K rows ) but runs at 3 orders of magnitude more slowly. I am looking into the tail operation and other culprits now.
Update The following reference from Alvin Alexander indicates that the tail operation should be v fast for immutable collections - but deadly slow for long mutable ones - including Array's !
https://alvinalexander.com/scala/understanding-performance-scala-collections-classes-methods-cookbook
Wow! I had no idea about the performance implications of using mutable collections in scala!
Update By adding code to convert the Array to an (immutable) Seq I see the 3 orders of magnitude performance improvement on the production data sample:
val grps = if (grpsIn.isInstanceOf[mutable.WrappedArray[_]] || grpsIn.isInstanceOf[Array[_]]) {
Seq(grpsIn: _*)
} else grpsIn
The (now fast ~200K/sec) final code is:
type Fgroups = Seq[(BigDecimal, Int)]
val cntr = new java.util.concurrent.atomic.AtomicInteger
#tailrec
def getBoundaries(grpsIn: Fgroups, groupSize: Int, curSum: Int = 0, curOffs: Seq[BigDecimal] = Seq.empty[BigDecimal]): Seq[BigDecimal] = {
val grps = if (grpsIn.isInstanceOf[mutable.WrappedArray[_]] || grpsIn.isInstanceOf[Array[_]]) {
Seq(grpsIn: _*)
} else grpsIn
if (grps.isEmpty) curOffs
else {
val (id, cnt) = grps.head
val newSum = curSum + cnt.toInt
if (cntr.getAndIncrement % 500==0) { println(s"[${cntr.get}] id=$id newsum=$newSum") }
if (newSum >= groupSize) {
getBoundaries(grps.tail, groupSize, 0, curOffs :+ id)
} else {
getBoundaries(grps.tail, groupSize, newSum, curOffs)
}
}
}

Scala Graph Cycle Detection Algo 'return' needed?

I have implemented a small cycle detection algorithm for a DAG in Scala.
The 'return' bothers me - I'd like to have a version without the return...possible?
def isCyclic() : Boolean = {
lock.readLock().lock()
try {
nodes.foreach(node => node.marker = 1)
nodes.foreach(node => {if (1 == node.marker && visit(node)) return true})
} finally {
lock.readLock().unlock()
}
false
}
private def visit(node: MyNode): Boolean = {
node.marker = 3
val nodeId = node.id
val children = vertexMap.getChildren(nodeId).toList.map(nodeId => id2nodeMap(nodeId))
children.foreach(child => {
if (3 == child.marker || (1 == child.marker && visit(child))) return true
})
node.marker = 2
false
}
Yes, by using '.find' instead of 'foreach' + 'return':
http://www.scala-lang.org/api/current/index.html#scala.collection.immutable.Seq
def isCyclic() : Boolean = {
def visit(node: MyNode): Boolean = {
node.marker = 3
val nodeId = node.id
val children = vertexMap.getChildren(nodeId).toList.map(nodeId => id2nodeMap(nodeId))
val found = children.exists(child => (3 == child.marker || (1 == child.marker && visit(child))))
node.marker = 2
found
}
lock.readLock().lock()
try {
nodes.foreach(node => node.marker = 1)
nodes.exists(node => node.marker && visit(node))
} finally {
lock.readLock().unlock()
}
}
Summary:
I have originated two solutions as generic FP functions which detect cycles within a directed graph. And per your implied preference, the use of an early return to escape the recursive function has been eliminated. The first, isCyclic, simply returns a Boolean as soon as the DFS (Depth First Search) repeats a node visit. The second, filterToJustCycles, returns a copy of the input Map filtered down to just the nodes involved in any/all cycles, and returns an empty Map when no cycles are found.
Details:
For the following, please Consider a directed graph encoded as such:
val directedGraphWithCyclesA: Map[String, Set[String]] =
Map(
"A" -> Set("B", "E", "J")
, "B" -> Set("E", "F")
, "C" -> Set("I", "G")
, "D" -> Set("G", "L")
, "E" -> Set("H")
, "F" -> Set("G")
, "G" -> Set("L")
, "H" -> Set("J", "K")
, "I" -> Set("K", "L")
, "J" -> Set("B")
, "K" -> Set("B")
)
In both functions below, the type parameter N refers to whatever "Node" type you care to provide. It is important the provided "Node" type be both immutable and have stable equals and hashCode implementations (all of which occur automatically with use of immutable case classes).
The first function, isCyclic, is a similar in nature to the version of the solution provided by #the-archetypal-paul. It assumes the directed graph has been transformed into a Map[N, Set[N]] where N is the identity of a node in the graph.
If you need to see how to generically transform your custom directed graph implementation into a Map[N, Set[N]], I have outlined a generic solution towards the end of this answer.
Calling the isCyclic function as such:
val isCyclicResult = isCyclic(directedGraphWithCyclesA)
will return:
`true`
No further information is provided. And the DFS (Depth First Search) is aborted at detection of the first repeated visit to a node.
def isCyclic[N](nsByN: Map[N, Set[N]]) : Boolean = {
def hasCycle(nAndNs: (N, Set[N]), visited: Set[N] = Set[N]()): Boolean =
if (visited.contains(nAndNs._1))
true
else
nAndNs._2.exists(
n =>
nsByN.get(n) match {
case Some(ns) =>
hasCycle((n, ns), visited + nAndNs._1)
case None =>
false
}
)
nsByN.exists(hasCycle(_))
}
The second function, filterToJustCycles, uses the set reduction technique to recursively filter away unreferenced root nodes in the Map. If there are no cycles in the supplied graph of nodes, then .isEmpty will be true on the returned Map. If however, there are any cycles, all of the nodes required to participate in any of the cycles are returned with all of the other non-cycle participating nodes filtered away.
Again, if you need to see how to generically transform your custom directed graph implementation into a Map[N, Set[N]], I have outlined a generic solution towards the end of this answer.
Calling the filterToJustCycles function as such:
val cycles = filterToJustCycles(directedGraphWithCyclesA)
will return:
Map(E -> Set(H), J -> Set(B), B -> Set(E), H -> Set(J, K), K -> Set(B))
It's trivial to then create a traversal across this Map to produce any or all of the various cycle pathways through the remaining nodes.
def filterToJustCycles[N](nsByN: Map[N, Set[N]]): Map[N, Set[N]] = {
def recursive(nsByNRemaining: Map[N, Set[N]], referencedRootNs: Set[N] = Set[N]()): Map[N, Set[N]] = {
val (referencedRootNsNew, nsByNRemainingNew) = {
val referencedRootNsNewTemp =
nsByNRemaining.values.flatten.toSet.intersect(nsByNRemaining.keySet)
(
referencedRootNsNewTemp
, nsByNRemaining.collect {
case (t, ts) if referencedRootNsNewTemp.contains(t) && referencedRootNsNewTemp.intersect(ts.toSet).nonEmpty =>
(t, referencedRootNsNewTemp.intersect(ts.toSet))
}
)
}
if (referencedRootNsNew == referencedRootNs)
nsByNRemainingNew
else
recursive(nsByNRemainingNew, referencedRootNsNew)
}
recursive(nsByN)
}
So, how does one generically transform a custom directed graph implementation into a Map[N, Set[N]]?
In essence, "Go Scala case classes!"
First, let's define an example case of a real node in a pre-existing directed graph:
class CustomNode (
val equipmentIdAndType: String //"A387.Structure" - identity is embedded in a string and must be parsed out
, val childrenNodes: List[CustomNode] //even through Set is implied, for whatever reason this implementation used List
, val otherImplementationNoise: Option[Any] = None
)
Again, this is just an example. Yours could involve subclassing, delegation, etc. The purpose is to have access to a something that will be able to fetch the two essential things to make this work:
the identity of a node; i.e. something to distinguish it and makes it unique from all other nodes in the same directed graph
a collection of the identities of the immediate children of a specific node - if the specific node doesn't have any children, this collection will be empty
Next, we define a helper object, DirectedGraph, which will contain the infrastructure for the conversion:
Node: an adapter trait which will wrap CustomNode
toMap: a function which will take a List[CustomNode] and convert it to a Map[Node, Set[Node]] (which is type equivalent to our target type of Map[N, Set[N]])
Here's the code:
object DirectedGraph {
trait Node[S, I] {
def source: S
def identity: I
def children: Set[I]
}
def toMap[S, I, N <: Node[S, I]](ss: List[S], transformSToN: S => N): Map[N, Set[N]] = {
val (ns, nByI) = {
val iAndNs =
ss.map(
s => {
val n =
transformSToN(s)
(n.identity, n)
}
)
(iAndNs.map(_._2), iAndNs.toMap)
}
ns.map(n => (n, n.children.map(nByI(_)))).toMap
}
}
Now, we must generate the actual adapter, CustomNodeAdapter, which will wrap each CustomNode instance. This adapter uses a case class in a very specific way; i.e. specifying two constructor parameters lists. It ensures the case class conforms to a Set's requirement that a Set member have correct equals and hashCode implementations. For more details on why and how to use a case class this way, please see this StackOverflow question and answer:
object CustomNodeAdapter extends (CustomNode => CustomNodeAdapter) {
def apply(customNode: CustomNode): CustomNodeAdapter =
new CustomNodeAdapter(fetchIdentity(customNode))(customNode) {}
def fetchIdentity(customNode: CustomNode): String =
fetchIdentity(customNode.equipmentIdAndType)
def fetchIdentity(eiat: String): String =
eiat.takeWhile(char => char.isLetter || char.isDigit)
}
abstract case class CustomNodeAdapter(identity: String)(customNode: CustomNode) extends DirectedGraph.Node[CustomNode, String] {
val children =
customNode.childrenNodes.map(CustomNodeAdapter.fetchIdentity).toSet
val source =
customNode
}
We now have the infrastructure in place. Let's define a "real world" directed graph consisting of CustomNode:
val customNodeDirectedGraphWithCyclesA =
List(
new CustomNode("A.x", List("B.a", "E.a", "J.a"))
, new CustomNode("B.x", List("E.b", "F.b"))
, new CustomNode("C.x", List("I.c", "G.c"))
, new CustomNode("D.x", List("G.d", "L.d"))
, new CustomNode("E.x", List("H.e"))
, new CustomNode("F.x", List("G.f"))
, new CustomNode("G.x", List("L.g"))
, new CustomNode("H.x", List("J.h", "K.h"))
, new CustomNode("I.x", List("K.i", "L.i"))
, new CustomNode("J.x", List("B.j"))
, new CustomNode("K.x", List("B.k"))
, new CustomNode("L.x", Nil)
)
Finally, we can now do the conversion which looks like this:
val transformCustomNodeDirectedGraphWithCyclesA =
DirectedGraph.toMap[CustomNode, String, CustomNodeAdapter](customNodes1, customNode => CustomNodeAdapter(customNode))
And we can take transformCustomNodeDirectedGraphWithCyclesA, which is of type Map[CustomNodeAdapter,Set[CustomNodeAdapter]], and submit it to the two original functions.
Calling the isCyclic function as such:
val isCyclicResult = isCyclic(transformCustomNodeDirectedGraphWithCyclesA)
will return:
`true`
Calling the filterToJustCycles function as such:
val cycles = filterToJustCycles(transformCustomNodeDirectedGraphWithCyclesA)
will return:
Map(
CustomNodeAdapter(B) -> Set(CustomNodeAdapter(E))
, CustomNodeAdapter(E) -> Set(CustomNodeAdapter(H))
, CustomNodeAdapter(H) -> Set(CustomNodeAdapter(J), CustomNodeAdapter(K))
, CustomNodeAdapter(J) -> Set(CustomNodeAdapter(B))
, CustomNodeAdapter(K) -> Set(CustomNodeAdapter(B))
)
And if needed, this Map can then be converted back to Map[CustomNode, List[CustomNode]]:
cycles.map {
case (customNodeAdapter, customNodeAdapterChildren) =>
(customNodeAdapter.source, customNodeAdapterChildren.toList.map(_.source))
}
If you have any questions, issues or concerns, please let me know and I will address them ASAP.
I think the problem can be solved without changing the state of the node with the marker field. The following is a rough code of what i think the isCyclic should look like. I am currently storing the node objects which are visited instead you can store the node ids if the node doesnt have equality based on node id.
def isCyclic() : Boolean = nodes.exists(hasCycle(_, HashSet()))
def hasCycle(node:Node, visited:Seq[Node]) = visited.contains(node) || children(node).exists(hasCycle(_, node +: visited))
def children(node:Node) = vertexMap.getChildren(node.id).toList.map(nodeId => id2nodeMap(nodeId))
Answer added just to show that the mutable-visited isn't too unreadable either (untested, though!)
def isCyclic() : Boolean =
{
var visited = HashSet()
def hasCycle(node:Node) = {
if (visited.contains(node)) {
true
} else {
visited :+= node
children(node).exists(hasCycle(_))
}
}
nodes.exists(hasCycle(_))
}
def children(node:Node) = vertexMap.getChildren(node.id).toList.map(nodeId => id2nodeMap(nodeId))
If p = node => node.marker==1 && visit(node) and assuming nodes is a List you can pick any of the following:
nodes.filter(p).length>0
nodes.count(p)>0
nodes.exists(p) (I think the most relevant)
I am not sure of the relative complexity of each method and would appreciate a comment from fellow members of the community

Scala: Detecting a Straight in a 5-card Poker hand using pattern matching

For those who don't know what a 5-card Poker Straight is: http://en.wikipedia.org/wiki/List_of_poker_hands#Straight
I'm writing a small Poker simulator in Scala to help me learn the language, and I've created a Hand class with 5 ordered Cards in it. Each Card has a Rank and Suit, both defined as Enumerations. The Hand class has methods to evaluate the hand rank, and one of them checks whether the hand contains a Straight (we can ignore Straight Flushes for the moment). I know there are a few nice algorithms for determining a Straight, but I wanted to see whether I could design something with Scala's pattern matching, so I came up with the following:
def isStraight() = {
def matchesStraight(ranks: List[Rank.Value]): Boolean = ranks match {
case head :: Nil => true
case head :: tail if (Rank(head.id + 1) == tail.head) => matchesStraight(tail)
case _ => false
}
matchesStraight(cards.map(_.rank).toList)
}
That works fine and is fairly readable, but I was wondering if there is any way to get rid of that if. I'd imagine something like the following, though I can't get it to work:
private def isStraight() = {
def matchesStraight(ranks: List[Rank.Value]): Boolean = ranks match {
case head :: Nil => true
case head :: next(head.id + 1) :: tail => matchesStraight(next :: tail)
case _ => false
}
matchesStraight(cards.map(_.rank).toList)
}
Any ideas? Also, as a side question, what is the general opinion on the inner matchesStraight definition? Should this rather be private or perhaps done in a different way?
You can't pass information to an extractor, and you can't use information from one value returned in another, except on the if statement -- which is there to cover all these cases.
What you can do is create your own extractors to test these things, but it won't gain you much if there isn't any reuse.
For example:
class SeqExtractor[A, B](f: A => B) {
def unapplySeq(s: Seq[A]): Option[Seq[A]] =
if (s map f sliding 2 forall { case Seq(a, b) => a == b } ) Some(s)
else None
}
val Straight = new SeqExtractor((_: Card).rank)
Then you can use it like this:
listOfCards match {
case Straight(cards) => true
case _ => false
}
But, of course, all that you really want is that if statement in SeqExtractor. So, don't get too much in love with a solution, as you may miss simpler ways of doing stuff.
You could do something like:
val ids = ranks.map(_.id)
ids.max - ids.min == 4 && ids.distinct.length == 5
Handling aces correctly requires a bit of work, though.
Update: Here's a much better solution:
(ids zip ids.tail).forall{case (p,q) => q%13==(p+1)%13}
The % 13 in the comparison handles aces being both rank 1 and rank 14.
How about something like:
def isStraight(cards:List[Card]) = (cards zip cards.tail) forall { case (c1,c2) => c1.rank+1 == c2.rank}
val cards = List(Card(1),Card(2),Card(3),Card(4))
scala> isStraight(cards)
res2: Boolean = true
This is a completely different approache, but it does use pattern matching. It produces warnings in the match clause which seem to indicate that it shouldn't work. But it actually produces the correct results:
Straight !!! 34567
Straight !!! 34567
Sorry no straight this time
I ignored the Suites for now and I also ignored the possibility of an ace under a 2.
abstract class Rank {
def value : Int
}
case class Next[A <: Rank](a : A) extends Rank {
def value = a.value + 1
}
case class Two() extends Rank {
def value = 2
}
class Hand(a : Rank, b : Rank, c : Rank, d : Rank, e : Rank) {
val cards = List(a, b, c, d, e).sortWith(_.value < _.value)
}
object Hand{
def unapply(h : Hand) : Option[(Rank, Rank, Rank, Rank, Rank)] = Some((h.cards(0), h.cards(1), h.cards(2), h.cards(3), h.cards(4)))
}
object Poker {
val two = Two()
val three = Next(two)
val four = Next(three)
val five = Next(four)
val six = Next(five)
val seven = Next(six)
val eight = Next(seven)
val nine = Next(eight)
val ten = Next(nine)
val jack = Next(ten)
val queen = Next(jack)
val king = Next(queen)
val ace = Next(king)
def main(args : Array[String]) {
val simpleStraight = new Hand(three, four, five, six, seven)
val unsortedStraight = new Hand(four, seven, three, six, five)
val notStraight = new Hand (two, two, five, five, ace)
printIfStraight(simpleStraight)
printIfStraight(unsortedStraight)
printIfStraight(notStraight)
}
def printIfStraight[A](h : Hand) {
h match {
case Hand(a: A , b : Next[A], c : Next[Next[A]], d : Next[Next[Next[A]]], e : Next[Next[Next[Next[A]]]]) => println("Straight !!! " + a.value + b.value + c.value + d.value + e.value)
case Hand(a,b,c,d,e) => println("Sorry no straight this time")
}
}
}
If you are interested in more stuff like this google 'church numerals scala type system'
How about something like this?
def isStraight = {
cards.map(_.rank).toList match {
case first :: second :: third :: fourth :: fifth :: Nil if
first.id == second.id - 1 &&
second.id == third.id - 1 &&
third.id == fourth.id - 1 &&
fourth.id == fifth.id - 1 => true
case _ => false
}
}
You're still stuck with the if (which is in fact larger) but there's no recursion or custom extractors (which I believe you're using incorrectly with next and so is why your second attempt doesn't work).
If you're writing a poker program, you are already check for n-of-a-kind. A hand is a straight when it has no n-of-a-kinds (n > 1) and the different between the minimum denomination and the maximum is exactly four.
I was doing something like this a few days ago, for Project Euler problem 54. Like you, I had Rank and Suit as enumerations.
My Card class looks like this:
case class Card(rank: Rank.Value, suit: Suit.Value) extends Ordered[Card] {
def compare(that: Card) = that.rank compare this.rank
}
Note I gave it the Ordered trait so that we can easily compare cards later. Also, when parsing the hands, I sorted them from high to low using sorted, which makes assessing values much easier.
Here is my straight test which returns an Option value depending on whether it's a straight or not. The actual return value (a list of Ints) is used to determine the strength of the hand, the first representing the hand type from 0 (no pair) to 9 (straight flush), and the others being the ranks of any other cards in the hand that count towards its value. For straights, we're only worried about the highest ranking card.
Also, note that you can make a straight with Ace as low, the "wheel", or A2345.
case class Hand(cards: Array[Card]) {
...
def straight: Option[List[Int]] = {
if( cards.sliding(2).forall { case Array(x, y) => (y compare x) == 1 } )
Some(5 :: cards(0).rank.id :: 0 :: 0 :: 0 :: 0 :: Nil)
else if ( cards.map(_.rank.id).toList == List(12, 3, 2, 1, 0) )
Some(5 :: cards(1).rank.id :: 0 :: 0 :: 0 :: 0 :: Nil)
else None
}
}
Here is a complete idiomatic Scala hand classifier for all hands (handles 5-high straights):
case class Card(rank: Int, suit: Int) { override def toString = s"${"23456789TJQKA" rank}${"♣♠♦♥" suit}" }
object HandType extends Enumeration {
val HighCard, OnePair, TwoPair, ThreeOfAKind, Straight, Flush, FullHouse, FourOfAKind, StraightFlush = Value
}
case class Hand(hand: Set[Card]) {
val (handType, sorted) = {
def rankMatches(card: Card) = hand count (_.rank == card.rank)
val groups = hand groupBy rankMatches mapValues {_.toList.sorted}
val isFlush = (hand groupBy {_.suit}).size == 1
val isWheel = "A2345" forall {r => hand exists (_.rank == Card.ranks.indexOf(r))} // A,2,3,4,5 straight
val isStraight = groups.size == 1 && (hand.max.rank - hand.min.rank) == 4 || isWheel
val (isThreeOfAKind, isOnePair) = (groups contains 3, groups contains 2)
val handType = if (isStraight && isFlush) HandType.StraightFlush
else if (groups contains 4) HandType.FourOfAKind
else if (isThreeOfAKind && isOnePair) HandType.FullHouse
else if (isFlush) HandType.Flush
else if (isStraight) HandType.Straight
else if (isThreeOfAKind) HandType.ThreeOfAKind
else if (isOnePair && groups(2).size == 4) HandType.TwoPair
else if (isOnePair) HandType.OnePair
else HandType.HighCard
val kickers = ((1 until 5) flatMap groups.get).flatten.reverse
require(hand.size == 5 && kickers.size == 5)
(handType, if (isWheel) (kickers takeRight 4) :+ kickers.head else kickers)
}
}
object Hand {
import scala.math.Ordering.Implicits._
implicit val rankOrdering = Ordering by {hand: Hand => (hand.handType, hand.sorted)}
}