Different behavior when declaration type is different(Set vs TreeSet) - scala

var set = TreeSet(5,4,3,2,1)
println(set)
val diffSet: TreeSet[Int] = set
// if I change above code to val diffSet: Set[Int] = set
// the result is unsorted set.
for (i <- diffSet; x = i) {
println(i)
}
println("-" * 20)
// the above code translates to below and print the same result
val temp = diffSet.map(i => (i, i))
for ((i, x) <- temp) {
println(i)
}
My question is if I defined a method like this:
def genSet:Set[Int] = {
TreeSet(5, 4, 3, 2, 1)
}
and when i want to use a for loop with it
for (i <- genSet; x = i + 1) {
println(x)
}
the result is unsorted, how to fix this behavior without change the genSet's return type. if I use for loop like below, it will be fine, but I hope to keep the above code style.
for (i <- genSet) {
val x = i + 1
println(x)
}

Why the map version winds up unsorted
The map method (called with a function that we'll call func) takes an implicit CanBuildFrom parameter that takes into account the type of the collection that map is being called on, in addition to the type that func returns to choose an appropriate return type. This is used to make Map.map[Int] or BitSet.map[String] do the right thing (return general purpose lists) while Map.map[(String,Int)] or BitSet.map[Int] also do the right thing (return a Map and a BitSet) respectively.
The CanBuildFrom is chosen at compile time, so it must be chosen based on the static type of the set that you call map on (the type the compiler knows about at compile time). The static type of set is TreeSet, but the static type of diffset is Set. The dynamic type of both (at runtime) is TreeSet.
When you call map on set (a TreeSet), the compiler chooses immutable.this.SortedSet.canBuildFrom[Int](math.this.Ordering.Int) as the CanBuildFrom.
When you call map on diffset (a Set), the compiler chooses immutable.this.Set.canBuildFrom[Int] as the CanBuildFrom.
Why the for version winds up unsorted
The loop
for (i <- genSet; x = i + 1) {
println(x)
}
desugars into
genSet.map(((i) => {
val x = i.$plus(1);
scala.Tuple2(i, x)
})).foreach(((x$1) => x$1: #scala.unchecked match {
case scala.Tuple2((i # _), (x # _)) => println(x)
}))
The desugared version includes a map function which will use the unsorted CanBuildFrom as I explained above.
On the other hand, the loop
for (i <- genSet) {
val x = i + 1
println(x)
}
desugars into
genSet.foreach(((i) => {
val x = i.$plus(1);
println(x)
}))
Which doesn't use a CanBuildFrom at all, since no new collection is being returned.

Set does not guarantee ordering. Even if the underlying class is a TreeSet, if the expected result is a Set you'll loose the ordering in the first transformation you do.
If you want ordering, do not use Set. I suggest, say, SortedSet.

Change the sig of genSet to return a SortedSet
def genSet:SortedSet[Int] = {
TreeSet(5, 4, 3, 2, 1)
}
This is probably some sort of bug. I would have expected your code to work too.
I think map is the culprit. This results in the same behavior:
for (i <- genSet.map(_ + 1)) { println(i) }
And for(i <- genSet; x = i + 1) equates to for(x <- genSet.map({i => i + 1}))

You can do:
scala> for (i <-genSet.view; x = i + 1) println(x)
2
3
4
5
6
Although, it's the type of trick that when you look at it after a few months, you may wonder why you added .view ...

Related

Use an array as a Scala foldLeft accumulator

I am trying to use a foldLeft on an array. Eg:
var x = some array
x.foldLeft(new Array[Int](10))((a, c) => a(c) = a(c)+1)
This refuses to compile with the error found Int(0) required Array[Int].
In order to use foldLeft in what you want to do, and following your style, you can just return the same accumulator array in the computation like this:
val ret = a.foldLeft(new Array[Int](10)) {
(acc, c) => acc(c) += 1; acc
}
Alternatively, since your numbers are from 0 to 9, you can also do this to achieve the same result:
val ret = (0 to 9).map(x => a.count(_ == x))
Assignment in Scala does not return a value (but instead Unit) so your expression that is supposed to return the Array[Int] for the next step returns Unit which does not work.
You would have to use a block and return the array in the end like this:
x.foldLeft(new Array[Int](10)) { (a, c) =>
a(c) = a(c)+1
a
}

the filtering in Scala's for loop

I'm a new beginner to Scala, and I'm now learning the for statements. I read this tutorial http://joelabrahamsson.com/learning-scala-part-six-if-statements-and-loops/
And in this tutorial, there is a example,
for (person: Person <- people
if !person.female;
name = person.name;
if name.contains("Ewing"))
println(name)
If compare this for loop to the for loop in Java, is it like
for(person: people) {
if (!person.female) {
String name = person.name;
if (name.contains("Ewing"))
println(name)
}
}
or like this:
for(person: people) {
String name = person.name;
if (!person.female && name.contains("Ewing")) {
println(name)
}
}
Are the operations (in this example, name = person.name;) executed if the first filter condition "if !person.female;" is not satisfied?
Thanks!
To see what the scala compiler generates, compile as scalac -Xprint:typer. It gives:
people.withFilter(((check$ifrefutable$1: typer.Person) => check$ifrefutable$1: #scala.unchecked match {
case (person # (_: Person)) => true
case _ => false
}))
//filter is acting as your if-clause
.withFilter(((person: Person) => person.<female: error>.unary_!)).map(((person: Person) => {
val name = person.name;
scala.Tuple2(person, name)
}))
//Your next if-clause
.withFilter(((x$1) => x$1: #scala.unchecked match {
case scala.Tuple2((person # (_: Person)), (name # _)) => name.contains("Ewing")
}))
//Print each of them
.foreach(((x$2) => x$2: #scala.unchecked match {
case scala.Tuple2((person # (_: Person)), (name # _)) => println(name)
}))
}
}
So in short it as acting as your first mentioned case. But as a concept, it is always recommended to think for-comprehensions as a mapping of map, foreach, flatmap etc.
This is because in many cases while dealing with yield you will need to manage types and thinking in terms of foreach and filter (which in java sense is foreach and if) will not cover all cases. For example, consider below:
scala> for(x <- Option(1);
| u <- scala.util.Left(2)
| ) yield (x,u)
<console>:9: error: value map is not a member of scala.util.Left[Int,Nothing]
u <- scala.util.Left(2)
Above for comprehension uses flatmap and map. Thinking in terms of Java for loops (foreach basically`) will not help in finding the reason.
Scala for comprehension unfolds into combination of map, flatmap and filter. In your case if is actually a filter for all values that appear before it in this "iteration" of the "loop". So if if condition is not satisfied the loop will skip this iteration, so your Java for loops behave the same way as Scala example.
For example try this in REPL:
scala> val l = List(1, 2, 3, 4, 5, 6)
l: List[Int] = List(1, 2, 3, 4, 5, 6)
scala> for (i <- l
| if(i%2 == 0))
| println(i)
2
4
6
scala>
This is equivalent to:
l.filter(_%2 == 0).foreach(println)
Try it out!
for{
x <- 1 to 10
if x % 3 == 0
y = println(f"x=$x")
if x % 2 == 0
} {
println(x)
}
prints:
x=3
x=6
x=9
6
So this means that the y= line is happening before the second if filter.
I think it would not evaluate the follows expressions as it is a common implementation to do so with AND conditions. You can see here
http://www.scala-lang.org/api/current/index.html#scala.Boolean
That scala also has this short circuit implementation when using &&

Abort early in a fold

What's the best way to terminate a fold early? As a simplified example, imagine I want to sum up the numbers in an Iterable, but if I encounter something I'm not expecting (say an odd number) I might want to terminate. This is a first approximation
def sumEvenNumbers(nums: Iterable[Int]): Option[Int] = {
nums.foldLeft (Some(0): Option[Int]) {
case (Some(s), n) if n % 2 == 0 => Some(s + n)
case _ => None
}
}
However, this solution is pretty ugly (as in, if I did a .foreach and a return -- it'd be much cleaner and clearer) and worst of all, it traverses the entire iterable even if it encounters a non-even number.
So what would be the best way to write a fold like this, that terminates early? Should I just go and write this recursively, or is there a more accepted way?
My first choice would usually be to use recursion. It is only moderately less compact, is potentially faster (certainly no slower), and in early termination can make the logic more clear. In this case you need nested defs which is a little awkward:
def sumEvenNumbers(nums: Iterable[Int]) = {
def sumEven(it: Iterator[Int], n: Int): Option[Int] = {
if (it.hasNext) {
val x = it.next
if ((x % 2) == 0) sumEven(it, n+x) else None
}
else Some(n)
}
sumEven(nums.iterator, 0)
}
My second choice would be to use return, as it keeps everything else intact and you only need to wrap the fold in a def so you have something to return from--in this case, you already have a method, so:
def sumEvenNumbers(nums: Iterable[Int]): Option[Int] = {
Some(nums.foldLeft(0){ (n,x) =>
if ((n % 2) != 0) return None
n+x
})
}
which in this particular case is a lot more compact than recursion (though we got especially unlucky with recursion since we had to do an iterable/iterator transformation). The jumpy control flow is something to avoid when all else is equal, but here it's not. No harm in using it in cases where it's valuable.
If I was doing this often and wanted it within the middle of a method somewhere (so I couldn't just use return), I would probably use exception-handling to generate non-local control flow. That is, after all, what it is good at, and error handling is not the only time it's useful. The only trick is to avoid generating a stack trace (which is really slow), and that's easy because the trait NoStackTrace and its child trait ControlThrowable already do that for you. Scala already uses this internally (in fact, that's how it implements the return from inside the fold!). Let's make our own (can't be nested, though one could fix that):
import scala.util.control.ControlThrowable
case class Returned[A](value: A) extends ControlThrowable {}
def shortcut[A](a: => A) = try { a } catch { case Returned(v) => v }
def sumEvenNumbers(nums: Iterable[Int]) = shortcut{
Option(nums.foldLeft(0){ (n,x) =>
if ((x % 2) != 0) throw Returned(None)
n+x
})
}
Here of course using return is better, but note that you could put shortcut anywhere, not just wrapping an entire method.
Next in line for me would be to re-implement fold (either myself or to find a library that does it) so that it could signal early termination. The two natural ways of doing this are to not propagate the value but an Option containing the value, where None signifies termination; or to use a second indicator function that signals completion. The Scalaz lazy fold shown by Kim Stebel already covers the first case, so I'll show the second (with a mutable implementation):
def foldOrFail[A,B](it: Iterable[A])(zero: B)(fail: A => Boolean)(f: (B,A) => B): Option[B] = {
val ii = it.iterator
var b = zero
while (ii.hasNext) {
val x = ii.next
if (fail(x)) return None
b = f(b,x)
}
Some(b)
}
def sumEvenNumbers(nums: Iterable[Int]) = foldOrFail(nums)(0)(_ % 2 != 0)(_ + _)
(Whether you implement the termination by recursion, return, laziness, etc. is up to you.)
I think that covers the main reasonable variants; there are some other options also, but I'm not sure why one would use them in this case. (Iterator itself would work well if it had a findOrPrevious, but it doesn't, and the extra work it takes to do that by hand makes it a silly option to use here.)
The scenario you describe (exit upon some unwanted condition) seems like a good use case for the takeWhile method. It is essentially filter, but should end upon encountering an element that doesn't meet the condition.
For example:
val list = List(2,4,6,8,6,4,2,5,3,2)
list.takeWhile(_ % 2 == 0) //result is List(2,4,6,8,6,4,2)
This will work just fine for Iterators/Iterables too. The solution I suggest for your "sum of even numbers, but break on odd" is:
list.iterator.takeWhile(_ % 2 == 0).foldLeft(...)
And just to prove that it's not wasting your time once it hits an odd number...
scala> val list = List(2,4,5,6,8)
list: List[Int] = List(2, 4, 5, 6, 8)
scala> def condition(i: Int) = {
| println("processing " + i)
| i % 2 == 0
| }
condition: (i: Int)Boolean
scala> list.iterator.takeWhile(condition _).sum
processing 2
processing 4
processing 5
res4: Int = 6
You can do what you want in a functional style using the lazy version of foldRight in scalaz. For a more in depth explanation, see this blog post. While this solution uses a Stream, you can convert an Iterable into a Stream efficiently with iterable.toStream.
import scalaz._
import Scalaz._
val str = Stream(2,1,2,2,2,2,2,2,2)
var i = 0 //only here for testing
val r = str.foldr(Some(0):Option[Int])((n,s) => {
println(i)
i+=1
if (n % 2 == 0) s.map(n+) else None
})
This only prints
0
1
which clearly shows that the anonymous function is only called twice (i.e. until it encounters the odd number). That is due to the definition of foldr, whose signature (in case of Stream) is def foldr[B](b: B)(f: (Int, => B) => B)(implicit r: scalaz.Foldable[Stream]): B. Note that the anonymous function takes a by name parameter as its second argument, so it need no be evaluated.
Btw, you can still write this with the OP's pattern matching solution, but I find if/else and map more elegant.
Well, Scala does allow non local returns. There are differing opinions on whether or not this is a good style.
scala> def sumEvenNumbers(nums: Iterable[Int]): Option[Int] = {
| nums.foldLeft (Some(0): Option[Int]) {
| case (None, _) => return None
| case (Some(s), n) if n % 2 == 0 => Some(s + n)
| case (Some(_), _) => None
| }
| }
sumEvenNumbers: (nums: Iterable[Int])Option[Int]
scala> sumEvenNumbers(2 to 10)
res8: Option[Int] = None
scala> sumEvenNumbers(2 to 10 by 2)
res9: Option[Int] = Some(30)
EDIT:
In this particular case, as #Arjan suggested, you can also do:
def sumEvenNumbers(nums: Iterable[Int]): Option[Int] = {
nums.foldLeft (Some(0): Option[Int]) {
case (Some(s), n) if n % 2 == 0 => Some(s + n)
case _ => return None
}
}
You can use foldM from cats lib (as suggested by #Didac) but I suggest to use Either instead of Option if you want to get actual sum out.
bifoldMap is used to extract the result from Either.
import cats.implicits._
def sumEven(nums: Stream[Int]): Either[Int, Int] = {
nums.foldM(0) {
case (acc, n) if n % 2 == 0 => Either.right(acc + n)
case (acc, n) => {
println(s"Stopping on number: $n")
Either.left(acc)
}
}
}
examples:
println("Result: " + sumEven(Stream(2, 2, 3, 11)).bifoldMap(identity, identity))
> Stopping on number: 3
> Result: 4
println("Result: " + sumEven(Stream(2, 7, 2, 3)).bifoldMap(identity, identity))
> Stopping on number: 7
> Result: 2
Cats has a method called foldM which does short-circuiting (for Vector, List, Stream, ...).
It works as follows:
def sumEvenNumbers(nums: Stream[Int]): Option[Long] = {
import cats.implicits._
nums.foldM(0L) {
case (acc, c) if c % 2 == 0 => Some(acc + c)
case _ => None
}
}
If it finds a not even element it returns None without computing the rest, otherwise it returns the sum of the even entries.
If you want to keep count until an even entry is found, you should use an Either[Long, Long]
#Rex Kerr your answer helped me, but I needed to tweak it to use Either
def foldOrFail[A,B,C,D](map: B => Either[D, C])(merge: (A, C) => A)(initial: A)(it: Iterable[B]): Either[D, A] = {
val ii= it.iterator
var b= initial
while (ii.hasNext) {
val x= ii.next
map(x) match {
case Left(error) => return Left(error)
case Right(d) => b= merge(b, d)
}
}
Right(b)
}
You could try using a temporary var and using takeWhile. Here is a version.
var continue = true
// sample stream of 2's and then a stream of 3's.
val evenSum = (Stream.fill(10)(2) ++ Stream.fill(10)(3)).takeWhile(_ => continue)
.foldLeft(Option[Int](0)){
case (result,i) if i%2 != 0 =>
continue = false;
// return whatever is appropriate either the accumulated sum or None.
result
case (optionSum,i) => optionSum.map( _ + i)
}
The evenSum should be Some(20) in this case.
You can throw a well-chosen exception upon encountering your termination criterion, handling it in the calling code.
A more beutiful solution would be using span:
val (l, r) = numbers.span(_ % 2 == 0)
if(r.isEmpty) Some(l.sum)
else None
... but it traverses the list two times if all the numbers are even
Just for an "academic" reasons (:
var headers = Source.fromFile(file).getLines().next().split(",")
var closeHeaderIdx = headers.takeWhile { s => !"Close".equals(s) }.foldLeft(0)((i, S) => i+1)
Takes twice then it should but it is a nice one liner.
If "Close" not found it will return
headers.size
Another (better) is this one:
var headers = Source.fromFile(file).getLines().next().split(",").toList
var closeHeaderIdx = headers.indexOf("Close")

Polish notation evaluate function

I am new to Scala and I am having hard-time with defining, or more likely translating my code from Ruby to evaluate calculations described as Polish Notations,
f.e. (+ 3 2) or (- 4 (+ 3 2))
I successfully parse the string to form of ArrayBuffer(+, 3, 2) or ArrayBuffer(-, 4, ArrayBuffer(+, 3 2)).
The problem actually starts when I try to define a recursive eval function ,which simply takes ArrayBuffer as argument and "return" an Int(result of evaluated application).
IN THE BASE CASE:
I want to simply check if 2nd element is an instanceOf[Int] and 3rd element is instanceOf[Int] then evaluate them together (depending on sign operator - 1st element) and return Int.
However If any of the elements is another ArrayBuffer, I simply want to reassign that element to returned value of recursively called eval function. like:
Storage(2) = eval(Storage(2)). (** thats why i am using mutable ArrayBuffer **)
The error ,which I get is:
scala.collection.mutable.ArrayBuffer cannot be cast to java.lang.Integer
I am of course not looking for any copy-and-paste answers but for some advices and observations.
Constructive Criticism fully welcomed.
****** This is the testing code I am using only for the addition ******
def eval(Input: ArrayBuffer[Any]):Int = {
if(ArrayBuffer(2).isInstaceOf[ArrayBuffer[Any]]) {
ArrayBuffer(2) = eval(ArrayBuffer(2))
}
if(ArrayBuffer(3).isInstaceOf[ArrayBuffer[Any]]) {
ArrayBuffer(3) = eval(ArrayBuffer(3))
}
if(ArrayBuffer(2).isInstaceOf[Int] && ArrayBuffer(3).isInstanceOf[Int]) {
ArrayBuffer(2).asInstanceOf[Int] + ArrayBuffer(3).asInstanceOf[Int]
}
}
A few problems with your code:
ArrayBuffer(2) means "construct an ArrayBuffer with one element: 2". Nowhere in your code are you referencing your parameter Input. You would need to replace instances of ArrayBuffer(2) with Input(2) for this to work.
ArrayBuffer (and all collections in Scala) are 0-indexed, so if you want to access the second thing in the collection, you would do input(1).
If you leave the the final if there, then the compiler will complain since your function won't always return an Int; if the input contained something unexpected, then that last if would evaluate to false, and you have no else to fall to.
Here's a direct rewrite of your code: fixing the issues:
def eval(input: ArrayBuffer[Any]):Int = {
if(input(1).isInstanceOf[ArrayBuffer[Any]])
input(1) = eval(input(1).asInstanceOf[ArrayBuffer[Any]])
if(input(2).isInstanceOf[ArrayBuffer[Any]])
input(2) = eval(input(2).asInstanceOf[ArrayBuffer[Any]])
input(1).asInstanceOf[Int] + input(2).asInstanceOf[Int]
}
(note also that variable names, like input, should be lowercased.)
That said, the procedure of replacing entries in your input with their evaluations is probably not the best route because it destroys the input in the process of evaluating. You should instead write a function that takes the ArrayBuffer and simply recurses through it without modifying the original.
You'll want you eval function to check for specific cases. Here's a simple implementation as a demonstration:
def eval(e: Seq[Any]): Int =
e match {
case Seq("+", a: Int, b: Int) => a + b
case Seq("+", a: Int, b: Seq[Any]) => a + eval(b)
case Seq("+", a: Seq[Any], b: Int) => eval(a) + b
case Seq("+", a: Seq[Any], b: Seq[Any]) => eval(a) + eval(b)
}
So you can see that for the simple case of (+ arg1 arg2), there are 4 cases. In each case, if the argument is an Int, we use it directly in the addition. If the argument itself is a sequence (like ArrayBuffer), then we recursively evaluate before adding. Notice also that Scala's case syntax lets to do pattern matches with types, so you can skip the isInstanceOf and asInstanceOf stuff.
Now there definitely style improvements you'd want to make down the line (like using Either instead of Any and not hard coding the "+"), but this should get you on the right track.
And here's how you would use it:
eval(Seq("+", 3, 2))
res0: Int = 5
scala> eval(Seq("+", 4, Seq("+", 3, 2)))
res1: Int = 9
Now, if you want to really take advantage of Scala features, you could use an Eval extractor:
object Eval {
def unapply(e: Any): Option[Int] = {
e match {
case i: Int => Some(i)
case Seq("+", Eval(a), Eval(b)) => Some(a + b)
}
}
}
And you'd use it like this:
scala> val Eval(result) = 2
result: Int = 2
scala> val Eval(result) = ArrayBuffer("+", 2, 3)
result: Int = 5
scala> val Eval(result) = ArrayBuffer("+", 2, ArrayBuffer("+", 2, 3))
result: Int = 7
Or you could wrap it in an eval function:
def eval(e: Any): Int = {
val Eval(result) = e
result
}
Here is my take on right to left stack-based evaluation:
def eval(expr: String): Either[Throwable, Int] = {
import java.lang.NumberFormatException
import scala.util.control.Exception._
def int(s: String) = catching(classOf[NumberFormatException]).opt(s.toInt)
val symbols = expr.replaceAll("""[^\d\+\-\*/ ]""", "").split(" ").toSeq
allCatch.either {
val results = symbols.foldRight(List.empty[Int]) {
(symbol, operands) => int(symbol) match {
case Some(op) => op :: operands
case None => val x :: y :: ops = operands
val result = symbol match {
case "+" => x + y
case "-" => x - y
case "*" => x * y
case "/" => x / y
}
result :: ops
}
}
results.head
}
}

Tune Nested Loop in Scala

I was wondering if I can tune the following Scala code :
def removeDuplicates(listOfTuple: List[(Class1,Class2)]): List[(Class1,Class2)] = {
var listNoDuplicates: List[(Class1, Class2)] = Nil
for (outerIndex <- 0 until listOfTuple.size) {
if (outerIndex != listOfTuple.size - 1)
for (innerIndex <- outerIndex + 1 until listOfTuple.size) {
if (listOfTuple(i)._1.flag.equals(listOfTuple(j)._1.flag))
listNoDuplicates = listOfTuple(i) :: listNoDuplicates
}
}
listNoDuplicates
}
Usually if you have someting looking like:
var accumulator: A = new A
for( b <- collection ) {
accumulator = update(accumulator, b)
}
val result = accumulator
can be converted in something like:
val result = collection.foldLeft( new A ){ (acc,b) => update( acc, b ) }
So here we can first use a map to force the unicity of flags. Supposing the flag has a type F:
val result = listOfTuples.foldLeft( Map[F,(ClassA,ClassB)] ){
( map, tuple ) => map + ( tuple._1.flag -> tuple )
}
Then the remaining tuples can be extracted from the map and converted to a list:
val uniqList = map.values.toList
It will keep the last tuple encoutered, if you want to keep the first one, replace foldLeft by foldRight, and invert the argument of the lambda.
Example:
case class ClassA( flag: Int )
case class ClassB( value: Int )
val listOfTuples =
List( (ClassA(1),ClassB(2)), (ClassA(3),ClassB(4)), (ClassA(1),ClassB(-1)) )
val result = listOfTuples.foldRight( Map[Int,(ClassA,ClassB)]() ) {
( tuple, map ) => map + ( tuple._1.flag -> tuple )
}
val uniqList = result.values.toList
//uniqList: List((ClassA(1),ClassB(2)), (ClassA(3),ClassB(4)))
Edit: If you need to retain the order of the initial list, use instead:
val uniqList = listOfTuples.filter( result.values.toSet )
This compiles, but as I can't test it it's hard to say if it does "The Right Thing" (tm):
def removeDuplicates(listOfTuple: List[(Class1,Class2)]): List[(Class1,Class2)] =
(for {outerIndex <- 0 until listOfTuple.size
if outerIndex != listOfTuple.size - 1
innerIndex <- outerIndex + 1 until listOfTuple.size
if listOfTuple(i)._1.flag == listOfTuple(j)._1.flag
} yield listOfTuple(i)).reverse.toList
Note that you can use == instead of equals (use eq if you need reference equality).
BTW: https://codereview.stackexchange.com/ is better suited for this type of question.
Do not use index with lists (like listOfTuple(i)). Index on lists have very lousy performance. So, some ways...
The easiest:
def removeDuplicates(listOfTuple: List[(Class1,Class2)]): List[(Class1,Class2)] =
SortedSet(listOfTuple: _*)(Ordering by (_._1.flag)).toList
This will preserve the last element of the list. If you want it to preserve the first element, pass listOfTuple.reverse instead. Because of the sorting, performance is, at best, O(nlogn). So, here's a faster way, using a mutable HashSet:
def removeDuplicates(listOfTuple: List[(Class1,Class2)]): List[(Class1,Class2)] = {
// Produce a hash map to find the duplicates
import scala.collection.mutable.HashSet
val seen = HashSet[Flag]()
// now fold
listOfTuple.foldLeft(Nil: List[(Class1,Class2)]) {
case (acc, el) =>
val result = if (seen(el._1.flag)) acc else el :: acc
seen += el._1.flag
result
}.reverse
}
One can avoid using a mutable HashSet in two ways:
Make seen a var, so that it can be updated.
Pass the set along with the list being created in the fold. The case then becomes:
case ((seen, acc), el) =>