Scala: conditional yield with matching and side-effects? - scala

What is the correct syntax for something like the following?
var foo = 0
someIterator.foreach{ x => x.property match {
case "a" => foo += 1
case "b" => yield "current foo" + foo
}}
I.e., I'm trying to iterate over someIterator. When it matches one thing, it should update a local variable via closure; when it encounters another, it should yield some derivation of the current state to the resulting iterator preserving the state for successive iterations.

You can use Option[String], unlift and collect.
unlift takes a function that returns an option and turns it into a partial function.
collect is like map, except it considers only the elements for which the partial function is defined.
The following example demonstrates the approach:
import Function.unlift
var foo = 0
someIterator.map(_.property).collect(unlift {
case "a" => foo += 1; None
case "b" => Some("current foo" + foo)
})
Live Demo

If you just want to have the final value of foo then you can use tail recursion and you won't have to use a mutable variable too.
#annotation.tailrec
def f(l: List[Char], foo: Int = 0):Int= {
if (l == Nil) foo
else {
l.head match {
case 'a' => f(l.tail, foo + 1)
case 'b' => f(l.tail, foo)
}
}
}
val l1 = List('a','b')
f(l1)
l1: List[Char] = List(a, b)
res0: Int = 1

You can use scanLeft's acumulator instead of variable:
someIterator.scanLeft((0, None: Option[String])){
case ((foo, _), "a") => (foo + 1, None)
case ((foo, _), "b") => (foo, Some(s"current foo $foo"))
}.map(_._2).flatten
Example:
val someIterator = List("a","a","a","a","b", "a", "a", "b").toIterator
scala> someIterator.scanLeft(...){...}.map(_._2).flatten
res16: Iterator[String] = non-empty iterator
scala> res16.toList
res17: List[String] = List(current foo 4, current foo 6)
So you still have an iterator as a result

Related

Type mismatch when using iterators

I'm getting this error when i run the below code -
type mismatch, found : scala.collection.immutable.IndexedSeq[Int] required: Range
Where I'm going wrong ?
Functions -
def calcRange(i: Int, r: List[Range]):List[Range] = r match {
case List() => List(new Range(i,i+1,1))
case r1::rs =>
if (r1.start-1==i) {new Range(i,r1.end,1):: rs; }
else if(r1.end==i){new Range(r1.start,r1.end+1,1)::rs}
else {r1::calcRange(i,rs)}
}
def recurseForRanges(l: Iterator[Int]):List[Range] = {
var ans=List[Range]()
while(l.hasNext){
val cur=l.next;
ans=calcRange(cur,ans)
}
ans
}
def rangify(l: Iterator[Int]):Iterator[Range] = recurseForRanges(l).toIterator
Driver code
def main(args: Array[String]) {
val x=rangify( List(1,2,3,6,7,8).toIterator ).reduce( (x,y) => x ++ y)
/** This line gives the error -type mismatch,
found : scala.collection.immutable.IndexedSeq[Int] required: Range */
}
You can check docs:
++[B](that: GenTraversableOnce[B]): IndexedSeq[B]
++ returns IndexedSeq, not another Range, Range cannot have "holes" in them.
One way to fix it is to change Ranges to IndexedSeqs before reducing. This upcasts the Range so that reduce could take function
(IndexedSeq[Int], IndexedSeq[Int]) => IndexedSeq[Int]
because now it takes
(Range, Range) => Range
But ++ actually returns IndexedSeq[Int] instead of Range hence the type error.
val x = rangify(List(1, 2, 3, 6, 7, 8).iterator).map(_.toIndexedSeq).reduce(_ ++ _)
You can as well do this kind of cast by annotating type:
val it: Iterator[IndexedSeq[Int]] = rangify(List(1,2,3,6,7,8).iterator)
val x = it.reduce(_ ++ _)
Note that your code can be simplified, without vars
def calcRange(r: List[Range], i: Int): List[Range] = r match {
case Nil =>
Range(i, i + 1) :: Nil
case r1 :: rs =>
if (r1.start - 1 == i)
Range(i, r1.end) :: rs
else if (r1.end == i)
Range(r1.start, r1.end + 1) :: rs
else
r1 :: calcRange(rs, i)
}
def recurseForRanges(l: Iterator[Int]): List[Range] = {
l.foldLeft(List.empty[Range])(calcRange)
}
def rangify(l: Iterator[Int]): Iterator[Range] = recurseForRanges(l).iterator
val x = rangify(List(1,2,3,6,7,8).iterator).map(_.toIndexedSeq).reduce(_ ++ _)
To explain what I've done with it:
Range has a factory method, you don't need new keyword, you don't need to specify by value because 1 is default.
You need no semicolons as well.
What you are doing in recurseForRanges is basically what foldLeft does, I just swapped arguments in calcRange it could be passed directly to foldLeft.

Conditionally apply a function in Scala - How to write this function?

I need to conditionally apply a function f1 to the elements in a collection depending on the result of a function f2 that takes each element as an argument and returns a boolean. If f2(e) is true, f1(e) will be applied otherwise 'e' will be returned "as is".
My intent is to write a general-purpose function able to work on any kind of collection.
c: C[E] // My collection
f1 = ( E => E ) // transformation function
f2 = ( E => Boolean ) // conditional function
I cannot come to a solution. Here's my idea, but I'm afraid I'm in high-waters
/* Notice this code doesn't compile ~ partially pseudo-code */
conditionallyApply[E,C[_](c: C[E], f2: E => Boolean, f1: E => E): C[E] = {
#scala.annotation.tailrec
def loop(a: C[E], c: C[E]): C[E] = {
c match {
case Nil => a // Here head / tail just express the idea, but I want to use a generic collection
case head :: tail => go(a ++ (if f2(head) f1(head) else head ), tail)
}
}
loop(??, c) // how to get an empty collection of the same type as the one from the input?
}
Could any of you enlighten me?
This looks like a simple map of a Functor. Using scalaz:
def condMap[F[_],A](fa: F[A])(f: A => A, p: A => Boolean)(implicit F:Functor[F]) =
F.map(fa)(x => if (p(x)) f(x) else x)
Not sure why you would need scalaz for something so pedestrian.
// example collection and functions
val xs = 1 :: 2 :: 3 :: 4 :: Nil
def f1(v: Int) = v + 1
def f2(v: Int) = v % 2 == 0
// just conditionally transform inside a map
val transformed = xs.map(x => if (f2(x)) f1(x) else x)
Without using scalaz, you can use the CanBuildFrom pattern. This is exactly what is used in the standard collections library. Of course, in your specific case, this is probably over-engineered as a simple call to map is enough.
import scala.collection.generic._
def cmap[A, C[A] <: Traversable[A]](col: C[A])(f: A ⇒ A, p: A ⇒ Boolean)(implicit bf: CanBuildFrom[C[A], A, C[A]]): C[A] = {
val b = bf(col)
b.sizeHint(col)
for (x <- col) if(p(x)) b += f(x) else b += x
b.result
}
And now the usage:
scala> def f(i: Int) = 0
f: (i: Int)Int
scala> def p(i: Int) = i % 2 == 0
p: (i: Int)Boolean
scala> cmap(Seq(1, 2, 3, 4))(f, p)
res0: Seq[Int] = List(1, 0, 3, 0)
scala> cmap(List(1, 2, 3, 4))(f, p)
res1: List[Int] = List(1, 0, 3, 0)
scala> cmap(Set(1, 2, 3, 4))(f, p)
res2: scala.collection.immutable.Set[Int] = Set(1, 0, 3)
Observe how the return type is always the same as the one provided.
The function could be nicely encapsulated in an implicit class, using the "pimp my library" pattern.
For something like this you can use an implicit class. They were added just for this reason, to enhance libraries you can't change.
It would work like this:
object ImplicitStuff {
implicit class SeqEnhancer[A](s:Seq[A]) {
def transformIf( cond : A => Boolean)( f : A => A ):Seq[A] =
s.map{ x => if(cond(x)) f(x) else x }
}
def main(a:Array[String]) = {
val s = Seq(1,2,3,4,5,6,7)
println(s.transformIf(_ % 2 ==0){ _ * 2})
// result is (1, 4, 3, 8, 5, 12, 7)
}
}
Basically if you call a method that does not exists in the object you're calling it in (in this case, Seq), it will check if there's an implicit class that implements it, but it looks like a built in method.

How to know when to use PartialFunction vs return Option

As an example, we define a function that should convert 1, 3, 42 respectively to "foo", "bar", "qix" and all other integer to "X".
I've come up with 2 implementations :
The method f need to be separate because it can be reuse in other context.
def f(i: Int): Option[String] = i match {
case 1 => Some("foo")
case 3 => Some("bar")
case 42 => Some("qix")
case _ => None
}
def g(i: Int) : String = f(i).getOrElse("X")
And :
def f_ : PartialFunction[Int, String] = {
case 1 => "foo"
case 3 => "bar"
case 42 => "qix"
}
def g_(i: Int) : String = f_.orElse { case _ => "X" }(i)
I tend to prefer the second because it avoid many repetitive Some(…)
WDYT ?
I'm not sure why you want to use option at all when you can just as easily do this and get the exact same result:
def f(i: Int): String = i match {
case 1 => "foo"
case 3 => "bar"
case 42 => "qix"
case _ => "X"
}
It even saves you a pesky getOrElse
You can even go one better and use neither a PartialFunction or a match and just do this:
def f: Int => String = {
case 1 => "foo"
case 3 => "bar"
case 42 => "qix"
case _ => "X"
}
Which saves you writing a disposable i
fScala's Map is already a partial function. So you can use it instead of defining your own function which does exactly what Map does - "A map from keys of type A to values of type B".
So all you have to do is:
val f = Map(1 -> "foo", 3 -> "bar", 42 -> "qix")
def g(i: Int) = f.getOrElse(i, "X")
f(1) //foo
f(4) // throw NoSuchElementException: Key not found: 4
f.get(1) // Some(foo)
f.get(4) // None
g(1) //foo
g(4) //X
Now you can use the function 'g' or reuse 'f' to other needs.
Edited my example according to your comment:
def f(i: Int): Option[String] = {
val map = Map(1 -> "foo", 3 -> "bar", 42 -> "qix")
i match {
case x if (map.contains(x)) => Some(map(x))
case _ => None
}
}
def g(i: Int) : String = f(i).getOrElse("X")
I think the function should react to integers outside the given range in some meaningful way. That's why I would prefer an Option.
Option is a good way to handle null value. Partial function is just a partial matching, they are not same, even though Option and PartialFunction both have similar orElse method
Due to partial function is a function, so it can be chained but option can not, Option is the way to handle null value.
For the partial function you can do like this, it is more like chain of responsibility
def f_1 : PartialFunction[Int, String] = {
case 1 => "1"
}
def f_2 : PartialFunction[Int, String] = {
case 2 => "2"
}
def f_3 : PartialFunction[Int, String] = {
case 3 => "3"
}
f_1.orElse(f_2).orElse(f_3)(3)
You may want to try this. Here HashMap gives you fast lookup:
object SomeMain {
import scala.collection.immutable.HashMap
def f(i: Int): Option[String] = {
HashMap(1 -> "foo", 3 -> "bar", 42 -> "qix").get(i).orElse(None)
}
def g(i: Int): String = f(i).getOrElse("X")
def main(args: Array[String]) {
List(1, 3, 42, 10) foreach { x => println(x + ": " + g(x)) }
}
}
Output:
1: foo
3: bar
42: qix
10: X

Remove duplicates in List specifying equality function

I have a List[A], how is a idiomatic way of removing duplicates given an equality function (a:A, b:A) => Boolean? I cannot generally override equalsfor A
The way I can think now is creating a wrapping class AExt with overridden equals, then
list.map(new AExt(_)).distinct
But I wonder if there's a cleaner way.
There is a simple (simpler) way to do this:
list.groupBy(_.key).mapValues(_.head)
If you want you can use the resulting map instantly by replacing _.head by a function block like:
sameElements => { val observedItem = sameElements.head
new A (var1 = observedItem.firstAttr,
var2 = "SomethingElse") }
to return a new A for each distinct element.
There is only one minor problem. The above code (list.groupBy(_.key).mapValues(_.head)) didnt explains very well the intention to remove duplicates. For that reason it would be great to have a function like distinctIn[A](attr: A => B) or distinctBy[A](eq: (A, A) -> Boolean).
Using the Foo and customEquals from misingFaktor's answer:
case class Foo(a: Int, b: Int)
val (a, b, c, d) = (Foo(3, 4), Foo(3, 1), Foo(2, 5), Foo(2, 5))
def customEquals(x: Foo, y: Foo) = x.a == y.a
(Seq(a, b, c, d).foldLeft(Seq[Foo]()) {
(unique, curr) => {
if (!unique.exists(customEquals(curr, _)))
curr +: unique
else
unique
}
}).reverse
If result ordering is important but the duplicate to be removed is not, then foldRight is preferable
Seq(a, b, c, d).foldRight(Seq[Foo]()) {
(curr, unique) => {
if (!unique.exists(customEquals(curr, _)))
curr +: unique
else
unique
}
}
I must say I think I'd go via an intermediate collection which was a Set if you expected that your Lists might be quite long as testing for presence (via exists or find) on a Seq is O(n) of course:
Rather than write a custom equals; decide what property the elements are equal by. So instead of:
def myCustomEqual(a1: A, a2: A) = a1.foo == a2.foo && a1.bar == a2.bar
Make a Key. Like so:
type Key = (Foo, Bar)
def key(a: A) = (a.foo, a.bar)
Then you can add the keys to a Set to see whether you have come across them before.
var keys = Set.empty[Key]
((List.empty[A] /: as) { (l, a) =>
val k = key(a)
if (keys(k)) l else { keys += k; a +: l }
}).reverse
Of course, this solution has worse space complexity and potentially worse performance (as you are creating extra objects - the keys) in the case of very short lists. If you do not like the var in the fold, you might like to look at how you could achieve this using State and Traverse from scalaz 7
scala> case class Foo(a: Int, b: Int)
defined class Foo
scala> val (a, b, c, d) = (Foo(3, 4), Foo(3, 1), Foo(2, 5), Foo(2, 5))
a: Foo = Foo(3,4)
b: Foo = Foo(3,1)
c: Foo = Foo(2,5)
d: Foo = Foo(2,5)
scala> def customEquals(x: Foo, y: Foo) = x.a == y.a
customEquals: (x: Foo, y: Foo)Boolean
scala> Seq(a, b, c, d) filter {
| var seq = Seq.empty[Foo]
| x => {
| if(seq.exists(customEquals(x, _))) {
| false
| } else {
| seq :+= x
| true
| }
| }
res13: Seq[Foo] = List(Foo(3,4), Foo(2,5))
Starting Scala 2.13, we can use the new distinctBy method which returns elements of a sequence ignoring the duplicates as determined by == after applying a transforming function f:
def distinctBy[B](f: (A) => B): List[A]
For instance:
// case class A(a: Int, b: String, c: Double)
// val list = List(A(1, "hello", 3.14), A(2, "world", 3.14), A(1, "hello", 12.3))
list.distinctBy(x => (x.a, x.b)) // List(A(1, "hello", 3.14), A(2, "world", 3.14))
list.distinctBy(_.c) // List(A(1, "hello", 3.14), A(1, "hello", 12.3))
case class Foo (a: Int, b: Int)
val x = List(Foo(3,4), Foo(3,1), Foo(2,5), Foo(2,5))
def customEquals(x : Foo, y: Foo) = (x.a == y.a && x.b == y.b)
x.foldLeft(Nil : List[Foo]) {(list, item) =>
val exists = list.find(x => customEquals(item, x))
if (exists.isEmpty) item :: list
else list
}.reverse
res0: List[Foo] = List(Foo(3,4), Foo(3,1), Foo(2,5))

Step-by-step explanation of Scala syntax used in Wikipedia quicksort example

I am trying to understand the Scala quicksort example from Wikipedia. How could the sample be disassembled step by step and what does all the syntactic sugar involved mean?
def qsort: List[Int] => List[Int] = {
case Nil => Nil
case pivot :: tail =>
val (smaller, rest) = tail.partition(_ < pivot)
qsort(smaller) ::: pivot :: qsort(rest)
}
As much as I can gather at this stage qsort is a function that takes no parameters and returns a new Function1[List[Int],List[Int]] that implements quicksort through usage of pattern matching, list manipulation and recursive calls. But I can't quite figure out where the pivot comes from, and how exactly the pattern matching syntax works in this case.
UPDATE:
Thanks everyone for the great explanations!
I just wanted to share another example of quicksort implementation which I have discovered in the Scala by Example by Martin Odersky. Although based around arrays instead of lists and less of a show-off in terms of varios Scala features I personally find it much less convoluted than its Wikipedia counterpart, and just so much more clear and to the point expression of the underlying algorithm:
def sort(xs: Array[Int]): Array[Int] = {
if (xs.length <= 1) xs
else {
val pivot = xs(xs.length / 2)
Array.concat(
sort(xs filter (pivot >)),
xs filter (pivot ==),
sort(xs filter (pivot <)))
}
}
def qsort: List[Int] => List[Int] = {
case Nil => Nil
case pivot :: tail =>
val (smaller, rest) = tail.partition(_ < pivot)
qsort(smaller) ::: pivot :: qsort(rest)
}
let's pick apart a few bits.
Naming
Operators (such as * or +) are valid candidates for method and class names in Scala (hence you can have a class called :: (or a method called :: for that matter - and indeed both exist). Scala appears to have operator-overloading but in fact it does not: it's merely that you can declare a method with the same name.
Pattern Matching
target match {
case p1 =>
case p2 =>
}
Where p1 and p2 are patterns. There are many valid patterns (you can match against Strings, types, particular instances etc). You can also match against something called an extractor. An extractor basically extracts arguments for you in the case of a match, so:
target match {
case MyExtractor(arg1, arg2, arg3) => //I can now use arg1, arg2 etc
}
In scala, if an extractor (of which a case class is an example) exists called X, then the pattern X(a, b) is equivalent to a X b. The case class :: has a constructor taking 2 arguments and putting this together we get that:
case x :: xs =>
case ::(x, xs) =>
Are equivalent. This match says "if my List is an instance of :: extract the value head into x and tail into xs". pattern-matching is also used in variable declaration. For example, if p is a pattern, this is valid:
val p = expression
This why we can declare variables like:
val x :: xs = List(1, 2, 3)
val (a, b) = xs.partition(_ % 2 == 0 ) //returns a Tuple2 which is a pattern (t1, t2)
Anonymous Functions
Secondly we have a function "literal". tail is an instance of List which has a method called partition which takes a predicate and returns two lists; one of those entries satisfying the predicate and one of those entries which did not.
val pred = (el: Int) => e < 2
Declares a function predicate which takes an Int and returns true iff the int value is less than 2. There is a shorthand for writing functions inline
tail.partition(_ < pivot) // _ is a placeholder for the parameter
tail.partition( (e: Int) => e < pivot )
These two expressions mean the same thing.
Lists
A List is a sealed abstract class with only two implementations, Nil (the empty list) and :: (also called cons), which is a non-empty list consisting of a head and a tail (which is also a list). You can now see that the pattern match is a match on whether the list is empty or not. a List can be created by cons-ing it to other lists:
val l = 1 :: 2 :: Nil
val m = List(1, 2, 3) ::: List(4, 5, 6)
The above lines are simply method calls (:: is a valid method name in scala). The only difference between these and normal method calls is that, if a method end in a colon : and is called with spaces, the order of target and parameter is reversed:
a :: b === b.::(a)
Function Types
val f: A => B
the previous line types the reference f as a function which takes an A and returns a B, so I could then do:
val a = new A
val b: B = f(a)
Hence you can see that def qsort: List[Int] => List[Int] declares a method called qsort which returns a function taking a List[Int] and returning a List[Int]. So I could obviously do:
val l = List(2, 4, 1)
val m = qsort.apply(l) //apply is to Function what run is to Runnable
val n = qsort(l) //syntactic sugar - you don't have to define apply explicitly!
Recursion
When a method call is tail recursive, Scala will optimize this into the iterator pattern. There was a msitake in my original answer because the qsort above is not tail-recursive (the tail-call is the cons operator)
def qsort: List[Int] => List[Int] = {
case Nil => Nil
case pivot :: tail =>
val (smaller, rest) = tail.partition(_ < pivot)
qsort(smaller) ::: pivot :: qsort(rest)
}
Let's rewrite that. First, replace the function literal with an instance of Function1:
def qsort: List[Int] => List[Int] = new Function1[List[Int], List[Int]] {
def apply(input: List[Int]): List[Int] = input match {
case Nil => Nil
case pivot :: tail =>
val (smaller, rest) = tail.partition(_ < pivot)
qsort(smaller) ::: pivot :: qsort(rest)
}
}
Next, I'm going to replace the pattern match with equivalent if/else statements. Note that they are equivalent, not the same. The bytecode for pattern matches are more optimized. For instance, the second if and the exception throwing below do not exist, because the compile knows the second match will always happen if the first fails.
def qsort: List[Int] => List[Int] = new Function1[List[Int], List[Int]] {
def apply(input: List[Int]): List[Int] = if (input == Nil) {
Nil
} else if (input.isInstanceOf[::[_]] &&
scala.collection.immutable.::.unapply(input.asInstanceOf[::[Int]]) != None) {
val unapplyResult = scala.collection.immutable.::.unapply(input.asInstanceOf[::[Int]]).get
val pivot = unapplyResult._1
val tail = unapplyResult._2
val (smaller, rest) = tail.partition(_ < pivot)
qsort(smaller) ::: pivot :: qsort(rest)
} else {
throw new scala.MatchError(input)
}
}
Actually, val (smaller, rest) is pattern match as well, so Let's decompose it as well:
def qsort: List[Int] => List[Int] = new Function1[List[Int], List[Int]] {
def apply(input: List[Int]): List[Int] = if (input == Nil) {
Nil
} else if (input.isInstanceOf[::[_]] &&
scala.collection.immutable.::.unapply(input.asInstanceOf[::[Int]]) != None) {
val unapplyResult0 = scala.collection.immutable.::.unapply(input.asInstanceOf[::[Int]]).get
val pivot = unapplyResult0._1
val tail = unapplyResult0._2
val tmp0 = tail.partition(_ < pivot)
if (Tuple2.unapply(tmp0) == None)
throw new scala.MatchError(tmp0)
val unapplyResult1 = Tuple2.unapply(tmp0).get
val smaller = unapplyResult1._1
val rest = unapplyResult1._2
qsort(smaller) ::: pivot :: qsort(rest)
} else {
throw new scala.MatchError(input)
}
}
Obviously, this is highly unoptmized. Even worse, there are some function calls being done more than once, which doesn't happen in the original. Unfortunately, to fix that would require some structural changes to the code.
There's still some syntactic sugar here. There is an anonymous function being passed to partition, and there is the syntactic sugar for calling functions. Rewriting those yields the following:
def qsort: List[Int] => List[Int] = new Function1[List[Int], List[Int]] {
def apply(input: List[Int]): List[Int] = if (input == Nil) {
Nil
} else if (input.isInstanceOf[::[_]] &&
scala.collection.immutable.::.unapply(input.asInstanceOf[::[Int]]) != None) {
val unapplyResult0 = scala.collection.immutable.::.unapply(input.asInstanceOf[::[Int]]).get
val pivot = unapplyResult0._1
val tail = unapplyResult0._2
val func0 = new Function1[Int, Boolean] {
def apply(input: Int): Boolean = input < pivot
}
val tmp0 = tail.partition(func0)
if (Tuple2.unapply(tmp0) == None)
throw new scala.MatchError(tmp0)
val unapplyResult1 = Tuple2.unapply(tmp0).get
val smaller = unapplyResult1._1
val rest = unapplyResult1._2
qsort.apply(smaller) ::: pivot :: qsort.apply(rest)
} else {
throw new scala.MatchError(input)
}
}
For once, the extensive explanations about each syntactic sugar and how it works are being done by others. :-) I hope this complements their answers. Just as a final note, the following two lines are equivalent:
qsort(smaller) ::: pivot :: qsort(rest)
qsort(rest).::(pivot).:::(qsort(smaller))
The pivot in this pattern matching example is the first element of the list:
scala> List(1,2,3) match {
| case x :: xs => println(x)
| case _ => println("empty")
| }
1
The pattern matching is based on extractors and the cons is not part of the language. It uses the infix syntax. You can also write
scala> List(1,2,3) match {
| case ::(x,xs) => println(x)
| case _ => println("empty")
| }
1
as well. So there is a type :: that looks like the cons operator. This type defines how it is extracted:
final case class ::[B](private var hd: B, private[scala] var tl: List[B]){ ... }
It's a case class so the extractor will be generated by the Scala compiler. Like in this example class A.
case class A(x : Int, y : Int)
A(1,2) match { case x A y => printf("%s %s", x, y)}
-> 1 2
Based on this machinary patterns matching is supported for Lists, Regexp and XML.