Scala - can a lambda parameter match a tuple? - scala

So say i have some list like
val l = List((1, "blue"), (5, "red"), (2, "green"))
And then i want to filter one of them out, i can do something like
val m = l.filter(item => {
val (n, s) = item // "unpack" the tuple here
n != 2
}
Is there any way i can "unpack" the tuple as the parameter to the lambda directly, instead of having this intermediate item variable?
Something like the following would be ideal, but eclipse tells me wrong number of parameters; expected=1
val m = l.filter( (n, s) => n != 2 )
Any help would be appreciated - using 2.9.0.1

This is about the closest you can get:
val m = l.filter { case (n, s) => n != 2 }
It's basically pattern matching syntax inside an anonymous PartialFunction. There are also the tupled methods in Function object and traits, but they are just a wrapper around this pattern matching expression.

Hmm although Kipton has a good answer. You can actually make this shorter by doing.
val l = List((1, "blue"), (5, "red"), (2, "green"))
val m = l.filter(_._1 != 2)

There are a bunch of options:
for (x <- l; (n,s) = x if (n != 2)) yield x
l.collect{ case x # (n,s) if (n != 2) => x }
l.filter{ case (n,s) => n != 2 }
l.unzip.zipped.map((n,s) => n != 2).zip // Complains that zip is deprecated

val m = l.filter( (n, s) => n != 2 )
... is a type mismatch because that lambda defines a
Function2[String,Int,Boolean] with two parameters instead of
Function1[(String,Int),Boolean] with one Tuple2[String,Int] as its parameter.
You can convert between them like this:
val m = l.filter( ((n, s) => n != 2).tupled )

I've pondered the same, and came to your question today.
I'm not very fond of the partial function approaches (anything having case) since they imply that there could be more entry points for the logic flow. At least to me, they tend to blur the intention of the code. On the other hand, I really do want to go straight to the tuple fields, like you.
Here's a solution I drafted today. It seems to work, but I haven't tried it in production, yet.
object unTuple {
def apply[A, B, X](f: (A, B) => X): (Tuple2[A, B] => X) = {
(t: Tuple2[A, B]) => f(t._1, t._2)
}
def apply[A, B, C, X](f: (A, B, C) => X): (Tuple3[A, B, C] => X) = {
(t: Tuple3[A, B, C]) => f(t._1, t._2, t._3)
}
//...
}
val list = List( ("a",1), ("b",2) )
val list2 = List( ("a",1,true), ("b",2,false) )
list foreach unTuple( (k: String, v: Int) =>
println(k, v)
)
list2 foreach unTuple( (k: String, v: Int, b: Boolean) =>
println(k, v, b)
)
Output:
(a,1)
(b,2)
(a,1,true)
(b,2,false)
Maybe this turns out to be useful. The unTuple object should naturally be put aside in some tool namespace.
Addendum:
Applied to your case:
val m = l.filter( unTuple( (n:Int,color:String) =>
n != 2
))

Related

Scala: Remove duplicated integers from Vector( tuples(Int,Int) , ...)

I have a big size of a vector (about 2000 elements), inside consists of many tuples, Tuple(Int,Int), i.e.
val myVectorEG = Vector((65,61), (29,49), (4,57), (12,49), (24,98), (21,52), (81,86), (91,23), (73,34), (97,41),...))
I wish to remove the repeated/duplicated integers for every tuple at the index (0), i.e. if Tuple(65,xx) repeated at other Tuple(65, yy) inside the vector, it should be removed)
I enable to access them and print out in this method:
val (id1,id2) = ( allSource.foreach(i=>println(i._1)), allSource.foreach(i=>i._2))
How can I remove duplicate integers? Or I should use another method, rather than using foreach to access my element index at 0
To remove all duplicates, first group by the first tuple and only collect the tuples where there is only one tuple that belongs to that particular key (_._1). Then flatten the result.
myVectorEG.groupBy(_._1).collect{
case (k, v) if v.size == 1 => v
}.flatten
This returns a List which you can call .toVector on if you need a Vector
This does the job and preserves order (unlike other solutions) but is O(n^2) so potentially slow for 2000 elements:
myVectorEG.filter(x => myVectorEG.count(_._1 == x._1) == 1)
This is more efficient for larger vectors but still preserves order:
val keep =
myVectorEG.groupBy(_._1).collect{
case (k, v) if v.size == 1 => k
}.toSet
myVectorEG.filter(x => keep.contains(x._1))
You can use a distinctBy to remove duplicates.
In the case of Vector[(Int, Int)] it will look like this
myVectorEG.distinctBy(_._1)
Updated, if you need to remove all the duplicates:
You can use groupBy but this will rearrange your order.
myVectorEG.groupBy(_._1).filter(_._2.size == 1).flatMap(_._2).toVector
Another option, taking advantage that you want the list sorted at the end.
def sortAndRemoveDuplicatesByFirst[A : Ordering, B](input: List[(A, B)]): List[(A, B)] = {
import Ordering.Implicits._
val sorted = input.sortBy(_._1)
#annotation.tailrec
def loop(remaining: List[(A, B)], previous: (A, B), repeated: Boolean, acc: List[(A, B)]): List[(A, B)] =
remaining match {
case x :: xs =>
if (x._1 == previous._1)
loop(remaining = xs, previous, repeated = true, acc)
else if (!repeated)
loop(remaining = xs, previous = x, repeated = false, previous :: acc)
else
loop(remaining = xs, previous = x, repeated = false, acc)
case Nil =>
(previous :: acc).reverse
}
sorted match {
case x :: xs =>
loop(remaining = xs, previous = x, repeated = false, acc = List.empty)
case Nil =>
List.empty
}
}
Which you can test like this:
val data = List(
1 -> "A",
3 -> "B",
1 -> "C",
4 -> "D",
3 -> "E",
5 -> "F",
1 -> "G",
0 -> "H"
)
sortAndRemoveDuplicatesByFirst(data)
// res: List[(Int, String)] = List((0,H), (4,D), (5,F))
(I used List instead of Vector to make it easy and performant to write the tail-rec algorithm)

Parallel Aggregates in Scala using associative operators

I want to perform aggregate of a list of values in Scala. Here are few considerations:
aggregate function [1] is associative as well as commutative: examples are plus and multiply
this function is applied to the list in parallel so as to utilize all the cores of CPU
Here is an implementation:
package com.example.reactive
import scala.concurrent.Future
import scala.concurrent.Await
import scala.concurrent.duration._
import scala.concurrent.ExecutionContext.Implicits.global
object AggregateParallel {
private def pm[T](l: List[Future[T]])(zero: T)(fn: (T, T) => T): Future[T] = {
val l1 = l.grouped(2)
val l2 = l1.map { sl =>
sl match {
case x :: Nil => x
case x :: y :: Nil =>
for (a <- x; b <- y) yield fn(a, b)
case _ => Future(zero)
}
}.toList
l2 match {
case x :: Nil => x
case x :: xs => pm(l2)(zero)(fn)
case Nil => Future(zero)
}
}
def parallelAggregate[T](l: List[T])(zero: T)(fn: (T, T) => T): T = {
val n = pm(l.map(Future(_)))(zero)(fn)
Await.result(n, 1000 millis)
n.value.get.get
}
def main(args: Array[String]) {
// multiply empty list: zero value is 1
println(parallelAggregate(List[Int]())(1)((x, y) => x * y))
// multiply a list: zero value is 1
println(parallelAggregate(List(1, 2, 3, 4, 5))(1)((x, y) => x * y))
// sum a list: zero value is 0
println(parallelAggregate(List(1, 2, 3, 4, 5))(0)((x, y) => x + y))
// sum a list: zero value is 0
val bigList1 = List(1, 2, 3, 4, 5).map(BigInt(_))
println(parallelAggregate(bigList1)(0)((x, y) => x + y))
// sum a list of BigInt: zero value is 0
val bigList2 = (1 to 100).map(BigInt(_)).toList
println(parallelAggregate(bigList2)(0)((x, y) => x + y))
// multiply a list of BigInt: zero value is 1
val bigList3 = (1 to 100).map(BigInt(_)).toList
println(parallelAggregate(bigList3)(1)((x, y) => x * y))
}
}
OUTPUT:
1
120
15
15
5050
93326215443944152681699238856266700490715968264381621468592963895217599993229915608941463976156518286253697920827223758251185210916864000000000000000000000000
How else can I achieve the same objective or improve this code in Scala?
EDIT1:
I have implemented bottom up aggregate. I think I am quite close to the aggregate method in Scala ( below). The difference being that I am only splitting into sub lists of two elements:
Scala implementation:
def aggregate[S](z: S)(seqop: (S, T) => S, combop: (S, S) => S): S = {
executeAndWaitResult(new Aggregate(z, seqop, combop, splitter))
}
With this implementation I assume that the aggregate happens in parallel like so:
List(1,2,3,4,5,6)
-> split parallel -> List(List(1,2), List(3,4), List(5,6) )
-> execute in parallel -> List( 3, 7, 11 )
-> split parallel -> List(List(3,7), List(11) )
-> execute in parallel -> List( 10, 11)
-> Result is 21
Is that correct to assume that Scala aggregate is also doing bottom-up aggregates in parallel?
[1] http://www.mathsisfun.com/associative-commutative-distributive.html
The parallel lists of scala already have an aggregate method that do you just what you're asking for:
http://markusjais.com/scalas-parallel-collections-and-the-aggregate-method/
It works like foldLeft but takes an extra parameter:
def foldLeft[B](z: B)(f: (B, A) ⇒ B): B
def aggregate[B](z: ⇒ B)(seqop: (B, A) ⇒ B, combop: (B, B) ⇒ B): B
When called on a parallel collection aggregate splits the collection in N parts, uses foldLeft parrallely on each parts, and uses combop to unit all the results.
But when called on a non parallel collection aggregate just works like foldLeft and ignores the combop.
To have consistent results you need associative and commutative operators because you don't control how the list will be split at all.
A short example:
List(1, 2, 3, 4, 5).par.aggregate(1)(_ * _, _ * _)
res0: Int = 120
Answer to Edit1 (improved according to comment):
I don't think it's the right approach, for a N items list you'll create N Futures. Which creates a big overhead in scheduling. Unless the seqop is really long, I'll avoid creating a Future each time you are calling it.

How to elegantly extract range of list based on specific criteria?

I want to extract range of elements from a list, meeting the following requirements:
First element of range has to be an element previous to element matching specific condition
Last element of range has to be an element next to element matching specific condition
Example: For list (1,1,1,10,2,10,1,1,1) and condition x >= 10 I want to get (1,10,2,10,1)
This is very simple to program imperatively, but I am just wondering if there is some smart Scala-functional way to achieve it. Is it?
Keeping it in the scala standard lib, I would solve this using recursion:
def f(_xs: List[Int])(cond: Int => Boolean): List[Int] = {
def inner(xs: List[Int], res: List[Int]): List[Int] = xs match {
case Nil => Nil
case x :: y :: tail if cond(y) && res.isEmpty => inner(tail, res ++ (x :: y :: Nil))
case x :: y :: tail if cond(x) && res.nonEmpty => res ++ (x :: y :: Nil)
case x :: tail if res.nonEmpty => inner(tail, res :+ x)
case x :: tail => inner(tail, res)
}
inner(_xs, Nil)
}
scala> f(List(1,1,1,10,2,10,1,1,1))(_ >= 10)
res3: List[Int] = List(1, 10, 2, 10, 1)
scala> f(List(2,10,2,10))(_ >= 10)
res4: List[Int] = List()
scala> f(List(2,10,2,10,1))(_ >= 10)
res5: List[Int] = List(2, 10, 2, 10, 1)
Maybe there is something I did not think of in this solution, or I missunderstood something, but I think you will get the basic idea.
Good functional algorithm design practice is all about breaking complex problems into simpler ones.
The principle is called Divide and Conquer.
It's easy to extract two simpler subproblems from the subject problem:
Get a list of all elements after the matching one, preceded with this matching element,
preceded with an element before it.
Get a list of all elements up to the latest matching one, followed by the matching element and
the element after it.
The named problems are simple enough for the appropriate functions to be implemented, so no subdivision is required.
Here's the implementation of the first function:
def afterWithPredecessor
[ A ]
( elements : List[ A ] )
( test : A => Boolean )
: List[ A ]
= elements match {
case Nil => Nil
case a :: tail if test( a ) => Nil // since there is no predecessor
case a :: b :: tail if test( b ) => a :: b :: tail
case a :: tail => afterWithPredecessor( tail )( test )
}
Since the second problem can be seen as a direct inverse of the first one, it can be easily implemented by reversing the input and output:
def beforeWithSuccessor
[ A ]
( elements : List[ A ] )
( test : A => Boolean )
: List[ A ]
= afterWithPredecessor( elements.reverse )( test ).reverse
But here's an optimized version of this:
def beforeWithSuccessor
[ A ]
( elements : List[ A ] )
( test : A => Boolean )
: List[ A ]
= elements match {
case Nil => Nil
case a :: b :: tail if test( a ) =>
a :: b :: beforeWithSuccessor( tail )( test )
case a :: tail =>
beforeWithSuccessor( tail )( test ) match {
case Nil => Nil
case r => a :: r
}
}
Finally, composing the above functions together to produce the function solving your problem becomes quite trivial:
def range[ A ]( elements : List[ A ] )( test : A => Boolean ) : List[ A ]
= beforeWithSuccessor( afterWithPredecessor( elements )( test ) )( test )
Tests:
scala> range( List(1,1,1,10,2,10,1,1,1) )( _ >= 10 )
res0: List[Int] = List(1, 10, 2, 10, 1)
scala> range( List(1,1,1,10,2,10,1,1,1) )( _ >= 1 )
res1: List[Int] = List()
scala> range( List(1,1,1,10,2,10,1,1,1) )( _ == 2 )
res2: List[Int] = List(10, 2, 10)
The second test returns an empty list since the outermost elements satisfying the predicate have no predecessors (or successors).
def range[T](elements: List[T], condition: T => Boolean): List[T] = {
val first = elements.indexWhere(condition)
val last = elements.lastIndexWhere(condition)
elements.slice(first - 1, last + 2)
}
scala> range[Int](List(1,1,1,10,2,10,1,1,1), _ >= 10)
res0: List[Int] = List(1, 10, 2, 10, 1)
scala> range[Int](List(2,10,2,10), _ >= 10)
res1: List[Int] = List(2, 10, 2, 10)
scala> range[Int](List(), _ >= 10)
res2: List[Int] = List()
Zip and map to the rescue
val l = List(1, 1, 1, 10, 2, 1, 1, 1)
def test (i: Int) = i >= 10
((l.head :: l) zip (l.tail :+ l.last)) zip l filter {
case ((a, b), c) => (test (a) || test (b) || test (c) )
} map { case ((a, b), c ) => c }
That should work. I only have my smartphone and am miles from anywhere I could test this, so apologise for any typos or minor syntax errors
Edit: works now. I hope it's obvious that my solution shuffles the list to the right and to the left to create two new lists. When these are zipped together and zipped again with the original list, the result is a list of tuples, each containing the original element and a tuple of its neighbours. This is then trivial to filter and map back to a simple list.
Making this into a more general function (and using collect rather than filter -> map)...
def filterWithNeighbours[E](l: List[E])(p: E => Boolean) = l match {
case Nil => Nil
case li if li.size < 3 => if (l exists p) l else Nil
case _ => ((l.head :: l) zip (l.tail :+ l.last)) zip l collect {
case ((a, b), c) if (p (a) || p (b) || p (c) ) => c
}
}
This is less efficient than the recursive solution but makes the test much simpler and more clear. It can be difficult to match the right sequence of patterns in a recursive solution, as the patterns often express the shape of the chosen implementation rather than the original data. With the simple functional solution, each element is clearly and simply being compared to its neighbours.

Calculating differences of subsequent elements of a sequence in scala

I would like to do almost exactly this in scala. Is there an elegant way?
Specifically, I just want the difference of adjacent elements in a sequence. For example
input = 1,2,6,9
output = 1,4,3
How about this?
scala> List(1, 2, 6, 9).sliding(2).map { case Seq(x, y, _*) => y - x }.toList
res0: List[Int] = List(1, 4, 3)
Here is one that uses recursion and works best on Lists
def differences(l:List[Int]) : List[Int] = l match {
case a :: (rest # b :: _) => (b - a) :: differences(rest)
case _ => Nil
}
And here is one that should be pretty fast on Vector or Array:
def differences(a:IndexedSeq[Int]) : IndexedSeq[Int] =
a.indices.tail.map(i => a(i) - a(i-1))
Of course there is always this:
def differences(a:Seq[Int]) : Seq[Int] =
a.tail.zip(a).map { case (x,y) => x - y }
Note that only the recursive version handles empty lists without an exception.

Expand a Set[Set[String]] into Cartesian Product in Scala

I have the following set of sets. I don't know ahead of time how long it will be.
val sets = Set(Set("a","b","c"), Set("1","2"), Set("S","T"))
I would like to expand it into a cartesian product:
Set("a&1&S", "a&1&T", "a&2&S", ..., "c&2&T")
How would you do that?
I think I figured out how to do that.
def combine(acc:Set[String], set:Set[String]) = for (a <- acc; s <- set) yield {
a + "&" + s
}
val expanded = sets.reduceLeft(combine)
expanded: scala.collection.immutable.Set[java.lang.String] = Set(b&2&T, a&1&S,
a&1&T, b&1&S, b&1&T, c&1&T, a&2&T, c&1&S, c&2&T, a&2&S, c&2&S, b&2&S)
Nice question. Here's one way:
scala> val seqs = Seq(Seq("a","b","c"), Seq("1","2"), Seq("S","T"))
seqs: Seq[Seq[java.lang.String]] = List(List(a, b, c), List(1, 2), List(S, T))
scala> val seqs2 = seqs.map(_.map(Seq(_)))
seqs2: Seq[Seq[Seq[java.lang.String]]] = List(List(List(a), List(b), List(c)), List(List(1), List(2)), List(List(S), List(T)))
scala> val combined = seqs2.reduceLeft((xs, ys) => for {x <- xs; y <- ys} yield x ++ y)
combined: Seq[Seq[java.lang.String]] = List(List(a, 1, S), List(a, 1, T), List(a, 2, S), List(a, 2, T), List(b, 1, S), List(b, 1, T), List(b, 2, S), List(b, 2, T), List(c, 1, S), List(c, 1, T), List(c, 2, S), List(c, 2, T))
scala> combined.map(_.mkString("&"))
res11: Seq[String] = List(a&1&S, a&1&T, a&2&S, a&2&T, b&1&S, b&1&T, b&2&S, b&2&T, c&1&S, c&1&T, c&2&S, c&2&T)
Came after the batle ;) but another one:
sets.reduceLeft((s0,s1)=>s0.flatMap(a=>s1.map(a+"&"+_)))
Expanding on dsg's answer, you can write it more clearly (I think) this way, if you don't mind the curried function:
def combine[A](f: A => A => A)(xs:Iterable[Iterable[A]]) =
xs reduceLeft { (x, y) => x.view flatMap { y map f(_) } }
Another alternative (slightly longer, but much more readable):
def combine[A](f: (A, A) => A)(xs:Iterable[Iterable[A]]) =
xs reduceLeft { (x, y) => for (a <- x.view; b <- y) yield f(a, b) }
Usage:
combine[String](a => b => a + "&" + b)(sets) // curried version
combine[String](_ + "&" + _)(sets) // uncurried version
Expanding on #Patrick's answer.
Now it's more general and lazier:
def combine[A](f:(A, A) => A)(xs:Iterable[Iterable[A]]) =
xs.reduceLeft { (x, y) => x.view.flatMap {a => y.map(f(a, _)) } }
Having it be lazy allows you to save space, since you don't store the exponentially many items in the expanded set; instead, you generate them on the fly. But, if you actually want the full set, you can still get it like so:
val expanded = combine{(x:String, y:String) => x + "&" + y}(sets).toSet