Scala conditional fold in Tuple(int,int) - scala

I have a list of tuples (int,int) such as
(100,3), (130,3), (160,1), (180,2), (200,2)
I want to foldRight or something efficient where the neighbors are compared. For ((A1,A2),(B1,B2)), we do a merge only when A2 is less than or equal to B2. Otherwise, we do not fold the list at that instance. If we merge, we retain (A1,A2) and add a count field.
The sample output is
(100,3,2) and (160,1,3)
here 2 and 3 are the weight of the observations folded to this one observation.
(100,3), (130,3)
will lead to (100,3,2)
while
(160,1), (180,2), (200,2)
will lead to (160,1,3)
Any idea how to write it in scala functional style?

scala>def conditionalFold(in: List[(Int, Int)]): List[(Int, Int, Int)] =
| in.foldLeft(Nil: List[(Int, Int, Int)]) { (acc, i) =>
| acc match {
| case Nil =>
| (i._1, i._2, 1) :: Nil
| case head :: tail =>
| if (i._2 >= head._2)
| (head._1, head._2, head._3 + 1) :: tail
| else
| (i._1, i._2, 1) :: head :: tail
| }
| }.reverse
conditionalFold: (in: List[(Int, Int)])List[(Int, Int, Int)]
scala> println(conditionalFold(List((100, 3), (130, 3), (160, 1), (180, 2), (200, 2))))
List((100,3,2), (160,1,3))
scala> println(conditionalFold(List((100,3), (130,3))))
List((100,3,2))
scala> println(conditionalFold(List((160,1), (180,2), (200,2))))
List((160,1,3))

Related

Scala map with dependent variables

In scala I have a list of functions that return a value. The order in which the functions are executed are important since the argument of function n is the output of function n-1.
This hints to use foldLeft, something like:
val base: A
val funcs: Seq[Function[A, A]]
funcs.foldLeft(base)(x, f) => f(x)
(detail: type A is actually a Spark DataFrame).
However, the results of each functions are mutually exclusive and in the end I want the union of all the results for each function.
This hints to use a map, something like:
funcs.map(f => f(base)).reduce(_.union(_)
But here each function is applied to base which is not what I want.
Short: A list of variable length of ordered functions needs to return a list of equal length of return values, where each value n-1 was the input for function n (starting from base where n=0). Such that the result values can be concatenated.
How can I achieve this?
EDIT
example:
case class X(id:Int, value:Int)
val base = spark.createDataset(Seq(X(1, 1), X(2, 2), X(3, 3), X(4, 4), X(5, 5))).toDF
def toA = (x: DataFrame) => x.filter('value.mod(2) === 1).withColumn("value", lit("a"))
def toB = (x: DataFrame) => x.withColumn("value", lit("b"))
val a = toA(base)
val remainder = base.join(a, Seq("id"), "leftanti")
val b = toB(remainder)
a.union(b)
+---+-----+
| id|value|
+---+-----+
| 1| a|
| 3| a|
| 5| a|
| 2| b|
| 4| b|
+---+-----+
This should work for an arbitrary number of functions (e.g. toA, toB ... toN. Where each time the remainder of the previous result is calculated and passed into the next function. In the end a union is applied to all results.
Seq already has a method scanLeft that does this out-of-the-box:
funcs.scanLeft(base)((acc, f) => f(acc)).tail
Make sure to drop the first element of the result of scanLeft if you don't want base to be included.
Using only foldLeft it is possible too:
funcs.foldLeft((base, List.empty[A])){ case ((x, list), f) =>
val res = f(x)
(res, res :: list)
}._2.reverse.reduce(_.union(_))
Or:
funcs.foldLeft((base, Vector.empty[A])){ case ((x, list), f) =>
val res = f(x)
(res, list :+ res)
}._2.reduce(_.union(_))
The trick is to accumulate into a Seq inside the fold.
Example:
scala> val base = 7
base: Int = 7
scala> val funcs: List[Int => Int] = List(_ * 2, _ + 3)
funcs: List[Int => Int] = List($$Lambda$1772/1298658703#7d46af18, $$Lambda$1773/107346281#5470fb9b)
scala> funcs.foldLeft((base, Vector.empty[Int])){ case ((x, list), f) =>
| val res = f(x)
| (res, list :+ res)
| }._2
res8: scala.collection.immutable.Vector[Int] = Vector(14, 17)
scala> .reduce(_ + _)
res9: Int = 31
I've got a simplified solution using normal collections but the same principle applies.
val list: List[Int] = List(1, 2, 3, 4, 5)
val funcs: Seq[Function[List[Int], List[Int]]] = Seq(times2, by2)
funcs.foldLeft(list) { case(collection, func) => func(collection) } foreach println // prints 1 2 3 4 5
def times2(l: List[Int]): List[Int] = l.map(_ * 2)
def by2(l: List[Int]): List[Int] = l.map(_ / 2)
This solution does not hold if you want a single reduced value as your final output e.g. single Int; therefore this works as:
F[B] -> F[B] -> F[B] and not as F[B] -> F[B] -> B; though I guess this is what you need.

Scala recursion function that removes number from list needs work

I have a recursive method in which remove it should all the zeroes from a given list.
def removeZ(list:List[Int], n:Int):List[Int] = list match {
case Nil => Nil
case h::t=>
if (h == n)
t
else
h :: removeZ(t,n)
}
This removes one zero from the list but if the list has multiple zeros it won't. I tried adding another if else statement that didn't work such as:
if else(t==n)
removeZ(t,n)
How can I have all zeroes be removed?
That's because after the first 0 you return the tail, you have to keep iterating:
scala> def removeZ(list: List[Int], n: Int): List[Int] = list match {
| case Nil => Nil
| case h :: t =>
| if (h == n)
| removeZ(t, n) // 0 found, skip it and iterate the tail
| else
| h :: removeZ(t, n)
| }
removeZ: (list: List[Int], n: Int)List[Int]
scala> removeZ(List(1,0,2,0,3), 0)
res0: List[Int] = List(1, 2, 3)

Does Scala have a statement equivalent to ML's "as" construct?

In ML, one can assign names for each element of a matched pattern:
fun findPair n nil = NONE
| findPair n (head as (n1, _))::rest =
if n = n1 then (SOME head) else (findPair n rest)
In this code, I defined an alias for the first pair of the list and matched the contents of the pair. Is there an equivalent construct in Scala?
You can do variable binding with the # symbol, e.g.:
scala> val wholeList # List(x, _*) = List(1,2,3)
wholeList: List[Int] = List(1, 2, 3)
x: Int = 1
I'm sure you'll get a more complete answer later as I'm not sure how to write it recursively like your example, but maybe this variation would work for you:
scala> val pairs = List((1, "a"), (2, "b"), (3, "c"))
pairs: List[(Int, String)] = List((1,a), (2,b), (3,c))
scala> val n = 2
n: Int = 2
scala> pairs find {e => e._1 == n}
res0: Option[(Int, String)] = Some((2,b))
OK, next attempt at direct translation. How about this?
scala> def findPair[A, B](n: A, p: List[Tuple2[A, B]]): Option[Tuple2[A, B]] = p match {
| case Nil => None
| case head::rest if head._1 == n => Some(head)
| case _::rest => findPair(n, rest)
| }
findPair: [A, B](n: A, p: List[(A, B)])Option[(A, B)]

How to generate the power set of a set in Scala

I have a Set of items of some type and want to generate its power set.
I searched the web and couldn't find any Scala code that adresses this specific task.
This is what I came up with. It allows you to restrict the cardinality of the sets produced by the length parameter.
def power[T](set: Set[T], length: Int) = {
var res = Set[Set[T]]()
res ++= set.map(Set(_))
for (i <- 1 until length)
res = res.map(x => set.map(x + _)).flatten
res
}
This will not include the empty set. To accomplish this you would have to change the last line of the method simply to res + Set()
Any suggestions how this can be accomplished in a more functional style?
Looks like no-one knew about it back in July, but there's a built-in method: subsets.
scala> Set(1,2,3).subsets foreach println
Set()
Set(1)
Set(2)
Set(3)
Set(1, 2)
Set(1, 3)
Set(2, 3)
Set(1, 2, 3)
Notice that if you have a set S and another set T where T = S ∪ {x} (i.e. T is S with one element added) then the powerset of T - P(T) - can be expressed in terms of P(S) and x as follows:
P(T) = P(S) ∪ { p ∪ {x} | p ∈ P(S) }
That is, you can define the powerset recursively (notice how this gives you the size of the powerset for free - i.e. adding 1-element doubles the size of the powerset). So, you can do this tail-recursively in scala as follows:
scala> def power[A](t: Set[A]): Set[Set[A]] = {
| #annotation.tailrec
| def pwr(t: Set[A], ps: Set[Set[A]]): Set[Set[A]] =
| if (t.isEmpty) ps
| else pwr(t.tail, ps ++ (ps map (_ + t.head)))
|
| pwr(t, Set(Set.empty[A])) //Powerset of ∅ is {∅}
| }
power: [A](t: Set[A])Set[Set[A]]
Then:
scala> power(Set(1, 2, 3))
res2: Set[Set[Int]] = Set(Set(1, 2, 3), Set(2, 3), Set(), Set(3), Set(2), Set(1), Set(1, 3), Set(1, 2))
It actually looks much nicer doing the same with a List (i.e. a recursive ADT):
scala> def power[A](s: List[A]): List[List[A]] = {
| #annotation.tailrec
| def pwr(s: List[A], acc: List[List[A]]): List[List[A]] = s match {
| case Nil => acc
| case a :: as => pwr(as, acc ::: (acc map (a :: _)))
| }
| pwr(s, Nil :: Nil)
| }
power: [A](s: List[A])List[List[A]]
Here's one of the more interesting ways to write it:
import scalaz._, Scalaz._
def powerSet[A](xs: List[A]) = xs filterM (_ => true :: false :: Nil)
Which works as expected:
scala> powerSet(List(1, 2, 3)) foreach println
List(1, 2, 3)
List(1, 2)
List(1, 3)
List(1)
List(2, 3)
List(2)
List(3)
List()
See for example this discussion thread for an explanation of how it works.
(And as debilski notes in the comments, ListW also pimps powerset onto List, but that's no fun.)
Use the built-in combinations function:
val xs = Seq(1,2,3)
(0 to xs.size) flatMap xs.combinations
// Vector(List(), List(1), List(2), List(3), List(1, 2), List(1, 3), List(2, 3),
// List(1, 2, 3))
Note, I cheated and used a Seq, because for reasons unknown, combinations is defined on SeqLike. So with a set, you need to convert to/from a Seq:
val xs = Set(1,2,3)
(0 to xs.size).flatMap(xs.toSeq.combinations).map(_.toSet).toSet
//Set(Set(1, 2, 3), Set(2, 3), Set(), Set(3), Set(2), Set(1), Set(1, 3),
//Set(1, 2))
Can be as simple as:
def powerSet[A](xs: Seq[A]): Seq[Seq[A]] =
xs.foldLeft(Seq(Seq[A]())) {(sets, set) => sets ++ sets.map(_ :+ set)}
Recursive implementation:
def powerSet[A](xs: Seq[A]): Seq[Seq[A]] = {
def go(xsRemaining: Seq[A], sets: Seq[Seq[A]]): Seq[Seq[A]] = xsRemaining match {
case Nil => sets
case y :: ys => go(ys, sets ++ sets.map(_ :+ y))
}
go(xs, Seq[Seq[A]](Seq[A]()))
}
All the other answers seemed a bit complicated, here is a simple function:
def powerSet (l:List[_]) : List[List[Any]] =
l match {
case Nil => List(List())
case x::xs =>
var a = powerSet(xs)
a.map(n => n:::List(x)):::a
}
so
powerSet(List('a','b','c'))
will produce the following result
res0: List[List[Any]] = List(List(c, b, a), List(b, a), List(c, a), List(a), List(c, b), List(b), List(c), List())
Here's another (lazy) version... since we're collecting ways of computing the power set, I thought I'd add it:
def powerset[A](s: Seq[A]) =
Iterator.range(0, 1 << s.length).map(i =>
Iterator.range(0, s.length).withFilter(j =>
(i >> j) % 2 == 1
).map(s)
)
Here's a simple, recursive solution using a helper function:
def concatElemToList[A](a: A, list: List[A]): List[Any] = (a,list) match {
case (x, Nil) => List(List(x))
case (x, ((h:List[_]) :: t)) => (x :: h) :: concatElemToList(x, t)
case (x, (h::t)) => List(x, h) :: concatElemToList(x, t)
}
def powerSetRec[A] (a: List[A]): List[Any] = a match {
case Nil => List()
case (h::t) => powerSetRec(t) ++ concatElemToList(h, powerSetRec (t))
}
so the call of
powerSetRec(List("a", "b", "c"))
will give the result
List(List(c), List(b, c), List(b), List(a, c), List(a, b, c), List(a, b), List(a))

Scala Get First and Last elements of List using Pattern Matching

I am doing a pattern matching on a list. Is there anyway I can access the first and last element of the list to compare?
I want to do something like..
case List(x, _*, y) if(x == y) => true
or
case x :: _* :: y =>
or something similar...
where x and y are first and last elements of the list..
How can I do that.. any Ideas?
Use the standard :+ and +: extractors from the scala.collection package
ORIGINAL ANSWER
Define a custom extractor object.
object :+ {
def unapply[A](l: List[A]): Option[(List[A], A)] = {
if(l.isEmpty)
None
else
Some(l.init, l.last)
}
}
Can be used as:
val first :: (l :+ last) = List(3, 89, 11, 29, 90)
println(first + " " + l + " " + last) // prints 3 List(89, 11, 29) 90
(For your case: case x :: (_ :+ y) if(x == y) => true)
In case you missed the obvious:
case list # (head :: tail) if head == list.last => true
The head::tail part is there so you don’t match on the empty list.
simply:
case head +: _ :+ last =>
for example:
scala> val items = Seq("ham", "spam", "eggs")
items: Seq[String] = List(ham, spam, eggs)
scala> items match {
| case head +: _ :+ last => Some((head, last))
| case List(head) => Some((head, head))
| case _ => None
| }
res0: Option[(String, String)] = Some((ham,eggs))
Lets understand the concept related to this question, there is a difference between '::', '+:' and ':+':
1st Operator:
'::' - It is right associative operator which works specially for lists
scala> val a :: b :: c = List(1,2,3,4)
a: Int = 1
b: Int = 2
c: List[Int] = List(3, 4)
2nd Operator:
'+:' - It is also right associative operator but it works on seq which is more general than just list.
scala> val a +: b +: c = List(1,2,3,4)
a: Int = 1
b: Int = 2
c: List[Int] = List(3, 4)
3rd Operator:
':+' - It is also left associative operator but it works on seq which is more general than just list
scala> val a :+ b :+ c = List(1,2,3,4)
a: List[Int] = List(1, 2)
b: Int = 3
c: Int = 4
The associativity of an operator is determined by the operator’s last character. Operators ending in a colon ‘:’ are right-associative. All other operators are left-associative.
A left-associative binary operation e1;op;e2 is interpreted as e1.op(e2)
If op is right-associative, the same operation is interpreted as { val x=e1; e2.op(x) }, where x is a fresh name.
Now comes answer for your question:
So now if you need to get first and last element from the list, please use following code
scala> val firstElement +: b :+ lastElement = List(1,2,3,4)
firstElement: Int = 1
b: List[Int] = List(2, 3)
lastElement: Int = 4