Group, map & reduce with two different reducer-operators - scala

I have these tuples:
("T1",2,"x1"),
("T1",2,"x2"),
// … etc
And i want to reduce it to ("T1", 4, List("x1", "x2")). How can i do this ?
I did something like .group(_._1).map{case (key,list) => key-> list.map(_._2).reduce(_+_)}
But this is not working, and just sums the numbers without appending the list.

With groupMapReduce:
val xs = List(
("T1",40,"x1"),
("T1",2,"x2"),
("T2",58,"x3")
)
println(xs.groupMapReduce(_._1)
(e => (e._2, List(e._3)))
({ case ((x, y), (z, w)) => (x + z, y ++ w)})
)
with groupBy:
val xs = List(
("T1",40,"x1"),
("T1",2,"x2"),
("T2",58,"x3")
)
println(xs.groupBy(_._1)
.view
.mapValues(ys => (ys.view.map(_._2).sum, ys.map(_._3)))
.toMap
)
If you want to do it in one pass per list, and not use ++ you could try sth. like this:
xs.groupBy(_._1)
.view
.mapValues(ys =>
ys.foldRight((0, List.empty[String])){
case ((_, n, x), (sum, acc)) => (n + sum, x :: acc)
}
)
.toMap
All three variants give
Map(T2 -> (58,List(x3)), T1 -> (42,List(x1, x2)))
Note that combining many lists with ++ might become very inefficient if the number of lists becomes large. It depends on your use-case whether this is acceptable or not.

Using foldLeft
val tuples = List(
("T1",2,"x1"),
("T1",2,"x2"),
("T2",2,"x1"),
("T2",2,"x2"),
("T3",2,"x1")
)
tuples.foldLeft(Map.empty[String, (Int, List[String])]){ (acc, curr) =>
acc.get(curr._1).fold(acc + (curr._1 -> (curr._2, List(curr._3)))) { case (int, ls) =>
acc + (curr._1 -> (int + curr._2, (curr._3 :: ls).reverse))
}
}

If you have cats in scope, all you need to do is this:
import cats.data.Chain
import cats.syntax.all._
def combineTripletes(data: List[(String, Int, String)]): Map[String, (Int, List[String])] =
data.foldMap {
case (key, i, str) =>
Map(key -> (i, Chain.one(str)))
} fmap {
case (sum, chain) =>
sum -> chain.toList
}

Related

Merging list of tuples in scala based on key

I have a list of tuples look like this:
Seq("ptxt"->"how","list"->"you doing","ptxt"->"whats up","ptxt"-> "this ","list"->"is ","list"->"cool")
On the keys, merge ptxt with all the list that will come after it.
e.g.
create a new seq look like this :
Seq("how you doing", "whats up", "this is cool")
You could fold your Seq with foldLeft:
val s = Seq("ptxt"->"how ","list"->"you doing","ptxt"->"whats up","ptxt"-> "this ","list"->"is ","list"->"cool")
val r: Seq[String] = s.foldLeft(List[String]()) {
case (xs, ("ptxt", s)) => s :: xs
case (x :: xs, ("list", s)) => (x + s) :: xs
}.reverse
If you don't care about an order you can omit reverse.
Function foldLeft takes two arguments first is the initial value and the second one is a function taking two arguments: the previous result and element of the sequence. Result of this method is then fed the next function call as the first argument.
For example for numbers foldLeft, would just create a sum of all elements starting from left.
List(5, 4, 8, 6, 2).foldLeft(0) { (result, i) =>
result + i
} // 25
For our case, we start with an empty list. Then we provide function, which handles two cases using pattern matching.
Case when the key is "ptxt". In this case, we just prepend the value to list.
case (xs, ("ptxt", s)) => s :: xs
Case when the key is "list". Here we take the first string from the list (using pattern matching) and then concatenate value to it, after that we put it back with the rest of the list.
case (x :: xs, ("list", s)) => (x + s) :: xs
At the end since we were prepending element, we need to revert our list. Why we were prepending, not appending? Because append on the immutable list is O(n) and prepend is O(1), so it's more efficient.
Here another solution:
val data = Seq("ptxt"->"how","list"->"you doing","ptxt"->"whats", "list" -> "up","ptxt"-> "this ", "list"->"is cool")
First group Keys and Values:
val grouped = s.groupBy(_._1)
.map{case (k, l) => k -> l.map{case (_, v) => v.trim}}
// > Map(list -> List(you doing, up, is cool), ptxt -> List(how, whats, this))
Then zip and concatenate the two values:
grouped("ptxt").zip(grouped("list"))
.map{case (a, b) => s"$a $b"}
// > List(how you doing, whats up, this is cool)
Disclaimer: This only works if the there is always key, value, key, value,.. in the list - I had to adjust the input data.
If you change Seq for List, you can solve that with a simple tail-recursive function.
(The code uses Scala 2.13, but can be rewritten to use older Scala versions if needed)
def mergeByKey[K](list: List[(K, String)]): List[String] = {
#annotation.tailrec
def loop(remaining: List[(K, String)], acc: Map[K, StringBuilder]): List[String] =
remaining match {
case Nil =>
acc.valuesIterator.map(_.result()).toList
case (key, value) :: tail =>
loop(
remaining = tail,
acc.updatedWith(key) {
case None => Some(new StringBuilder(value))
case Some(oldValue) => Some(oldValue.append(value))
}
)
}
loop(remaining = list, acc = Map.empty)
}
val data = List("ptxt"->"how","list"->"you doing","ptxt"->"whats up","ptxt"-> "this ","list"->"is ","list"->"cool")
mergeByKey(data)
// res: List[String] = List("howwhats upthis ", "you doingis cool")
Or a one liner using groupMap.
(inspired on pme's answer)
data.groupMap(_._1)(_._2).view.mapValues(_.mkString).valuesIterator.toList
Adding another answer since I don't have enough reputation points for adding a comment. just an improvment on Krzysztof Atłasik's answer. to compensate for the case where the Seq starts with a "list" you might want to add another case as:
case (xs,("list", s)) if xs.isEmpty=>xs
So the final code could be something like:
val s = Seq("list"->"how ","list"->"you doing","ptxt"->"whats up","ptxt"-> "this ","list"->"is ","list"->"cool")
val r: Seq[String] = s.foldLeft(List[String]()) {
case (xs,("list", s)) if xs.isEmpty=>xs
case (xs, ("ptxt", s)) => s :: xs
case (x :: xs, ("list", s)) => (x + s) :: xs
}.reverse

Scala - state while looping through a list

Newbie question.
I am looping through a list and need keep state in between the items.
For instance
val l = List("a", "1", "2", "3", "b", "4")
var state: String = ""
l.foreach(o => {
if (toInt(o).isEmpty) state = o else println(state + o.toString)
})
what's the alternative for the usage of var here?
You should keep in mind that it's sometimes (read: when it makes the code more readable and maintainable by others) okay to use mutability when performing some operation that's easily expressed with mutable state as long as that mutable state is confined to as little of your program as possible. Using (e.g.) foldLeft to maintain an accumulator here without using a var doesn't gain you much.
That said, here's one way to go about doing this:
val listOfThings: Seq[Either[Char, Int]] = Seq(Left('a'), Right(11), Right(212), Left('b'), Right(89))
val result = listOfThings.foldLeft(Seq[(Char, Seq[Int])]()) {
case (accumulator, Left(nextChar)) => accumulator :+ (nextChar, Seq.empty)
case (accumulator, Right(nextInt)) =>
val (currentChar, currentSequence) = accumulator.last
accumulator.dropRight(1) :+ (currentChar, currentSequence :+ nextInt)
}
result foreach {
case (char, numbers) => println(numbers.map(num => s"$char-$num").mkString(" "))
}
Use foldLeft:
l.foldLeft(""){ (state, o) =>
if(toInt(o).isEmpty) o
else {
println(state + o.toString)
state
}
}
Pass an arg:
scala> def collapse(header: String, vs: List[String]): Unit = vs match {
| case Nil =>
| case h :: t if h.forall(Character.isDigit) => println(s"$header$h") ; collapse(header, t)
| case h :: t => collapse(h, t)
| }
collapse: (header: String, vs: List[String])Unit
scala> collapse("", vs)
a1
a2
a3
b4
As simple as:
val list: List[Int] = List.range(1, 10) // Create list
def updateState(i : Int) : Int = i + 1 // Generate new state, just add one to each position. That will be the state
list.foldRight[List[(Int,Int)]](List())((a, b) => (a, updateState(a)) :: b)
Note that the result is a list of Tuple2: (Element, State), and each state depends on the element of the list.
Hope this helps
There are two major options to pass a state in functional programming when processing collections (I assume you want to get your result as a variable):
Recursion (classic)
val xs = List("a", "11", "212", "b", "89")
#annotation.tailrec
def fold(seq: ListBuffer[(String, ListBuffer[String])],
xs: Seq[String]): ListBuffer[(String, ListBuffer[String])] = {
(seq, xs) match {
case (_, Nil) =>
seq
case (_, c :: tail) if toInt(c).isEmpty =>
fold(seq :+ ((c, ListBuffer[String]())), tail)
case (init :+ ((c, seq)), i :: tail) =>
fold(init :+ ((c, seq :+ i)), tail)
}
}
val result =
fold(ListBuffer[(String, ListBuffer[String])](), xs)
// Get rid of mutable ListBuffer
.toSeq
.map {
case (c, seq) =>
(c, seq.toSeq)
}
//> List((a,List(11, 212)), (b,List(89)))
foldLeft et al.
val xs = List("a", "11", "212", "b", "89")
val result =
xs.foldLeft(
ListBuffer[(String, ListBuffer[String])]()
) {
case (seq, c) if toInt(c).isEmpty =>
seq :+ ((c, ListBuffer[String]()))
case (init :+ ((c, seq)), i) =>
init :+ ((c, seq :+ i))
}
// Get rid of mutable ListBuffer
.toSeq
.map {
case (c, seq) =>
(c, seq.toSeq)
}
//> List((a,List(11, 212)), (b,List(89)))
Which one is better? Unless you want to abort your processing in the middle of your collection (like e.g. in find) foldLeft is considered a better way and it has slightly less boilerplate, but otherwise they are very similar.
I'm using ListBuffer here to avoid reversing lists.

What is the idiomatic way to both filter items out of a list and count them in Scala

I find that I often end up with a list of Options (or Eithers or Trys) and I want to count the number of Nones before I flatten the list. Is there a nice idiomatic way to do this that doesn't require I process the list multiple times?
Something like this but better:
val sprockets: List[Option[Sprockets]] = getSprockets()
println("this many sprockets failed to be parsed" + sprockets.filter(_.isEmpty).count)
println(sprockets.flatten)
I would have used a fold as Daenyth suggested, for example somthing like this:
val list = List(Some(1),None,Some(0),Some(3),None)
val (flatList,count) = list.foldLeft((List[Int](),0)){
case ((data,count), Some(x)) => (data :+ x, count)
case ((data,count), None) => (data, count +1)
}
//output
//flatList: List[Int] = List(1, 0, 3)
//count: Int = 2
Recursion maybe?
#tailrec
def flattenAndCountNones[A](in: Seq[Option[A]], out: Seq[A] = Queue.empty[A], n: Int = 0): (Seq[A], Int) = in match {
case Nil => (out, n)
case Some(x) :: tail => flattenAndCountNones(tail, out :+ x, n)
case None :: tail => flattenAndCountNones(tail, out, n + 1)
}
Is this what you're looking for?
val foo = List(Some(3), Some(4), None:Option[Int], Some(5), Some(6))
val (concatenatedList, emptyCount) =
foo.map(entry =>
(entry.toList, if (entry.isEmpty) 1 else 0)
).fold((List[Int](), 0))((a, b) =>
(a._1 ++ b._1, a._2 + b._2)
)
It is one pass, but I'm not sure if it's really any more efficient than doing it in two - the extra object creation (the Tuple2s) in this case is going to offset the extra loop in the two-pass case.

Converting List to Map using keys in Scala

I am struggling with finding an elegant FP approach to solving the following problem in Scala:
Say I have a set of candidate keys
val validKeys = Set("key1", "key2", "key3")
And a list that
Starts with a key
has some number of non-keys (> 0) between each key
Does not end with a key
For example:
val myList = List("key3", "foo", "bar", "key1", "baz")
I'd like to transform this list into a map by choosing using valid keys as the key and aggregating non-keys as the value. So, in the example above:
("key3" -> "foo\nbar", "key1" -> "baz")
Thanks in advance.
Short and simple:
def create(a: List[String]): Map[String, String] = a match {
case Nil => Map()
case head :: tail =>
val (vals, rest) = tail.span(!validKeys(_))
create(rest) + (head -> vals.mkString("\n"))
}
Traversing a list from left to right, accumulating a result should suggest foldLeft
myList.foldLeft((Map[String, String](), "")) {
case ((m, lk), s) =>
if (validKeys contains s)
(m updated (s, ""), s)
else (m updated (lk, if (m(lk) == "") s else m(lk) + "\n" + s), lk)
}._1
// Map(key3 -> foo\nbar, key1 -> baz)
As a first approximation solution:
def group(list:List[String]):List[(String, List[String])] = {
#tailrec
def grp(list:List[String], key:String, acc:List[String]):List[(String, List[String])] =
list match {
case Nil => List((key, acc.reverse))
case x :: xs if validKeys(x) => (key, acc.reverse)::group(x::xs)
case x :: xs => grp(xs, key, x::acc)
}
list match {
case Nil => Nil
case x::xs => grp(xs, x, List())
}
}
val map = group(myList).toMap
Another option:
list.foldLeft((Map[String, String](), "")) {
case ((map, key), item) if validKeys(item) => (map, item)
case ((map, key), item) =>
(map.updated(key, map.get(key).map(v => v + "\n" + item).getOrElse(item)), key)
}._1

n-way `span` on sequences

Given a sequence of elements and a predicate p, I would like to produce a sequence of sequences such that, in each subsequence, either all elements satisfy p or the sequence has length 1. Additionally, calling .flatten on the result should give me back my original sequence (so no re-ordering of elements).
For instance, given:
val l = List(2, 4, -6, 3, 1, 8, 7, 10, 0)
val p = (i : Int) => i % 2 == 0
I would like magic(l,p) to produce:
List(List(2, 4, -6), List(3), List(1), List(8), List(7), List(10, 0))
I know of .span, but that method stops the first time it encounters a value that doesn't satisfy p and just returns a pair.
Below is a candidate implementation. It does what I want, but, well, makes we want to cry. I would love for someone to come up with something slightly more idiomatic.
def magic[T](elems : Seq[T], p : T=>Boolean) : Seq[Seq[T]] = {
val loop = elems.foldLeft[(Boolean,Seq[Seq[T]])]((false,Seq.empty)) { (pr,e) =>
val (lastOK,s) = pr
if(lastOK && p(e)) {
(true, s.init :+ (s.last :+ e))
} else {
(p(e), s :+ Seq(e))
}
}
loop._2
}
(Note that I do not particularly care about preserving the actual type of the Seq.)
I would not use foldLeft. It's just a simple recursion of span with a special rule if the head doesn't match the predicate:
def magic[T](elems: Seq[T], p: T => Boolean): Seq[Seq[T]] =
elems match {
case Seq() => Seq()
case Seq(head, tail # _*) if !p(head) => Seq(head) +: magic(tail, p)
case xs =>
val (prefix, rest) = xs span p
prefix +: magic(rest, p)
}
You could also do it tail-recursive, but you need to remember to reverse the output if you're prepending (as is sensible):
def magic[T](elems: Seq[T], p: T => Boolean): Seq[Seq[T]] = {
def iter(elems: Seq[T], out: Seq[Seq[T]]) : Seq[Seq[T]] =
elems match {
case Seq() => out.reverse
case Seq(head, tail # _*) if !p(head) => iter(tail, Seq(head) +: out)
case xs =>
val (prefix, rest) = xs span p
iter(rest, prefix +: out)
}
iter(elems, Seq())
}
For this task you can use takeWhile and drop combined with a little pattern matching an recursion:
def magic[T](elems : Seq[T], p : T=>Boolean) : Seq[Seq[T]] = {
def magic(elems: Seq[T], result: Seq[Seq[T]]): Seq[Seq[T]] = elems.takeWhile(p) match {
// if elems is Nil, we have a result
case Nil if elems.isEmpty => result
// if it's not, but we don't get any values from takeWhile, we take a single elem
case Nil => magic(elems.tail, result :+ Seq(elems.head))
// takeWhile gave us something, so we add it to the result
// and drop as many elements from elems, as takeWhile gave us
case xs => magic(elems.drop(xs.size), result :+ xs)
}
magic(elems, Seq())
}
Another solution using a fold:
def magicFilter[T](seq: Seq[T], p: T => Boolean): Seq[Seq[T]] = {
val (filtered, current) = (seq foldLeft (Seq[Seq[T]](), Seq[T]())) {
case ((filtered, current), element) if p(element) => (filtered, current :+ element)
case ((filtered, current), element) if !current.isEmpty => (filtered :+ current :+ Seq(element), Seq())
case ((filtered, current), element) => (filtered :+ Seq(element), Seq())
}
if (!current.isEmpty) filtered :+ current else filtered
}