Scala - state while looping through a list - scala

Newbie question.
I am looping through a list and need keep state in between the items.
For instance
val l = List("a", "1", "2", "3", "b", "4")
var state: String = ""
l.foreach(o => {
if (toInt(o).isEmpty) state = o else println(state + o.toString)
})
what's the alternative for the usage of var here?

You should keep in mind that it's sometimes (read: when it makes the code more readable and maintainable by others) okay to use mutability when performing some operation that's easily expressed with mutable state as long as that mutable state is confined to as little of your program as possible. Using (e.g.) foldLeft to maintain an accumulator here without using a var doesn't gain you much.
That said, here's one way to go about doing this:
val listOfThings: Seq[Either[Char, Int]] = Seq(Left('a'), Right(11), Right(212), Left('b'), Right(89))
val result = listOfThings.foldLeft(Seq[(Char, Seq[Int])]()) {
case (accumulator, Left(nextChar)) => accumulator :+ (nextChar, Seq.empty)
case (accumulator, Right(nextInt)) =>
val (currentChar, currentSequence) = accumulator.last
accumulator.dropRight(1) :+ (currentChar, currentSequence :+ nextInt)
}
result foreach {
case (char, numbers) => println(numbers.map(num => s"$char-$num").mkString(" "))
}

Use foldLeft:
l.foldLeft(""){ (state, o) =>
if(toInt(o).isEmpty) o
else {
println(state + o.toString)
state
}
}

Pass an arg:
scala> def collapse(header: String, vs: List[String]): Unit = vs match {
| case Nil =>
| case h :: t if h.forall(Character.isDigit) => println(s"$header$h") ; collapse(header, t)
| case h :: t => collapse(h, t)
| }
collapse: (header: String, vs: List[String])Unit
scala> collapse("", vs)
a1
a2
a3
b4

As simple as:
val list: List[Int] = List.range(1, 10) // Create list
def updateState(i : Int) : Int = i + 1 // Generate new state, just add one to each position. That will be the state
list.foldRight[List[(Int,Int)]](List())((a, b) => (a, updateState(a)) :: b)
Note that the result is a list of Tuple2: (Element, State), and each state depends on the element of the list.
Hope this helps

There are two major options to pass a state in functional programming when processing collections (I assume you want to get your result as a variable):
Recursion (classic)
val xs = List("a", "11", "212", "b", "89")
#annotation.tailrec
def fold(seq: ListBuffer[(String, ListBuffer[String])],
xs: Seq[String]): ListBuffer[(String, ListBuffer[String])] = {
(seq, xs) match {
case (_, Nil) =>
seq
case (_, c :: tail) if toInt(c).isEmpty =>
fold(seq :+ ((c, ListBuffer[String]())), tail)
case (init :+ ((c, seq)), i :: tail) =>
fold(init :+ ((c, seq :+ i)), tail)
}
}
val result =
fold(ListBuffer[(String, ListBuffer[String])](), xs)
// Get rid of mutable ListBuffer
.toSeq
.map {
case (c, seq) =>
(c, seq.toSeq)
}
//> List((a,List(11, 212)), (b,List(89)))
foldLeft et al.
val xs = List("a", "11", "212", "b", "89")
val result =
xs.foldLeft(
ListBuffer[(String, ListBuffer[String])]()
) {
case (seq, c) if toInt(c).isEmpty =>
seq :+ ((c, ListBuffer[String]()))
case (init :+ ((c, seq)), i) =>
init :+ ((c, seq :+ i))
}
// Get rid of mutable ListBuffer
.toSeq
.map {
case (c, seq) =>
(c, seq.toSeq)
}
//> List((a,List(11, 212)), (b,List(89)))
Which one is better? Unless you want to abort your processing in the middle of your collection (like e.g. in find) foldLeft is considered a better way and it has slightly less boilerplate, but otherwise they are very similar.
I'm using ListBuffer here to avoid reversing lists.

Related

Scala - Use predicate function to summarize list of strings

I need to write a function to analyze some text files.
For that, there should be a function that splits the file via a predicate into sublists. It should only get the values after the first time the predicate evaluates to True and afterwards start a new sublist after the predicate was True again.
For Example:
List('ignore','these','words','x','this','is','first','x','this','is','second')
with predicate
x=>x.equals('x')
should produce
List(List('this','is','first'),List('this','is','second'))
I've already done the reading of the file into a List[String] and tried to use foldLeft with a case statement to iterate over the List.
words.foldLeft(List[List[String]]()) {
case (Nil, s) => List(List(s))
case (result, "x") => result :+ List()
case (result, s) => result.dropRight(1) :+ (result.last :+ s)
}
There are 2 problems with this though and I can't figure them out:
This does not ignore the words before the first time the predicate
evaluates to True
I can't use an arbitrary predicate function
If anyone could tell me what I have to do to fix my problems it would be highly appreciated.
I modified your example a little bit:
def foldWithPredicate[A](predicate: A => Boolean)(l: List[A]) =
l.foldLeft[List[List[A]]](Nil){
case (acc, e) if predicate(e) => acc :+ Nil //if predicate passed add new list at the end
case (Nil, _) => Nil //empty list means we need to ignore elements
case (xs :+ x, e) => xs :+ (x :+ e) //append an element to the last list
}
val l = List("ignore","these","words","x","this","is","first","x","this","is","second")
val predicate: String => Boolean = _.equals("x")
foldWithPredicate(predicate)(l) // List(List(this, is, first), List(this, is, second))
There's one problem performance related to your approach: appending is very slow on immutable lists.
It might be faster to prepend elements on the list, but then, of course, all lists will have elements in reversed order (but they could be reversed at the end).
def foldWithPredicate2[A](predicate: A => Boolean)(l: List[A]) =
l.foldLeft[List[List[A]]](Nil){
case (acc, e) if predicate(e) => Nil :: acc
case (Nil, _) => Nil
case (x :: xs, e) => (e :: x) :: xs
}.map(_.reverse).reverse
An alternative approach is to use span to split the items into the next sublist and the rest in a single call. The following code assumes Scala 2.13 for List.unfold:
def splitIntoBlocks[T](items: List[T])(startsNewBlock: T => Boolean): List[List[T]] = {
def splitBlock(items: List[T]): (List[T], List[T]) = items.span(!startsNewBlock(_))
List.unfold(splitBlock(items)._2) {
case blockIndicator :: rest => Some(splitBlock(rest))
case _ => None
}
}
And the usage:
scala> splitIntoBlocks(List(
"ignore", "these", "words",
"x", "this", "is", "first",
"x", "this", "is", "second")
)(_ == "x")
res0: List[List[String]] = List(List(this, is, first), List(this, is, second))

How to convert a list to list of lists by group the elements when an element repeats in Scala idiomatic way

I want to convert a list of elements into a list of lists by breaking whenever an element repeats like below
Input:
List(1, 2, 3, 1, 2, 1, 3, 1, 2, 3)
Ouput:
List[List[Integer]] = List(List(1, 2, 3), List(1, 2), List(1, 3), List(1, 2, 3))
Here's what I've tried:
val tokens = List(1,2,3,1,2,1,3,1,2,3)
val group = new ListBuffer[List[Integer]]()
val nextGroup = new ListBuffer[Integer]()
val nextTokens = new ListBuffer[Integer]()
for (t <- tokens) {
if (nextTokens.contains(t)) {
group += nextGroup.toList
nextGroup.clear()
nextTokens.clear()
}
nextGroup += t
nextTokens += t
}
group += nextGroup.toList
group.toList
I'm looking for a better way to achieve this using map/foldLeft... functions without using ListBuffer.
Thanks in advance.
Here is a version using foldLeft
tokens
.drop(1)
.foldLeft(List(tokens.take(1))) { case (res, token) =>
if (res.head.contains(token)) {
List(token) +: res
} else {
(res.head :+ token) +: res.tail
}
}
.reverse
Using drop and take ensures that this works on an empty list. Building the result in reverse means that the latest List is always at the head, and will also be efficient for long lists.
And for completeness, here is a recursive function that does the same thing:
def groupDistinct[T](tokens: List[T]): List[List[T]] = {
#tailrec
def loop(token: List[T], cur: List[T], res: List[List[T]]): List[List[T]] =
token match {
case Nil =>
res :+ cur
case head :: tail =>
if (cur.contains(head)) {
loop(tail, List(head), res :+ cur)
} else {
loop(tail, cur :+ head, res)
}
}
loop(tokens, Nil, Nil)
}
Using the solution provided for below question, I have come up with the below solution
https://stackoverflow.com/a/52976957/1696418
mySeq.foldLeft(List.empty[List[Int]]) {
case (acc, i) if acc.isEmpty => List(List(i))
case (acc, i) if acc.last.contains(i) => acc :+ List(i)
case (acc, i) => acc.init :+ (acc.last :+ i)
}
Here is a very direct solution which uses foldLeft while keeping track of two accumulator arrays: One for the final result, and one for the currently considered token sublist. Finally, we combine the result array with the last sublist.
val (acc, last) = tokens.foldLeft ((List[List[Int]](), List[Int]())) ((a,b) =>
if (a._2.contains(b)) (a._1 :+ a._2, List(b))
else (a._1, a._2 :+ b))
acc :+ last
Notice that this is not the computationally most efficient solution, since for each iteration we are checking through the entire sublist under consideration using contains. If you wish for efficiency, you might (depending on your data) consider an approach which uses hash maps or similar data structures instead for faster repetition checks.

Compress a Given Text of String in Scala

I have been trying to compress a String. Given a String like this:
AAABBCAADEEFF, I would need to compress it like 3A2B1C2A1D2E2F
I was able to come up with a tail recursive implementation:
#scala.annotation.tailrec
def compress(str: List[Char], current: Seq[Char], acc: Map[Int, String]): String = str match {
case Nil =>
if (current.nonEmpty)
s"${acc.values.mkString("")}${current.length}${current.head}"
else
s"${acc.values.mkString("")}"
case List(x) if current.contains(x) =>
val newMap = acc ++ Map(acc.keys.toList.last + 1 -> s"${current.length + 1}${current.head}")
compress(List.empty[Char], Seq.empty[Char], newMap)
case x :: xs if current.isEmpty =>
compress(xs, Seq(x), acc)
case x :: xs if !current.contains(x) =>
if (acc.nonEmpty) {
val newMap = acc ++ Map(acc.keys.toList.last + 1 -> s"${current.length}${current.head}")
compress(xs, Seq(x), newMap)
} else {
compress(xs, Seq(x), acc ++ Map(1 -> s"${current.length}${current.head}"))
}
case x :: xs =>
compress(xs, current :+ x, acc)
}
// Produces 2F3A2B1C2A instead of 3A2B1C2A1D2E2F
compress("AAABBCAADEEFF".toList, Seq.empty[Char], Map.empty[Int, String])
It fails however for the given case! Not sure what edge scenario I'm missing! Any help?
So what I'm actually doing is, going over the sequence of characters, collecting identical ones into a new Sequence and as long as the new character in the original String input (the first param in the compress method) is found in the current (the second parameter in the compress method), I keep collecting it.
As soon as it is not the case, I empty the current sequence, count and push the collected elements into the Map! It fails for some edge cases that I'm not able to make out!
I came up with this solution:
def compress(word: List[Char]): List[(Char, Int)] =
word.map((_, 1)).foldRight(Nil: List[(Char, Int)])((e, acc) =>
acc match {
case Nil => List(e)
case ((c, i)::rest) => if (c == e._1) (c, i + 1)::rest else e::acc
})
Basically, it's a map followed by a right fold.
Took inspiration from the #nicodp code
def encode(word: String): String =
word.foldLeft(List.empty[(Char, Int)]) { (acc, e) =>
acc match {
case Nil => (e, 1) :: Nil
case ((lastChar, lastCharCount) :: xs) if lastChar == e => (lastChar, lastCharCount + 1) :: xs
case xs => (e, 1) :: xs
}
}.reverse.map { case (a, num) => s"$num$a" }.foldLeft("")(_ ++ _)
First our intermediate result will be List[(Char, Int)]. List of tuples of chars each char will be accompanied by its count.
Now lets start going through the list one char at once using the Great! foldLeft
We will accumulate the result in the acc variable and e represents the current element.
acc is of type List[(Char, Int)] and e is of type Char
Now when we start, we are at first char of the list. Right now the acc is empty list. So, we attach first tuple to the front of the list acc
with count one.
when acc is Nil do (e, 1) :: Nil or (e, 1) :: acc note: acc is Nil
Now front of the list is the node we are interested in.
Lets go to the second element. Now acc has one element which is the first element with count one.
Now, we compare the current element with the front element of the list
if it matches, increment the count and put the (element, incrementedCount) in the front of the list in place of old tuple.
if current element does not match the last element, that means we have
new element. So, we attach new element with count 1 to the front of the list and so on.
then to convert the List[(Char, Int)] to required string representation.
Note: We are using front element of the list which is accessible in O(1) (constant time complexity) has buffer and increasing the count in case same element is found.
Scala REPL
scala> :paste
// Entering paste mode (ctrl-D to finish)
def encode(word: String): String =
word.foldLeft(List.empty[(Char, Int)]) { (acc, e) =>
acc match {
case Nil => (e, 1) :: Nil
case ((lastChar, lastCharCount) :: xs) if lastChar == e => (lastChar, lastCharCount + 1) :: xs
case xs => (e, 1) :: xs
}
}.reverse.map { case (a, num) => s"$num$a" }.foldLeft("")(_ ++ _)
// Exiting paste mode, now interpreting.
encode: (word: String)String
scala> encode("AAABBCAADEEFF")
res0: String = 3A2B1C2A1D2E2F
Bit more concise with back ticks e instead of guard in pattern matching
def encode(word: String): String =
word.foldLeft(List.empty[(Char, Int)]) { (acc, e) =>
acc match {
case Nil => (e, 1) :: Nil
case ((`e`, lastCharCount) :: xs) => (e, lastCharCount + 1) :: xs
case xs => (e, 1) :: xs
}
}.reverse.map { case (a, num) => s"$num$a" }.foldLeft("")(_ ++ _)
Here's another more simplified approach based upon this answer:
class StringCompressinator {
def compress(raw: String): String = {
val split: Array[String] = raw.split("(?<=(.))(?!\\1)", 0) // creates array of the repeated chars as strings
val converted = split.map(group => {
val char = group.charAt(0) // take first char of group string
s"${group.length}${char}" // use the length as counter and prefix the return string "AAA" becomes "3A"
})
converted.mkString("") // converted is again array, join turn it into a string
}
}
import org.scalatest.FunSuite
class StringCompressinatorTest extends FunSuite {
test("testCompress") {
val compress = (new StringCompressinator).compress(_)
val input = "AAABBCAADEEFF"
assert(compress(input) == "3A2B1C2A1D2E2F")
}
}
Similar idea with slight difference :
Case class for pattern matching the head so we don't need to use if and it also helps on printing end result by overriding toString
Using capital letter for variable name when pattern matching (either that or back ticks, I don't know which I like less :P)
case class Count(c : Char, cnt : Int){
override def toString = s"$cnt$c"
}
def compressor( counts : List[Count], C : Char ) = counts match {
case Count(C, cnt) :: tail => Count(C, cnt + 1) :: tail
case _ => Count(C, 1) :: counts
}
"AAABBCAADEEFF".foldLeft(List[Count]())(compressor).reverse.mkString
//"3A2B1C2A1D2E2F"

Scala: Partitioning by case (not by filter)

I have a list of mixed values:
val list = List("A", 2, 'c', 4)
I know how to collect the chars, or strings, or ints, in a single operation:
val strings = list collect { case s:String => s }
==> List(A)
val chars = list collect { case c:Char => c }
==> List(c)
val ints = list collect { case i:Int => i }
==> List(2,4)
Can I do it all in one shot somehow? I'm looking for:
val (strings, chars, ints) = list ??? {
case s:String => s
case c:Char => c
case i:Int => i
}
EDIT
Confession -- An example closer to my actual use case:
I have a list of things, that I want to partition according to some conditions:
val list2 = List("Word", " ", "", "OtherWord")
val (empties, whitespacesonly, words) = list2 ??? {
case s:String if s.isEmpty => s
case s:String if s.trim.isEmpty => s
case s:String => s
}
N.B. partition would be great for this if I only had 2 cases (one where the condition was met and one where it wasn't) but here I have multiple conditions to split on.
Based on your second example: you can use groupBy and a key-ing function. I prefer to use those techniques in conjunction with a discriminated union to make the intention of the code more obvious:
val list2 = List("Word", " ", "", "OtherWord")
sealed trait Description
object Empty extends Description
object Whitespaces extends Description
object Words extends Description
def strToDesc(str : String) : Description = str match {
case _ if str.isEmpty() => Empty
case _ if str.trim.isEmpty() => Whitespaces
case _ => Words
}
val descMap = (list2 groupBy strToDesc) withDefaultValue List.empty[String]
val (empties, whitespaceonly, words) =
(descMap(Empty),descMap(Whitespaces),descMap(Words))
This extends well if you want to add another Description later, e.g. AllCaps...
Hope this help:
list.foldLeft((List[String](), List[String](), List[String]())) {
case ((e,s,w),str:String) if str.isEmpty => (str::e,s,w)
case ((e,s,w),str:String) if str.trim.isEmpty => (e,str::s,w)
case ((e,s,w),str:String) => (e,s,str::w)
case (acc, _) => acc
}
You could use partition twice :
def partitionWords(list: List[String]) = {
val (emptyOrSpaces, words) = list.partition(_.trim.isEmpty)
val (empty, spaces) = emptyOrSpaces.partition(_.isEmpty)
(empty, spaces, words)
}
Which gives for your example :
partitionWords(list2)
// (List(""),List(" "),List(Word, OtherWord))
In general you can use foldLeft with a tuple as accumulator.
def partitionWords2(list: List[String]) = {
val nilString = List.empty[String]
val (empty, spaces, words) = list.foldLeft((nilString, nilString, nilString)) {
case ((empty, spaces, words), elem) =>
elem match {
case s if s.isEmpty => (s :: empty, spaces, words)
case s if s.trim.isEmpty => (empty, s :: spaces, words)
case s => (empty, spaces, s :: words)
}
}
(empty.reverse, spaces.reverse, words.reverse)
}
Which will give you the same result.
A tail recursive method,
def partition(list: List[Any]): (List[Any], List[Any], List[Any]) = {
#annotation.tailrec
def inner(map: Map[String, List[Any]], innerList: List[Any]): Map[String, List[Any]] = innerList match {
case x :: xs => x match {
case s: String => inner(insertValue(map, "str", s), xs)
case c: Char => inner(insertValue(map, "char", c), xs)
case i: Int => inner(insertValue(map, "int", i), xs)
}
case Nil => map
}
def insertValue(map: Map[String, List[Any]], key: String, value: Any) = {
map + (key -> (value :: map.getOrElse(key, Nil)))
}
val partitioned = inner(Map.empty[String, List[Any]], list)
(partitioned.get("str").getOrElse(Nil), partitioned.get("char").getOrElse(Nil), partitioned.get("int").getOrElse(Nil))
}
val list1 = List("A", 2, 'c', 4)
val (strs, chars, ints) = partition(list1)
I wound up with this, based on #Nyavro's answer:
val list2 = List("Word", " ", "", "OtherWord")
val(empties, spaces, words) =
list2.foldRight((List[String](), List[String](), List[String]())) {
case (str, (e, s, w)) if str.isEmpty => (str :: e, s, w)
case (str, (e, s, w)) if str.trim.isEmpty => (e, str :: s, w)
case (str, (e, s, w)) => (e, s, str :: w)
}
==> empties: List[String] = List("")
==> spaces: List[String] = List(" ")
==> words: List[String] = List(Word, OtherWord)
I understand the risks of using foldRight: mainly that in order to start on the right, the runtime needs to recurse and that this may blow the stack on large inputs. However, my inputs are small and this risk is acceptable.
Having said that, if there's a quick way to _.reverse three lists of a tuple that I haven't thought of, I'm all ears.
Thanks all!

n-way `span` on sequences

Given a sequence of elements and a predicate p, I would like to produce a sequence of sequences such that, in each subsequence, either all elements satisfy p or the sequence has length 1. Additionally, calling .flatten on the result should give me back my original sequence (so no re-ordering of elements).
For instance, given:
val l = List(2, 4, -6, 3, 1, 8, 7, 10, 0)
val p = (i : Int) => i % 2 == 0
I would like magic(l,p) to produce:
List(List(2, 4, -6), List(3), List(1), List(8), List(7), List(10, 0))
I know of .span, but that method stops the first time it encounters a value that doesn't satisfy p and just returns a pair.
Below is a candidate implementation. It does what I want, but, well, makes we want to cry. I would love for someone to come up with something slightly more idiomatic.
def magic[T](elems : Seq[T], p : T=>Boolean) : Seq[Seq[T]] = {
val loop = elems.foldLeft[(Boolean,Seq[Seq[T]])]((false,Seq.empty)) { (pr,e) =>
val (lastOK,s) = pr
if(lastOK && p(e)) {
(true, s.init :+ (s.last :+ e))
} else {
(p(e), s :+ Seq(e))
}
}
loop._2
}
(Note that I do not particularly care about preserving the actual type of the Seq.)
I would not use foldLeft. It's just a simple recursion of span with a special rule if the head doesn't match the predicate:
def magic[T](elems: Seq[T], p: T => Boolean): Seq[Seq[T]] =
elems match {
case Seq() => Seq()
case Seq(head, tail # _*) if !p(head) => Seq(head) +: magic(tail, p)
case xs =>
val (prefix, rest) = xs span p
prefix +: magic(rest, p)
}
You could also do it tail-recursive, but you need to remember to reverse the output if you're prepending (as is sensible):
def magic[T](elems: Seq[T], p: T => Boolean): Seq[Seq[T]] = {
def iter(elems: Seq[T], out: Seq[Seq[T]]) : Seq[Seq[T]] =
elems match {
case Seq() => out.reverse
case Seq(head, tail # _*) if !p(head) => iter(tail, Seq(head) +: out)
case xs =>
val (prefix, rest) = xs span p
iter(rest, prefix +: out)
}
iter(elems, Seq())
}
For this task you can use takeWhile and drop combined with a little pattern matching an recursion:
def magic[T](elems : Seq[T], p : T=>Boolean) : Seq[Seq[T]] = {
def magic(elems: Seq[T], result: Seq[Seq[T]]): Seq[Seq[T]] = elems.takeWhile(p) match {
// if elems is Nil, we have a result
case Nil if elems.isEmpty => result
// if it's not, but we don't get any values from takeWhile, we take a single elem
case Nil => magic(elems.tail, result :+ Seq(elems.head))
// takeWhile gave us something, so we add it to the result
// and drop as many elements from elems, as takeWhile gave us
case xs => magic(elems.drop(xs.size), result :+ xs)
}
magic(elems, Seq())
}
Another solution using a fold:
def magicFilter[T](seq: Seq[T], p: T => Boolean): Seq[Seq[T]] = {
val (filtered, current) = (seq foldLeft (Seq[Seq[T]](), Seq[T]())) {
case ((filtered, current), element) if p(element) => (filtered, current :+ element)
case ((filtered, current), element) if !current.isEmpty => (filtered :+ current :+ Seq(element), Seq())
case ((filtered, current), element) => (filtered :+ Seq(element), Seq())
}
if (!current.isEmpty) filtered :+ current else filtered
}