I have a list of tuples look like this:
Seq("ptxt"->"how","list"->"you doing","ptxt"->"whats up","ptxt"-> "this ","list"->"is ","list"->"cool")
On the keys, merge ptxt with all the list that will come after it.
e.g.
create a new seq look like this :
Seq("how you doing", "whats up", "this is cool")
You could fold your Seq with foldLeft:
val s = Seq("ptxt"->"how ","list"->"you doing","ptxt"->"whats up","ptxt"-> "this ","list"->"is ","list"->"cool")
val r: Seq[String] = s.foldLeft(List[String]()) {
case (xs, ("ptxt", s)) => s :: xs
case (x :: xs, ("list", s)) => (x + s) :: xs
}.reverse
If you don't care about an order you can omit reverse.
Function foldLeft takes two arguments first is the initial value and the second one is a function taking two arguments: the previous result and element of the sequence. Result of this method is then fed the next function call as the first argument.
For example for numbers foldLeft, would just create a sum of all elements starting from left.
List(5, 4, 8, 6, 2).foldLeft(0) { (result, i) =>
result + i
} // 25
For our case, we start with an empty list. Then we provide function, which handles two cases using pattern matching.
Case when the key is "ptxt". In this case, we just prepend the value to list.
case (xs, ("ptxt", s)) => s :: xs
Case when the key is "list". Here we take the first string from the list (using pattern matching) and then concatenate value to it, after that we put it back with the rest of the list.
case (x :: xs, ("list", s)) => (x + s) :: xs
At the end since we were prepending element, we need to revert our list. Why we were prepending, not appending? Because append on the immutable list is O(n) and prepend is O(1), so it's more efficient.
Here another solution:
val data = Seq("ptxt"->"how","list"->"you doing","ptxt"->"whats", "list" -> "up","ptxt"-> "this ", "list"->"is cool")
First group Keys and Values:
val grouped = s.groupBy(_._1)
.map{case (k, l) => k -> l.map{case (_, v) => v.trim}}
// > Map(list -> List(you doing, up, is cool), ptxt -> List(how, whats, this))
Then zip and concatenate the two values:
grouped("ptxt").zip(grouped("list"))
.map{case (a, b) => s"$a $b"}
// > List(how you doing, whats up, this is cool)
Disclaimer: This only works if the there is always key, value, key, value,.. in the list - I had to adjust the input data.
If you change Seq for List, you can solve that with a simple tail-recursive function.
(The code uses Scala 2.13, but can be rewritten to use older Scala versions if needed)
def mergeByKey[K](list: List[(K, String)]): List[String] = {
#annotation.tailrec
def loop(remaining: List[(K, String)], acc: Map[K, StringBuilder]): List[String] =
remaining match {
case Nil =>
acc.valuesIterator.map(_.result()).toList
case (key, value) :: tail =>
loop(
remaining = tail,
acc.updatedWith(key) {
case None => Some(new StringBuilder(value))
case Some(oldValue) => Some(oldValue.append(value))
}
)
}
loop(remaining = list, acc = Map.empty)
}
val data = List("ptxt"->"how","list"->"you doing","ptxt"->"whats up","ptxt"-> "this ","list"->"is ","list"->"cool")
mergeByKey(data)
// res: List[String] = List("howwhats upthis ", "you doingis cool")
Or a one liner using groupMap.
(inspired on pme's answer)
data.groupMap(_._1)(_._2).view.mapValues(_.mkString).valuesIterator.toList
Adding another answer since I don't have enough reputation points for adding a comment. just an improvment on Krzysztof Atłasik's answer. to compensate for the case where the Seq starts with a "list" you might want to add another case as:
case (xs,("list", s)) if xs.isEmpty=>xs
So the final code could be something like:
val s = Seq("list"->"how ","list"->"you doing","ptxt"->"whats up","ptxt"-> "this ","list"->"is ","list"->"cool")
val r: Seq[String] = s.foldLeft(List[String]()) {
case (xs,("list", s)) if xs.isEmpty=>xs
case (xs, ("ptxt", s)) => s :: xs
case (x :: xs, ("list", s)) => (x + s) :: xs
}.reverse
Related
I need to write a function to analyze some text files.
For that, there should be a function that splits the file via a predicate into sublists. It should only get the values after the first time the predicate evaluates to True and afterwards start a new sublist after the predicate was True again.
For Example:
List('ignore','these','words','x','this','is','first','x','this','is','second')
with predicate
x=>x.equals('x')
should produce
List(List('this','is','first'),List('this','is','second'))
I've already done the reading of the file into a List[String] and tried to use foldLeft with a case statement to iterate over the List.
words.foldLeft(List[List[String]]()) {
case (Nil, s) => List(List(s))
case (result, "x") => result :+ List()
case (result, s) => result.dropRight(1) :+ (result.last :+ s)
}
There are 2 problems with this though and I can't figure them out:
This does not ignore the words before the first time the predicate
evaluates to True
I can't use an arbitrary predicate function
If anyone could tell me what I have to do to fix my problems it would be highly appreciated.
I modified your example a little bit:
def foldWithPredicate[A](predicate: A => Boolean)(l: List[A]) =
l.foldLeft[List[List[A]]](Nil){
case (acc, e) if predicate(e) => acc :+ Nil //if predicate passed add new list at the end
case (Nil, _) => Nil //empty list means we need to ignore elements
case (xs :+ x, e) => xs :+ (x :+ e) //append an element to the last list
}
val l = List("ignore","these","words","x","this","is","first","x","this","is","second")
val predicate: String => Boolean = _.equals("x")
foldWithPredicate(predicate)(l) // List(List(this, is, first), List(this, is, second))
There's one problem performance related to your approach: appending is very slow on immutable lists.
It might be faster to prepend elements on the list, but then, of course, all lists will have elements in reversed order (but they could be reversed at the end).
def foldWithPredicate2[A](predicate: A => Boolean)(l: List[A]) =
l.foldLeft[List[List[A]]](Nil){
case (acc, e) if predicate(e) => Nil :: acc
case (Nil, _) => Nil
case (x :: xs, e) => (e :: x) :: xs
}.map(_.reverse).reverse
An alternative approach is to use span to split the items into the next sublist and the rest in a single call. The following code assumes Scala 2.13 for List.unfold:
def splitIntoBlocks[T](items: List[T])(startsNewBlock: T => Boolean): List[List[T]] = {
def splitBlock(items: List[T]): (List[T], List[T]) = items.span(!startsNewBlock(_))
List.unfold(splitBlock(items)._2) {
case blockIndicator :: rest => Some(splitBlock(rest))
case _ => None
}
}
And the usage:
scala> splitIntoBlocks(List(
"ignore", "these", "words",
"x", "this", "is", "first",
"x", "this", "is", "second")
)(_ == "x")
res0: List[List[String]] = List(List(this, is, first), List(this, is, second))
I have a List with n as 3 ls as List(a,b,c,d,e) My query is to code for the (3,List(a,b,c,d,e)) I want to split them in to two parts such as List(a,b,c),List(d,e). For this the scala program is like below.
I don't understand val(pre,post). why it is used and what do we get from it? can someone please elaborate?
def splitRecursive[A](n: Int, ls: List[A]): (List[A], List[A]) = (n, ls) match {
case (_, Nil) => (Nil, Nil)
case (0, list) => (Nil, list)
case (n, h :: tail) => {
val (pre, post) = splitRecursive(n - 1, tail)
(h :: pre, post)
}
}
Your splitRecursive function returns a pair of lists. To get the two lists out of the pair, you can either fetch them like this:
val result = splitRecursive(n - 1, tail)
val pre = result._1
val post = result._2
Or you can use destructuring to get them without first having to bind the pair to result. That is what the syntax in splitRecursive is doing.
val (pre, post) = splitRecursive(n - 1, tail)
It is simply a convenient way to get the elements out of a pair (or some other structure that can be destructured).
I have been trying to compress a String. Given a String like this:
AAABBCAADEEFF, I would need to compress it like 3A2B1C2A1D2E2F
I was able to come up with a tail recursive implementation:
#scala.annotation.tailrec
def compress(str: List[Char], current: Seq[Char], acc: Map[Int, String]): String = str match {
case Nil =>
if (current.nonEmpty)
s"${acc.values.mkString("")}${current.length}${current.head}"
else
s"${acc.values.mkString("")}"
case List(x) if current.contains(x) =>
val newMap = acc ++ Map(acc.keys.toList.last + 1 -> s"${current.length + 1}${current.head}")
compress(List.empty[Char], Seq.empty[Char], newMap)
case x :: xs if current.isEmpty =>
compress(xs, Seq(x), acc)
case x :: xs if !current.contains(x) =>
if (acc.nonEmpty) {
val newMap = acc ++ Map(acc.keys.toList.last + 1 -> s"${current.length}${current.head}")
compress(xs, Seq(x), newMap)
} else {
compress(xs, Seq(x), acc ++ Map(1 -> s"${current.length}${current.head}"))
}
case x :: xs =>
compress(xs, current :+ x, acc)
}
// Produces 2F3A2B1C2A instead of 3A2B1C2A1D2E2F
compress("AAABBCAADEEFF".toList, Seq.empty[Char], Map.empty[Int, String])
It fails however for the given case! Not sure what edge scenario I'm missing! Any help?
So what I'm actually doing is, going over the sequence of characters, collecting identical ones into a new Sequence and as long as the new character in the original String input (the first param in the compress method) is found in the current (the second parameter in the compress method), I keep collecting it.
As soon as it is not the case, I empty the current sequence, count and push the collected elements into the Map! It fails for some edge cases that I'm not able to make out!
I came up with this solution:
def compress(word: List[Char]): List[(Char, Int)] =
word.map((_, 1)).foldRight(Nil: List[(Char, Int)])((e, acc) =>
acc match {
case Nil => List(e)
case ((c, i)::rest) => if (c == e._1) (c, i + 1)::rest else e::acc
})
Basically, it's a map followed by a right fold.
Took inspiration from the #nicodp code
def encode(word: String): String =
word.foldLeft(List.empty[(Char, Int)]) { (acc, e) =>
acc match {
case Nil => (e, 1) :: Nil
case ((lastChar, lastCharCount) :: xs) if lastChar == e => (lastChar, lastCharCount + 1) :: xs
case xs => (e, 1) :: xs
}
}.reverse.map { case (a, num) => s"$num$a" }.foldLeft("")(_ ++ _)
First our intermediate result will be List[(Char, Int)]. List of tuples of chars each char will be accompanied by its count.
Now lets start going through the list one char at once using the Great! foldLeft
We will accumulate the result in the acc variable and e represents the current element.
acc is of type List[(Char, Int)] and e is of type Char
Now when we start, we are at first char of the list. Right now the acc is empty list. So, we attach first tuple to the front of the list acc
with count one.
when acc is Nil do (e, 1) :: Nil or (e, 1) :: acc note: acc is Nil
Now front of the list is the node we are interested in.
Lets go to the second element. Now acc has one element which is the first element with count one.
Now, we compare the current element with the front element of the list
if it matches, increment the count and put the (element, incrementedCount) in the front of the list in place of old tuple.
if current element does not match the last element, that means we have
new element. So, we attach new element with count 1 to the front of the list and so on.
then to convert the List[(Char, Int)] to required string representation.
Note: We are using front element of the list which is accessible in O(1) (constant time complexity) has buffer and increasing the count in case same element is found.
Scala REPL
scala> :paste
// Entering paste mode (ctrl-D to finish)
def encode(word: String): String =
word.foldLeft(List.empty[(Char, Int)]) { (acc, e) =>
acc match {
case Nil => (e, 1) :: Nil
case ((lastChar, lastCharCount) :: xs) if lastChar == e => (lastChar, lastCharCount + 1) :: xs
case xs => (e, 1) :: xs
}
}.reverse.map { case (a, num) => s"$num$a" }.foldLeft("")(_ ++ _)
// Exiting paste mode, now interpreting.
encode: (word: String)String
scala> encode("AAABBCAADEEFF")
res0: String = 3A2B1C2A1D2E2F
Bit more concise with back ticks e instead of guard in pattern matching
def encode(word: String): String =
word.foldLeft(List.empty[(Char, Int)]) { (acc, e) =>
acc match {
case Nil => (e, 1) :: Nil
case ((`e`, lastCharCount) :: xs) => (e, lastCharCount + 1) :: xs
case xs => (e, 1) :: xs
}
}.reverse.map { case (a, num) => s"$num$a" }.foldLeft("")(_ ++ _)
Here's another more simplified approach based upon this answer:
class StringCompressinator {
def compress(raw: String): String = {
val split: Array[String] = raw.split("(?<=(.))(?!\\1)", 0) // creates array of the repeated chars as strings
val converted = split.map(group => {
val char = group.charAt(0) // take first char of group string
s"${group.length}${char}" // use the length as counter and prefix the return string "AAA" becomes "3A"
})
converted.mkString("") // converted is again array, join turn it into a string
}
}
import org.scalatest.FunSuite
class StringCompressinatorTest extends FunSuite {
test("testCompress") {
val compress = (new StringCompressinator).compress(_)
val input = "AAABBCAADEEFF"
assert(compress(input) == "3A2B1C2A1D2E2F")
}
}
Similar idea with slight difference :
Case class for pattern matching the head so we don't need to use if and it also helps on printing end result by overriding toString
Using capital letter for variable name when pattern matching (either that or back ticks, I don't know which I like less :P)
case class Count(c : Char, cnt : Int){
override def toString = s"$cnt$c"
}
def compressor( counts : List[Count], C : Char ) = counts match {
case Count(C, cnt) :: tail => Count(C, cnt + 1) :: tail
case _ => Count(C, 1) :: counts
}
"AAABBCAADEEFF".foldLeft(List[Count]())(compressor).reverse.mkString
//"3A2B1C2A1D2E2F"
Newbie question.
I am looping through a list and need keep state in between the items.
For instance
val l = List("a", "1", "2", "3", "b", "4")
var state: String = ""
l.foreach(o => {
if (toInt(o).isEmpty) state = o else println(state + o.toString)
})
what's the alternative for the usage of var here?
You should keep in mind that it's sometimes (read: when it makes the code more readable and maintainable by others) okay to use mutability when performing some operation that's easily expressed with mutable state as long as that mutable state is confined to as little of your program as possible. Using (e.g.) foldLeft to maintain an accumulator here without using a var doesn't gain you much.
That said, here's one way to go about doing this:
val listOfThings: Seq[Either[Char, Int]] = Seq(Left('a'), Right(11), Right(212), Left('b'), Right(89))
val result = listOfThings.foldLeft(Seq[(Char, Seq[Int])]()) {
case (accumulator, Left(nextChar)) => accumulator :+ (nextChar, Seq.empty)
case (accumulator, Right(nextInt)) =>
val (currentChar, currentSequence) = accumulator.last
accumulator.dropRight(1) :+ (currentChar, currentSequence :+ nextInt)
}
result foreach {
case (char, numbers) => println(numbers.map(num => s"$char-$num").mkString(" "))
}
Use foldLeft:
l.foldLeft(""){ (state, o) =>
if(toInt(o).isEmpty) o
else {
println(state + o.toString)
state
}
}
Pass an arg:
scala> def collapse(header: String, vs: List[String]): Unit = vs match {
| case Nil =>
| case h :: t if h.forall(Character.isDigit) => println(s"$header$h") ; collapse(header, t)
| case h :: t => collapse(h, t)
| }
collapse: (header: String, vs: List[String])Unit
scala> collapse("", vs)
a1
a2
a3
b4
As simple as:
val list: List[Int] = List.range(1, 10) // Create list
def updateState(i : Int) : Int = i + 1 // Generate new state, just add one to each position. That will be the state
list.foldRight[List[(Int,Int)]](List())((a, b) => (a, updateState(a)) :: b)
Note that the result is a list of Tuple2: (Element, State), and each state depends on the element of the list.
Hope this helps
There are two major options to pass a state in functional programming when processing collections (I assume you want to get your result as a variable):
Recursion (classic)
val xs = List("a", "11", "212", "b", "89")
#annotation.tailrec
def fold(seq: ListBuffer[(String, ListBuffer[String])],
xs: Seq[String]): ListBuffer[(String, ListBuffer[String])] = {
(seq, xs) match {
case (_, Nil) =>
seq
case (_, c :: tail) if toInt(c).isEmpty =>
fold(seq :+ ((c, ListBuffer[String]())), tail)
case (init :+ ((c, seq)), i :: tail) =>
fold(init :+ ((c, seq :+ i)), tail)
}
}
val result =
fold(ListBuffer[(String, ListBuffer[String])](), xs)
// Get rid of mutable ListBuffer
.toSeq
.map {
case (c, seq) =>
(c, seq.toSeq)
}
//> List((a,List(11, 212)), (b,List(89)))
foldLeft et al.
val xs = List("a", "11", "212", "b", "89")
val result =
xs.foldLeft(
ListBuffer[(String, ListBuffer[String])]()
) {
case (seq, c) if toInt(c).isEmpty =>
seq :+ ((c, ListBuffer[String]()))
case (init :+ ((c, seq)), i) =>
init :+ ((c, seq :+ i))
}
// Get rid of mutable ListBuffer
.toSeq
.map {
case (c, seq) =>
(c, seq.toSeq)
}
//> List((a,List(11, 212)), (b,List(89)))
Which one is better? Unless you want to abort your processing in the middle of your collection (like e.g. in find) foldLeft is considered a better way and it has slightly less boilerplate, but otherwise they are very similar.
I'm using ListBuffer here to avoid reversing lists.
I have the following data structure
val list = List(1,2,
List(3,4),
List(5,6,7)
)
I want to get this as a result
List(
List(1,2,3,5), List(1,2,3,6), List(1,2,3,7),
List(1,2,4,5), List(1,2,4,6), List(1,2,4,7)
)
Number of sub-lists in the input and number of elements in them can vary
P.S.
I'm trying to use this as a first step
list.map{
case x => List(x)
case list:List => list
}
and some for comprehension, but it won't work because I don't know how many elements each sublist of the result will have
Types like List[Any] are most often avoided in Scala – so much of the power of the language comes from its smart type system, and this kind of type impedes this. So your instinct to turn the list into a normalized List[List[Int]] is spot on:
val normalizedList = list.map {
case x: Int => List(x)
case list: List[Int #unchecked] => list
}
Note that this will eventually throw a runtime exception if list includes a List of some type other than Int, such as List[String], due to type erasure. This is exactly the kind of problem that arises when failing to use strong types! You can read more about strategies for dealing with type erasure here.
Once you have a normalized List[List[Int]], then you can use foldLeft to build the combinations. You are also correct in seeing that a for comprehension can work well here:
normalizedList.foldLeft(List(List.empty[Int])) { (acc, next) =>
for {
combo <- acc
num <- next
} yield (combo :+ num)
}
In each iteration of the foldLeft, we consider one more sublist (next) from the normalizedList. We look at each combination thus far constructed (each combo in acc), and then for each number num in next, we make a new combination by appending it to combo.
As you might now, for comprehensions are really syntactic sugar for map, flatMap, and filter operations. So we can also express this with those more primitive methods:
normalizedList.foldLeft(List(List.empty[Int])) { (acc, next) =>
acc.flatMap { combo =>
next.map { num => combo :+ num }
}
}
You can even use the (somewhat silly) :/ alias for foldLeft, switch the order of the maps, and use underscore syntax for ultimate brevity:
(List(List[Int]()) /: normalizedList) { (acc, next) => next.flatMap { num => acc.map(_ :+ num) } }
val list = List(1,2,
List(3,4),
List(5,6,7)
)
def getAllCombinations(list: List[Any]) : List[List[Int]] ={
//normalize head so it is always a List
val headList: List[Int] = list.head match {
case i:Int => List(i)
case l:List[Int] => l
}
if(list.tail.nonEmpty){
// recursion for tail combinations
val tailCombinations : List[List[Int]] = getAllCombinations(list.tail)
//combine head combinations with tail combinations
headList.flatMap(
{i:Int => tailCombinations.map(
{l=>List(i).++(l)}
)
}
)
}
else{
headList.map(List(_))
}
}
print(getAllCombinations(list))
This can be achieved with the use of a foldLeft, as well. In the code below, each item of the outer List is folded into the List of List's by combining each current list with each new item.
val list = List(1,2, List(3,4), List(5,6,7) )
val lxl0 = List( List[Int]() ) //start value for foldLeft
val lxl = list.foldLeft( lxl0 )( (lxl, i) => {
i match {
case i:Int => for( l <- lxl ) yield l :+ i
case newl:List[Int] => for( l <- lxl;
i <- newl ) yield l :+ i
}
})
lxl.map( _.mkString(",") ).foreach( println(_))
While I didn't use the map that you desired, I do believe that the code may be changed to do the map and make all elements List[Int]. Then, that may simplify the foldLeft to simply do the for-comprehension. I was not able to get that to work immediately, though ;)