Splitting the list in to two parts in Scala - scala

I have a List with n as 3 ls as List(a,b,c,d,e) My query is to code for the (3,List(a,b,c,d,e)) I want to split them in to two parts such as List(a,b,c),List(d,e). For this the scala program is like below.
I don't understand val(pre,post). why it is used and what do we get from it? can someone please elaborate?
def splitRecursive[A](n: Int, ls: List[A]): (List[A], List[A]) = (n, ls) match {
case (_, Nil) => (Nil, Nil)
case (0, list) => (Nil, list)
case (n, h :: tail) => {
val (pre, post) = splitRecursive(n - 1, tail)
(h :: pre, post)
}
}

Your splitRecursive function returns a pair of lists. To get the two lists out of the pair, you can either fetch them like this:
val result = splitRecursive(n - 1, tail)
val pre = result._1
val post = result._2
Or you can use destructuring to get them without first having to bind the pair to result. That is what the syntax in splitRecursive is doing.
val (pre, post) = splitRecursive(n - 1, tail)
It is simply a convenient way to get the elements out of a pair (or some other structure that can be destructured).

Related

Scala - Use predicate function to summarize list of strings

I need to write a function to analyze some text files.
For that, there should be a function that splits the file via a predicate into sublists. It should only get the values after the first time the predicate evaluates to True and afterwards start a new sublist after the predicate was True again.
For Example:
List('ignore','these','words','x','this','is','first','x','this','is','second')
with predicate
x=>x.equals('x')
should produce
List(List('this','is','first'),List('this','is','second'))
I've already done the reading of the file into a List[String] and tried to use foldLeft with a case statement to iterate over the List.
words.foldLeft(List[List[String]]()) {
case (Nil, s) => List(List(s))
case (result, "x") => result :+ List()
case (result, s) => result.dropRight(1) :+ (result.last :+ s)
}
There are 2 problems with this though and I can't figure them out:
This does not ignore the words before the first time the predicate
evaluates to True
I can't use an arbitrary predicate function
If anyone could tell me what I have to do to fix my problems it would be highly appreciated.
I modified your example a little bit:
def foldWithPredicate[A](predicate: A => Boolean)(l: List[A]) =
l.foldLeft[List[List[A]]](Nil){
case (acc, e) if predicate(e) => acc :+ Nil //if predicate passed add new list at the end
case (Nil, _) => Nil //empty list means we need to ignore elements
case (xs :+ x, e) => xs :+ (x :+ e) //append an element to the last list
}
val l = List("ignore","these","words","x","this","is","first","x","this","is","second")
val predicate: String => Boolean = _.equals("x")
foldWithPredicate(predicate)(l) // List(List(this, is, first), List(this, is, second))
There's one problem performance related to your approach: appending is very slow on immutable lists.
It might be faster to prepend elements on the list, but then, of course, all lists will have elements in reversed order (but they could be reversed at the end).
def foldWithPredicate2[A](predicate: A => Boolean)(l: List[A]) =
l.foldLeft[List[List[A]]](Nil){
case (acc, e) if predicate(e) => Nil :: acc
case (Nil, _) => Nil
case (x :: xs, e) => (e :: x) :: xs
}.map(_.reverse).reverse
An alternative approach is to use span to split the items into the next sublist and the rest in a single call. The following code assumes Scala 2.13 for List.unfold:
def splitIntoBlocks[T](items: List[T])(startsNewBlock: T => Boolean): List[List[T]] = {
def splitBlock(items: List[T]): (List[T], List[T]) = items.span(!startsNewBlock(_))
List.unfold(splitBlock(items)._2) {
case blockIndicator :: rest => Some(splitBlock(rest))
case _ => None
}
}
And the usage:
scala> splitIntoBlocks(List(
"ignore", "these", "words",
"x", "this", "is", "first",
"x", "this", "is", "second")
)(_ == "x")
res0: List[List[String]] = List(List(this, is, first), List(this, is, second))

Merging list of tuples in scala based on key

I have a list of tuples look like this:
Seq("ptxt"->"how","list"->"you doing","ptxt"->"whats up","ptxt"-> "this ","list"->"is ","list"->"cool")
On the keys, merge ptxt with all the list that will come after it.
e.g.
create a new seq look like this :
Seq("how you doing", "whats up", "this is cool")
You could fold your Seq with foldLeft:
val s = Seq("ptxt"->"how ","list"->"you doing","ptxt"->"whats up","ptxt"-> "this ","list"->"is ","list"->"cool")
val r: Seq[String] = s.foldLeft(List[String]()) {
case (xs, ("ptxt", s)) => s :: xs
case (x :: xs, ("list", s)) => (x + s) :: xs
}.reverse
If you don't care about an order you can omit reverse.
Function foldLeft takes two arguments first is the initial value and the second one is a function taking two arguments: the previous result and element of the sequence. Result of this method is then fed the next function call as the first argument.
For example for numbers foldLeft, would just create a sum of all elements starting from left.
List(5, 4, 8, 6, 2).foldLeft(0) { (result, i) =>
result + i
} // 25
For our case, we start with an empty list. Then we provide function, which handles two cases using pattern matching.
Case when the key is "ptxt". In this case, we just prepend the value to list.
case (xs, ("ptxt", s)) => s :: xs
Case when the key is "list". Here we take the first string from the list (using pattern matching) and then concatenate value to it, after that we put it back with the rest of the list.
case (x :: xs, ("list", s)) => (x + s) :: xs
At the end since we were prepending element, we need to revert our list. Why we were prepending, not appending? Because append on the immutable list is O(n) and prepend is O(1), so it's more efficient.
Here another solution:
val data = Seq("ptxt"->"how","list"->"you doing","ptxt"->"whats", "list" -> "up","ptxt"-> "this ", "list"->"is cool")
First group Keys and Values:
val grouped = s.groupBy(_._1)
.map{case (k, l) => k -> l.map{case (_, v) => v.trim}}
// > Map(list -> List(you doing, up, is cool), ptxt -> List(how, whats, this))
Then zip and concatenate the two values:
grouped("ptxt").zip(grouped("list"))
.map{case (a, b) => s"$a $b"}
// > List(how you doing, whats up, this is cool)
Disclaimer: This only works if the there is always key, value, key, value,.. in the list - I had to adjust the input data.
If you change Seq for List, you can solve that with a simple tail-recursive function.
(The code uses Scala 2.13, but can be rewritten to use older Scala versions if needed)
def mergeByKey[K](list: List[(K, String)]): List[String] = {
#annotation.tailrec
def loop(remaining: List[(K, String)], acc: Map[K, StringBuilder]): List[String] =
remaining match {
case Nil =>
acc.valuesIterator.map(_.result()).toList
case (key, value) :: tail =>
loop(
remaining = tail,
acc.updatedWith(key) {
case None => Some(new StringBuilder(value))
case Some(oldValue) => Some(oldValue.append(value))
}
)
}
loop(remaining = list, acc = Map.empty)
}
val data = List("ptxt"->"how","list"->"you doing","ptxt"->"whats up","ptxt"-> "this ","list"->"is ","list"->"cool")
mergeByKey(data)
// res: List[String] = List("howwhats upthis ", "you doingis cool")
Or a one liner using groupMap.
(inspired on pme's answer)
data.groupMap(_._1)(_._2).view.mapValues(_.mkString).valuesIterator.toList
Adding another answer since I don't have enough reputation points for adding a comment. just an improvment on Krzysztof Atłasik's answer. to compensate for the case where the Seq starts with a "list" you might want to add another case as:
case (xs,("list", s)) if xs.isEmpty=>xs
So the final code could be something like:
val s = Seq("list"->"how ","list"->"you doing","ptxt"->"whats up","ptxt"-> "this ","list"->"is ","list"->"cool")
val r: Seq[String] = s.foldLeft(List[String]()) {
case (xs,("list", s)) if xs.isEmpty=>xs
case (xs, ("ptxt", s)) => s :: xs
case (x :: xs, ("list", s)) => (x + s) :: xs
}.reverse

Reduce/Fold only some elements

I have a file parser that produces a collection of elements all belonging to the same trait. It is similar to the following.
trait Data {
val identifier: String
}
case class Meta(identifier: String, props: Properties) extends Data
case class Complete(identifier: String, contents: Map[String, Any]) extends Data
case class Partial(identifier: String, name: String, value: Any) extends Data
...
def parse(file: File): Iterator[Data] = ... // this isn't relevant
What I am attempting to do is traverse the collection in a functional manner since I am processing a lot of data and want to be as memory conscious as possible. The collection when it is returned from the parse method is a mix of Complete, Meta, and Partial elements. The logic is that I need to pass the Complete and Meta elements through unchanged, while collecting the Partial elements and grouping on the identifier to create Complete elements.
With just a collection of Partial elements (Iterator[Partial]), I can do the following:
partialsOnly.groupBy(_.identifier)
.map{
case (ident, parts) =>
Complete(ident, parts.map(p => p.name -> p.value).toMap)
}
Is there a functional way, somewhat similar to scan that will accumulate elements, but only some elements, while letting the rest through unchanged?
You can use the partition function to split a collection in two based on a predicate.
val (partial: List[Data], completeAndMeta: List[Data]) = parse("file").partition(_ match{
case partial: Partial => true
case _ => false
})
From there, you want to make sure you can process partial as a List[Partial], ideally without tripping compiler warnings about type erasure or doing messy casts. You can do this with a call to collect, using a function that only accepts Partial's.
val partials: List[Partial] = partial.collect(_.match{case partial: Partial => partial}}
Unfortunately, when used on an Iterator, partition may need to buffer arbitrary amounts of data, so isn't necessarily the most memory efficient technique. If memory management is a huge concern, you may need to sacrifice functional purity. Alternately, if you add some way of knowing when a Partial is completed, you can accumulate them in a Map via a foldLeft and emit the final value as they finish.
Recursion might be functional way to solve your problem:
def parse(list: List[Data]): (List[Data], List[Data]) = {
list match {
case (x:Partial) :: xs =>
val (partials, rest) = parse(xs)
(x :: partials, rest) //instead of creating list, you can join partials here
case x :: xs =>
val (partials, rest) = parse(xs)
(partials, x :: rest)
case _ => (Nil, Nil)
}
}
val (partials, rest) = parse(list)
Unfortunately, this function is not tail recursive, so it might blow up the stack for longer lists.
You can solve it by using Eval from cats:
def parse2(list: List[Data]): Eval[(List[Data], List[Data])] =
Eval.now(list).flatMap {
case (x:Partial) :: xs =>
parse2(xs).map {
case (partials, rest) => (x :: partials, rest) //instead of creating list, you can join partials here
}
case x :: xs =>
parse2(xs).map {
case (partials, rest) => (partials, x :: rest)
}
case _ => Eval.now((Nil, Nil))
}
val (partialsResult, restResult) = parse2(longList).value
This solution would be safe for the stack because it uses Heap, not Stack.
And here's version, which also groups partials:
def parse3(list: List[Data]): Eval[(Map[String, List[Partial]], List[Data])] =
Eval.now(list).flatMap {
case (x:Partial) :: xs =>
parse3(xs).map {
case (partials, rest) =>
val newPartials = x :: partials.getOrElse(x.identifier, Nil)
(partials + (x.identifier -> newPartials), rest)
}
case x :: xs =>
parse3(xs).map {
case (partials, rest) => (partials, x :: rest)
}
case _ => Eval.now((Map.empty[String, List[Partial]], Nil))
}

Compress a Given Text of String in Scala

I have been trying to compress a String. Given a String like this:
AAABBCAADEEFF, I would need to compress it like 3A2B1C2A1D2E2F
I was able to come up with a tail recursive implementation:
#scala.annotation.tailrec
def compress(str: List[Char], current: Seq[Char], acc: Map[Int, String]): String = str match {
case Nil =>
if (current.nonEmpty)
s"${acc.values.mkString("")}${current.length}${current.head}"
else
s"${acc.values.mkString("")}"
case List(x) if current.contains(x) =>
val newMap = acc ++ Map(acc.keys.toList.last + 1 -> s"${current.length + 1}${current.head}")
compress(List.empty[Char], Seq.empty[Char], newMap)
case x :: xs if current.isEmpty =>
compress(xs, Seq(x), acc)
case x :: xs if !current.contains(x) =>
if (acc.nonEmpty) {
val newMap = acc ++ Map(acc.keys.toList.last + 1 -> s"${current.length}${current.head}")
compress(xs, Seq(x), newMap)
} else {
compress(xs, Seq(x), acc ++ Map(1 -> s"${current.length}${current.head}"))
}
case x :: xs =>
compress(xs, current :+ x, acc)
}
// Produces 2F3A2B1C2A instead of 3A2B1C2A1D2E2F
compress("AAABBCAADEEFF".toList, Seq.empty[Char], Map.empty[Int, String])
It fails however for the given case! Not sure what edge scenario I'm missing! Any help?
So what I'm actually doing is, going over the sequence of characters, collecting identical ones into a new Sequence and as long as the new character in the original String input (the first param in the compress method) is found in the current (the second parameter in the compress method), I keep collecting it.
As soon as it is not the case, I empty the current sequence, count and push the collected elements into the Map! It fails for some edge cases that I'm not able to make out!
I came up with this solution:
def compress(word: List[Char]): List[(Char, Int)] =
word.map((_, 1)).foldRight(Nil: List[(Char, Int)])((e, acc) =>
acc match {
case Nil => List(e)
case ((c, i)::rest) => if (c == e._1) (c, i + 1)::rest else e::acc
})
Basically, it's a map followed by a right fold.
Took inspiration from the #nicodp code
def encode(word: String): String =
word.foldLeft(List.empty[(Char, Int)]) { (acc, e) =>
acc match {
case Nil => (e, 1) :: Nil
case ((lastChar, lastCharCount) :: xs) if lastChar == e => (lastChar, lastCharCount + 1) :: xs
case xs => (e, 1) :: xs
}
}.reverse.map { case (a, num) => s"$num$a" }.foldLeft("")(_ ++ _)
First our intermediate result will be List[(Char, Int)]. List of tuples of chars each char will be accompanied by its count.
Now lets start going through the list one char at once using the Great! foldLeft
We will accumulate the result in the acc variable and e represents the current element.
acc is of type List[(Char, Int)] and e is of type Char
Now when we start, we are at first char of the list. Right now the acc is empty list. So, we attach first tuple to the front of the list acc
with count one.
when acc is Nil do (e, 1) :: Nil or (e, 1) :: acc note: acc is Nil
Now front of the list is the node we are interested in.
Lets go to the second element. Now acc has one element which is the first element with count one.
Now, we compare the current element with the front element of the list
if it matches, increment the count and put the (element, incrementedCount) in the front of the list in place of old tuple.
if current element does not match the last element, that means we have
new element. So, we attach new element with count 1 to the front of the list and so on.
then to convert the List[(Char, Int)] to required string representation.
Note: We are using front element of the list which is accessible in O(1) (constant time complexity) has buffer and increasing the count in case same element is found.
Scala REPL
scala> :paste
// Entering paste mode (ctrl-D to finish)
def encode(word: String): String =
word.foldLeft(List.empty[(Char, Int)]) { (acc, e) =>
acc match {
case Nil => (e, 1) :: Nil
case ((lastChar, lastCharCount) :: xs) if lastChar == e => (lastChar, lastCharCount + 1) :: xs
case xs => (e, 1) :: xs
}
}.reverse.map { case (a, num) => s"$num$a" }.foldLeft("")(_ ++ _)
// Exiting paste mode, now interpreting.
encode: (word: String)String
scala> encode("AAABBCAADEEFF")
res0: String = 3A2B1C2A1D2E2F
Bit more concise with back ticks e instead of guard in pattern matching
def encode(word: String): String =
word.foldLeft(List.empty[(Char, Int)]) { (acc, e) =>
acc match {
case Nil => (e, 1) :: Nil
case ((`e`, lastCharCount) :: xs) => (e, lastCharCount + 1) :: xs
case xs => (e, 1) :: xs
}
}.reverse.map { case (a, num) => s"$num$a" }.foldLeft("")(_ ++ _)
Here's another more simplified approach based upon this answer:
class StringCompressinator {
def compress(raw: String): String = {
val split: Array[String] = raw.split("(?<=(.))(?!\\1)", 0) // creates array of the repeated chars as strings
val converted = split.map(group => {
val char = group.charAt(0) // take first char of group string
s"${group.length}${char}" // use the length as counter and prefix the return string "AAA" becomes "3A"
})
converted.mkString("") // converted is again array, join turn it into a string
}
}
import org.scalatest.FunSuite
class StringCompressinatorTest extends FunSuite {
test("testCompress") {
val compress = (new StringCompressinator).compress(_)
val input = "AAABBCAADEEFF"
assert(compress(input) == "3A2B1C2A1D2E2F")
}
}
Similar idea with slight difference :
Case class for pattern matching the head so we don't need to use if and it also helps on printing end result by overriding toString
Using capital letter for variable name when pattern matching (either that or back ticks, I don't know which I like less :P)
case class Count(c : Char, cnt : Int){
override def toString = s"$cnt$c"
}
def compressor( counts : List[Count], C : Char ) = counts match {
case Count(C, cnt) :: tail => Count(C, cnt + 1) :: tail
case _ => Count(C, 1) :: counts
}
"AAABBCAADEEFF".foldLeft(List[Count]())(compressor).reverse.mkString
//"3A2B1C2A1D2E2F"

How can I remove duplicates from a list in Scala with pattern matching?

As homework i have to write a function that will remove duplicates from a list. It should be recursive and with pattern matching. I am not allowed to use list functions like head,tail,contains,etc... .
For sorted lists i came up with this solution:
def remove(u:List[Int]):List[Int] = {
u match { case Nil => u
case hd::hd2::tl => if(hd == hd2) remove(hd2::tl) else hd :: remove(hd2::tl)
case hd::tl => hd :: remove(tl)
}
}
How can i do it for unsorted lists?
I won't do your homework for you, but hope, this will help.
You want to make your function tail-recursive. That means that the recursive call appears in the very last position of the function, so that the jvm can clear up the previous call from the stack before invoking it (it makes it execute very much like a loop, without requiring additional space on stack). In your original solution it is statements like this, that make it not tail-recursive: hd :: remove(tl): you have to invoke the recursive call, and then prepend hd to its result. The then part breaks the idea of tail recursion, because jvm has to remember on stack the place to return to after the recursive call is finished.
This is typically avoided by carrying the final result of the function through the recursion as an argument:
def remove(u: List[Int], result: List[Int] = Nil): List[Int] = u match {
case Nil => result
case a :: b :: tail if a == b => remove(b :: tail, result)
case head :: tail => remove(tail, head :: result)
}
(Note, that both recursive invocations here are in the tail position - there is nothing left to do after the call returns, so the previous entry can be cleared from the stack prior to invoking the recursion).
You need another recursive function - contains - that tells whether a given element is contained in a list. Once you have that, just replace the second case clause above with something like
case head :: tail if contains(head, result) => remove(tail, result)
and your job is done!
If you want to preserve the original order of elements of the list, you will need to reverse it afterwards (replace case Nil => result with case Nil => result.reverse) ... If you are not allowed to use .reverse here too, that's going to be another nice exercise for you. How do you reverse a list (tail-)recursively?
I would probably go about sorting the list firstly so that we can apply the O(n) complexity pattern matching method but in order to keep the order you will need to index the list so that you can recover the order later, this can be done with the zipWithIndex method. And also since the data type changes from List[Int] to List[(Int, Int)], you will need to define another recursive remove function inside the original one:
def remove(u: List[Int]): List[Int] = {
val sortedU = u.zipWithIndex.sortBy{ case (x, y) => x}
def removeRec(su: List[(Int, Int)]): List[(Int, Int)] = {
su match {
case Nil => su
case hd :: Nil => su
case hd :: hd2 :: tail => {
if (hd._1 == hd2._1) removeRec(hd2 :: tail)
else hd :: removeRec(hd2 :: tail)
}
}
}
removeRec(sortedU).sortBy{case (x, y) => y}.map{ case (x, y) => x}
}
val lst = List(1,2,3,1,3,3)
remove(lst)
// res51: List[Int] = List(2, 1, 3)
Note: This follows OP's pattern and is not tail-recursive. And if you want a tail recursive version, you can follow #Dima's nice explanation.