Scala - standard recursive pattern match on list - scala

Given:
def readLines(x: String): List[String] = Source.fromFile(x).getLines.toList
def toKVMap(fName : String): Map[String,String] =
readLines(fName).map(x => x.split(',')).map { case Array(x, y) => (x, y) }.toMap
I want to be able to take a string and a list of files of replacements and replace bracketed items. So if I have:
replLines("Hello",["cat"]) and cat contains ello,i!, I want to get back Hi!
I tried:
def replLines(inpQ : String, y : List[String]): String = y match {
case Nil => inpQ
case x::xs => replLines(toKVMap(x).fold(inpQ) {
case ((str: String), ((k: String), (v: String))) =>
str.replace("[" + k + "]", v).toString
}, xs)
}
I think the syntax is close, but not quite there. What have I done wrong?

What you're looking for is most likely this (note the foldLeft[String] instead of fold:
def replLines(inpQ: String, y: List[String]): String = y match {
case Nil => inpQ
case x :: xs => replLines(toKVMap(x).foldLeft[String](inpQ) {
case ((str: String), ((k: String), (v: String))) =>
str.replace("[" + k + "]", v)
}, xs)
}
fold generalizes the fold initial argument too much, and considers it a Serializable, not a String. foldLeft (and foldRight, if you prefer to start your replacements from the end) allows you to explicitly specify the type you fold on
EDIT: In fact, you don't even need a recursive pattern matching at all, as you can map your replacements directly to the list:
def replLines2(inpQ: String, y: List[String]): String =
y.flatMap(toKVMap).foldLeft[String](inpQ) {
case (str, (k, v)) => str.replace(s"[$k]", v)
}

Related

Compress a Given Text of String in Scala

I have been trying to compress a String. Given a String like this:
AAABBCAADEEFF, I would need to compress it like 3A2B1C2A1D2E2F
I was able to come up with a tail recursive implementation:
#scala.annotation.tailrec
def compress(str: List[Char], current: Seq[Char], acc: Map[Int, String]): String = str match {
case Nil =>
if (current.nonEmpty)
s"${acc.values.mkString("")}${current.length}${current.head}"
else
s"${acc.values.mkString("")}"
case List(x) if current.contains(x) =>
val newMap = acc ++ Map(acc.keys.toList.last + 1 -> s"${current.length + 1}${current.head}")
compress(List.empty[Char], Seq.empty[Char], newMap)
case x :: xs if current.isEmpty =>
compress(xs, Seq(x), acc)
case x :: xs if !current.contains(x) =>
if (acc.nonEmpty) {
val newMap = acc ++ Map(acc.keys.toList.last + 1 -> s"${current.length}${current.head}")
compress(xs, Seq(x), newMap)
} else {
compress(xs, Seq(x), acc ++ Map(1 -> s"${current.length}${current.head}"))
}
case x :: xs =>
compress(xs, current :+ x, acc)
}
// Produces 2F3A2B1C2A instead of 3A2B1C2A1D2E2F
compress("AAABBCAADEEFF".toList, Seq.empty[Char], Map.empty[Int, String])
It fails however for the given case! Not sure what edge scenario I'm missing! Any help?
So what I'm actually doing is, going over the sequence of characters, collecting identical ones into a new Sequence and as long as the new character in the original String input (the first param in the compress method) is found in the current (the second parameter in the compress method), I keep collecting it.
As soon as it is not the case, I empty the current sequence, count and push the collected elements into the Map! It fails for some edge cases that I'm not able to make out!
I came up with this solution:
def compress(word: List[Char]): List[(Char, Int)] =
word.map((_, 1)).foldRight(Nil: List[(Char, Int)])((e, acc) =>
acc match {
case Nil => List(e)
case ((c, i)::rest) => if (c == e._1) (c, i + 1)::rest else e::acc
})
Basically, it's a map followed by a right fold.
Took inspiration from the #nicodp code
def encode(word: String): String =
word.foldLeft(List.empty[(Char, Int)]) { (acc, e) =>
acc match {
case Nil => (e, 1) :: Nil
case ((lastChar, lastCharCount) :: xs) if lastChar == e => (lastChar, lastCharCount + 1) :: xs
case xs => (e, 1) :: xs
}
}.reverse.map { case (a, num) => s"$num$a" }.foldLeft("")(_ ++ _)
First our intermediate result will be List[(Char, Int)]. List of tuples of chars each char will be accompanied by its count.
Now lets start going through the list one char at once using the Great! foldLeft
We will accumulate the result in the acc variable and e represents the current element.
acc is of type List[(Char, Int)] and e is of type Char
Now when we start, we are at first char of the list. Right now the acc is empty list. So, we attach first tuple to the front of the list acc
with count one.
when acc is Nil do (e, 1) :: Nil or (e, 1) :: acc note: acc is Nil
Now front of the list is the node we are interested in.
Lets go to the second element. Now acc has one element which is the first element with count one.
Now, we compare the current element with the front element of the list
if it matches, increment the count and put the (element, incrementedCount) in the front of the list in place of old tuple.
if current element does not match the last element, that means we have
new element. So, we attach new element with count 1 to the front of the list and so on.
then to convert the List[(Char, Int)] to required string representation.
Note: We are using front element of the list which is accessible in O(1) (constant time complexity) has buffer and increasing the count in case same element is found.
Scala REPL
scala> :paste
// Entering paste mode (ctrl-D to finish)
def encode(word: String): String =
word.foldLeft(List.empty[(Char, Int)]) { (acc, e) =>
acc match {
case Nil => (e, 1) :: Nil
case ((lastChar, lastCharCount) :: xs) if lastChar == e => (lastChar, lastCharCount + 1) :: xs
case xs => (e, 1) :: xs
}
}.reverse.map { case (a, num) => s"$num$a" }.foldLeft("")(_ ++ _)
// Exiting paste mode, now interpreting.
encode: (word: String)String
scala> encode("AAABBCAADEEFF")
res0: String = 3A2B1C2A1D2E2F
Bit more concise with back ticks e instead of guard in pattern matching
def encode(word: String): String =
word.foldLeft(List.empty[(Char, Int)]) { (acc, e) =>
acc match {
case Nil => (e, 1) :: Nil
case ((`e`, lastCharCount) :: xs) => (e, lastCharCount + 1) :: xs
case xs => (e, 1) :: xs
}
}.reverse.map { case (a, num) => s"$num$a" }.foldLeft("")(_ ++ _)
Here's another more simplified approach based upon this answer:
class StringCompressinator {
def compress(raw: String): String = {
val split: Array[String] = raw.split("(?<=(.))(?!\\1)", 0) // creates array of the repeated chars as strings
val converted = split.map(group => {
val char = group.charAt(0) // take first char of group string
s"${group.length}${char}" // use the length as counter and prefix the return string "AAA" becomes "3A"
})
converted.mkString("") // converted is again array, join turn it into a string
}
}
import org.scalatest.FunSuite
class StringCompressinatorTest extends FunSuite {
test("testCompress") {
val compress = (new StringCompressinator).compress(_)
val input = "AAABBCAADEEFF"
assert(compress(input) == "3A2B1C2A1D2E2F")
}
}
Similar idea with slight difference :
Case class for pattern matching the head so we don't need to use if and it also helps on printing end result by overriding toString
Using capital letter for variable name when pattern matching (either that or back ticks, I don't know which I like less :P)
case class Count(c : Char, cnt : Int){
override def toString = s"$cnt$c"
}
def compressor( counts : List[Count], C : Char ) = counts match {
case Count(C, cnt) :: tail => Count(C, cnt + 1) :: tail
case _ => Count(C, 1) :: counts
}
"AAABBCAADEEFF".foldLeft(List[Count]())(compressor).reverse.mkString
//"3A2B1C2A1D2E2F"

Assign value to variable in Scala Parser

i have a question about the Parser in Scala. I'll just post the important part here so that there would not be too much codes. For the eval Function:
def eval(t: Term): Double = t match {
case Addition(l, r) => eval(l) + eval(r)
case Multiplication(l, r) => eval(l) * eval(r)
case Numeric(i) => i
case Variable("X") => 3
}
And the calculate function:
def calculate(arg: String): Double = {
return eval(parseAll(term, arg).get)
}
now i should overload the function "calculate" so that it takes an extra Parameter tup : (String, Double) and assign the value for this String. For example ("Y",2) then Y = 2 in the Parser. And then calculate the parser. But i don't know how to assign the value for this String. I had a stupid idea and tried this but it didn't work.
def calculate(arg: String, tup : (String, Double)) : Double = {
tup match {
case (a,b) => {
def eval(t : Term): Double = t match {
case Variable(a) => b
}
return eval(parseAll(term, arg).get)
}
}
can you guys pls help me out ? Thank you !!
You're almost there, you just need to tell the compiler that the a in your Variable pattern is actually the a from your (a, b) pattern. By default, what you do is called shadowing of the variable name a (in the scope of this pattern match, a is the value extracted in Variable, and the other a is forgotten).
What you want is something like
...
case Variable(`a`) => b
...
or, if your expression gets a little more complicated, you should rather use a guard:
...
case Variable(v) if v == a => b
...
EDIT However, now your eval function is not well defined. You need to put it all at once:
def eval(t: Term, varAssignement: (String, Double)): Double = t match {
case Addition(l, r) => eval(l) + eval(r)
case Multiplication(l, r) => eval(l) * eval(r)
case Numeric(i) => i
case Variable(a) if a == varAssignment._1 => varAssignment._2
}
Or, if you want to have multiple variables:
def eval(t: Term, assignments: Map[String, Double]): Double = t match {
case Addition(l, r) => eval(l) + eval(r)
case Multiplication(l, r) => eval(l) * eval(r)
case Numeric(i) => i
case Variable(a) if assignments.exists(a) => assignments(a)
}
Beware that you'll still get MatchErrors whenever an unassigned variable is used.

Combining / uniting two strings in Scala

What would be the best way to combine "__L_" and "_E__" to "_EL_" in Scala?
I don't want to use if and for commands.
How about this:
def combine(xs: String, ys: String): String =
(xs zip ys).map {
case ('_', y) => y
case (x, _) => x
}.mkString("")
The only thing that is not really nice about this is how to get back from a collection (IndexedSeq[Char]) to a string. An alternative is to use the String constructor that takes an Array[Char]. That would probably be more efficient.
Note that zip will work for strings of different length, but the result will be the size of the shorter string. This may or may not be what you want.
def zipStrings(first: String,
second: String,
comb: (Char, Char) => String = (f, _) => f.toString,
placeholder: Char = '_') =
first.zipAll(second, '_', '_').map {
case (c, `placeholder`) => c
case (`placeholder`, c) => c
case (f, s) => comb(f, s)
}.mkString
that prioritises characters from first over second by default
zipStrings("__A", "X_CD") // yields "X_AD"
zipStrings("A__YY", "BXXXX", (f, s) => s"($f|$s)") // yields "(A|B)XX(Y|X)(Y|X)"
For you original strings:
zipStrings("L_", "_E") // yield "LE"
zipStrings("--L-", "-E--", placeholder = '-') // yields "-EL-"

Scala: Partitioning by case (not by filter)

I have a list of mixed values:
val list = List("A", 2, 'c', 4)
I know how to collect the chars, or strings, or ints, in a single operation:
val strings = list collect { case s:String => s }
==> List(A)
val chars = list collect { case c:Char => c }
==> List(c)
val ints = list collect { case i:Int => i }
==> List(2,4)
Can I do it all in one shot somehow? I'm looking for:
val (strings, chars, ints) = list ??? {
case s:String => s
case c:Char => c
case i:Int => i
}
EDIT
Confession -- An example closer to my actual use case:
I have a list of things, that I want to partition according to some conditions:
val list2 = List("Word", " ", "", "OtherWord")
val (empties, whitespacesonly, words) = list2 ??? {
case s:String if s.isEmpty => s
case s:String if s.trim.isEmpty => s
case s:String => s
}
N.B. partition would be great for this if I only had 2 cases (one where the condition was met and one where it wasn't) but here I have multiple conditions to split on.
Based on your second example: you can use groupBy and a key-ing function. I prefer to use those techniques in conjunction with a discriminated union to make the intention of the code more obvious:
val list2 = List("Word", " ", "", "OtherWord")
sealed trait Description
object Empty extends Description
object Whitespaces extends Description
object Words extends Description
def strToDesc(str : String) : Description = str match {
case _ if str.isEmpty() => Empty
case _ if str.trim.isEmpty() => Whitespaces
case _ => Words
}
val descMap = (list2 groupBy strToDesc) withDefaultValue List.empty[String]
val (empties, whitespaceonly, words) =
(descMap(Empty),descMap(Whitespaces),descMap(Words))
This extends well if you want to add another Description later, e.g. AllCaps...
Hope this help:
list.foldLeft((List[String](), List[String](), List[String]())) {
case ((e,s,w),str:String) if str.isEmpty => (str::e,s,w)
case ((e,s,w),str:String) if str.trim.isEmpty => (e,str::s,w)
case ((e,s,w),str:String) => (e,s,str::w)
case (acc, _) => acc
}
You could use partition twice :
def partitionWords(list: List[String]) = {
val (emptyOrSpaces, words) = list.partition(_.trim.isEmpty)
val (empty, spaces) = emptyOrSpaces.partition(_.isEmpty)
(empty, spaces, words)
}
Which gives for your example :
partitionWords(list2)
// (List(""),List(" "),List(Word, OtherWord))
In general you can use foldLeft with a tuple as accumulator.
def partitionWords2(list: List[String]) = {
val nilString = List.empty[String]
val (empty, spaces, words) = list.foldLeft((nilString, nilString, nilString)) {
case ((empty, spaces, words), elem) =>
elem match {
case s if s.isEmpty => (s :: empty, spaces, words)
case s if s.trim.isEmpty => (empty, s :: spaces, words)
case s => (empty, spaces, s :: words)
}
}
(empty.reverse, spaces.reverse, words.reverse)
}
Which will give you the same result.
A tail recursive method,
def partition(list: List[Any]): (List[Any], List[Any], List[Any]) = {
#annotation.tailrec
def inner(map: Map[String, List[Any]], innerList: List[Any]): Map[String, List[Any]] = innerList match {
case x :: xs => x match {
case s: String => inner(insertValue(map, "str", s), xs)
case c: Char => inner(insertValue(map, "char", c), xs)
case i: Int => inner(insertValue(map, "int", i), xs)
}
case Nil => map
}
def insertValue(map: Map[String, List[Any]], key: String, value: Any) = {
map + (key -> (value :: map.getOrElse(key, Nil)))
}
val partitioned = inner(Map.empty[String, List[Any]], list)
(partitioned.get("str").getOrElse(Nil), partitioned.get("char").getOrElse(Nil), partitioned.get("int").getOrElse(Nil))
}
val list1 = List("A", 2, 'c', 4)
val (strs, chars, ints) = partition(list1)
I wound up with this, based on #Nyavro's answer:
val list2 = List("Word", " ", "", "OtherWord")
val(empties, spaces, words) =
list2.foldRight((List[String](), List[String](), List[String]())) {
case (str, (e, s, w)) if str.isEmpty => (str :: e, s, w)
case (str, (e, s, w)) if str.trim.isEmpty => (e, str :: s, w)
case (str, (e, s, w)) => (e, s, str :: w)
}
==> empties: List[String] = List("")
==> spaces: List[String] = List(" ")
==> words: List[String] = List(Word, OtherWord)
I understand the risks of using foldRight: mainly that in order to start on the right, the runtime needs to recurse and that this may blow the stack on large inputs. However, my inputs are small and this risk is acceptable.
Having said that, if there's a quick way to _.reverse three lists of a tuple that I haven't thought of, I'm all ears.
Thanks all!

n-way `span` on sequences

Given a sequence of elements and a predicate p, I would like to produce a sequence of sequences such that, in each subsequence, either all elements satisfy p or the sequence has length 1. Additionally, calling .flatten on the result should give me back my original sequence (so no re-ordering of elements).
For instance, given:
val l = List(2, 4, -6, 3, 1, 8, 7, 10, 0)
val p = (i : Int) => i % 2 == 0
I would like magic(l,p) to produce:
List(List(2, 4, -6), List(3), List(1), List(8), List(7), List(10, 0))
I know of .span, but that method stops the first time it encounters a value that doesn't satisfy p and just returns a pair.
Below is a candidate implementation. It does what I want, but, well, makes we want to cry. I would love for someone to come up with something slightly more idiomatic.
def magic[T](elems : Seq[T], p : T=>Boolean) : Seq[Seq[T]] = {
val loop = elems.foldLeft[(Boolean,Seq[Seq[T]])]((false,Seq.empty)) { (pr,e) =>
val (lastOK,s) = pr
if(lastOK && p(e)) {
(true, s.init :+ (s.last :+ e))
} else {
(p(e), s :+ Seq(e))
}
}
loop._2
}
(Note that I do not particularly care about preserving the actual type of the Seq.)
I would not use foldLeft. It's just a simple recursion of span with a special rule if the head doesn't match the predicate:
def magic[T](elems: Seq[T], p: T => Boolean): Seq[Seq[T]] =
elems match {
case Seq() => Seq()
case Seq(head, tail # _*) if !p(head) => Seq(head) +: magic(tail, p)
case xs =>
val (prefix, rest) = xs span p
prefix +: magic(rest, p)
}
You could also do it tail-recursive, but you need to remember to reverse the output if you're prepending (as is sensible):
def magic[T](elems: Seq[T], p: T => Boolean): Seq[Seq[T]] = {
def iter(elems: Seq[T], out: Seq[Seq[T]]) : Seq[Seq[T]] =
elems match {
case Seq() => out.reverse
case Seq(head, tail # _*) if !p(head) => iter(tail, Seq(head) +: out)
case xs =>
val (prefix, rest) = xs span p
iter(rest, prefix +: out)
}
iter(elems, Seq())
}
For this task you can use takeWhile and drop combined with a little pattern matching an recursion:
def magic[T](elems : Seq[T], p : T=>Boolean) : Seq[Seq[T]] = {
def magic(elems: Seq[T], result: Seq[Seq[T]]): Seq[Seq[T]] = elems.takeWhile(p) match {
// if elems is Nil, we have a result
case Nil if elems.isEmpty => result
// if it's not, but we don't get any values from takeWhile, we take a single elem
case Nil => magic(elems.tail, result :+ Seq(elems.head))
// takeWhile gave us something, so we add it to the result
// and drop as many elements from elems, as takeWhile gave us
case xs => magic(elems.drop(xs.size), result :+ xs)
}
magic(elems, Seq())
}
Another solution using a fold:
def magicFilter[T](seq: Seq[T], p: T => Boolean): Seq[Seq[T]] = {
val (filtered, current) = (seq foldLeft (Seq[Seq[T]](), Seq[T]())) {
case ((filtered, current), element) if p(element) => (filtered, current :+ element)
case ((filtered, current), element) if !current.isEmpty => (filtered :+ current :+ Seq(element), Seq())
case ((filtered, current), element) => (filtered :+ Seq(element), Seq())
}
if (!current.isEmpty) filtered :+ current else filtered
}