Related
I have a list of tuples look like this:
Seq("ptxt"->"how","list"->"you doing","ptxt"->"whats up","ptxt"-> "this ","list"->"is ","list"->"cool")
On the keys, merge ptxt with all the list that will come after it.
e.g.
create a new seq look like this :
Seq("how you doing", "whats up", "this is cool")
You could fold your Seq with foldLeft:
val s = Seq("ptxt"->"how ","list"->"you doing","ptxt"->"whats up","ptxt"-> "this ","list"->"is ","list"->"cool")
val r: Seq[String] = s.foldLeft(List[String]()) {
case (xs, ("ptxt", s)) => s :: xs
case (x :: xs, ("list", s)) => (x + s) :: xs
}.reverse
If you don't care about an order you can omit reverse.
Function foldLeft takes two arguments first is the initial value and the second one is a function taking two arguments: the previous result and element of the sequence. Result of this method is then fed the next function call as the first argument.
For example for numbers foldLeft, would just create a sum of all elements starting from left.
List(5, 4, 8, 6, 2).foldLeft(0) { (result, i) =>
result + i
} // 25
For our case, we start with an empty list. Then we provide function, which handles two cases using pattern matching.
Case when the key is "ptxt". In this case, we just prepend the value to list.
case (xs, ("ptxt", s)) => s :: xs
Case when the key is "list". Here we take the first string from the list (using pattern matching) and then concatenate value to it, after that we put it back with the rest of the list.
case (x :: xs, ("list", s)) => (x + s) :: xs
At the end since we were prepending element, we need to revert our list. Why we were prepending, not appending? Because append on the immutable list is O(n) and prepend is O(1), so it's more efficient.
Here another solution:
val data = Seq("ptxt"->"how","list"->"you doing","ptxt"->"whats", "list" -> "up","ptxt"-> "this ", "list"->"is cool")
First group Keys and Values:
val grouped = s.groupBy(_._1)
.map{case (k, l) => k -> l.map{case (_, v) => v.trim}}
// > Map(list -> List(you doing, up, is cool), ptxt -> List(how, whats, this))
Then zip and concatenate the two values:
grouped("ptxt").zip(grouped("list"))
.map{case (a, b) => s"$a $b"}
// > List(how you doing, whats up, this is cool)
Disclaimer: This only works if the there is always key, value, key, value,.. in the list - I had to adjust the input data.
If you change Seq for List, you can solve that with a simple tail-recursive function.
(The code uses Scala 2.13, but can be rewritten to use older Scala versions if needed)
def mergeByKey[K](list: List[(K, String)]): List[String] = {
#annotation.tailrec
def loop(remaining: List[(K, String)], acc: Map[K, StringBuilder]): List[String] =
remaining match {
case Nil =>
acc.valuesIterator.map(_.result()).toList
case (key, value) :: tail =>
loop(
remaining = tail,
acc.updatedWith(key) {
case None => Some(new StringBuilder(value))
case Some(oldValue) => Some(oldValue.append(value))
}
)
}
loop(remaining = list, acc = Map.empty)
}
val data = List("ptxt"->"how","list"->"you doing","ptxt"->"whats up","ptxt"-> "this ","list"->"is ","list"->"cool")
mergeByKey(data)
// res: List[String] = List("howwhats upthis ", "you doingis cool")
Or a one liner using groupMap.
(inspired on pme's answer)
data.groupMap(_._1)(_._2).view.mapValues(_.mkString).valuesIterator.toList
Adding another answer since I don't have enough reputation points for adding a comment. just an improvment on Krzysztof Atłasik's answer. to compensate for the case where the Seq starts with a "list" you might want to add another case as:
case (xs,("list", s)) if xs.isEmpty=>xs
So the final code could be something like:
val s = Seq("list"->"how ","list"->"you doing","ptxt"->"whats up","ptxt"-> "this ","list"->"is ","list"->"cool")
val r: Seq[String] = s.foldLeft(List[String]()) {
case (xs,("list", s)) if xs.isEmpty=>xs
case (xs, ("ptxt", s)) => s :: xs
case (x :: xs, ("list", s)) => (x + s) :: xs
}.reverse
Given:
def readLines(x: String): List[String] = Source.fromFile(x).getLines.toList
def toKVMap(fName : String): Map[String,String] =
readLines(fName).map(x => x.split(',')).map { case Array(x, y) => (x, y) }.toMap
I want to be able to take a string and a list of files of replacements and replace bracketed items. So if I have:
replLines("Hello",["cat"]) and cat contains ello,i!, I want to get back Hi!
I tried:
def replLines(inpQ : String, y : List[String]): String = y match {
case Nil => inpQ
case x::xs => replLines(toKVMap(x).fold(inpQ) {
case ((str: String), ((k: String), (v: String))) =>
str.replace("[" + k + "]", v).toString
}, xs)
}
I think the syntax is close, but not quite there. What have I done wrong?
What you're looking for is most likely this (note the foldLeft[String] instead of fold:
def replLines(inpQ: String, y: List[String]): String = y match {
case Nil => inpQ
case x :: xs => replLines(toKVMap(x).foldLeft[String](inpQ) {
case ((str: String), ((k: String), (v: String))) =>
str.replace("[" + k + "]", v)
}, xs)
}
fold generalizes the fold initial argument too much, and considers it a Serializable, not a String. foldLeft (and foldRight, if you prefer to start your replacements from the end) allows you to explicitly specify the type you fold on
EDIT: In fact, you don't even need a recursive pattern matching at all, as you can map your replacements directly to the list:
def replLines2(inpQ: String, y: List[String]): String =
y.flatMap(toKVMap).foldLeft[String](inpQ) {
case (str, (k, v)) => str.replace(s"[$k]", v)
}
I have been trying to compress a String. Given a String like this:
AAABBCAADEEFF, I would need to compress it like 3A2B1C2A1D2E2F
I was able to come up with a tail recursive implementation:
#scala.annotation.tailrec
def compress(str: List[Char], current: Seq[Char], acc: Map[Int, String]): String = str match {
case Nil =>
if (current.nonEmpty)
s"${acc.values.mkString("")}${current.length}${current.head}"
else
s"${acc.values.mkString("")}"
case List(x) if current.contains(x) =>
val newMap = acc ++ Map(acc.keys.toList.last + 1 -> s"${current.length + 1}${current.head}")
compress(List.empty[Char], Seq.empty[Char], newMap)
case x :: xs if current.isEmpty =>
compress(xs, Seq(x), acc)
case x :: xs if !current.contains(x) =>
if (acc.nonEmpty) {
val newMap = acc ++ Map(acc.keys.toList.last + 1 -> s"${current.length}${current.head}")
compress(xs, Seq(x), newMap)
} else {
compress(xs, Seq(x), acc ++ Map(1 -> s"${current.length}${current.head}"))
}
case x :: xs =>
compress(xs, current :+ x, acc)
}
// Produces 2F3A2B1C2A instead of 3A2B1C2A1D2E2F
compress("AAABBCAADEEFF".toList, Seq.empty[Char], Map.empty[Int, String])
It fails however for the given case! Not sure what edge scenario I'm missing! Any help?
So what I'm actually doing is, going over the sequence of characters, collecting identical ones into a new Sequence and as long as the new character in the original String input (the first param in the compress method) is found in the current (the second parameter in the compress method), I keep collecting it.
As soon as it is not the case, I empty the current sequence, count and push the collected elements into the Map! It fails for some edge cases that I'm not able to make out!
I came up with this solution:
def compress(word: List[Char]): List[(Char, Int)] =
word.map((_, 1)).foldRight(Nil: List[(Char, Int)])((e, acc) =>
acc match {
case Nil => List(e)
case ((c, i)::rest) => if (c == e._1) (c, i + 1)::rest else e::acc
})
Basically, it's a map followed by a right fold.
Took inspiration from the #nicodp code
def encode(word: String): String =
word.foldLeft(List.empty[(Char, Int)]) { (acc, e) =>
acc match {
case Nil => (e, 1) :: Nil
case ((lastChar, lastCharCount) :: xs) if lastChar == e => (lastChar, lastCharCount + 1) :: xs
case xs => (e, 1) :: xs
}
}.reverse.map { case (a, num) => s"$num$a" }.foldLeft("")(_ ++ _)
First our intermediate result will be List[(Char, Int)]. List of tuples of chars each char will be accompanied by its count.
Now lets start going through the list one char at once using the Great! foldLeft
We will accumulate the result in the acc variable and e represents the current element.
acc is of type List[(Char, Int)] and e is of type Char
Now when we start, we are at first char of the list. Right now the acc is empty list. So, we attach first tuple to the front of the list acc
with count one.
when acc is Nil do (e, 1) :: Nil or (e, 1) :: acc note: acc is Nil
Now front of the list is the node we are interested in.
Lets go to the second element. Now acc has one element which is the first element with count one.
Now, we compare the current element with the front element of the list
if it matches, increment the count and put the (element, incrementedCount) in the front of the list in place of old tuple.
if current element does not match the last element, that means we have
new element. So, we attach new element with count 1 to the front of the list and so on.
then to convert the List[(Char, Int)] to required string representation.
Note: We are using front element of the list which is accessible in O(1) (constant time complexity) has buffer and increasing the count in case same element is found.
Scala REPL
scala> :paste
// Entering paste mode (ctrl-D to finish)
def encode(word: String): String =
word.foldLeft(List.empty[(Char, Int)]) { (acc, e) =>
acc match {
case Nil => (e, 1) :: Nil
case ((lastChar, lastCharCount) :: xs) if lastChar == e => (lastChar, lastCharCount + 1) :: xs
case xs => (e, 1) :: xs
}
}.reverse.map { case (a, num) => s"$num$a" }.foldLeft("")(_ ++ _)
// Exiting paste mode, now interpreting.
encode: (word: String)String
scala> encode("AAABBCAADEEFF")
res0: String = 3A2B1C2A1D2E2F
Bit more concise with back ticks e instead of guard in pattern matching
def encode(word: String): String =
word.foldLeft(List.empty[(Char, Int)]) { (acc, e) =>
acc match {
case Nil => (e, 1) :: Nil
case ((`e`, lastCharCount) :: xs) => (e, lastCharCount + 1) :: xs
case xs => (e, 1) :: xs
}
}.reverse.map { case (a, num) => s"$num$a" }.foldLeft("")(_ ++ _)
Here's another more simplified approach based upon this answer:
class StringCompressinator {
def compress(raw: String): String = {
val split: Array[String] = raw.split("(?<=(.))(?!\\1)", 0) // creates array of the repeated chars as strings
val converted = split.map(group => {
val char = group.charAt(0) // take first char of group string
s"${group.length}${char}" // use the length as counter and prefix the return string "AAA" becomes "3A"
})
converted.mkString("") // converted is again array, join turn it into a string
}
}
import org.scalatest.FunSuite
class StringCompressinatorTest extends FunSuite {
test("testCompress") {
val compress = (new StringCompressinator).compress(_)
val input = "AAABBCAADEEFF"
assert(compress(input) == "3A2B1C2A1D2E2F")
}
}
Similar idea with slight difference :
Case class for pattern matching the head so we don't need to use if and it also helps on printing end result by overriding toString
Using capital letter for variable name when pattern matching (either that or back ticks, I don't know which I like less :P)
case class Count(c : Char, cnt : Int){
override def toString = s"$cnt$c"
}
def compressor( counts : List[Count], C : Char ) = counts match {
case Count(C, cnt) :: tail => Count(C, cnt + 1) :: tail
case _ => Count(C, 1) :: counts
}
"AAABBCAADEEFF".foldLeft(List[Count]())(compressor).reverse.mkString
//"3A2B1C2A1D2E2F"
I have a list of mixed values:
val list = List("A", 2, 'c', 4)
I know how to collect the chars, or strings, or ints, in a single operation:
val strings = list collect { case s:String => s }
==> List(A)
val chars = list collect { case c:Char => c }
==> List(c)
val ints = list collect { case i:Int => i }
==> List(2,4)
Can I do it all in one shot somehow? I'm looking for:
val (strings, chars, ints) = list ??? {
case s:String => s
case c:Char => c
case i:Int => i
}
EDIT
Confession -- An example closer to my actual use case:
I have a list of things, that I want to partition according to some conditions:
val list2 = List("Word", " ", "", "OtherWord")
val (empties, whitespacesonly, words) = list2 ??? {
case s:String if s.isEmpty => s
case s:String if s.trim.isEmpty => s
case s:String => s
}
N.B. partition would be great for this if I only had 2 cases (one where the condition was met and one where it wasn't) but here I have multiple conditions to split on.
Based on your second example: you can use groupBy and a key-ing function. I prefer to use those techniques in conjunction with a discriminated union to make the intention of the code more obvious:
val list2 = List("Word", " ", "", "OtherWord")
sealed trait Description
object Empty extends Description
object Whitespaces extends Description
object Words extends Description
def strToDesc(str : String) : Description = str match {
case _ if str.isEmpty() => Empty
case _ if str.trim.isEmpty() => Whitespaces
case _ => Words
}
val descMap = (list2 groupBy strToDesc) withDefaultValue List.empty[String]
val (empties, whitespaceonly, words) =
(descMap(Empty),descMap(Whitespaces),descMap(Words))
This extends well if you want to add another Description later, e.g. AllCaps...
Hope this help:
list.foldLeft((List[String](), List[String](), List[String]())) {
case ((e,s,w),str:String) if str.isEmpty => (str::e,s,w)
case ((e,s,w),str:String) if str.trim.isEmpty => (e,str::s,w)
case ((e,s,w),str:String) => (e,s,str::w)
case (acc, _) => acc
}
You could use partition twice :
def partitionWords(list: List[String]) = {
val (emptyOrSpaces, words) = list.partition(_.trim.isEmpty)
val (empty, spaces) = emptyOrSpaces.partition(_.isEmpty)
(empty, spaces, words)
}
Which gives for your example :
partitionWords(list2)
// (List(""),List(" "),List(Word, OtherWord))
In general you can use foldLeft with a tuple as accumulator.
def partitionWords2(list: List[String]) = {
val nilString = List.empty[String]
val (empty, spaces, words) = list.foldLeft((nilString, nilString, nilString)) {
case ((empty, spaces, words), elem) =>
elem match {
case s if s.isEmpty => (s :: empty, spaces, words)
case s if s.trim.isEmpty => (empty, s :: spaces, words)
case s => (empty, spaces, s :: words)
}
}
(empty.reverse, spaces.reverse, words.reverse)
}
Which will give you the same result.
A tail recursive method,
def partition(list: List[Any]): (List[Any], List[Any], List[Any]) = {
#annotation.tailrec
def inner(map: Map[String, List[Any]], innerList: List[Any]): Map[String, List[Any]] = innerList match {
case x :: xs => x match {
case s: String => inner(insertValue(map, "str", s), xs)
case c: Char => inner(insertValue(map, "char", c), xs)
case i: Int => inner(insertValue(map, "int", i), xs)
}
case Nil => map
}
def insertValue(map: Map[String, List[Any]], key: String, value: Any) = {
map + (key -> (value :: map.getOrElse(key, Nil)))
}
val partitioned = inner(Map.empty[String, List[Any]], list)
(partitioned.get("str").getOrElse(Nil), partitioned.get("char").getOrElse(Nil), partitioned.get("int").getOrElse(Nil))
}
val list1 = List("A", 2, 'c', 4)
val (strs, chars, ints) = partition(list1)
I wound up with this, based on #Nyavro's answer:
val list2 = List("Word", " ", "", "OtherWord")
val(empties, spaces, words) =
list2.foldRight((List[String](), List[String](), List[String]())) {
case (str, (e, s, w)) if str.isEmpty => (str :: e, s, w)
case (str, (e, s, w)) if str.trim.isEmpty => (e, str :: s, w)
case (str, (e, s, w)) => (e, s, str :: w)
}
==> empties: List[String] = List("")
==> spaces: List[String] = List(" ")
==> words: List[String] = List(Word, OtherWord)
I understand the risks of using foldRight: mainly that in order to start on the right, the runtime needs to recurse and that this may blow the stack on large inputs. However, my inputs are small and this risk is acceptable.
Having said that, if there's a quick way to _.reverse three lists of a tuple that I haven't thought of, I'm all ears.
Thanks all!
Hello currently I have the following string:
"STRING$INTEGER$STRING$STRING"
How can I pattern match that in scala?
Currently I know that I can use .split, but that produces a Array[String] my regex is flawed I could match everything against (.*) but that will handle the second as a String but it's an int, is there a way to have
data match {}
You can indead use a regex, but everything you match will be a String.
val format = """(\w+)\$(\d+)\$(\w+)\$(\w+)""".r
"hello$5$foo$bar" match {
case format(s1, i, s2, s3) => // i is a String
val n = i.toInt
}
You could also create an extractor which could use the regex above or split.
object Format {
def unapply(string: String) = string.split("""\$""") match {
case Array(s1, i, s2, s3) =>
Try(i.toInt).toOption.map(i => (s1, i, s2, s3))
}
}
"hello$5$foo$bar" match {
case Format(s1, i, s2, s3) => i + 5 // i is an Int
}
// Int = 10
Your question is plenty unclear, but I guess you might need something like this?
val string = "STRING$INTEGER$STRING$STRING"
val regex = """(\w+)\$(\w+)\$(\w+)\$(\w+)""".r
string match {
case regex(s1, i, s2, s3) =>
s"$s1, $i, $s2, $s3"
case _ =>
"error"
}
|> res0: String = STRING, INTEGER, STRING, STRING
To separate a string val s = "abc$123$foo$bar" by $ consider
val xs = s.split("\\$")
xs: Array[String] = Array(abc, 123, foo, bar)
To collect all the integer values in the resulting array consider for instance
xs.flatMap( s => scala.util.Try(s.toInt).toOption )
res: Array[Int] = Array(123)
were we try to convert each string into an integer value; in case of failure it delivers a None which is flattened out.