Suppose there are 3 strings:
protein, starch, drink
Concatenating those, we could say what is for dinner:
Example:
val protein = "fish"
val starch = "chips"
val drink = "wine"
val dinner = protein + ", " + starch + ", " + drink
But what if something was missing, for example the protein, because my wife couldn't catch anything. Then, we will have: ,chips, drink for dinner.
There is a slick way to concatenate the strings to optionally add the commas - I just don't know what it is ;-). Does anyone have a nice idea?
I'm looking for something like:
val dinner = protein +[add a comma if protein is not lenth of zero] + starch .....
It's just a fun exercise I'm doing, so now sweat if it can't be done in some cool way. The reason that I'm trying to do the conditional concatenation in a single assignment, is because I'm using this type of thing a lot in XML and a nice solution will make things..... nicer.
When you say "it may be absent", this entity's type should be Option[T]. Then,
def dinner(components: List[Option[String]]) = components.flatten mkString ", "
You would invoke it like this:
scala> dinner(None :: Some("chips") :: Some("wine") :: Nil)
res0: String = chips, wine
In case you absolutely want checking a string's emptiness,
def dinner(strings: List[String]) = strings filter {_.nonEmpty} mkString ", "
scala> dinner("" :: "chips" :: "wine" :: Nil)
res1: String = chips, wine
You're looking for mkString on collections, maybe.
val protein = "fish"
val starch = "chips"
val drink = "wine"
val complete = List (protein, starch, drink)
val partly = List (protein, starch)
complete.mkString (", ")
partly.mkString (", ")
results in:
res47: String = fish, chips, wine
res48: String = fish, chips
You may even specify a start and end:
scala> partly.mkString ("<<", ", ", ">>")
res49: String = <<fish, chips>>
scala> def concat(ss: String*) = ss filter (_.nonEmpty) mkString ", "
concat: (ss: String*)String
scala> concat("fish", "chips", "wine")
res0: String = fish, chips, wine
scala> concat("", "chips", "wine")
res1: String = chips, wine
scala>
This takes care of the case of empty strings and also shows how you could put other logic for filtering and formatting. This will work fine for a List[String] and generalizes to List[Any].
val input = List("fish", "", "chips", 137, 32, 32.0, None, "wine")
val output = input.flatMap{ _ match {
case None => None
case x:String if !x.nonEmpty => None
case x:String => Some(x)
case _ => None
}}
.mkString(",")
res1: String = fish,chips,wine
The idea is that flatMap takes a List[Any] and uses matching to assign None for any elements that you do not want to keep in the output. The Nones get flattened away and the Somes stay.
If you needed to be able to handle different types (Int, Double, etc) then you could add more cases.
println(s"$protein,$starch,$drink")
Related
I have below string and I want to extract only List((asdf, asdf), (fff,qqq)) from the string, line has many other characters before and after the part I want to extract.
some garbage string PARAMS=List((foo, bar), (foo1, bar1)) some garbage string
I have tried these regex
(?:PARAMS)=(List\((.*?)\))
(?:PARAMS)=(List\(([^)]+)\))
but it gives me below output in group(1):
List((foo, bar)
regex .*List\((.*)\).* works
Using Scala regex and pattern matching together and then split with any of ( , ) and then group
regex contains extractors
val r = """.*List\((.*)\).*""".r
pattern matching using extractor in regex
val result = str match {
case r(value) => value
case _ => ""
}
Then split using any of ( or , or ) and then group
result.split("""[(|,|)]""").filterNot(s => s.isEmpty || s.trim.isEmpty)
.grouped(2)
.toList
.map(pair => (pair(0), pair(1))).toList
Scala REPL
scala> val str = """some garbage string PARAMS=List((foo, bar), (foo1, bar1)) some garbage string"""
str: String = "some garbage string PARAMS=List((foo, bar), (foo1, bar1)) some garbage string"
scala> val r = """.*List\((.*)\).*""".r
r: util.matching.Regex = .*List\((.*)\).*
scala> val result = str match {
case r(value) => value
case _ => ""
}
result: String = "(foo, bar), (foo1, bar1)"
scala> result.split("""[(|,|)]""").filterNot(s => s.isEmpty || s.trim.isEmpty).grouped(2).toList.map(pair => (pair(0), pair(1))).toList
res46: List[(String, String)] = List(("foo", " bar"), ("foo1", " bar1"))
In this Scala code I'm trying to analyze a string that contains a sum (such as 12+3+5) and return the result (20). I'm using regex to extract the first digit and parse the trail to be added recursively. My issue is that since the regex returns a String, I cannot add up the numbers. Any ideas?
object TestRecursive extends App {
val plus = """(\w*)\+(\w*)""".r
println(parse("12+3+5"))
def parse(str: String) : String = str match {
// sum
case plus(head, trail) => parse(head) + parse(trail)
case _ => str
}
}
You might want to use the parser combinators for an application like this.
"""(\w*)\+(\w*)""".r also matches "+" or "23+" or "4 +5" // but captures it only in the first group.
what you could do might be
scala> val numbers = "[+-]?\\d+"
numbers: String = [+-]?\d+
^
scala> numbers.r.findAllIn("1+2-3+42").map(_.toInt).reduce(_ + _)
res4: Int = 42
scala> numbers.r.findAllIn("12+3+5").map(_.toInt).reduce(_ + _)
res5: Int = 20
I have been playing with Scala's combinators and parsers, and have a question that may be too elementary (apologies if it is). I have written it out in this code to make the question easy to understand and my question is at the end.
import scala.util.parsing.combinator._
// First, I create a case class
case class Human(fname: String, lname: String, age: Int)
// Now, I create a parser object
object SimpleParser extends RegexParsers {
def fname: Parser[String] = """[A-Za-z]+""".r ^^ {_.toString}
def lname: Parser[String] = """[A-Za-z]+""".r ^^ {_.toString}
def age: Parser[Int] = """[1-9][0-9]{0,2}""".r ^^ {_.toInt}
def person: Parser[Human] = fname ~ lname ~ age ^^ {case f ~ l ~ a => Human(f, l, a)}
// Now, I need to read a list of these, not just one.
// How do I do this? Here is what I tried, but can't figure out what goes inside
// the curly braces to the right of ^^
// def group: Parser[List[Human]] = rep(person) ^^ {}
// def group: Parser[List[Human]] = (person)+ ^^ {}
}
// Here is what I did to test things:
val s1 = "Bilbo Baggins 123"
val r = SimpleParser.parseAll(SimpleParser.person, s1)
println("First Name: " + r.get.fname)
println("Last Name: " + r.get.lname)
println("Age: " + r.get.age)
// So that worked well; I could read these things into an object and examine the object,
// and can do things with the object now.
// But how do I read either of these into, say, a List[Human] or some collection?
val s2 = "Bilbo Baggins 123 Frodo Baggins 40 John Doe 22"
val s3 = "Bilbo Baggins 123; Frodo Baggins 40; John Doe 22"
If there is something very obvious I missed please let me know. Thanks!
You were very close. For the space-separated version, rep is all you need:
lazy val people: Parser[List[Human]] = rep(person)
And for the version with semicolons, you can use repsep:
lazy val peopleWithSemicolons: Parser[List[Human]] = repsep(person, ";")
Note that in both cases rep* returns the result you want—there's no need to map over the result with ^^. This is also the case for fname and lname, where the regular expression will be implicitly converted into a Parser[String], which means that mapping _.toString doesn't actually change anything.
When we need an array of strings to be concatenated, we can use mkString method:
val concatenatedString = listOfString.mkString
However, when we have a very long list of string, getting concatenated string may not be a good choice. In this case, It would be more appropriated to print out to an output stream directly, Writing it to output stream is simple:
listOfString.foreach(outstream.write _)
However, I don't know a neat way to append separators. One thing I tried is looping with an index:
var i = 0
for(str <- listOfString) {
if(i != 0) outstream.write ", "
outstream.write str
i += 1
}
This works, but it is too wordy. Although I can make a function encapsules the code above, I want to know whether Scala API already has a function do the same thing or not.
Thank you.
Here is a function that do what you want in a bit more elegant way:
def commaSeparated(list: List[String]): Unit = list match {
case List() =>
case List(a) => print(a)
case h::t => print(h + ", ")
commaSeparated(t)
}
The recursion avoids mutable variables.
To make it even more functional style, you can pass in the function that you want to use on each item, that is:
def commaSeparated(list: List[String], func: String=>Unit): Unit = list match {
case List() =>
case List(a) => func(a)
case h::t => func(h + ", ")
commaSeparated(t, func)
}
And then call it by:
commaSeparated(mylist, oustream.write _)
I believe what you want is the overloaded definitions of mkString.
Definitions of mkString:
scala> val strList = List("hello", "world", "this", "is", "bob")
strList: List[String] = List(hello, world, this, is, bob)
def mkString: String
scala> strList.mkString
res0: String = helloworldthisisbob
def mkString(sep: String): String
scala> strList.mkString(", ")
res1: String = hello, world, this, is, bob
def mkString(start: String, sep: String, end: String): String
scala> strList.mkString("START", ", ", "END")
res2: String = STARThello, world, this, is, bobEND
EDIT
How about this?
scala> strList.view.map(_ + ", ").foreach(print) // or .iterator.map
hello, world, this, is, bob,
Not good for parallelized code, but otherwise:
val it = listOfString.iterator
it.foreach{x => print(x); if (it.hasNext) print(' ')}
Here's another approach which avoids the var
listOfString.zipWithIndex.foreach{ case (s, i) =>
if (i != 0) outstream write ","
outstream write s }
Self Answer:
I wrote a function encapsulates the code in the original question:
implicit def withSeparator[S >: String](seq: Seq[S]) = new {
def withSeparator(write: S => Any, sep: String = ",") = {
var i = 0
for (str <- seq) {
if (i != 0) write(sep)
write(str)
i += 1
}
seq
}
}
You can use it like this:
listOfString.withSeparator(print _)
The separator can also be assigned:
listOfString.withSeparator(print _, ",\n")
Thank you for everyone answered me. What I wanted to use is a concise and not too slow representation. The implicit function withSeparator looks like the thing I wanted. So I accept my own answer for this question. Thank you again.
I'm parsing XML, and keep finding myself writing code like:
val xml = <outertag>
<dog>val1</dog>
<cat>val2</cat>
</outertag>
var cat = ""
var dog = ""
for (inner <- xml \ "_") {
inner match {
case <dog>{ dg # _* }</dog> => dog = dg(0).toString()
case <cat>{ ct # _* }</cat> => cat = ct(0).toString()
}
}
/* do something with dog and cat */
It annoys me because I should be able to declare cat and dog as val (immutable), since I only need to set them once, but I have to make them mutable. And besides that it just seems like there must be a better way to do this in scala. Any ideas?
Here are two (now make it three) possible solutions. The first one is pretty quick and dirty. You can run the whole bit in the Scala interpreter.
val xmlData = <outertag>
<dog>val1</dog>
<cat>val2</cat>
</outertag>
// A very simple way to do this mapping.
def simpleGetNodeValue(x:scala.xml.NodeSeq, tag:String) = (x \\ tag).text
val cat = simpleGetNodeValue(xmlData, "cat")
val dog = simpleGetNodeValue(xmlData, "dog")
cat will be "val2", and dog will be "val1".
Note that if either node is not found, an empty string will be returned. You can work around this, or you could write it in a slightly more idiomatic way:
// A more idiomatic Scala way, even though Scala wouldn't give us nulls.
// This returns an Option[String].
def getNodeValue(x:scala.xml.NodeSeq, tag:String) = {
(x \\ tag).text match {
case "" => None
case x:String => Some(x)
}
}
val cat1 = getNodeValue(xmlData, "cat") getOrElse "No cat found."
val dog1 = getNodeValue(xmlData, "dog") getOrElse "No dog found."
val goat = getNodeValue(xmlData, "goat") getOrElse "No goat found."
cat1 will be "val2", dog1 will be "val1", and goat will be "No goat found."
UPDATE: Here's one more convenience method to take a list of tag names and return their matches as a Map[String, String].
// Searches for all tags in the List and returns a Map[String, String].
def getNodeValues(x:scala.xml.NodeSeq, tags:List[String]) = {
tags.foldLeft(Map[String, String]()) { (a, b) => a(b) = simpleGetNodeValue(x, b)}
}
val tagsToMatch = List("dog", "cat")
val matchedValues = getNodeValues(xmlData, tagsToMatch)
If you run that, matchedValues will be Map(dog -> val1, cat -> val2).
Hope that helps!
UPDATE 2: Per Daniel's suggestion, I'm using the double-backslash operator, which will descend into child elements, which may be better as your XML dataset evolves.
scala> val xml = <outertag><dog>val1</dog><cat>val2</cat></outertag>
xml: scala.xml.Elem = <outertag><dog>val1</dog><cat>val2</cat></outertag>
scala> val cat = xml \\ "cat" text
cat: String = val2
scala> val dog = xml \\ "dog" text
dog: String = val1
Consider wrapping up the XML inspection and pattern matching in a function that returns the multiple values you need as a tuple (Tuple2[String, String]). But stop and consider: it looks like it's possible to not match any dog and cat elements, which would leave you returning null for one or both of the tuple components. Perhaps you could return a tuple of Option[String], or throw if either of the element patterns fail to bind.
In any case, you can generally solve these initialization problems by wrapping up the constituent statements into a function to yield an expression. Once you have an expression in hand, you can initialize a constant with the result of its evaluation.