I would like to pretty-print a Product, such as a case class, so I create the following trait:
trait X extends Product {
def fmtStrs =
productIterator map {
case _ : Double => "%8.2f"
case _ => "%4s"
} map (_ + separator) toSeq
override def toString = {
new StringContext("" +: fmtStrs : _*) f (productIterator.toSeq : _*)
}
}
This uses string interpolation as described in the ScalaDoc for StringContext.
But this won't compile, with this cryptic error:
Error:(69, 70) too many arguments for interpolated string
new StringContext("" +: fmtStrs : _*) f (productIterator.toSeq : _*)
Is this a bug, or limitation of a macro? Note that doing the following works fine, so I suspect this may be related to the variable argument list:
scala> val str2 = StringContext("","%4s,","%8.2f").f(1,23.4)
str2: String = " 1, 23.40"
The reason f is a macro is so that it can give you an error when types of format specifiers and arguments don't match, and this isn't possible to check by looking at ("" +: fmtStrs : _*) and (productIterator.toSeq : _*), so it isn't particularly surprising this doesn't work. The error message could be clearer, so let's see what exactly happens.
If you look at the implementation of f (it took me some time to actually find it, I finally did by searching for the error message), you'll see
c.macroApplication match {
//case q"$_(..$parts).f(..$args)" =>
case Applied(Select(Apply(_, parts), _), _, argss) =>
val args = argss.flatten
def badlyInvoked = (parts.length != args.length + 1) && truly {
def because(s: String) = s"too $s arguments for interpolated string"
val (p, msg) =
if (parts.length == 0) (c.prefix.tree.pos, "there are no parts")
else if (args.length + 1 < parts.length)
(if (args.isEmpty) c.enclosingPosition else args.last.pos, because("few"))
else (args(parts.length-1).pos, because("many"))
c.abort(p, msg)
}
if (badlyInvoked) c.macroApplication else interpolated(parts, args)
With your call you have a single tree in both parts and argss, and parts.length != args.length + 1 is true, so badlyInvoked is true.
s doesn't care what its arguments look like, so it's just a method and your scenario works.
Related
I want to get first argument for main method that is optional, something like this:
val all = args(0) == "all"
However, this would fail with exception if no argument is provided.
Is there any one-liner simple method to set all to false when args[0] is missing; and not doing the common if-no-args-set-false-else... thingy?
In general case you can use lifting:
args.lift(0).map(_ == "all").getOrElse(false)
Or even (thanks to #enzyme):
args.lift(0).contains("all")
You can use headOption and fold (on Option):
val all = args.headOption.fold(false)(_ == "all")
Of course, as #mohit pointed out, map followed by getOrElse will work as well.
If you really need indexed access, you could pimp a get method on any Seq:
implicit class RichIndexedSeq[V, T <% Seq[V]](seq: T) {
def get(i: Int): Option[V] =
if (i < 0 || i >= seq.length) None
else Some(seq(i))
}
However, if this is really about arguments, you'll be probably better off, handling arguments in a fold:
case class MyArgs(n: Int = 1, debug: Boolean = false,
file: Option[String] = None)
val myArgs = args.foldLeft(MyArgs()) {
case (args, "-debug") =>
args.copy(debug = true)
case (args, str) if str.startsWith("-n") =>
args.copy(n = ???) // parse string
case (args, str) if str.startsWith("-f") =>
args.copy(file = Some(???) // parse string
case _ =>
sys.error("Unknown arg")
}
if (myArgs.file.isEmpty)
sys.error("Need file")
You can use foldLeft with initial false value:
val all = (false /: args)(_ | _ == "all")
But be careful, One Liners can be difficult to read.
Something like this will work assuming args(0) returns Some or None:
val all = args(0).map(_ == "all").getOrElse(false)
I'm trying to 'group' a string into segments, I guess this example would explain it more succintly
scala> val str: String = "aaaabbcddeeeeeeffg"
... (do something)
res0: List("aaaa","bb","c","dd","eeeee","ff","g")
I can thnk of a few ways to do this in an imperative style (with vars and stepping through the string to find groups) but I was wondering if any better functional solution could
be attained? I've been looking through the Scala API but there doesn't seem to be something that fits my needs.
Any help would be appreciated
You can split the string recursively with span:
def s(x : String) : List[String] = if(x.size == 0) Nil else {
val (l,r) = x.span(_ == x(0))
l :: s(r)
}
Tail recursive:
#annotation.tailrec def s(x : String, y : List[String] = Nil) : List[String] = {
if(x.size == 0) y.reverse
else {
val (l,r) = x.span(_ == x(0))
s(r, l :: y)
}
}
Seems that all other answers are very concentrated on collection operations. But pure string + regex solution is much simpler:
str split """(?<=(\w))(?!\1)""" toList
In this regex I use positive lookbehind and negative lookahead for the captured char
def group(s: String): List[String] = s match {
case "" => Nil
case s => s.takeWhile(_==s.head) :: group(s.dropWhile(_==s.head))
}
Edit: Tail recursive version:
def group(s: String, result: List[String] = Nil): List[String] = s match {
case "" => result reverse
case s => group(s.dropWhile(_==s.head), s.takeWhile(_==s.head) :: result)
}
can be used just like the other because the second parameter has a default value and thus doesnt have to be supplied.
Make it one-liner:
scala> val str = "aaaabbcddddeeeeefff"
str: java.lang.String = aaaabbcddddeeeeefff
scala> str.groupBy(identity).map(_._2)
res: scala.collection.immutable.Iterable[String] = List(eeeee, fff, aaaa, bb, c, dddd)
UPDATE:
As #Paul mentioned about the order here is updated version:
scala> str.groupBy(identity).toList.sortBy(_._1).map(_._2)
res: List[String] = List(aaaa, bb, c, dddd, eeeee, fff)
You could use some helper functions like this:
val str = "aaaabbcddddeeeeefff"
def zame(chars:List[Char]) = chars.partition(_==chars.head)
def q(chars:List[Char]):List[List[Char]] = chars match {
case Nil => Nil
case rest =>
val (thesame,others) = zame(rest)
thesame :: q(others)
}
q(str.toList) map (_.mkString)
This should do the trick, right? No doubt it can be cleaned up into one-liners even further
A functional* solution using fold:
def group(s : String) : Seq[String] = {
s.tail.foldLeft(Seq(s.head.toString)) { case (carry, elem) =>
if ( carry.last(0) == elem ) {
carry.init :+ (carry.last + elem)
}
else {
carry :+ elem.toString
}
}
}
There is a lot of cost hidden in all those sequence operations performed on strings (via implicit conversion). I guess the real complexity heavily depends on the kind of Seq strings are converted to.
(*) Afaik all/most operations in the collection library depend in iterators, an imho inherently unfunctional concept. But the code looks functional, at least.
Starting Scala 2.13, List is now provided with the unfold builder which can be combined with String::span:
List.unfold("aaaabbaaacdeeffg") {
case "" => None
case rest => Some(rest.span(_ == rest.head))
}
// List[String] = List("aaaa", "bb", "aaa", "c", "d", "ee", "ff", "g")
or alternatively, coupled with Scala 2.13's Option#unless builder:
List.unfold("aaaabbaaacdeeffg") {
rest => Option.unless(rest.isEmpty)(rest.span(_ == rest.head))
}
// List[String] = List("aaaa", "bb", "aaa", "c", "d", "ee", "ff", "g")
Details:
Unfold (def unfold[A, S](init: S)(f: (S) => Option[(A, S)]): List[A]) is based on an internal state (init) which is initialized in our case with "aaaabbaaacdeeffg".
For each iteration, we span (def span(p: (Char) => Boolean): (String, String)) this internal state in order to find the prefix containing the same symbol and produce a (String, String) tuple which contains the prefix and the rest of the string. span is very fortunate in this context as it produces exactly what unfold expects: a tuple containing the next element of the list and the new internal state.
The unfolding stops when the internal state is "" in which case we produce None as expected by unfold to exit.
Edit: Have to read more carefully. Below is no functional code.
Sometimes, a little mutable state helps:
def group(s : String) = {
var tmp = ""
val b = Seq.newBuilder[String]
s.foreach { c =>
if ( tmp != "" && tmp.head != c ) {
b += tmp
tmp = ""
}
tmp += c
}
b += tmp
b.result
}
Runtime O(n) (if segments have at most constant length) and tmp.+= probably creates the most overhead. Use a string builder instead for strict runtime in O(n).
group("aaaabbcddeeeeeeffg")
> Seq[String] = List(aaaa, bb, c, dd, eeeeee, ff, g)
If you want to use scala API you can use the built in function for that:
str.groupBy(c => c).values
Or if you mind it being sorted and in a list:
str.groupBy(c => c).values.toList.sorted
I'm reading lines from a file
for (line <- Source.fromFile("test.txt").getLines) {
....
}
I basically want to get a list of paragraphs in the end. If a line is empty, that starts as a new paragraph, and I might want to parse some keyword - value pairs in the future.
The text file contains a list of entries like this (or something similar, like an Ini file)
User=Hans
Project=Blow up the moon
The slugs are going to eat the mustard. // multiline possible!
They are sneaky bastards, those slugs.
User=....
And I basically want to have a List[Project] where Project looks something like
class Project (val User: String, val Name:String, val Desc: String) {}
And the Description is that big chunk of text that doesn't start with a <keyword>=, but can stretch over any number of lines.
I know how to do this in an iterative style. Just do a list of checks for the keywords, and populate an instance of a class, and add it to a list to return later.
But I think it should be possible to do this in proper functional style, possibly with match case, yield and recursion, resulting in a list of objects that have the fields User, Project and so on. The class used is known, as are all the keywords, and the file format is not set in stone either. I'm mostly trying to learn better functional style.
You're obviously parsing something, so it might be the time to use... a parser!
Since your language seems to treat line breaks as significant, you will need to refer to this question to tell the parser so.
Apart from that, a rather simple implementation would be
import scala.util.parsing.combinator.RegexParsers
case class Project(user: String, name: String, description: String)
object ProjectParser extends RegexParsers {
override val whiteSpace = """[ \t]+""".r
def eol : Parser[String] = """\r?\n""".r
def user: Parser[String] = "User=" ~> """[^\n]*""".r <~ eol
def name: Parser[String] = "Project=" ~> """[^\n]*""".r <~ eol
def description: Parser[String] = repsep("""[^\n]+""".r, eol) ^^ { case l => l.mkString("\n") }
def project: Parser[Project] = user ~ name ~ description ^^ { case a ~ b ~ c => Project(a, b, c) }
def projects: Parser[List[Project]] = repsep(project,eol ~ eol)
}
And how to use it:
val sample = """User=foo1
Project=bar1
desc1
desc2
desc3
User=foo
Project=bar
desc4 desc5 desc6
desc7 desc8 desc9"""
import scala.util.parsing.input._
val reader = new CharSequenceReader(sample)
val res = ProjectParser.parseAll(ProjectParser.projects, reader)
if(res.successful) {
print("Found projects: " + res.get)
} else {
print(res)
}
Another possible implementation (since this parser is rather simple), using recursion:
import scala.io.Source
case class Project(user: String, name: String, desc: String)
#scala.annotation.tailrec
def parse(source: Iterator[String], list: List[Project] = Nil): List[Project] = {
val emptyProject = Project("", "", "")
#scala.annotation.tailrec
def parseProject(project: Option[Project] = None): Option[Project] = {
if(source.hasNext) {
val line = source.next
if(!line.isEmpty) {
val splitted = line.span(_ != '=')
parseProject(splitted match {
case (h, t) if h == "User" => project.orElse(Some(emptyProject)).map(_.copy(user = t.drop(1)))
case (h, t) if h == "Project" => project.orElse(Some(emptyProject)).map(_.copy(name = t.drop(1)))
case _ => project.orElse(Some(emptyProject)).map(project => project.copy(desc = (if(project.desc.isEmpty) "" else project.desc ++ "\n") ++ line))
})
} else project
} else project
}
if(source.hasNext) {
parse(source, parseProject().map(_ :: list).getOrElse(list))
} else list.reverse
}
And the test:
object Test {
def source = Source.fromString("""User=Hans
Project=Blow up the moon
The slugs are going to eat the mustard. // multiline possible!
They are sneaky bastards, those slugs.
User=Plop
Project=SO
Some desc""")
def test = println(parse(source.getLines))
}
Which gives:
List(Project(Hans,Blow up the moon,The slugs are going to eat the mustard. // multiline possible!
They are sneaky bastards, those slugs.), Project(Plop,SO,Some desc))
To answer your question without also tackling keyword parsing, fold over the lines and aggregate lines unless it's an empty one, in which case you start a new empty paragraph.
lines.foldLeft(List("")) { (l, x) =>
if (x.isEmpty) "" :: l else (l.head + "\n" + x) :: l.tail
} reverse
You'll notice this has some wrinkles in how it handles zero lines, and multiple and trailing empty lines. Adapt to your needs. Also if you are anal about string concatenations you can collect them in a nested list and flatten in the end (using .map(_.mkString)), this is just to showcase the basic technique of folding a sequence not to a scalar but to a new sequence.
This builds a list in reverse order because list prepend (::) is more efficient than appending to l in each step.
You're obviously building something, so you might want to try... a builder!
Like Jürgen, my first thought was to fold, where you're accumulating a result.
A mutable.Builder does the accumulation mutably, with a collection.generic.CanBuildFrom to indicate the builder to use to make a target collection from a source collection. You keep the mutable thing around just long enough to get a result. So that's my plug for localized mutability. Lest one assume that the path from List[String] to List[Project] is immutable.
To the other fine answers (the ones with non-negative appreciation ratings), I would add that functional style means functional decomposition, and usually small functions.
If you're not using regex parsers, don't neglect regexes in your pattern matches.
And try to spare the dots. In fact, I believe that tomorrow is a Spare the Dots Day, and people with sensitivity to dots are advised to remain indoors.
case class Project(user: String, name: String, description: String)
trait Sample {
val sample = """
|User=Hans
|Project=Blow up the moon
|The slugs are going to eat the mustard. // multiline possible!
|They are sneaky bastards, those slugs.
|
|User=Bob
|I haven't thought up a project name yet.
|
|User=Greta
|Project=Burn the witch
|It's necessary to escape from the witch before
|we blow up the moon. I hope Hans sees it my way.
|Once we burn the bitch, I mean witch, we can
|wreak whatever havoc pleases us.
|""".stripMargin
}
object Test extends App with Sample {
val kv = "(.*?)=(.*)".r
def nonnully(s: String) = if (s == null) "" else s + " "
val empty = Project(null, null, null)
val (res, dummy) = ((List.empty[Project], empty) /: sample.lines) { (acc, line) =>
val (sofar, cur) = acc
line match {
case kv("User", u) => (sofar, cur copy (user = u))
case kv("Project", n) => (sofar, cur copy (name = n))
case kv(k, _) => sys error s"Bad keyword $k"
case x if x.nonEmpty => (sofar, cur copy (description = s"${nonnully(cur.description)}$x"))
case _ if cur != empty => (cur :: sofar, empty)
case _ => (sofar, empty)
}
}
val ps = if (dummy == empty) res.reverse else (dummy :: res).reverse
Console println ps
}
The match can be mashed this way, too:
val (res, dummy) = ((List.empty[Project], empty) /: sample.lines) {
case ((sofar, cur), kv("User", u)) => (sofar, cur copy (user = u))
case ((sofar, cur), kv("Project", n)) => (sofar, cur copy (name = n))
case ((sofar, cur), kv(k, _)) => sys error s"Bad keyword $k"
case ((sofar, cur), x) if x.nonEmpty => (sofar, cur copy (description = s"${nonnully(cur.description)}$x"))
case ((sofar, cur), _) if cur != empty => (cur :: sofar, empty)
case ((sofar, cur), _) => (sofar, empty)
}
Before the fold, it seemed simpler to do paragraphs first. Is that imperative thinking?
object Test0 extends App with Sample {
def grafs(ss: Iterator[String]): List[List[String]] = {
val (g, rest) = ss dropWhile (_.isEmpty) span (_.nonEmpty)
val others = if (rest.nonEmpty) grafs(rest) else Nil
g.toList :: others
}
def toProject(ss: List[String]): Project = {
var p = Project("", "", "")
for (line <- ss; parts = line split '=') parts match {
case Array("User", u) => p = p.copy(user = u)
case Array("Project", n) => p = p.copy(name = n)
case Array(k, _) => sys error s"Bad keyword $k"
case Array(text) => p = p.copy(description = s"${p.description} $text")
}
p
}
val ps = grafs(sample.lines) map toProject
Console println ps
}
class Project (val User: String, val Name:String, val Desc: String) {}
object Project {
def apply(str: String): Project = {
val user = somehowFetchUserName(str)
val name = somehowFetchProjectName(str)
val desc = somehowFetchDescription(str)
new Project(user, name, desc)
}
}
val contents: Array[String] = Source.fromFile("test.txt").mkString.split("\\n\\n")
val list = contents map(Project(_))
will end up with the list of projects.
Let's say I want to parse a string with various opening and closing brackets (I used parentheses in the title because I believe it is more common -- the question is the same nevertheless) so that I get all the higher levels separated in a list.
Given:
[hello:=[notting],[hill]][3.4(4.56676|5.67787)][the[hill[is[high]]not]]
I want:
List("[hello:=[notting],[hill]]", "[3.4(4.56676|5.67787)]", "[the[hill[is[high]]not]]")
The way I am doing this is by counting the opening and closing brackets and adding to the list whenever I get my counter to 0. However, I have an ugly imperative code. You may assume that the original string is well formed.
My question is: what would be a nice functional approach to this problem?
Notes: I have thought of using the for...yield construct but given the use of the counters I cannot get a simple conditional (I must have conditionals just for updating the counters as well) and I do not know how I could use this construct in this case.
Quick solution using Scala parser combinator library:
import util.parsing.combinator.RegexParsers
object Parser extends RegexParsers {
lazy val t = "[^\\[\\]\\(\\)]+".r
def paren: Parser[String] =
("(" ~ rep1(t | paren) ~ ")" |
"[" ~ rep1(t | paren) ~ "]") ^^ {
case o ~ l ~ c => (o :: l ::: c :: Nil) mkString ""
}
def all = rep(paren)
def apply(s: String) = parseAll(all, s)
}
Checking it in REPL:
scala> Parser("[hello:=[notting],[hill]][3.4(4.56676|5.67787)][the[hill[is[high]]not]]")
res0: Parser.ParseResult[List[String]] = [1.72] parsed: List([hello:=[notting],[hill]], [3.4(4.56676|5.67787)], [the[hill[is[high]]not]])
What about:
def split(input: String): List[String] = {
def loop(pos: Int, ends: List[Int], xs: List[String]): List[String] =
if (pos >= 0)
if ((input charAt pos) == ']') loop(pos-1, pos+1 :: ends, xs)
else if ((input charAt pos) == '[')
if (ends.size == 1) loop(pos-1, Nil, input.substring(pos, ends.head) :: xs)
else loop(pos-1, ends.tail, xs)
else loop(pos-1, ends, xs)
else xs
loop(input.length-1, Nil, Nil)
}
scala> val s1 = "[hello:=[notting],[hill]][3.4(4.56676|5.67787)][the[hill[is[high]]not]]"
s1: String = [hello:=[notting],[hill]][3.4(4.56676|5.67787)][the[hill[is[high]]not]]
scala> val s2 = "[f[sad][add]dir][er][p]"
s2: String = [f[sad][add]dir][er][p]
scala> split(s1) foreach println
[hello:=[notting],[hill]]
[3.4(4.56676|5.67787)]
[the[hill[is[high]]not]]
scala> split(s2) foreach println
[f[sad][add]dir]
[er]
[p]
Given your requirements counting the parenthesis seems perfectly fine. How would you do that in a functional way? You can make the state explicitly passed around.
So first we define our state which accumulates results in blocks or concatenates the next block and keeps track of the depth:
case class Parsed(blocks: Vector[String], block: String, depth: Int)
Then we write a pure function that processed that returns the next state. Hopefully, we can just carefully look at this one function and ensure it's correct.
def nextChar(parsed: Parsed, c: Char): Parsed = {
import parsed._
c match {
case '[' | '(' => parsed.copy(block = block + c,
depth = depth + 1)
case ']' | ')' if depth == 1
=> parsed.copy(blocks = blocks :+ (block + c),
block = "",
depth = depth - 1)
case ']' | ')' => parsed.copy(block = block + c,
depth = depth - 1)
case _ => parsed.copy(block = block + c)
}
}
Then we just used a foldLeft to process the data with an initial state:
val data = "[hello:=[notting],[hill]][3.4(4.56676|5.67787)][the[hill[is[high]]not]]"
val parsed = data.foldLeft(Parsed(Vector(), "", 0))(nextChar)
parsed.blocks foreach println
Which returns:
[hello:=[notting],[hill]]
[3.4(4.56676|5.67787)]
[the[hill[is[high]]not]]
You have an ugly imperative solution, so why not make a good-looking one? :)
This is an imperative translation of huynhjl's solution, but just posting to show that sometimes imperative is concise and perhaps easier to follow.
def parse(s: String) = {
var res = Vector[String]()
var depth = 0
var block = ""
for (c <- s) {
block += c
c match {
case '[' => depth += 1
case ']' => depth -= 1
if (depth == 0) {
res :+= block
block = ""
}
case _ =>
}
}
res
}
Try this:
val s = "[hello:=[notting],[hill]][3.4(4.56676|5.67787)][the[hill[is[high]]not]]"
s.split("]\\[").toList
returns:
List[String](
[hello:=[notting],[hill],
3.4(4.56676|5.67787),
the[hill[is[high]]not]]
)
Is it there an equivalent of Pythons repr function in scala?
Ie a function which you can give any scala object an it will produce a string representation of the object which is valid scala code.
eg:
val l = List(Map(1 -> "a"))
print(repr(l))
Would produce
List(Map(1 -> "a"))
There is mostly only the toString method on every object. (Inherited from Java.) This may or may not result in a parseable representation. In most generic cases it probably won’t; there is no real convention for this as there is in Python but some of the collection classes at least try to. (As long as they are not infinite.)
The point where it breaks down is of course already reached when Strings are involved
"some string".toString == "some string"
however, for a proper representation, one would need
repr("some string") == "\"some string\""
As far as I know there is no such thing in Scala. Some of the serialisation libraries might be of some help for this, though.
Based on the logic at Java equivalent of Python repr()?, I wrote this little function:
object Util {
def repr(s: String): String = {
if (s == null) "null"
else s.toList.map {
case '\0' => "\\0"
case '\t' => "\\t"
case '\n' => "\\n"
case '\r' => "\\r"
case '\"' => "\\\""
case '\\' => "\\\\"
case ch if (' ' <= ch && ch <= '\u007e') => ch.toString
case ch => {
val hex = Integer.toHexString(ch.toInt)
"\\u%s%s".format("0" * (4 - hex.length), hex)
}
}.mkString("\"", "", "\"")
}
}
I've tried it with a few values and it seems to work, though I'm pretty sure sticking in a Unicode character above U+FFFF would cause problems.
If you deal with case classes, you can mix in the following trait StringMaker, so that calling toString on such case classes will work even if their arguments are strings:
trait StringMaker {
override def toString = {
this.getClass.getName + "(" +
this.getClass.getDeclaredFields.map{
field =>
field.setAccessible(true)
val name = field.getName
val value = field.get(this)
value match {
case s: String => "\"" + value + "\"" //Or Util.repr(value) see the other answer
case _ => value.toString
}
}
.reduceLeft{_+", "+_} +
")"
}
}
trait Expression
case class EString(value: String, i: Int) extends Expression with StringMaker
case class EStringBad(value: String, i: Int) extends Expression //w/o StringMaker
val c_good = EString("641", 151)
val c_bad = EStringBad("641", 151)
will result in:
c_good: EString = EString("641", 151)
c_bad: EStringBad = EStringBad(641,151)
So you can parse back the firsst expression, but not the first one.
No, there is no such feature in Scala.