scala combinator parser to keep original input - scala

I would like to compose a parser from another parser to have the consumed input as an argument to the ast construction.
Say I have
def ingredient = amount ~ nameOfIngredient ^^ {
case amount ~ name => Ingredient(name, amount)
}
What I'm looking for is a way to have another parser to construct an element of:
case class RecipeRow(orginalText: String, ingredient: Ingredient)
So I'm looking for a way to retrieve the original consumed input to the parser in a composition. Maybe something like:
def recipeRow = ingredient withConsumedInput ^^ {
case (amount ~ name, consumed) => RecipeRow(consumed, Ingredient(name, amount))
}
I guess the signature in this case would be:
def withConsumedInput [U](p: => Parser[U]): Parser[(U, String)]
Is there another simple way to get what I want or do I need to write that thing? It feels like it probably is a better way…

Not easy, actually.
Let's start with Parser: what can it give us? Well, a Parser extends Input => ParseResult, so we have to extract the information from either one.
The type Input is an alias, on RegexParsers anyway, for scala.util.parsing.input.Reader[Char]. There's very little there to help us, unless it happens to be a Reader of CharSequence, in which case we can use source and offset. Let's use that then.
Now, ParseResult has many subclasses, but we are only interested in Success, which has a next: Input field. Using that, we may try this:
def withConsumedInput [U](p: => Parser[U]): Parser[(U, String)] = new Parser[(U, String)] {
def apply(in: Input) = p(in) match {
case Success(result, next) =>
val parsedString = in.source.subSequence(in.offset, next.offset).toString
Success(result -> parsedString, next)
case other: NoSuccess => other
}
}
It will capture any skipped whitespaces, though. You can adapt it to avoid that automatically:
def withConsumedInput [U](p: => Parser[U]): Parser[(U, String)] = new Parser[(U, String)] {
def apply(in: Input) = p(in) match {
case Success(result, next) =>
val parsedString = in.source.subSequence(handleWhiteSpace(in.source, in.offset), next.offset).toString
Success(result -> parsedString, next)
case other: NoSuccess => other
}
}

Related

Scala Function Chaining and handle failure

I have many functions in my code defined with return type as Either[Throwable, String] and all of them have one argument of type String. Three representative functions of my code are defined as:
val f1 = (input: String) => {
/* Processing from the input using a library in my actual code returns a Either[Throwable, String] */
if (input == "a") Left(new Exception(input))
else Right("Success")
}
val f2 = (input: String) => {
if (input == "b") Left(new Exception(input))
else Right("Success")
}
val f3 = (input: String) => {
if (input == "c") Left(new Exception(input))
else Right("Success")
}
To chain the function outputs, I'm writing code like:
def x(input: String) = f1(input) match {
case Left(value) => Left(value)
case Right(_) => f2(input) match {
case Left(value) => Left(value)
case Right(_) => f3(input)
}
}
Since this is just three functions so this might look like a short code. However there are multiple such matches that are happening in my code, so it's a very long code. I am looking to avoid such a chaining.
I know that Scala has a way to chain functions like f1.andThen(f2).andThen(f3), however the problem is that in each andThen we need to pass the same argument, in this case being input. However I want to chain these functions so that if there is a Left output, it should not go to the next andThen.
I believe this can be simplified using Functional Programming, but I don't know where to start. Is there a way we can achieve this using Scala functional programming?
If you have cats in scope, then all you need to do is this:
import cats.syntax.all._
val functions: List[String => Either[Throwable, Unit]] = List(
// put your functions here.
)
val result: Either[Throwable, Unit] =
functions.traverse_(f => f(input))
Otherwise, you may emulate it using this:
val init: Either[Throwable, Unit] = Right(())
functions.foldLeft(init) {
case (acc, f) =>
acc.flatMap(_ => f(input))
}

Optional parse from stream with State monad

I'm new to cats. I'm creating State instances to handle deserialisation of types from a byte stream. e.g.
val int: State[Seq[Byte], Int] = State[Seq[Byte], Int] {
case bs if bs.length >= 4 =>
bs.drop(4) -> ByteBuffer.wrap(bs.take(4).toArray).getInt
case _ => throw new EOFException()
}
I have implemented a parser of Option[Int] in terms of the above, like so:
val unit: State[Seq[Byte], Unit] = State[Seq[Byte], Unit](_ -> Unit)
val optInt: State[Seq[Byte], Option[Int]] = int.flatMap(i =>
if (i == 1) int.map(Some(_)) else unit.map(_ => None)
)
I feel that I've missed a trick here, as the implementation seems too verbose. Can I write this more succinctly? Can I do away with needing to define unit?
I wouldn't say that's too verbose, but I'd do two tricks with this:
Replace conditional with pattern matching function
Use State.pure instead of manually creating/transforming State values such as your unit.
val optInt: State[Seq[Byte], Option[Int]] = int.flatMap {
case 1 => int.map(Some(_))
case _ => State.pure(None)
}

Map with different types to String

I am trying to learn some functional programming in Scala.
I have this Map:
val params: Map[String, QueryMap] = Map(
"a" -> SimpleQueryVal("1"),
"b" -> ComplexQueryVal("2", "3")
)
Where QueryMap is (might not be the best approach):
sealed trait QueryMap
case class SimpleQueryVal(value: String) extends QueryMap
case class ComplexQueryVal(values: String*) extends QueryMap
My result would be having a string like query parameters: ?a=1&b=2&b=3
I tried something, but my method return an Iterator[String] even I use mkString, looks ugly and I am sure that there's a very simple way of doing it.
def paramString(queryMap: Map[String, QueryMap]) = queryMap.keys.map { key =>
val params = queryMap(key) match {
case SimpleQueryVal(x) => "%s=%s".format(key, x)
case complexQuery: ComplexQueryVal => complexQuery.values.map { value =>
"%s=%s".format(key, value)
}
}
val result: String = params match {
case s: String => s + "&"
case s: ArrayBuffer[_] => s.mkString("&")
}
result.mkString
}
I would appreciate any idea that would make me learn something for today. :)
I think the result String can be built in a simpler, more straight forward, manner.
def paramString(queryMap: Map[String, QueryMap]): String = queryMap.map{
case (k, sq: SimpleQueryVal) => s"$k=${sq.value}"
case (k, cq: ComplexQueryVal)=> cq.values.map(k + "=" + _).mkString("&")
}.mkString("&")
A little cleaner:
def paramString(queryMap: Map[String, QueryMap]) = queryMap.flatMap {
case (key, SimpleQueryVal(x)) => Seq(s"$key=$x")
case (key, ComplexQueryVal(values # _*)) => values.map {v =>
s"$key=$v"
}
}.mkString("&")
No need for ArrayBuffer or to repeat the .mkString("&").
Keep in mind that this is good for just learning. If you're actually trying to handle HTTP query string parameters, you need to URLEncode the keys and the values and there's probably better libraries for that.
Try this:
def paramString(queryMap: Map[String, QueryMap]) = {
val qParams = queryMap.keys.map { key =>
queryMap(key) match {
case SimpleQueryVal(x) => "%s=%s".format(key, x)
case complexQuery: ComplexQueryVal => complexQuery.values.map { value =>
"%s=%s".format(key, value)
}.mkString("&")
}
}
qParams.mkString("&")
}
println(paramString(params))
Here, first you get a Set[String] like a=1 or b=2&b=3. Then you simply do another .mkString("&") to concatenate them all.

Scala check a Sequence of Eithers

I want to update a sequence in Scala, I have this code :
def update(userId: Long): Either[String, Int] = {
Logins.findByUserId(userId) map {
logins: Login => update(login.id,
Seq(NamedParameter("random_date", "prefix-" + logins.randomDate)))
} match {
case sequence : Seq(Nil, Int) => sequence.foldLeft(Right(_) + Right(_))
case _ => Left("error.logins.update")
}
}
Where findByUserId returns a Seq[Logins] and update returns Either[String, Int] where Int is the number of updated rows,
and String would be the description of the error.
What I want to achieve is to return an String if while updating the list an error happenes or an Int with the total number of updated rows.
The code is not working, I think I should do something different in the match, I don't know how I can check if every element in the Seq of Eithers is a Right value.
If you are open to using Scalaz or Cats you can use traverse. An example using Scalaz :
import scalaz.std.either._
import scalaz.std.list._
import scalaz.syntax.traverse._
val logins = Seq(1, 2, 3)
val updateRight: Int => Either[String, Int] = Right(_)
val updateLeft: Int => Either[String, Int] = _ => Left("kaboom")
logins.toList.traverseU(updateLeft).map(_.sum) // Left(kaboom)
logins.toList.traverseU(updateRight).map(_.sum) // Right(6)
Traversing over the logins gives us a Either[String, List[Int]], if we get the sum of the List we get the wanted Either[String, Int].
We use toList because there is no Traverse instance for Seq.
traverse is a combination of map and sequence.
We use traverseU instead of traverse because it infers some of the types for us (otherwise we should have introduced a type alias or a type lambda).
Because we imported scalaz.std.either._ we can use map directly without using a right projection (.right.map).
You shouldn't really use a fold if you want to exit early. A better solution would be to recursively iterate over the list, updating and counting successes, then return the error when you encounter one.
Here's a little example function that shows the technique. You would probably want to modify this to do the update on each login instead of just counting.
val noErrors = List[Either[String,Int]](Right(10), Right(12))
val hasError = List[Either[String,Int]](Right(10), Left("oops"), Right(12))
def checkList(l: List[Either[String,Int]], goodCount: Int): Either[String, Int] = {
l match {
case Left(err) :: xs =>
Left(err)
case Right(_) :: xs =>
checkList(xs, (goodCount + 1))
case Nil =>
Right(goodCount)
}
}
val r1 = checkList(noErrors, 0)
val r2 = checkList(hasError, 0)
// r1: Either[String,Int] = Right(2)
// r2: Either[String,Int] = Left(oops)
You want to stop as soon as an update fails, don't you?
That means that you want to be doing your matching inside the map, not outside. Try is actually a more suitable construct for this purpose, than Either. Something like this, perhaps:
def update(userId: Long): Either[String, Int] = Try {
Logins.findByUserId(userId) map { login =>
update(login.id, whatever) match {
case Right(x) => x
case Left(s) => throw new Exception(s)
}
}.sum
}
.map { n => Right(n) }
.recover { case ex => Left(ex.getMessage) }
BTW, a not-too-widely-known fact about scala is that putting a return statement inside a lambda, actually returns from the enclosing method. So, another, somewhat shorter way to write this would be like this:
def update(userId: Long): Either[String, Int] =
Logins.findByUserId(userId).foldLeft(Right(0)) { (sum,login) =>
update(login.id, whatever) match {
case Right(x) => Right(sum.right + x)
case error#Left(s) => return error
}
}
Also, why in the world does findUserById return a sequence???

Scala type mismatch while trying to pass a function

I need some help trying to figure out how to reuse a pattern match that I would rather not repeat (if possible). I have searched here and google, experimented with implicits and variance but to no result so far.
In the below are 2 methods, doSomething and doSomethingElse that contain the same pattern match on Ids. I would like to reuse the pattern by passing in a function.
This the initial setup. (The actual implementations of toPath and take2 are not really relevant.)
import java.nio.file.{Paths, Path}
import java.util.UUID
def take2(x: Long): String = {
(x % 100).toString.padTo(2, '0')
}
def take2(u: UUID): String = {
u.toString.take(2)
}
def toPath(x: Long): Path = {
Paths.get(s"$x")
}
def toPath(u: UUID): Path = {
Paths.get(u.toString)
}
case class Ids(id1: Option[Long], id2: Option[UUID])
def doSomething(ids: Ids): String = ids match {
case Ids(_, Some(uuid)) => take2(uuid)
case Ids(Some(long), _) => take2(long)
}
def doSomethingElse(ids: Ids) = ids match {
case Ids(_, Some(uuid)) => toPath(uuid)
case Ids(Some(long), _) => toPath(long)
}
doSomething(Ids(Some(12345L), None))
doSomethingElse(Ids(Some(12345L), None))
What I would like is for something like this to work:
def execute[A](ids: Ids)(f: Any => A): A = ids match {
case Ids(_, Some(uuid)) => f(uuid)
case Ids(Some(long), _) => f(long)
}
def _doSomething(ids: Ids) = execute[String](ids)(take2)
//def _doSomething2(ids: Ids) = execute[Path](ids)(toPath)
The error I get is:
Error: ... type mismatch;
found : (u: java.util.UUID)String <and> (x: Long)String
required: Any => String
def _doSomething(ids: Ids) = execute[String](ids)(take2)
^ ^
How can I make these function types work please?
My Scala version 2.11.2.
Worksheet I have been using:
https://github.com/lhohan/scala-pg/blob/0f1416a6c1d3e26d248c0ef2de404bab76ac4e57/src/main/scala/misc/MethodPassing.sc
Any help or pointers are kindly appreciated.
The problem is that you have two different methods that just happen to share the same name, e.g. "take2". When you try to use take2 you certainly aren't providing a function that can handle any argument type (as Any => A demands); you can't even handle the two types you want since they are two different methods!
In your original match statement you don't notice that the two methods are two methods that share the same name because the compiler fills in the correct method based on the argument type. There isn't a feature that says, "plug in the name I supply and then stick in different methods". (Well, you could do it with macros, but that's awfully complicated to avoid a little bit of repetition.)
Now the compiler is smart enough to make a function out of the method you want. So if you wrote
def execute[A](ids: Ids)(f1: UUID => A, f2: Long => A): A = ids match {
case Ids(_, Some(uuid)) => f1(uuid)
case Ids(Some(long), _) => f2(long)
}
you could then
def doSomething(ids: Ids) = execute[String](ids)(take2, take2)
which would reduce the repetition a bit.
Alternatively, you could write
import scala.util._
def take2(x: Long): String = (x % 100).toString.padTo(2, '0')
def take2(u: UUID): String = u.toString.take(2)
def take2(ul: Either[UUID, Long]): String = ul match {
case Left(u) => take2(u)
case Right(l) => take2(l)
}
(be sure to use :paste if you try this out in the REPL so all three get defined together).
Then you write
def execute[A](ids: Ids)(f: Either[UUID, Long] => A): A = ids match {
case Ids(_, Some(uuid)) => f(Left(uuid))
case Ids(Some(long), _) => f(Right(long))
}
and the correct one of the three take2s will be used. (There is a runtime penalty associated with the extra boxing of the arguments, but I doubt that this is a performance-critical code path.)
There are other options as well--Shapeless, for instance, provides union types. Or you can do runtime pattern matching and throw an exception if you pass something that is neither UUID nor Long...but that is liable to be a recipe for trouble later.