Scala parameters pattern (Spray routing example) - scala

Sorry about the vague title...wasn't sure how to characterize this.
I've seen/used a certain code construction in Scala for some time but I don't know how it works. It looks like this (example from Spray routing):
path( "foo" / Segment / Segment ) { (a,b) => { // <-- What's this style with a,b?
...
}}
In this example, the Segements in the path are bound to a and b respectively inside the associated block. I know how to use this pattern but how does it work? Why didn't it bind something to "foo"?
I'm not so interested in how spray works for my purpose here, but what facility of Scala is this, and how would I write my own?

This code is from a class that extends Directives. So all methods of Directives are in scope.
PathMatcher
There is no method / in String, so an implicit conversion is used to convert String to PathMatcher0 (PathMatcher[HNil]) with method /.
Method / takes a PathMatcher and returns a PathMatcher.
Segment is a PathMatcher1[String] (PathMatcher[String :: HNil]).
Method / of PathMatcher[HNil] with PathMatcher[String :: HNil] parameter returns a PathMatcher[String :: HNil].
Method / of PathMatcher[String :: HNil] with PathMatcher[String :: HNil] parameter returns a PathMatcher[String :: String :: HNil]. It's black magic from shapeless. See heterogenous lists concatenation; it is worth reading.
Directive
So you are calling method path with PathMatcher[String :: String :: HNil] as a parameter. It returns a Directive[String :: String :: HNil].
Then you are calling method apply on Directive with Function2[?, ?, ?] ((a, b) => ..) as a parameter. There is an appropriate implicit conversion (see black magic) for every Directive[A :: B :: C ...] that creates an object with method apply((a: A, b: B, c: C ...) => Route).
Parsing
PathMatcher contains rules for path parsing. It returns its result as an HList.
The "foo" matcher matches a String and ignores it (returns HNil).
The A / B matcher combines 2 matchers (A and B) separated by a "/" string. It concatenates the results of A and B using HList concatenation.
The Segment matcher matches a path segment and returns it as a String :: HNil.
So "foo" / Segment / Segment matches a path of 3 segments, ignores the first one and returns the remaining segments as String :: String :: HNil.
Then black magic allows you to use Function2[String, String, Route] ((String, String) => Route) to process String :: String :: HNil. Without such magic you would have to use the method like this: {case a :: b :: HNil => ...}.
Black magic
As #AlexIv noted:
There is an implicit conversion pimpApply for every Directive[A :: B :: C ...] that creates an object with method apply((a: A, b: B, c: C ...) => Route).
It accepts ApplyConverter implicitly. Type member In of ApplyConverter represents an appropriate function (A, B, C ...) => Route for every Directive[A :: B :: C ...].
There is no way to create such implicit values without macros or boilerplate code. So sbt-boilerplate is used for ApplyConverter generation. See ApplyConverterInstances.scala.

Senia's answer is helpful in understanding the Spray-routing directives and how they use HLists to do their work. But I get the impression you were really just interested in the Scala constructs used in
path( "foo" / Segment / Segment ) { (a,b) => ... }
It sounds as though you are interpreting this as special Scala syntax that in some way connects those two Segment instances to a and b. That is not the case at all.
path( "foo" / Segment / Segment )
is just an ordinary call to path with a single argument, an expression involving two calls to a / method. Nothing fancy, just an ordinary method invocation.
The result of that call is a function which wants another function -- the thing you want to happen when a matching request comes in -- as an argument. That's what this part is:
{ (a,b) => ... }
It's just a function with two arguments. The first part (the invocation of path) and the second part (what you want done when a matching message is received) are not syntactically connected in any way. They are completely separate to Scala. However, Spray's semantics connects them: the first part creates a function that will call the second part when a matching message is received.

Some additional note to senia answer, which is really good.
When you write something like this:
path("foo" / Segment / Segment) { (a,b) => {...} }
you are calling apply method on Directive like senia wrote, but there is no apply method in Directive, so spray is using implicit conversion to happly method. As you can pimpApply is implemented with typeclass pattern ApplyConverter, which is defined only for Directive0 by default. As you can see it's companion object extends ApplyConverterInstances, which is generated with sbt-bolierplate plugin

Why don't you look into source?
As for me, it could be implemented as follows
method path takes arbitrary type parameter, some pattern-object of that type and a function from that type:
def path[T](pattern:Pattern[T])(function:Function[T, `some other type like unit or any`])
the pattern is constructed with two tricks.
The String is either "pimped" to have method / or has an implicit convertion to Pattern[Nothing]
the Pattern[T] has method / that constructs another pattern with some new type. The method takes single argument (some ancestor of Segment). I guess — Pattern[T2]:
trait Pattern[T] {
///
def `/`[T2](otherPattern:Pattern[T2]):Pattern[(T,T2)]
}
So the first argument of path allows to determine the constructed type of pattern as being the pair. Thus we get proper type for the second argument.
The actual matching work is done inside path. I thought it was out of the questions scope.

Related

Scala pattern matching not working with Option[Seq[String]] [duplicate]

This question already has answers here:
How do I get around type erasure on Scala? Or, why can't I get the type parameter of my collections?
(11 answers)
Closed 3 months ago.
I am new to Scala(2.13.8) and working on code to use pattern matching to handle a value in different ways, code is very simply like below
def getOption(o: Option[Any]): Unit = {
o match {
case l: Some[List[String]] => handleListData(l)
case _ => handleData(_)
}
}
getOption(Some(3))
getOption(Some(Seq("5555")))
The result is handleListData() been invoked for both input. Can someone help on what's wrong in my code?
As sarveshseri mentioned in the comments, the problem here is caused by type erasure. When you compile this code, scalac issues a warning:
[warn] /Users/tmoore/IdeaProjects/scala-scratch/src/main/scala/PatternMatch.scala:6:15: non-variable type argument List[String] in type pattern Some[List[String]] is unchecked since it is eliminated by erasure
[warn] case l: Some[List[String]] => handleListData(l)
[warn] ^
This is because the values of type parameters are not available at runtime due to erasure, so this case is equivalent to:
case l: Some[_] => handleListData(l.asInstanceOf[Some[List[String]]])
This may fail at runtime due to an automatically-inserted cast in handleListData, depending on how it actually uses its argument.
One thing you can do is take advantage of destructuring in the case pattern in order to do a runtime type check on the content of the Option:
case Some(l: List[_]) => handleListData(l)
This will work with a handleListData with a signature like this:
def handleListData(l: List[_]): Unit
Note that it unwraps the Option, which is most likely more useful than passing it along.
However, it does not check that the List contains strings. To do so would require inspecting each item in the list. The alternative is an unsafe cast, made with the assumption that the list contains strings. This opens up the possibility of runtime exceptions later if the list elements are cast to strings, and are in fact some other type.
This change also reveals a problem with the second case:
case _ => handleData(_)
This does not do what you probably think it does, and issues its own compiler warning:
warn] /Users/tmoore/IdeaProjects/scala-scratch/src/main/scala/PatternMatch.scala:7:28: a pure expression does nothing in statement position
[warn] case _ => handleData(_)
[warn] ^
What does this mean? It's telling us that this operation has no effect. It does not invoke the handleData method with o as you might think. This is because the _ character has special meaning in Scala, and that meaning depends on the context where it's used.
In the pattern match case _, it is a wildcard that means "match anything without binding the match to a variable". In the expression handleData(_) it is essentially shorthand for x => handleData(x). In other words, when this case is reached, it evaluates to a Function value that would invoke handleData when applied, and then discards that value without invoking it. The result is that any value of o that doesn't match the first case will have no effect, and handleData is never called.
This can be solved by using o in the call:
case _ => handleData(o)
or by assigning a name to the match:
case x => handleData(x)
Returning to the original problem: how can you call handleListData only when the argument contains a List[String]? Since the type parameter is erased at runtime, this requires some other kind of runtime type information to differentiate it. A common approach is to define a custom algebraic data type instead of using Option:
object PatternMatch {
sealed trait Data
case class StringListData(l: List[String]) extends Data
case class OtherData(o: Any) extends Data
def handle(o: Data): Unit = {
o match {
case StringListData(l) => handleListData(l)
case x => handleData(x)
}
}
def handleListData(l: List[String]): Unit = println(s"Handling string list data: $l")
def handleData(value: Any): Unit = println(s"Handling data: $value")
def main(args: Array[String]): Unit = {
PatternMatch.handle(OtherData(3))
PatternMatch.handle(StringListData(List("5555", "6666")))
PatternMatch.handle(OtherData(List(7777, 8888)))
PatternMatch.handle(OtherData(List("uh oh!")))
/*
* Output:
* Handling data: OtherData(3)
* Handling string list data: List(5555, 6666)
* Handling data: OtherData(List(7777, 8888))
* Handling data: OtherData(List(uh oh!))
*/
}
}
Note that it's still possible here to create an instance of OtherData that actually contains a List[String], in which case handleData is called instead of handleListData. You would need to be careful not to do this when creating the Data passed to handle. This is the best you can do if you really need to handle Any in the default case. You can also extend this pattern with other special cases by creating new subtypes of Data, including a case object to handle the "empty" case, if needed (similar to None for Option):
case object NoData extends Data
// ...
PatternMatch.handle(NoData) // prints: 'Handling data: NoData'

Demystifying a function definition

I am new to Scala, and I hope this question is not too basic. I couldn't find the answer to this question on the web (which might be because I don't know the relevant keywords).
I am trying to understand the following definition:
def functionName[T <: AnyRef](name: Symbol)(range: String*)(f: T => String)(implicit tag: ClassTag[T]): DiscreteAttribute[T] = {
val r = ....
new anotherFunctionName[T](name.toString, f, Some(r))
}
First , why is it defined as def functionName[...](...)(...)(...)(...)? Can't we define it as def functionName[...](..., ..., ..., ...)?
Second, how does range: String* from range: String?
Third, would it be a problem if implicit tag: ClassTag[T] did not exist?
First , why is it defined as def functionName...(...)(...)(...)? Can't we define it as def functionName[...](..., ..., ..., ...)?
One good reason to use currying is to support type inference. Consider these two functions:
def pred1[A](x: A, f: A => Boolean): Boolean = f(x)
def pred2[A](x: A)(f: A => Boolean): Boolean = f(x)
Since type information flows from left to right if you try to call pred1 like this:
pred1(1, x => x > 0)
type of the x => x > 0 cannot be determined yet and you'll get an error:
<console>:22: error: missing parameter type
pred1(1, x => x > 0)
^
To make it work you have to specify argument type of the anonymous function:
pred1(1, (x: Int) => x > 0)
pred2 from the other hand can be used without specifying argument type:
pred2(1)(x => x > 0)
or simply:
pred2(1)(_ > 0)
Second, how does range: String* from range: String?
It is a syntax for defining Repeated Parameters a.k.a varargs. Ignoring other differences it can be used only on the last position and is available as a scala.Seq (here scala.Seq[String]). Typical usage is apply method of the collections types which allows for syntax like SomeDummyCollection(1, 2, 3). For more see:
What does `:_*` (colon underscore star) do in Scala?
Scala variadic functions and Seq
Is there a difference in Scala between Seq[T] and T*?
Third, would it be a problem if implicit tag: ClassTag[T] did not exist?
As already stated by Aivean it shouldn't be the case here. ClassTags are automatically generated by the compiler and should be accessible as long as the class exists. In general case if implicit argument cannot be accessed you'll get an error:
scala> import scala.concurrent._
import scala.concurrent._
scala> val answer: Future[Int] = Future(42)
<console>:13: error: Cannot find an implicit ExecutionContext. You might pass
an (implicit ec: ExecutionContext) parameter to your method
or import scala.concurrent.ExecutionContext.Implicits.global.
val answer: Future[Int] = Future(42)
Multiple argument lists: this is called "currying", and enables you to call a function with only some of the arguments, yielding a function that takes the rest of the arguments and produces the result type (partial function application). Here is a link to Scala documentation that gives an example of using this. Further, any implicit arguments to a function must be specified together in one argument list, coming after any other argument lists. While defining functions this way is not necessary (apart from any implicit arguments), this style of function definition can sometimes make it clearer how the function is expected to be used, and/or make the syntax for partial application look more natural (f(x) rather than f(x, _)).
Arguments with an asterisk: "varargs". This syntax denotes that rather than a single argument being expected, a variable number of arguments can be passed in, which will be handled as (in this case) a Seq[String]. It is the equivalent of specifying (String... range) in Java.
the implicit ClassTag: this is often needed to ensure proper typing of the function result, where the type (T here) cannot be determined at compile time. Since Scala runs on the JVM, which does not retain type information beyond compile time, this is a work-around used in Scala to ensure information about the type(s) involved is still available at runtime.
Check currying:Methods may define multiple parameter lists. When a method is called with a fewer number of parameter lists, then this will yield a function taking the missing parameter lists as its arguments.
range:String* is the syntax for varargs
implicit TypeTag parameter in Scala is the alternative for Class<T> clazzparameter in Java. It will be always available if your class is defined in scope. Read more about type tags.

Scala: How can I create a function that allows me to use dot notation when calling it?

I have been confused about this for a while, even despite reading the Scala Style Guide - Method Invocation several times.
I want to be able to call this method
def foldRVL[A,B](l: List[A], z: B)(f: (A, B) => B) = //"Right-Via-Left"
l.reverse.foldLeft(z)((a, b) => f(b, a))
using dot notation like this List(1,2,3).foldRVL(0)(_ + _).
And not like this: foldRVL(List(1,2,3), 0)(_ + _).
Also, sometimes I've seen code that shows methods that actually either takes zero parameters in their signature, or one fewer parameters than I would expect them to take, and still properly take a parameter using dot-notation. How does this work? I ask this because those methods work with dot-notation, so maybe if I wrote something like that I could solve my problem.
For the first part of your question, you probably need to look at implicit classes:
implicit class RichRVLList[A](l:List[A]) {
def foldRVL[B](z: B)(f: (A, B) => B) = //"Right-Via-Left"
l.reverse.foldLeft(z)((a, b) => f(b, a))
}
List(1,2,3).foldRVL(1)(_ + _) // output: res0: Int = 7
You can "enrich" existent class using implicit wrapper to "add" new methods.
As for the second part, probably you want implicit parameters. Implicit parameters are deduced from the current scope by type. There are some predefined implicit values, such as Numerics, that were used in the example below:
def product[T](els:TraversableOnce[T])(implicit num:Numeric[T]) = {
els.fold(num.one)((x1, x2) => num.times(x1, x2))
}
product(List(1, 2, 3)) // res1: Int = 6
product(List(1, 2.5, 3)) //res2: Double = 7.5
soong pointed out that I'm actually seeking the 'Pimp my library' pattern, which I then looked up, and implemented like this to handle the method in question:
implicit class BlingList[+A](l: List[A]) {
def foldRVL[B](z: B)(f: (A, B) => B): B = //"Right-Via-Left"
l.foldLeft(z)((a, b) => f(b, a))
}
As far as I understand, the key element to allowing dot-notation is having a class construct that takes parameters of the type that I want to have 'pimped'. In this case, I have a list, and I want to call foldRVL on the list after I write the list down, like this:
List(something).foldRVL(z)(f: A => B).
Therefore, I need a class that takes a List[A] parameter for me to be able to write a method like that in the first code snippet.
The implicit keyword is used so that I can add methods to the existing class List without having to create a separate Library of methods. Anytime a List is found prefixed before foldRVL, it will be implicitly converted into a BlingList because the compiler will see a List attached to a method that doesn't exist in class List. It will therefore look for any implicit methods defined in scope that have a foldRVL method and take a List as an argument, and it finds that the implicit class BlingList has the method foldRVL defined and takes a List[A]. Therefore, I can now write:
List(1,2,3).foldRVL(0)(_ + _) // in some IDE's, foldRVL will be underlined to show that
res0: Int = 6 // an implicit conversion is being made
"A Scala 2.10 implicit class example" goes into more depth about this. My favorite pointer from that post is to put all the implicit classes that you expect to be using in your current package and any subpackages inside of a package object, that way you don't have to clutter any of your classes or objects with implicit class definitions, nor do you have to import them. The fact that they are using the same package will automatically import them thanks to the package object.

Scala: Why does function need type variable in front?

From working on the first problem of the 99 Scala Puzzles I defined my own version of last like so:
def last[A](xs: List[A]): A = xs match {
case x :: Nil => x
case x :: xs => last(xs)
}
My question is: Why is it necessary for last to be directly followed by the type variable, as in last[A]? Why couldn't the compiler just do the right thing if I wrote the function like so:
def last(xs: List[A]): A
.....
(leaving the [A] off the end of last[A]?)
And if the compiler is capable of figuring it out, then what is the rationale for designing the language this way?
A appears 3 times:
last[A]
List[A]
: A(after the argument list)
The 2nd one is needed to specify that the List contains objects of type A. The 3rd one is needed to specify that the function returns an object of type A.
The 1st one is where you actually declare A, so it could be used in the other two places.
You need to write last[A] because A does not exist. Since it does not exist, by declaring it after the name of the function you actually get a chance to define some expectations or constraints for this type.
For example: last[A <: Int] to enforce the fact that A has to be a subtype of Int
Once it's declared, you can use it to define the type of your parameters and your return type.
I got an insight from #Lee's comment:
How would the compiler know that the A in List[A] doesn't refer to an
actual type called A?
To demonstrate to myself that this made sense, I tried substituting the type variable A, with the name of an actual type String, and then passed the function a List[Int], seeing that when last is declared like def last[String](xs: List[String]): String, I was able to pass last a List[Int]:
scala> def last[String](xs: List[String]): String = xs match {
| case x :: Nil => x
| case x :: xs => last(xs)
| }
last: [String](xs: List[String])String
scala> last(List(1,2,3,4))
res7: Int = 4
Therefore proving the identifier String does behave like a type variable, and does not reference the concrete type String.
It would also make debugging more difficult if the compiler just assumed that any identifier not in scope was a type variable. It therefore, makes sense to have to declare it at the beginning of the function definition.

Is there any fundamental limitations that stops Scala from implementing pattern matching over functions?

In languages like SML, Erlang and in buch of others we may define functions like this:
fun reverse [] = []
| reverse x :: xs = reverse xs # [x];
I know we can write analog in Scala like this (and I know, there are many flaws in the code below):
def reverse[T](lst: List[T]): List[T] = lst match {
case Nil => Nil
case x :: xs => reverse(xs) ++ List(x)
}
But I wonder, if we could write former code in Scala, perhaps with desugaring to the latter.
Is there any fundamental limitations for such syntax being implemented in the future (I mean, really fundamental -- e.g. the way type inference works in scala, or something else, except parser obviously)?
UPD
Here is a snippet of how it could look like:
type T
def reverse(Nil: List[T]) = Nil
def reverse(x :: xs: List[T]): List[T] = reverse(xs) ++ List(x)
It really depends on what you mean by fundamental.
If you are really asking "if there is a technical showstopper that would prevent to implement this feature", then I would say the answer is no. You are talking about desugaring, and you are on the right track here. All there is to do is to basically stitch several separates cases into one single function, and this can be done as a mere preprocessing step (this only requires syntactic knowledge, no need for semantic knowledge). But for this to even make sense, I would define a few rules:
The function signature is mandatory (in Haskell by example, this would be optional, but it is always optional whether you are defining the function at once or in several parts). We could try to arrange to live without the signature and attempt to extract it from the different parts, but lack of type information would quickly come to byte us. A simpler argument is that if we are to try to infer an implicit signature, we might as well do it for all the methods. But the truth is that there are very good reasons to have explicit singatures in scala and I can't imagine to change that.
All the parts must be defined within the same scope. To start with, they must be declared in the same file because each source file is compiled separately, and thus a simple preprocessor would not be enough to implement the feature. Second, we still end up with a single method in the end, so it's only natural to have all the parts in the same scope.
Overloading is not possible for such methods (otherwise we would need to repeat the signature for each part just so the preprocessor knows which part belongs to which overload)
Parts are added (stitched) to the generated match in the order they are declared
So here is how it could look like:
def reverse[T](lst: List[T]): List[T] // Exactly like an abstract def (provides the signature)
// .... some unrelated code here...
def reverse(Nil) = Nil
// .... another bit of unrelated code here...
def reverse(x :: xs ) = reverse(xs) ++ List(x)
Which could be trivially transformed into:
def reverse[T](list: List[T]): List[T] = lst match {
case Nil => Nil
case x :: xs => reverse(xs) ++ List(x)
}
// .... some unrelated code here...
// .... another bit of unrelated code here...
It is easy to see that the above transformation is very mechanical and can be done by just manipulating a source AST (the AST produced by the slightly modified grammar that accepts this new constructs), and transforming it into the target AST (the AST produced by the standard scala grammar).
Then we can compile the result as usual.
So there you go, with a few simple rules we are able to implement a preprocessor that does all the work to implement this new feature.
If by fundamental you are asking "is there anything that would make this feature out of place" then it can be argued that this does not feel very scala. But more to the point, it does not bring that much to the table. Scala author(s) actually tend toward making the language simpler (as in less built-in features, trying to move some built-in features into libraries) and adding a new syntax that is not really more readable goes against the goal of simplification.
In SML, your code snippet is literally just syntactic sugar (a "derived form" in the terminology of the language spec) for
val rec reverse = fn x =>
case x of [] => []
| x::xs = reverse xs # [x]
which is very close to the Scala code you show. So, no there is no "fundamental" reason that Scala couldn't provide the same kind of syntax. The main problem is Scala's need for more type annotations, which makes this shorthand syntax far less attractive in general, and probably not worth the while.
Note also that the specific syntax you suggest would not fly well, because there is no way to distinguish one case-by-case function definition from two overloaded functions syntactically. You probably would need some alternative syntax, similar to SML using "|".
I don't know SML or Erlang, but I know Haskell. It is a language without method overloading. Method overloading combined with such pattern matching could lead to ambiguities. Imagine following code:
def f(x: String) = "String "+x
def f(x: List[_]) = "List "+x
What should it mean? It can mean method overloading, i.e. the method is determined in compile time. It can also mean pattern matching. There would be just a f(x: AnyRef) method that would do the matching.
Scala also has named parameters, which would be probably also broken.
I don't think that Scala is able to offer more simple syntax than you have shown in general. A simpler syntax may IMHO work in some special cases only.
There are at least two problems:
[ and ] are reserved characters because they are used for type arguments. The compiler allows spaces around them, so that would not be an option.
The other problem is that = returns Unit. So the expression after the | would not return any result
The closest I could come up with is this (note that is very specialized towards your example):
// Define a class to hold the values left and right of the | sign
class |[T, S](val left: T, val right: PartialFunction[T, T])
// Create a class that contains the | operator
class OrAssoc[T](left: T) {
def |(right: PartialFunction[T, T]): T | T = new |(left, right)
}
// Add the | to any potential target
implicit def anyToOrAssoc[S](left: S): OrAssoc[S] = new OrAssoc(left)
object fun {
// Use the magic of the update method
def update[T, S](choice: T | S): T => T = { arg =>
if (choice.right.isDefinedAt(arg)) choice.right(arg)
else choice.left
}
}
// Use the above construction to define a new method
val reverse: List[Int] => List[Int] =
fun() = List.empty[Int] | {
case x :: xs => reverse(xs) ++ List(x)
}
// Call the method
reverse(List(3, 2, 1))