Can extractors be customized with parameters in the body of a case statement (or anywhere else that an extractor would be used)? - scala

Basically, I would like to be able to build a custom extractor without having to store it in a variable prior to using it.
This isn't a real example of how I would use it, it would more likely be used in the case of a regular expression or some other string pattern like construct, but hopefully it explains what I'm looking for:
def someExtractorBuilder(arg:Boolean) = new {
def unapply(s:String):Option[String] = if(arg) Some(s) else None
}
//I would like to be able to use something like this
val {someExtractorBuilder(true)}(result) = "test"
"test" match {case {someExtractorBuilder(true)}(result) => result }
//instead I would have to do this:
val customExtractor = someExtractorBuilder(true)
val customExtractor(result) = "test"
"test" match {case customExtractor(result) => result}
When just doing a single custom extractor it doesn't make much difference, but if you were building a large list of extractors for a case statement, it could make things more difficult to read by separating all of the extractors from their usage.
I expect that the answer is no you can't do this, but I thought I'd ask around first :D

Parameterising extractors would be cool, but we don't have the resources to implement them right now.

Nope.
8.1.7 Extractor Patterns
An extractor pattern x (p 1 , . . . ,
p n ) where n ≥ 0 is of the same
syntactic form as a constructor
pattern. However, instead of a case
class, the stable identifier x denotes
an object which has a member method
named unapply or unapplySeq that
matches the pattern.

One can customize extractors to certain extent using implicit parameters, like this:
object SomeExtractorBuilder {
def unapply(s: String)(implicit arg: Boolean): Option[String] = if (arg) Some(s) else None
}
implicit val arg: Boolean = true
"x" match {
case SomeExtractorBuilder(result) =>
result
}
Unfortunately this cannot be used when you want to use different variants in one match, as all case statements are in the same scope. Still, it can be useful sometimes.

Late but there is a scalac plugin in one of my lib providing syntax ~(extractorWith(param), bindings):
x match {
case ~(parametrizedExtractor(param)) =>
"no binding"
case ~(parametrizedExtractor(param), (a, b)) =>
s"extracted bindings: $a, $b"
}
https://github.com/cchantep/acolyte/blob/master/scalac-plugin/readme.md

Though what you are asking isn't directly possible,
it is possible to create an extractor returning a contaner that gets evaluated value in the if-part of the case evaluation. In the if part it is possible to provide parameters.
object DateExtractor {
def unapply(in: String): Option[DateExtractor] = Some(new DateExtractor(in));
}
class DateExtractor(input:String){
var value:LocalDate=null;
def apply():LocalDate = value;
def apply(format: String):Boolean={
val formater=DateTimeFormatter.ofPattern(format);
try{
val parsed=formater.parse(input, TemporalQueries.localDate());
value=parsed
true;
} catch {
case e:Throwable=>{
false
}
}
}
}
Usage:
object DateExtractorUsage{
def main(args: Array[String]): Unit = {
"2009-12-31" match {
case DateExtractor(ext) if(ext("dd-MM-yyyy"))=>{
println("Found dd-MM-yyyy date:"+ext())
}
case DateExtractor(ext) if(ext("yyyy-MM-dd"))=>{
println("Found yyyy-MM-dd date:"+ext())
}
case _=>{
println("Unable to parse date")
}
}
}
}
This pattern preserves the PartialFunction nature of the piece of code. I find this useful since I am quite a fan of the collect/collectFirst methods, which take a partial function as a parameter and typically does not leave room for precreating a set of extractors.

Related

Combine multiple extractor objects to use in one match statement

Is it possible to run multiple extractors in one match statement?
object CoolStuff {
def unapply(thing: Thing): Option[SomeInfo] = ...
}
object NeatStuff {
def unapply(thing: Thing): Option[OtherInfo] = ...
}
// is there some syntax similar to this?
thing match {
case t # CoolStuff(someInfo) # NeatStuff(otherInfo) => process(someInfo, otherInfo)
case _ => // neither Cool nor Neat
}
The intent here being that there are two extractors, and I don't have to do something like this:
object CoolNeatStuff {
def unapply(thing: Thing): Option[(SomeInfo, OtherInfo)] = thing match {
case CoolStuff(someInfo) => thing match {
case NeatStuff(otherInfo) => Some(someInfo -> otherInfo)
case _ => None // Cool, but not Neat
case _ => None// neither Cool nor Neat
}
}
Can try
object ~ {
def unapply[T](that: T): Option[(T,T)] = Some(that -> that)
}
def too(t: Thing) = t match {
case CoolStuff(a) ~ NeatStuff(b) => ???
}
I've come up with a very similar solution, but I was a bit too slow, so I didn't post it as an answer. However, since #userunknown asks to explain how it works, I'll dump my similar code here anyway, and add a few comments. Maybe someone finds it a valuable addition to cchantep's minimalistic solution (it looks... calligraphic? for some reason, in a good sense).
So, here is my similar, aesthetically less pleasing proposal:
object && {
def unapply[A](a: A) = Some((a, a))
}
// added some definitions to make your question-code work
type Thing = String
type SomeInfo = String
type OtherInfo = String
object CoolStuff {
def unapply(thing: Thing): Option[SomeInfo] = Some(thing.toLowerCase)
}
object NeatStuff {
def unapply(thing: Thing): Option[OtherInfo] = Some(thing.toUpperCase)
}
def process(a: SomeInfo, b: OtherInfo) = s"[$a, $b]"
val res = "helloworld" match {
case CoolStuff(someInfo) && NeatStuff(otherInfo) =>
process(someInfo, otherInfo)
case _ =>
}
println(res)
This prints
[helloworld, HELLOWORLD]
The idea is that identifiers (in particular, && and ~ in cchantep's code) can be used as infix operators in patterns. Therefore, the match-case
case CoolStuff(someInfo) && NeatStuff(otherInfo) =>
will be desugared into
case &&(CoolStuff(someInfo), NeatStuff(otherInfo)) =>
and then the unapply method method of && will be invoked which simply duplicates its input.
In my code, the duplication is achieved by a straightforward Some((a, a)). In cchantep's code, it is done with fewer parentheses: Some(t -> t). The arrow -> comes from ArrowAssoc, which in turn is provided as an implicit conversion in Predef. This is just a quick way to create pairs, usually used in maps:
Map("hello" -> 42, "world" -> 58)
Another remark: notice that && can be used multiple times:
case Foo(a) && Bar(b) && Baz(c) => ...
So... I don't know whether it's an answer or an extended comment to cchantep's answer, but maybe someone finds it useful.
For those who might miss the details on how this magic actually works, just want to expand the answer by #cchantep anf #Andrey Tyukin (comment section does not allow me to do that).
Running scalac with -Xprint:parser option will give something along those lines (scalac 2.11.12)
def too(t: String) = t match {
case $tilde(CoolStuff((a # _)), NeatStuff((b # _))) => $qmark$qmark$qmark
}
This basically shows you the initial steps compiler does while parsing source into AST.
Important Note here is that the rules why compiler makes this transformation are described in Infix Operation Patterns and Extractor Patterns. In particular, this allows you to use any object as long as it has unapply method, like for example CoolStuff(a) AndAlso NeatStuff(b). In previous answers && and ~ were picked up as also possible but not the only available valid identifiers.
If running scalac with option -Xprint:patmat which is a special phase for translating pattern matching one can see something similar to this
def too(t: String): Nothing = {
case <synthetic> val x1: String = t;
case9(){
<synthetic> val o13: Option[(String, String)] = main.this.~.unapply[String](x1);
if (o13.isEmpty.unary_!)
{
<synthetic> val p3: String = o13.get._1;
<synthetic> val p4: String = o13.get._2;
{
<synthetic> val o12: Option[String] = main.this.CoolStuff.unapply(p3);
if (o12.isEmpty.unary_!)
{
<synthetic> val o11: Option[String] = main.this.NeatStuff.unapply(p4);
if (o11.isEmpty.unary_!)
matchEnd8(scala.this.Predef.???)
Here ~.unapply will be called on input parameter t which will produce Some((t,t)). The tuple values will be extracted into variables p3 and p4. Then, CoolStuff.unapply(p3) will be called and if the result is not None NeatStuff.unapply(p4) will be called and also checked if it is not empty. If both are not empty then according to Variable Patterns a and b will be bound to returned results inside corresponding Some.

Scala: Shorthand for pattern match on type?

Is there a more elegant way of doing the following?
data match {
case e: SomeType => doSomethingWith(e)
case _ =>
}
Looking for something like:
data.ifInstanceOf[SomeType](doSomethingWith)
Do you want an expression or just perform a side-effect? If just a side-effect via "PimpMyLibrary" approach + refection:
import scala.reflect.ClassTag
implicit class AnyOps(data : Any) {
def ifInstanceOf[A : ClassTag](f: A => Unit) : Unit = {
val clzz = implicitly[ClassTag[A]].runtimeClass
if (clzz.isInstance(data)) f(data.asInstanceOf[A])
}
}
You can then try
"abc".ifInstanceOf[String](println)
"def".ifInstanceOf[Integer](println)
1.ifInstanceOf[Integer](println)
2.ifInstanceOf[String](println)
If you want an expression I think the best way to go is to add an additional type Parameter B and return Either[B, Anyref].
I think a collect might work for this. The method takes a partial function, and filters out elements that did not match any case statements:
Option(data) collect {
case element: SomeType => mappingFunction(element)
}
Of course, mappingFunction here could have side effects and return Unit, thereby mimicking a foreach.
If you want to make a new named method, you could make a new method on Any:
implicit class AnyOps(data: Any) {
def forMatch[A](pf: PartialFunction[Any, A]) = pf.lift(data)
}
For some reason, this didn't come up:
import PartialFunction._
condOpt("abc": Any) { case s: String => s.length } // = Some(3)
condOpt((): Any) { case s: String => s.length } // = None
I usually rename it or its sibling cond to when.
You can use asInstanceOf to get your goal:
Option(data)
.filter(_.isInstanceOf[SomeType])
.map(_.asInstanceOf[SomeType])
.map(doSomethingWith)
but I guess it's too verbose.
I think you'll find it difficult to find anything shorter or more precise than "good old"...
if (data.isInstanceOf[SomeType]) doSomething(data)
... unless your case is more complicated than you're showing

Mapping many Eithers to one Either with many

Say I have a monadic function in called processOne defined like this:
def processOne(input: Input): Either[ErrorType, Output] = ...
Given a list of "Inputs", I would like to return a corresponding list of "Outputs" wrapped in an Either:
def processMany(inputs: Seq[Input]): Either[ErrorType, Seq[Output]] = ...
processMany will call processOne for each input it has, however, I would like it to terminate the first time (if any) that processOne returns a Left, and return that Left, otherwise return a Right with a list of the outputs.
My question: what is the best way to implement processMany? Is it possible to accomplish this behavior using a for expression, or is it going to be necessary for me to iterate the list myself recursively?
With Scalaz 7:
def processMany(inputs: Seq[Input]): Either[ErrorType, Seq[Output]] =
inputs.toStream traverseU processOne
Converting inputs to a Stream[Input] takes advantage of the non-strict traverse implementation for Stream, i.e. gives you the short-circuiting behaviour you want.
By the way, you tagged this "monads", but traversal requires only an applicative functor (which, as it happens, is probably defined in terms of the monad for Either). For further reference, see the paper The Essence of the Iterator Pattern, or, for a Scala-based interpretation, Eric Torreborre's blog post on the subject.
The easiest with standard Scala, which doesn't evaluate more than is necessary, would probably be
def processMany(inputs: Seq[Input]): Either[ErrorType, Seq[Output]] = {
Right(inputs.map{ x =>
processOne(x) match {
case Right(r) => r
case Left(l) => return Left(l)
}
})
}
A fold would be more compact, but wouldn't short-circuit when it hit a left (it'd just keep carrying it along while you iterated through the entire input).
For now, I've decided to just solve this using recursion, as I am reluctant to add a dependency to a library (Scalaz).
(Types and names in my application have been changed here in order to appear more generic)
def processMany(inputs: Seq[Input]): Either[ErrorType, Seq[Output]] = {
import scala.annotation.tailrec
#tailrec
def traverse(acc: Vector[Output], inputs: List[Input]): Either[ErrorType, Seq[Output]] = {
inputs match {
case Nil => Right(acc)
case input :: more =>
processOne(input) match {
case Right(output) => traverse(acc :+ output, more)
case Left(e) => Left(e)
}
}
}
traverse(Vector[Output](), inputs.toList)
}

Implementing ifTrue, ifFalse, ifSome, ifNone, etc. in Scala to avoid if(...) and simple pattern matching

In Scala, I have progressively lost my Java/C habit of thinking in a control-flow oriented way, and got used to go ahead and get the object I'm interested in first, and then usually apply something like a match or a map() or foreach() for collections. I like it a lot, since it now feels like a more natural and more to-the-point way of structuring my code.
Little by little, I've wished I could program the same way for conditions; i.e., obtain a Boolean value first, and then match it to do various things. A full-blown match, however, does seem a bit overkill for this task.
Compare:
obj.isSomethingValid match {
case true => doX
case false => doY
}
vs. what I would write with style closer to Java:
if (obj.isSomethingValid)
doX
else
doY
Then I remembered Smalltalk's ifTrue: and ifFalse: messages (and variants thereof). Would it be possible to write something like this in Scala?
obj.isSomethingValid ifTrue doX else doY
with variants:
val v = obj.isSomethingValid ifTrue someVal else someOtherVal
// with side effects
obj.isSomethingValid ifFalse {
numInvalid += 1
println("not valid")
}
Furthermore, could this style be made available to simple, two-state types like Option? I know the more idiomatic way to use Option is to treat it as a collection and call filter(), map(), exists() on it, but often, at the end, I find that I want to perform some doX if it is defined, and some doY if it isn't. Something like:
val ok = resultOpt ifSome { result =>
println("Obtained: " + result)
updateUIWith(result) // returns Boolean
} else {
numInvalid += 1
println("missing end result")
false
}
To me, this (still?) looks better than a full-blown match.
I am providing a base implementation I came up with; general comments on this style/technique and/or better implementations are welcome!
First: we probably cannot reuse else, as it is a keyword, and using the backticks to force it to be seen as an identifier is rather ugly, so I'll use otherwise instead.
Here's an implementation attempt. First, use the pimp-my-library pattern to add ifTrue and ifFalse to Boolean. They are parametrized on the return type R and accept a single by-name parameter, which should be evaluated if the specified condition is realized. But in doing so, we must allow for an otherwise call. So we return a new object called Otherwise0 (why 0 is explained later), which stores a possible intermediate result as a Option[R]. It is defined if the current condition (ifTrue or ifFalse) is realized, and is empty otherwise.
class BooleanWrapper(b: Boolean) {
def ifTrue[R](f: => R) = new Otherwise0[R](if (b) Some(f) else None)
def ifFalse[R](f: => R) = new Otherwise0[R](if (b) None else Some(f))
}
implicit def extendBoolean(b: Boolean): BooleanWrapper = new BooleanWrapper(b)
For now, this works and lets me write
someTest ifTrue {
println("OK")
}
But, without the following otherwise clause, it cannot return a value of type R, of course. So here's the definition of Otherwise0:
class Otherwise0[R](intermediateResult: Option[R]) {
def otherwise[S >: R](f: => S) = intermediateResult.getOrElse(f)
def apply[S >: R](f: => S) = otherwise(f)
}
It evaluates its passed named argument if and only if the intermediate result it got from the preceding ifTrue or ifFalse is undefined, which is exactly what is wanted. The type parametrization [S >: R] has the effect that S is inferred to be the most specific common supertype of the actual type of the named parameters, such that for instance, r in this snippet has an inferred type Fruit:
class Fruit
class Apple extends Fruit
class Orange extends Fruit
val r = someTest ifTrue {
new Apple
} otherwise {
new Orange
}
The apply() alias even allows you to skip the otherwise method name altogether for short chunks of code:
someTest.ifTrue(10).otherwise(3)
// equivalently:
someTest.ifTrue(10)(3)
Finally, here's the corresponding pimp for Option:
class OptionExt[A](option: Option[A]) {
def ifNone[R](f: => R) = new Otherwise1(option match {
case None => Some(f)
case Some(_) => None
}, option.get)
def ifSome[R](f: A => R) = new Otherwise0(option match {
case Some(value) => Some(f(value))
case None => None
})
}
implicit def extendOption[A](opt: Option[A]): OptionExt[A] = new OptionExt[A](opt)
class Otherwise1[R, A1](intermediateResult: Option[R], arg1: => A1) {
def otherwise[S >: R](f: A1 => S) = intermediateResult.getOrElse(f(arg1))
def apply[S >: R](f: A1 => S) = otherwise(f)
}
Note that we now also need Otherwise1 so that we can conveniently passed the unwrapped value not only to the ifSome function argument, but also to the function argument of an otherwise following an ifNone.
You may be looking at the problem too specifically. You would probably be better off with the pipe operator:
class Piping[A](a: A) { def |>[B](f: A => B) = f(a) }
implicit def pipe_everything[A](a: A) = new Piping(a)
Now you can
("fish".length > 5) |> (if (_) println("Hi") else println("Ho"))
which, admittedly, is not quite as elegant as what you're trying to achieve, but it has the great advantage of being amazingly versatile--any time you want to put an argument first (not just with booleans), you can use it.
Also, you already can use options the way you want:
Option("fish").filter(_.length > 5).
map (_ => println("Hi")).
getOrElse(println("Ho"))
Just because these things could take a return value doesn't mean you have to avoid them. It does take a little getting used to the syntax; this may be a valid reason to create your own implicits. But the core functionality is there. (If you do create your own, consider fold[B](f: A => B)(g: => B) instead; once you're used to it the lack of the intervening keyword is actually rather nice.)
Edit: Although the |> notation for pipe is somewhat standard, I actually prefer use as the method name, because then def reuse[B,C](f: A => B)(g: (A,B) => C) = g(a,f(a)) seems more natural.
Why don't just use it like this:
val idiomaticVariable = if (condition) {
firstExpression
} else {
secondExpression
}
?
IMO, its very idiomatic! :)

Scala: short form of pattern matching that returns Boolean

I found myself writing something like this quite often:
a match {
case `b` => // do stuff
case _ => // do nothing
}
Is there a shorter way to check if some value matches a pattern? I mean, in this case I could just write if (a == b) // do stuff, but what if the pattern is more complex? Like when matching against a list or any pattern of arbitrary complexity. I'd like to be able to write something like this:
if (a matches b) // do stuff
I'm relatively new to Scala, so please pardon, if I'm missing something big :)
This is exactly why I wrote these functions, which are apparently impressively obscure since nobody has mentioned them.
scala> import PartialFunction._
import PartialFunction._
scala> cond("abc") { case "def" => true }
res0: Boolean = false
scala> condOpt("abc") { case x if x.length == 3 => x + x }
res1: Option[java.lang.String] = Some(abcabc)
scala> condOpt("abc") { case x if x.length == 4 => x + x }
res2: Option[java.lang.String] = None
The match operator in Scala is most powerful when used in functional style. This means, rather than "doing something" in the case statements, you would return a useful value. Here is an example for an imperative style:
var value:Int = 23
val command:String = ... // we get this from somewhere
command match {
case "duplicate" => value = value * 2
case "negate" => value = -value
case "increment" => value = value + 1
// etc.
case _ => // do nothing
}
println("Result: " + value)
It is very understandable that the "do nothing" above hurts a little, because it seems superflous. However, this is due to the fact that the above is written in imperative style. While constructs like these may sometimes be necessary, in many cases you can refactor your code to functional style:
val value:Int = 23
val command:String = ... // we get this from somewhere
val result:Int = command match {
case "duplicate" => value * 2
case "negate" => -value
case "increment" => value + 1
// etc.
case _ => value
}
println("Result: " + result)
In this case, you use the whole match statement as a value that you can, for example, assign to a variable. And it is also much more obvious that the match statement must return a value in any case; if the last case would be missing, the compiler could not just make something up.
It is a question of taste, but some developers consider this style to be more transparent and easier to handle in more real-world examples. I would bet that the inventors of the Scala programming language had a more functional use in mind for match, and indeed the if statement makes more sense if you only need to decide whether or not a certain action needs to be taken. (On the other hand, you can also use if in the functional way, because it also has a return value...)
This might help:
class Matches(m: Any) {
def matches[R](f: PartialFunction[Any, R]) { if (f.isDefinedAt(m)) f(m) }
}
implicit def any2matches(m: Any) = new Matches(m)
scala> 'c' matches { case x: Int => println("Int") }
scala> 2 matches { case x: Int => println("Int") }
Int
Now, some explanation on the general nature of the problem.
Where may a match happen?
There are three places where pattern matching might happen: val, case and for. The rules for them are:
// throws an exception if it fails
val pattern = value
// filters for pattern, but pattern cannot be "identifier: Type",
// though that can be replaced by "id1 # (id2: Type)" for the same effect
for (pattern <- object providing map/flatMap/filter/withFilter/foreach) ...
// throws an exception if none of the cases match
value match { case ... => ... }
There is, however, another situation where case might appear, which is function and partial function literals. For example:
val f: Any => Unit = { case i: Int => println(i) }
val pf: PartialFunction[Any, Unit] = { case i: Int => println(i) }
Both functions and partial functions will throw an exception if called with an argument that doesn't match any of the case statements. However, partial functions also provide a method called isDefinedAt which can test whether a match can be made or not, as well as a method called lift, which will turn a PartialFunction[T, R] into a Function[T, Option[R]], which means non-matching values will result in None instead of throwing an exception.
What is a match?
A match is a combination of many different tests:
// assign anything to x
case x
// only accepts values of type X
case x: X
// only accepts values matches by pattern
case x # pattern
// only accepts a value equal to the value X (upper case here makes a difference)
case X
// only accepts a value equal to the value of x
case `x`
// only accept a tuple of the same arity
case (x, y, ..., z)
// only accepts if extractor(value) returns true of Some(Seq()) (some empty sequence)
case extractor()
// only accepts if extractor(value) returns Some something
case extractor(x)
// only accepts if extractor(value) returns Some Seq or Tuple of the same arity
case extractor(x, y, ..., z)
// only accepts if extractor(value) returns Some Tuple2 or Some Seq with arity 2
case x extractor y
// accepts if any of the patterns is accepted (patterns may not contain assignable identifiers)
case x | y | ... | z
Now, extractors are the methods unapply or unapplySeq, the first returning Boolean or Option[T], and the second returning Option[Seq[T]], where None means no match is made, and Some(result) will try to match result as described above.
So there are all kinds of syntactic alternatives here, which just aren't possible without the use of one of the three constructions where pattern matches may happen. You may able to emulate some of the features, like value equality and extractors, but not all of them.
Patterns can also be used in for expressions. Your code sample
a match {
case b => // do stuff
case _ => // do nothing
}
can then be expressed as
for(b <- Some(a)) //do stuff
The trick is to wrap a to make it a valid enumerator. E.g. List(a) would also work, but I think Some(a) is closest to your intended meaning.
The best I can come up with is this:
def matches[A](a:A)(f:PartialFunction[A, Unit]) = f.isDefinedAt(a)
if (matches(a){case ... =>}) {
//do stuff
}
This won't win you any style points though.
Kim's answer can be “improved” to better match your requirement:
class AnyWrapper[A](wrapped: A) {
def matches(f: PartialFunction[A, Unit]) = f.isDefinedAt(wrapped)
}
implicit def any2wrapper[A](wrapped: A) = new AnyWrapper(wrapped)
then:
val a = "a" :: Nil
if (a matches { case "a" :: Nil => }) {
println("match")
}
I wouldn't do it, however. The => }) { sequence is really ugly here, and the whole code looks much less clear than a normal match. Plus, you get the compile-time overhead of looking up the implicit conversion, and the run-time overhead of wrapping the match in a PartialFunction (not counting the conflicts you could get with other, already defined matches methods, like the one in String).
To look a little bit better (and be less verbose), you could add this def to AnyWrapper:
def ifMatch(f: PartialFunction[A, Unit]): Unit = if (f.isDefinedAt(wrapped)) f(wrapped)
and use it like this:
a ifMatch { case "a" :: Nil => println("match") }
which saves you your case _ => line, but requires double braces if you want a block instead of a single statement... Not so nice.
Note that this construct is not really in the spirit of functional programming, as it can only be used to execute something that has side effects. We can't easily use it to return a value (therefore the Unit return value), as the function is partial — we'd need a default value, or we could return an Option instance. But here again, we would probably unwrap it with a match, so we'd gain nothing.
Frankly, you're better off getting used to seeing and using those match frequently, and moving away from this kind of imperative-style constructs (following Madoc's nice explanation).