Why doesn't this pattern matching work as expected in Scala? - scala

I'm trying to replicate the powerful pattern matching example that Joshua Suereth presented in his Devoxx 2013 talk titled "How to wield Scala in the trenches". Unfortunately I cannot achieve what he described and I cannot understand what is wrong. Can someone give me a hint at what I'm missing? (My Scala version is 2.10.3)
Please see the self contained code below:
case class Person(name: String, residence: Seq[Residence])
case class Residence(city: String, country: String)
object LivesIn {
def unapply(p: Person): Option[Seq[String]] =
Some(
for(r <- p.residence)
yield r.city
)
}
class StringSeqContains(value: String) {
def unapply(in: Seq[String]): Boolean =
in contains value
}
object PatternPower extends App {
val people =
Seq(Person("Emre", Seq(Residence("Antwerp", "BE"))),
Person("Ergin", Seq(Residence("Istanbul", "TR"))))
val Istanbul = new StringSeqContains("Istanbul")
// #1 does not work as expected, WHY?
println(
people collect {
case person # LivesIn(Istanbul) => person
}
)
// #2 works as expected
println(
people collect {
case person # LivesIn(cities) if cities.contains("Istanbul") => person
}
)
// #3 works as expected
println(
people collect {
case person # Person(_, res) if res.contains(Residence("Istanbul", "TR")) => person
}
)
}
When I compile and run it I get:
List()
List(Person(Ergin,List(Residence(Istanbul,TR))))
List(Person(Ergin,List(Residence(Istanbul,TR))))
As denoted in the source code, I fail to grasp why the first pattern does not produce the same result as the remaining two pattern matches. Any ideas why?

Your LivesIn extractor requires a Seq for an argument.
The following variation does what you expect:
println(
people collect {
case person # LivesIn(List("Istanbul")) => person
}
)

After some thinking and Googling, I realized that one should add () to the inner extractor (thanks to The Neophyte's Guide to Scala Part 1: Extractors).
In other words, the following works as expected:
people collect {
case person # LivesIn(Istanbul()) => person
}
whereas the following code silently, without any complaints, returns List():
people collect {
case person # LivesIn(Istanbul) => person
}
Unless I'm mistaken in another way (e.g. there is way to make it work without parantheses), I think technical presenters should be more careful with the code snippets / pseudo-code snippets (so that some of the curious audience will not lose sleepless hours ;-)

Related

Pattern matching json lines using Circe and filtering based upon decoded case class value

I have a very large file of json lines, which I intend to read into a list of case classes. Due to the size of the file, rather than reading the entire file into a variable first and then filtering, I would like to filter within the json decoding pattern matching. Currently the code looks like this:
import io.circe.Decoder
import io.circe.generic.semiauto.deriveDecoder
import io.circe.parser.decode
case class Person(name: String, age: Int, country: String)
val personList: List[Person] =
Source.fromResource("Persons.json").getLines.toList.map { line =>
implicit val jsonDecoder: Decoder[Person] = deriveDecoder[Person]
val decoded = decode[Person](line)
decoded match {
case Right(decodedJson) =>
Person(
decodedJson.name,
decodedJson.age,
decodedJson.country
)
case Left(ex) => throw new RuntimeException(ex)
}
}
however, if I wanted to only include Person instances with a country of "us", what would be the best way to accomplish this? Should I have nested pattern matching, that will specifically look for Person(_, _, "us") (im not sure how I would accomplish this), or is there some way I can implement Option handling?
You could do something like this:
import io.circe.Decoder
import io.circe.generic.semiauto.deriveDecoder
import io.circe.parser.decode
case class Person(name: String, age: Int, country: String)
implicit val jsonDecoder: Decoder[Person] = deriveDecoder[Person]
val personList: List[Person] =
Source
.fromResource("Persons.json")
.getLines
.flatMap { line =>
val decoded = decode[Person](line)
decoded match {
case Right(person # Person(_, _, "us")) => Some(person)
case Right(_) => None
case Left(ex) =>
println(s"couldn't decode: $line, will skip (error: ${ex.getMessage})")
None
}
}
.toList
println(s"US people: $personList")
A few things to note:
I moved the .toList to the end. In your implementation, you called it right after .getLines which kind of loses the lazyness of the whole thing. Assuming there's only a few US people out of huge number of people in the JSON file, this can be beneficial for performance & efficiency.
Wrapping each iteration's result in an Option along with flatMap over the original Iterator we're running upon is very helpful to get this kind collection filtering.
I didn't throw an exception upon an error, but rather logged it and moved on with a None. You could also accumulate errors and do whatever you want with them after all iterations are done, if that's helpful to you.
The # in person # Person(_, _, "us") can be used for something like "match & bind" upon the whole object in question.
As the comment to the original question noted - no need to re-instantiate the implicit Decoder upon each iteration. You can just pull it one layer up, as I did in my example.

Combine multiple extractor objects to use in one match statement

Is it possible to run multiple extractors in one match statement?
object CoolStuff {
def unapply(thing: Thing): Option[SomeInfo] = ...
}
object NeatStuff {
def unapply(thing: Thing): Option[OtherInfo] = ...
}
// is there some syntax similar to this?
thing match {
case t # CoolStuff(someInfo) # NeatStuff(otherInfo) => process(someInfo, otherInfo)
case _ => // neither Cool nor Neat
}
The intent here being that there are two extractors, and I don't have to do something like this:
object CoolNeatStuff {
def unapply(thing: Thing): Option[(SomeInfo, OtherInfo)] = thing match {
case CoolStuff(someInfo) => thing match {
case NeatStuff(otherInfo) => Some(someInfo -> otherInfo)
case _ => None // Cool, but not Neat
case _ => None// neither Cool nor Neat
}
}
Can try
object ~ {
def unapply[T](that: T): Option[(T,T)] = Some(that -> that)
}
def too(t: Thing) = t match {
case CoolStuff(a) ~ NeatStuff(b) => ???
}
I've come up with a very similar solution, but I was a bit too slow, so I didn't post it as an answer. However, since #userunknown asks to explain how it works, I'll dump my similar code here anyway, and add a few comments. Maybe someone finds it a valuable addition to cchantep's minimalistic solution (it looks... calligraphic? for some reason, in a good sense).
So, here is my similar, aesthetically less pleasing proposal:
object && {
def unapply[A](a: A) = Some((a, a))
}
// added some definitions to make your question-code work
type Thing = String
type SomeInfo = String
type OtherInfo = String
object CoolStuff {
def unapply(thing: Thing): Option[SomeInfo] = Some(thing.toLowerCase)
}
object NeatStuff {
def unapply(thing: Thing): Option[OtherInfo] = Some(thing.toUpperCase)
}
def process(a: SomeInfo, b: OtherInfo) = s"[$a, $b]"
val res = "helloworld" match {
case CoolStuff(someInfo) && NeatStuff(otherInfo) =>
process(someInfo, otherInfo)
case _ =>
}
println(res)
This prints
[helloworld, HELLOWORLD]
The idea is that identifiers (in particular, && and ~ in cchantep's code) can be used as infix operators in patterns. Therefore, the match-case
case CoolStuff(someInfo) && NeatStuff(otherInfo) =>
will be desugared into
case &&(CoolStuff(someInfo), NeatStuff(otherInfo)) =>
and then the unapply method method of && will be invoked which simply duplicates its input.
In my code, the duplication is achieved by a straightforward Some((a, a)). In cchantep's code, it is done with fewer parentheses: Some(t -> t). The arrow -> comes from ArrowAssoc, which in turn is provided as an implicit conversion in Predef. This is just a quick way to create pairs, usually used in maps:
Map("hello" -> 42, "world" -> 58)
Another remark: notice that && can be used multiple times:
case Foo(a) && Bar(b) && Baz(c) => ...
So... I don't know whether it's an answer or an extended comment to cchantep's answer, but maybe someone finds it useful.
For those who might miss the details on how this magic actually works, just want to expand the answer by #cchantep anf #Andrey Tyukin (comment section does not allow me to do that).
Running scalac with -Xprint:parser option will give something along those lines (scalac 2.11.12)
def too(t: String) = t match {
case $tilde(CoolStuff((a # _)), NeatStuff((b # _))) => $qmark$qmark$qmark
}
This basically shows you the initial steps compiler does while parsing source into AST.
Important Note here is that the rules why compiler makes this transformation are described in Infix Operation Patterns and Extractor Patterns. In particular, this allows you to use any object as long as it has unapply method, like for example CoolStuff(a) AndAlso NeatStuff(b). In previous answers && and ~ were picked up as also possible but not the only available valid identifiers.
If running scalac with option -Xprint:patmat which is a special phase for translating pattern matching one can see something similar to this
def too(t: String): Nothing = {
case <synthetic> val x1: String = t;
case9(){
<synthetic> val o13: Option[(String, String)] = main.this.~.unapply[String](x1);
if (o13.isEmpty.unary_!)
{
<synthetic> val p3: String = o13.get._1;
<synthetic> val p4: String = o13.get._2;
{
<synthetic> val o12: Option[String] = main.this.CoolStuff.unapply(p3);
if (o12.isEmpty.unary_!)
{
<synthetic> val o11: Option[String] = main.this.NeatStuff.unapply(p4);
if (o11.isEmpty.unary_!)
matchEnd8(scala.this.Predef.???)
Here ~.unapply will be called on input parameter t which will produce Some((t,t)). The tuple values will be extracted into variables p3 and p4. Then, CoolStuff.unapply(p3) will be called and if the result is not None NeatStuff.unapply(p4) will be called and also checked if it is not empty. If both are not empty then according to Variable Patterns a and b will be bound to returned results inside corresponding Some.

How to modify this nested case classes with "Seq" fields?

Some nested case classes and the field addresses is a Seq[Address]:
// ... means other fields
case class Street(name: String, ...)
case class Address(street: Street, ...)
case class Company(addresses: Seq[Address], ...)
case class Employee(company: Company, ...)
I have an employee:
val employee = Employee(Company(Seq(
Address(Street("aaa street")),
Address(Street("bbb street")),
Address(Street("bpp street")))))
It has 3 addresses.
And I want to capitalize the streets start with "b" only. My code is mess like following:
val modified = employee.copy(company = employee.company.copy(addresses =
employee.company.addresses.map { address =>
address.copy(street = address.street.copy(name = {
if (address.street.name.startsWith("b")) {
address.street.name.capitalize
} else {
address.street.name
}
}))
}))
The modified employee is then:
Employee(Company(List(
Address(Street(aaa street)),
Address(Street(Bbb street)),
Address(Street(Bpp street)))))
I'm looking for a way to improve it, and can't find one. Even tried Monocle, but can't apply it to this problem.
Is there any way to make it better?
PS: there are two key requirements:
use only immutable data
don't lose other existing fields
As Peter Neyens points out, Shapeless's SYB works really nicely here, but it will modify all Street values in the tree, which may not always be what you want. If you need more control over the path, Monocle can help:
import monocle.Traversal
import monocle.function.all._, monocle.macros._, monocle.std.list._
val employeeStreetNameLens: Traversal[Employee, String] =
GenLens[Employee](_.company).composeTraversal(
GenLens[Company](_.addresses)
.composeTraversal(each)
.composeLens(GenLens[Address](_.street))
.composeLens(GenLens[Street](_.name))
)
val capitalizer = employeeStreeNameLens.modify {
case s if s.startsWith("b") => s.capitalize
case s => s
}
As Julien Truffaut points out in an edit, you can make this even more concise (but less general) by creating a lens all the way to the first character of the street name:
import monocle.std.string._
val employeeStreetNameFirstLens: Traversal[Employee, Char] =
GenLens[Employee](_.company.addresses)
.composeTraversal(each)
.composeLens(GenLens[Address](_.street.name))
.composeOptional(headOption)
val capitalizer = employeeStreetNameFirstLens.modify {
case 'b' => 'B'
case s => s
}
There are symbolic operators that would make the definitions above a little more concise, but I prefer the non-symbolic versions.
And then (with the result reformatted for clarity):
scala> capitalizer(employee)
res3: Employee = Employee(
Company(
List(
Address(Street(aaa street)),
Address(Street(Bbb street)),
Address(Street(Bpp street))
)
)
)
Note that as in the Shapeless answer, you'll need to change your Employee definition to use List instead of Seq, or if you don't want to change your model, you could build that transformation into the Lens with an Iso[Seq[A], List[A]].
If you are open to replacing the addresses in Company from Seq to List, you can use "Scrap Your Boilerplate" from shapeless (example).
import shapeless._, poly._
case class Street(name: String)
case class Address(street: Street)
case class Company(addresses: List[Address])
case class Employee(company: Company)
val employee = Employee(Company(List(
Address(Street("aaa street")),
Address(Street("bbb street")),
Address(Street("bpp street")))))
You can create a polymorphic function which capitalizes the name of a Street if the name starts with a "b".
object capitalizeStreet extends ->(
(s: Street) => {
val name = if (s.name.startsWith("b")) s.name.capitalize else s.name
Street(name)
}
)
Which you can use as :
val afterCapitalize = everywhere(capitalizeStreet)(employee)
// Employee(Company(List(
// Address(Street(aaa street)),
// Address(Street(Bbb street)),
// Address(Street(Bpp street)))))
Take a look at quicklens
You could do it like this
import com.softwaremill.quicklens._
case class Street(name: String)
case class Address(street: Street)
case class Company(address: Seq[Address])
case class Employee(company: Company)
object Foo {
def foo(e: Employee) = {
modify(e)(_.company.address.each.street.name).using {
case name if name.startsWith("b") => name.capitalize
case name => name
}
}
}

Concise way to assert a value matches a given pattern in ScalaTest

Is there a nice way to check that a pattern match succeeds in ScalaTest? An option is given in scalatest-users mailing list:
<value> match {
case <pattern> =>
case obj => fail("Did not match: " + obj)
}
However, it doesn't compose (e.g. if I want to assert that exactly 2 elements of a list match the pattern using Inspectors API). I could write a matcher taking a partial function literal and succeeding if it's defined (it would have to be a macro if I wanted to get the pattern in the message as well). Is there a better alternative?
I am not 100% sure I understand the question you're asking, but one possible answer is to use inside from the Inside trait. Given:
case class Address(street: String, city: String, state: String, zip: String)
case class Name(first: String, middle: String, last: String)
case class Record(name: Name, address: Address, age: Int)
You can write:
inside (rec) { case Record(name, address, age) =>
inside (name) { case Name(first, middle, last) =>
first should be ("Sally")
middle should be ("Ann")
last should be ("Jones")
}
inside (address) { case Address(street, city, state, zip) =>
street should startWith ("25")
city should endWith ("Angeles")
state should equal ("CA")
zip should be ("12345")
}
age should be < 99
}
That works for both assertions or matchers. Details here:
http://www.scalatest.org/user_guide/other_goodies#inside
The other option if you are using matchers and just want to assert that a value matches a particular pattern, you can just the matchPattern syntax:
val name = Name("Jane", "Q", "Programmer")
name should matchPattern { case Name("Jane", _, _) => }
http://www.scalatest.org/user_guide/using_matchers#matchingAPattern
The scalatest-users post you pointed to was from 2011. We have added the above syntax for this use case since then.
Bill
This might not be exactly what you want, but you could write your test assertion using an idiom like this.
import scala.util.{ Either, Left, Right }
// Test class should extend org.scalatest.AppendedClues
val result = value match {
case ExpectedPattern => Right("test passed")
case _ => Left("failure explained here")
})
result shouldBe 'Right withClue(result.left.get)
This approach leverages the fact that that Scala match expression results in a value.
Here's a more concise version that does not require trait AppendedClues or assigning the result of the match expression to a val.
(value match {
case ExpectedPattern => Right("ok")
case _ => Left("failure reason")
}) shouldBe Right("ok")

Can extractors be customized with parameters in the body of a case statement (or anywhere else that an extractor would be used)?

Basically, I would like to be able to build a custom extractor without having to store it in a variable prior to using it.
This isn't a real example of how I would use it, it would more likely be used in the case of a regular expression or some other string pattern like construct, but hopefully it explains what I'm looking for:
def someExtractorBuilder(arg:Boolean) = new {
def unapply(s:String):Option[String] = if(arg) Some(s) else None
}
//I would like to be able to use something like this
val {someExtractorBuilder(true)}(result) = "test"
"test" match {case {someExtractorBuilder(true)}(result) => result }
//instead I would have to do this:
val customExtractor = someExtractorBuilder(true)
val customExtractor(result) = "test"
"test" match {case customExtractor(result) => result}
When just doing a single custom extractor it doesn't make much difference, but if you were building a large list of extractors for a case statement, it could make things more difficult to read by separating all of the extractors from their usage.
I expect that the answer is no you can't do this, but I thought I'd ask around first :D
Parameterising extractors would be cool, but we don't have the resources to implement them right now.
Nope.
8.1.7 Extractor Patterns
An extractor pattern x (p 1 , . . . ,
p n ) where n ≥ 0 is of the same
syntactic form as a constructor
pattern. However, instead of a case
class, the stable identifier x denotes
an object which has a member method
named unapply or unapplySeq that
matches the pattern.
One can customize extractors to certain extent using implicit parameters, like this:
object SomeExtractorBuilder {
def unapply(s: String)(implicit arg: Boolean): Option[String] = if (arg) Some(s) else None
}
implicit val arg: Boolean = true
"x" match {
case SomeExtractorBuilder(result) =>
result
}
Unfortunately this cannot be used when you want to use different variants in one match, as all case statements are in the same scope. Still, it can be useful sometimes.
Late but there is a scalac plugin in one of my lib providing syntax ~(extractorWith(param), bindings):
x match {
case ~(parametrizedExtractor(param)) =>
"no binding"
case ~(parametrizedExtractor(param), (a, b)) =>
s"extracted bindings: $a, $b"
}
https://github.com/cchantep/acolyte/blob/master/scalac-plugin/readme.md
Though what you are asking isn't directly possible,
it is possible to create an extractor returning a contaner that gets evaluated value in the if-part of the case evaluation. In the if part it is possible to provide parameters.
object DateExtractor {
def unapply(in: String): Option[DateExtractor] = Some(new DateExtractor(in));
}
class DateExtractor(input:String){
var value:LocalDate=null;
def apply():LocalDate = value;
def apply(format: String):Boolean={
val formater=DateTimeFormatter.ofPattern(format);
try{
val parsed=formater.parse(input, TemporalQueries.localDate());
value=parsed
true;
} catch {
case e:Throwable=>{
false
}
}
}
}
Usage:
object DateExtractorUsage{
def main(args: Array[String]): Unit = {
"2009-12-31" match {
case DateExtractor(ext) if(ext("dd-MM-yyyy"))=>{
println("Found dd-MM-yyyy date:"+ext())
}
case DateExtractor(ext) if(ext("yyyy-MM-dd"))=>{
println("Found yyyy-MM-dd date:"+ext())
}
case _=>{
println("Unable to parse date")
}
}
}
}
This pattern preserves the PartialFunction nature of the piece of code. I find this useful since I am quite a fan of the collect/collectFirst methods, which take a partial function as a parameter and typically does not leave room for precreating a set of extractors.