Pattern matching a domain name - scala

I don't use pattern matching as often as I should.
I am matching a domain name for the following:
1. If it starts with www., then remove that portion and return.
www.stackoverflow.com => "stackoverflow.com"
2. If it has either example.com or example.org, strip that out and return.
blog.example.com => "blog"
3. return request.domain
hello.world.com => "hello.world.com"
def filterDomain(request: RequestHeader): String = {
request.domain match {
case //?? case #1 => ?
case //?? case #2 => ?
case _ => request.domain
}
}
How do I reference the value (request.domain) inside the expression and see if it starts with "www." like:
if request.domain.startsWith("www.") request.domain.substring(4)

You can give the variable you pattern matching a name and Scala will infer its type, plus you can put an if statement in you case expression as follows
def filterDomain(request: RequestHeader): String = {
request.domain match {
case domain if domain.startsWith("www.") => domain.drop(4)
case domain if domain.contains("example.org") | domain.contains("example.com") => domain.takeWhile(_!='.')
case _ => request.domain
}
}
Note that the order of the case expressions matters.

When writing case clauses you can do something like:
case someVar if someVar.length < 2 => someVar.toLowerCase
This should make pretty clear how grabbing matched values works.
So in this case, you would need to write something like:
case d if d.startsWith("www.") => d.substring(4)

If you're dead set on using a regex rather than String methods such as startsWith and contains, you can do the following:
val wwwMatch = "(?:www\\.)(.*)".r
val exampleMatch = "(.*)(?:\\.example\\.(?:(?:com)|(?:org)))(.*)".r
def filterDomain(request: String): String = {
request.domain match {
case wwwMatch(d) => d
case exampleMatch(d1, d2) => d1 + d2
case _ => request.domain
}
}
Now, for maintainability's sake, I wouldn't go this way, because a month later, I will look at this and not remember what it's doing, but that's your call.

you don't need pattern matching for that:
request.domain
.stripPrefix("www.")
.stripSuffix(".example.org")
.stripSuffix(".example.com")

Related

pattern matching in String Scala

I wrote the code above to define the type of String based on some rules.
def dataType (input:String) : String = input match {
case input if input.startsWith("Q") => "StringType";
case input if (input.startsWith("8") && !(input.contains("F"))) => "IntegerType"
case input if (input.startsWith("8") && (input.contains("F"))) => "FloatType"
case _ => "UnknowType";
}
This code works well , but I want to optimize it by avoiding the use of If satements. I want it to be based on pattern matching only without any use of if statements.
I tried to modify it this way , but it gives me bad results :
def dataType (input:String) : String = input match {
case "startsWith('Q')" => "StringType"
case "startsWith('8') && !(contains('F')))" => "IntegerType"
case "startsWith('8') && (contains('F')))" => "FloatType"
case _ => "UnknowType";
}
it always gives me the UnknownType result
Any help with this please
Best Regards
Since you are checking for the initial letter and boolean for containing F, you can create Tuple2[Char, Boolean] of those cases and use it in you match case as following
def dataType (input:String) : String = (input.charAt(0), input.contains('F')) match {
case ('8', true) => "FloatType"
case ('Q', _) => "StringType"
case ('8', false) => "IntegerType"
case _ => "UnknowType"
}
And you should be fine

How to use a Result[String] in Scala match expression

In the following code the first expression returns a Result[String] which contains one of the strings "medical", "dental" or "pharmacy" inside of a Result. I can add .toOption.get to the end of the val statement to get the String, but is there a better way to use the Result? Without the .toOption.get, the code will not compile.
val service = element("h2").containingAnywhere("claim details").fullText()
service match {
case "medical" => extractMedicalClaim
case "dental" => extractDentalClaim
case "pharmacy" => extractPharmacyClaim
}
Hard to say without knowing what Result is. If it's a case class, with the target String as part of its constructor, then you could pattern match directly.
Something like this.
service match {
case Result("medical") => extractMedicalClaim
case Result("dental") => extractDentalClaim
case Result("pharmacy") => extractPharmacyClaim
case _ => // default result
}
If the Result class doesn't have an extractor (the upapply() method) you might be able to add one just for this purpose.
I'm assuming this Result[T] class has a toOption method which returns an Option[T] - if that's the case, you can call toOption and match on that option:
val service = element("h2").containingAnywhere("claim details").fullText().toOption
service match {
case Some("medical") => extractMedicalClaim
case Some("dental") => extractDentalClaim
case Some("pharmacy") => extractPharmacyClaim
case None => // handle the case where the result was empty
}

Scala methods with generic parameter type

I have been working with Scala for close to a year, but every now and then I come across a piece of code that I don't really understand. This time it is this one. I tried looking into documents on "scala methods with generic parameter type", but I am still confused.
def defaultCall[T](featureName : String) (block : => Option[T])(implicit name: String, list:Seq[String]) : Option[T] =
{
val value = block match {
case Some(n) => n match {
case i : Integer => /*-------Call another method----*/
case s : String => /*--------Call another method----*/
}
case _ => None
}
The method is called using the code shown below :
var exValue = Some(10)
val intialization = defaultCall[Integer]("StringName"){exValue}
What I don't understand in the above described code is the "case" statement in the defaultCall method.
I see that when the exValue has a value and is not empty, the code works as expected. But in case I change the exValue to None, then my code goes into the "case _ = None" condition. I don't understand why this happens since the match done here is against the "variable" which would be either an Integer or a String.
What happens here is that when you pass a None it will match on the second case, which "catches" everything that is not an instance of a Some[T]:
block match {
case Some(n) => // Will match when you pass an instance of Some[T]
case _ => // Will match on any other case
}
Note that None and Some are two different classes that inherit from Option.
Also, the variable match is only done if the first match succeeds, otherwise not. To achieve the type checking in the first match you could do:
block match {
case Some(n: Int) => // do stuff
case Some(n: String) => // do stuff
case _ => // Will match on any other case
}
Hope that helps

Cleanest way in Scala to avoid nested ifs when transforming collections and checking for error conditions in each step

I have some code for validating ip addresses that looks like the following:
sealed abstract class Result
case object Valid extends Result
case class Malformatted(val invalid: Iterable[IpConfig]) extends Result
case class Duplicates(val dups: Iterable[Inet4Address]) extends Result
case class Unavailable(val taken: Iterable[Inet4Address]) extends Result
def result(ipConfigs: Iterable[IpConfig]): Result = {
val invalidIpConfigs: Iterable[IpConfig] =
ipConfigs.filterNot(ipConfig => {
(isValidIpv4(ipConfig.address)
&& isValidIpv4(ipConfig.gateway))
})
if (!invalidIpConfigs.isEmpty) {
Malformatted(invalidIpConfigs)
} else {
val ipv4it: Iterable[Inet4Address] = ipConfigs.map { ipConfig =>
InetAddress.getByName(ipConfig.address).asInstanceOf[Inet4Address]
}
val dups = ipv4it.groupBy(identity).filter(_._2.size != 1).keys
if (!dups.isEmpty) {
Duplicates(dups)
} else {
val ipAvailability: Map[Inet4Address, Boolean] =
ipv4it.map(ip => (ip, isIpAvailable(ip)))
val taken: Iterable[Inet4Address] = ipAvailability.filter(!_._2).keys
if (!taken.isEmpty) {
Unavailable(taken)
} else {
Valid
}
}
}
}
I don't like the nested ifs because it makes the code less readable. Is there a nice way to linearize this code? In java, I might use return statements, but this is discouraged in scala.
I personally advocate using a match everywhere you can, as it in my opinion usually makes code very readable
def result(ipConfigs: Iterable[IpConfig]): Result =
ipConfigs.filterNot(ipc => isValidIpv4(ipc.address) && isValidIpv4(ipc.gateway)) match {
case Nil =>
val ipv4it = ipConfigs.map { ipc =>
InetAddress.getByName(ipc.address).asInstanceOf[Inet4Address]
}
ipv4it.groupBy(identity).filter(_._2.size != 1).keys match {
case Nil =>
val taken = ipv4it.map(ip => (ip, isIpAvailable(ip))).filter(!_._2).keys
if (taken.nonEmpty) Unavailable(taken) else Valid
case dups => Duplicates(dups)
}
case invalid => Malformatted(invalid)
}
Note that I've chosen to match on the else part first, since you generally go from specific to generic in matches, since Nil is a subclass of Iterable I put that as the first case, eliminating the need for an i if i.nonEmpty in the other case, since it would be a given if it didn't match Nil
Also a thing to note here, all your vals don't need the type explicitly defined, it significantly declutters the code if you write something like
val ipAvailability: Map[Inet4Address, Boolean] =
ipv4it.map(ip => (ip, isIpAvailable(ip)))
as simply
val ipAvailability = ipv4it.map(ip => (ip, isIpAvailable(ip)))
I've also taken the liberty of removing many one-off variables I didn't find remotely necessary, as all they did was add more lines to the code
A thing to note here about using match over nested ifs, is that is that it's easier to add a new case than it is to add a new else if 99% of the time, thereby making it more modular, and modularity is always a good thing.
Alternatively, as suggested by Nathaniel Ford, you can break it up into several smaller methods, in which case the above code would look like so:
def result(ipConfigs: Iterable[IpConfig]): Result =
ipConfigs.filterNot(ipc => isValidIpv4(ipc.address) && isValidIpv4(ipc.gateway)) match {
case Nil => wellFormatted(ipConfigs)
case i => Malformatted(i)
}
def wellFormatted(ipConfigs: Iterable[IpConfig]): Result = {
val ipv4it = ipConfigs.map(ipc => InetAddress.getByName(ipc.address).asInstanceOf[Inet4Address])
ipv4it.groupBy(identity).filter(_._2.size != 1).keys match {
case Nil => noDuplicates(ipv4it)
case dups => Duplicates(dups)
}
}
def noDuplicates(ipv4it: Iterable[IpConfig]): Result =
ipv4it.map(ip => (ip, isIpAvailable(ip))).filter(!_._2).keys match {
case Nil => Valid
case taken => Unavailable(taken)
}
This has the benefit of splitting it up into smaller more manageable chunks, while keeping to the FP ideal of having functions that only do one thing, but do that one thing well, rather than having god-methods that do everything.
Which style you prefer, of course is up to you.
This has some time now but I will add my 2 cents. The proper way to handle this is with Either. You can create a method like:
def checkErrors[T](errorList: Iterable[T], onError: Result) : Either[Result, Unit] = if(errorList.isEmpty) Right() else Left(onError)
so you can use for comprehension syntax
val invalidIpConfigs = getFormatErrors(ipConfigs)
val result = for {
_ <- checkErrors(invalidIpConfigs, Malformatted(invalidIpConfigs))
dups = getDuplicates(ipConfigs)
_ <- checkErrors(dups, Duplicates(dups))
taken = getAvailability(ipConfigs)
_ <- checkErrors(taken, Unavailable(taken))
} yield Valid
If you don't want to return an Either use
result.fold(l => l, r => r)
In case of the check methods uses Futures (could be the case for getAvailability, for example), you can use cats library to be able of use it in a clean way: https://typelevel.org/cats/datatypes/eithert.html
I think it's pretty readable and I wouldn't try to improve it from there, except that !isEmpty equals to nonEmpty.

Is there a way to see what a wildcard pattern is receiving during a match in Scala?

When doing pattern matching in an Akka or Scala Actor, is there a way to see what the match was NOT (i.e.) what is being evaluated by the wildcard _? Is there a simple way to see which message is being processed from the mailbox that it can't find a match for?
def receive = {
case A =>
case B =>
case C =>
...
case _ =>
println("what IS the message evaluated?")
}
Thanks,
Bruce
You can just define variable like this:
def receive = {
case A =>
case B =>
case C =>
...
case msg =>
println("unsupported message: " + msg)
}
You can even assign names to the messages that you are matching with #:
def receive = {
case msg # A => // do someting with `msg`
...
}
The "correct" way to do this in Akka is to override the "unhandled"-method, do what you want, and either delegate to the default behavior or replace it.
http://akka.io/api/akka/2.0-M4/#akka.actor.Actor
As for pattern matching in general, just match on anything, and bind it to a name, so you can refer to it:
x match {
case "foo" => whatever
case otherwise => //matches anything and binds it to the name "otherwise", use that inside the body of the match
}