Refactoring scala match case for list of strings - scala

I have the following:
def myFunc(str: String): Something => {
str match {
case "a" | "a1" | "abc" | "qwe" | "23rs" => Something
case _ => None
}
}
The string list could be very long, I'd like to extract that to a function. I don't really know what to search for, since doing
def isSomething(str: String): Boolean => {
List("a","a1","abc","qwe","23rs").contains(str)
}
but case isSomething => Something doesn't work

Your str is a String hence won't match isSomething which is of Boolean type. Another issue with your sample code is that None is of Option type, so it would make more sense to have your match cases return the same type. Here's one approach using guard for the contains condition:
val list = List("a", "a1", "abc", "qwe", "23rs")
val s = "abc"
s match {
case s if list contains s => Some(s)
case _ => None
}
// res1: Option[String] = Some(abc)

Most of the other answers seem to cover fixing up the use of option, or moving away from pattern matching (a simple use of guards isn't really pattern matching, IMO)
I think you may be asking about extractors. If so, this might be closer to what you want:
case class Something(str: String)
// define an extractor to match our list of Strings
object MatchList {
def unapply(str: String) = {
str match {
case "a" | "a1" | "abc" | "qwe" | "23rs" => Some(str)
case _ => None
}
}
}
def myFunc(str: String): Option[Something] = {
// use our new extractor (and fix up the use of Option while we're at it)
str match {
case MatchList(str) => Some(Something(str))
case _ => None
}
}
// Couple of test cases...
myFunc("a") // Some(Something(a))
myFunc("b") // None

First you have used wrong => operator while defining the function.
scala> def isSomething(str: String): Boolean = {
| List("a","a1","abc","qwe","23rs").contains(str)
| }
isSomething: (str: String)Boolean
scala> def myFunc(str: String): String = {
|
| str match {
| case str if(isSomething(str)) => "Something"
| case _ => "None"
| }
| }
myFunc: (str: String)String
scala> myFunc("a")
res9: String = Something
I don't know what is something so i have treated it as a string. You could modify it according to your use case.
Hope it helps.

You can do something like below
val list = List("a", "a1", "abc", "qwe", "23rs")
def myFunc(str: String): Option[String] = {
list.contains(str) match {
case true => ??? //calling something function should return Some
case false => None
}
}
Option[String] can be changed according to the return type you have , but None suggests that the true case should return Option type . So String can be changed to your desired datatype

Related

Check case class with Option fields to make sure all of them are None using Shapeless HList

I have a case class (simplified):
case class UserData(name: Option[String], age: Option[String]) {
lazy val nonEmpty = name.isDefined || age.isDefined // TODO
}
Can I replace the current implementation of nonEmpty check using, for instance, Shapeless' HList in order to enumerate all the fields to check that all of them are set to None or at least one has a value?
case class UserData(name: Option[String], age: Option[String]) {
lazy val isEmpty = this.productIterator.forall(_ == None)
}
UserData(None,None).isEmpty
UserData(None,Some("s")).isEmpty
I suppose you want to do different behavior inside case class, if you dont then #pamu answer is what you are looking for.
If you really want to use shapeless you can, but no need.
I think you also check with pure scala using productIterator.
scala> val data = UserData(None, None)
data: UserData = UserData(None,None)
scala> data.productIterator.forall {
| case x: Option[_] => x.isDefined
| case _ => false
| }
res2: Boolean = false
scala> val data = UserData(Some("foo"), Some("bar"))
data: UserData = UserData(Some(foo),Some(bar))
scala> data.productIterator.forall {
| case x: Option[_] => x.isDefined
| case _ => false // you may throw exception if you are not expecting this case
| }
res3: Boolean = true

Pattern Matching for CSV processing

I have a csv file with three columns and I want to get the third column into an Iterator. I want to filter out the headers by using the trytoDouble method in combination of the Pattern Matching.
def trytoDouble(s: String): Option[Double] = {
try {
Some(s.toDouble)
} catch {
case e: Exception => None
}
}
val asdf = Source.fromFile("my.csv").getLines().map(_.split(",").map(_.trim).map(utils.trytoDouble(_))).map{
_ match {
case Array(a, b, Some(c: Double)) => c
}
}
results in
An exception or error caused a run to abort: [Lscala.Option;#2b4a2ec7 (of class [Lscala.Option;)
scala.MatchError: [Lscala.Option;#2b4a2ec7 (of class [Lscala.Option;)
what did I do wrong?
Try using an extractor like StringDouble below. If the unapply returns Some then it matches, if it returns None then it doesn't match.
object StringDouble {
def unapply(str: String): Option[Double] = Try(str.toDouble).toOption
}
val doubles =
Source.fromFile("my.csv").getLines().map { line =>
line.split(",").map(_.trim)
}.map {
case Array(_, _, StringDouble(d)) => d
}
It will always give scala.MatchError
scala> val url = "/home/knoldus/data/moviedataset.csv"
url: String = /home/knoldus/data/moviedataset.csv
scala> val asdf1 = Source.fromFile(url).getLines().map(_.split(",").map(_.trim).map(trytoDouble(_))).toList
asdf1: List[Array[Option[Double]]] = List(Array(Some(1.0), None, Some(1993.0), Some(3.9), Some(4568.0)), Array(Some(2.0), None, Some(1932.0), Some(3.5), Some(4388.0)), Array(Some(3.0), None, Some(1921.0), Some(3.2), Some(9062.0)), Array(Some(4.0), None, Some(1991.0), Some(2.8), Some(6150.0)), Array(Some(5.0), None, Some(1963.0), Some(2.8), Some(5126.0)), ....
As it will return non-empty iterator so, to see the result i have converted it toList.
If you notice return type is List[Array[Option[Double]]], and you are trying to match with Array of tuple3 but it always returns Array[Option[Double]].
Therefore, it will always throw error.
Try This:
val asdf = Array("1,2,3","1,b,c").map(_.split(",").map(_.trim).map(trytoDouble(_))).map{
_(2) match {
case x:Option[Double] => x
}
}
having only one match case is always breaking code.
See the REPL example,
scala> def trytoDouble(s: String): Option[Double] = {
| try {
| Some(s.toDouble)
| } catch {
| case e: Exception => None
| }
| }
when there's no Double in your CSV,
scala> List("a,b,c").map(_.split(",").map(_.trim).map(trytoDouble(_)))
res1: List[Array[Option[Double]]] = List(Array(None, None, None))
When you match above result (Array(None, None, None)) with Array(a, b, Some(c: Double)) always going to fail,
scala> List("a,b,c").map(_.split(",").map(_.trim).map(trytoDouble(_))).map { data =>
| data match {
| case Array(a, b, Some(c: Double)) => c
| }
| }
scala.MatchError: [Lscala.Option;#22604c7e (of class [Lscala.Option;)
at .$anonfun$res4$4(<console>:15)
at .$anonfun$res4$4$adapted(<console>:13)
at scala.collection.immutable.List.map(List.scala:283)
... 33 elided
And when there Double,
scala> List("a,b,100").map(_.split(",").map(_.trim).map(trytoDouble(_))).map { data => data match { case Array(a, b, Some(c: Double)) => c }}
res5: List[Double] = List(100.0)
You basically need to check for Some(c) or None.
BUT, if I understand you you are to extract third field as Double which can be done this way,
scala> List("a,b,100", "100, 200, p").map(_.split(",")).map { case Array(a, b, c) => trytoDouble(c)}.filter(_.nonEmpty)
res14: List[Option[Double]] = List(Some(100.0))
The other answers are great, but I just noticed that an alternative of minimal code change is to simply write
val asdf = Source.fromFile(fileName).getLines().map(_.split(",").map(_.trim).map( utils.trytoDouble(_))).flatMap{
_ match {
case Array(a, b, c) => c
}
}
or slightly more efficient (since we are only interested in the last column anyway):
val asdf = Source.fromFile(fileName).getLines().map(_.split(",")).flatMap{
_ match {
case Array(a, b, c) => trytoDouble(c.trim)
}
}
Important to notice is here that flatMap will remove None objects.

pattern matching with case match in scala

I have a match statement like this:
val x = y match {
case array: Array[Float] => call z
case array: Array[Double] => call z
case array: Array[BigDecimal] => call z
case array: Array[_] => show error
}
How do I simplify this to use only two case statements, since first three case statements do same thing, instead of four.
Type erasure does not really gives you opportunity to understand how array was typed. What you should do instead is to extract head ( first element) of array and check it's type. For example following code works for me:
List(1,2,3) match {
case (a:Int) :: tail => println("yep")
}
This work, although not very nice:
def x(y: Array[_]) = y match {
case a if a.isInstanceOf[Array[Double]] ||
a.isInstanceOf[Array[Float]] ||
a.isInstanceOf[Array[BigDecimal]] => "call z"
case _ => "show error"
}
Would have thought that pattern matching with "|" as below would do the trick. However, this gives pattern type is incompatible with expected type on Array[Float] and Array[BigDecimal]. It might be that matching of generic on this single case where it could work has not been given so much attention:
def x(y: Array[_ <: Any]) = y match {
case a # (_:Array[Double] | _:Array[Float] | _:Array[BigDecimal]) => "call z"
case a: Array[_] => "show error"
}
May be it helps a bit:
import reflect.runtime.universe._
object Tester {
def test[T: TypeTag](y: Array[T]) = y match {
case c: Array[_] if typeOf[T] <:< typeOf[AnyVal] => "hi"
case c: Array[_] => "oh"
}
}
scala> Tester.test(Array(1,2,3))
res0: String = hi
scala> Tester.test(Array(1.0,2.0,3.0))
res1: String = hi
scala> Tester.test(Array("a", "b", "c"))
res2: String = oh
You can obtain the class of array elements as follows (it will be null for non-array types): c.getClass.getComponentType. So you can write:
if (Set(classOf[Float], classOf[Double], classOf[BigDecimal]).contains(c.getClass.getComponentType)) {
// call z
} else {
// show error
}
Not particularly Scala'ish, though; I think #thoredge's answer is the best for that.
You could also check whether the Array is empty first and then if not, just pattern match on Array.head...something like:
def x(y: Array[_]) = {
y.isEmpty match {
case true => "error"
case false => y.head match {
case a:Double | a:BigInt => do whatever
case _ => "error"
}
}
}

CLOSED!! How i can detect the type from a string in Scala?

I'm trying to parse the csv files and I need to determine the type of each field starting from its string value.
for examples:
val row: Array[String] = Array("1/1/06 0:00","3108 OCCIDENTAL DR","3","3C","1115")
this is what I would get:
row(0) --> Date
row(1) --> String
row(2) --> Int
Ecc....
how can I do?
------------------------------------ SOLUTION ------------------------------------
This is the solution I've found to recognize the fields String, Date, Int, Double and Boolean.
I hope that someone can serve in the future.
def typeDetection(x: String): String = {
x match {
// Matches: [12], [-22], [0] Non-Matches: [2.2], [3F]
case int if int.matches("^-?[0-9]+$") => "Int"
// Matches: [2,2], [-2.3], [0.2232323232332] Non-Matches: [.2], [,2], [2.2.2]
case double if double.matches("^-?[0-9]+(,|.)[0-9]+$") => "Double"
// Matches: [29/02/2004 20:15:27], [29/2/04 8:9:5], [31/3/2004 9:20:17] Non-Matches: [29/02/2003 20:15:15], [2/29/04 20:15:15], [31/3/4 9:20:17]
case d1 if d1.matches("^((((31\\/(0?[13578]|1[02]))|((29|30)\\/(0?[1,3-9]|1[0-2])))\\/(1[6-9]|[2-9]\\d)?\\d{2})|(29\\/0?2\\/(((1[6-9]|[2-9]\\d)?(0[48]|[2468][048]|[13579][26])|((16|[2468][048]|[3579][26])00))))|(0?[1-9]|1\\d|2[0-8])\\/((0?[1-9])|(1[0-2]))\\/((1[6-9]|[2-9]\\d)?\\d{2})) *(?:(?:([01]?\\d|2[0-3])(\\-|:|\\.))?([0-5]?\\d)(\\-|:|\\.))?([0-5]?\\d)")
=> "Date"
// Matches: [01.1.02], [11-30-2001], [2/29/2000] Non-Matches: [02/29/01], [13/01/2002], [11/00/02]
case d2 if d2.matches("^(?:(?:(?:0?[13578]|1[02])(\\/|-|\\.)31)\\1|(?:(?:0?[1,3-9]|1[0-2])(\\/|-|\\.)(?:29|30)\\2))(?:(?:1[6-9]|[2-9]\\d)?\\d{2})$|^(?:0?2(\\/|-|\\.)29\\3(?:(?:(?:1[6-9]|[2-9]\\d)?(?:0[48]|[2468][048]|[13579][26])|(?:(?:16|[2468][048]|[3579][26])00))))$|^(?:(?:0?[1-9])|(?:1[0-2]))(\\/|-|\\.)(?:0?[1-9]|1\\d|2[0-8])\\4(?:(?:1[6-9]|[2-9]\\d)?\\d{2})$")
=> "Date"
// Matches: [12/01/2002], [12/01/2002 12:32:10] Non-Matches: [32/12/2002], [12/13/2001], [12/02/06]
case d3 if d3.matches("^(([0-2]\\d|[3][0-1])(\\/|-|\\.)([0]\\d|[1][0-2])(\\/|-|\\.)[2][0]\\d{2})$|^(([0-2]\\d|[3][0-1])(\\/|-|\\.)([0]\\d|[1][0-2])(\\/|-|\\.)[2][0]\\d{2}\\s([0-1]\\d|[2][0-3])\\:[0-5]\\d\\:[0-5]\\d)$")
=> "Date"
case boolean if boolean.equalsIgnoreCase("true") || boolean.equalsIgnoreCase("false") => "Boolean"
case _ => "String"
}
}
val row: Array[String] = Array("1/1/06 0:00","3108 OCCIDENTAL DR","3","3C","1115")
val types: Array[String] = row.map(x => x match {
case string if string.contains("/") => "Date probably"
case string if string.matches("[0-9]+") => "Int probably"
case _ => "String probably"
})
types.foreach( x => println(x))
Outputs:
Date probably
String probably
Int probably
String probably
Int probably
But in all honesty I wouldn't use this approach, this is so error prone and there are so many things that could go wrong that I don't even want to think about it, the simplest example is what if a string contains a /, this small piece of code would match that as a Date.
I don't know your use-case but in my experience it's always a bad idea to create something that tries to guess types form unsecure data, if you have control over it you could introduce some identifier, for example "1/1/06 0:00 %d%" where %d% would indicate a date and so on and then remove it from the string, and even then you'll never be 100% sure that this won't fail.
For each string: try parsing it into the type you want. You'll have to write a function for each type. Keep trying in order until one of them works, order is important. You can use your favorite Date/Time library.
import java.util.Date
def stringdetect (s : String) = {
dateFromString(s) orElse intFromString(s) getOrElse s
}
def arrayDetect(row : Array[String]) = row map stringdetect
def arrayTypes(row : Array[String]) = {
arrayDetect(row) map { _ match {
case x:Int => "Int"
case x:Date => "Date"
case x:String => "String"
case _ => "?"
} }
}
def intFromString(s : String): Option[Int] = {
try {
Some(s.toInt)
} catch {
case _ : Throwable => None
}
}
def dateFromString(s : String): Option[Date] = {
try {
val formatter = new java.text.SimpleDateFormat("d/M/yy h:mm")
formatter.format(new java.util.Date)
Some(formatter.parse(s))
} catch {
case _ : Throwable => None
}
}
From the REPL / worksheet:
val row: Array[String] = Array("1/1/06 0:00","3108 OCCIDENTAL DR","3","3C","1115")
//> row : Array[String] = Array(1/1/06 0:00, 3108 OCCIDENTAL DR, 3, 3C, 1115)
arrayDetect(row)
//> res0: Array[Any] = Array(Sun Jan 01 00:00:00 CST 2006, 3108 OCCIDENTAL DR, 3 , 3C, 1115)
arrayTypeDisplay(row)
//> res1: Array[String] = Array(Date, String, Int, String, Int)

Regex.MatchData returning null: why not Option[String]?

Is there any particular reason why Regex.MatchData.group(i: Int): java.lang.String returns null rather than Option[String]?
Is there a "Scala Way" to handle nulls in Scala?
It returns null because it is a shallow interface over the Java library. I think it sucks too, and I have been bitten by it.
If you get a value that may be null, you can write Option(value) on Scala 2.8 and it will become either None or Some(value). That doesn't work with pattern matching, but you can write your own extractor for that:
object Optional {
def unapply[T](a: T) = if (null == a) Some(None) else Some(Some(a))
}
Examples:
scala> val a:String = null
a: String = null
scala> a match {
| case Optional(None) => println("Got none")
| case Optional(Some(value)) => println("Got "+value)
| }
Got none
scala> val a = "string"
a: java.lang.String = string
scala> a match {
| case Optional(None) => println("Got none")
| case Optional(Some(value)) => println("Got "+value)
| }
Got string
scala> val a = "5"
a: java.lang.String = 5
scala> a match {
| case Optional(None) => println("Got none")
| case Optional(Some(value)) => println("Got "+value.toInt)
| }
Got 5