I'm a Scala beginner and this piece of code makes me struggle.
Is there a way to do pattern matching to make sure everything i pass to Data is of the correct type? As you can see i have quite strange datatypes...
class Data (
val recipient: String,
val templateText: String,
val templateHtml: String,
val blockMaps: Map[String,List[Map[String,String]]],
templateMap: Map[String,String]
)
...
val dataParsed = JSON.parseFull(message)
dataParsed match {
case dataParsed: Map[String, Any] => {
def e(s: String) = dataParsed get s
val templateText = e("template-text")
val templateHtml = e("template-html")
val recipient = e("email")
val templateMap = e("data")
val blockMaps = e("blkdata")
val dependencies = new Data(recipient, templateText, templateHtml, blockMaps, templateMap)
Core.inject ! dependencies
}
...
I guess your problem is you want to be able to patten match the map that you get from parseFull(), but Map doesn't have an unapply.
So you could pattern match every single value, providing a default if it is not of the correct type:
val templateText: Option[String] = e("template-text") match {
case s: String => Some(s)
case _ => None
}
Or temporarily put all the data into some structure that can be pattern matched:
val data = (e("template-text"), e("template-html"), e("email"), e("data"),
e("blkdata"))
val dependencies: Option[Data] = data match {
case (templateText: String,
templateHtml: String,
blockMaps: Map[String,List[Map[String,String]]],
templateMap: Map[String,String]) =>
Some(new Data(recipient, templateText, templateHtml, blockMaps, templateMap))
case _ => None
}
Related
I have following HOCON config:
a {
b.c.d = "val1"
d.f.g = "val2"
}
HOCON represents paths "b.c.d" and "d.f.g" as objects. So, I would like to have a reader, which reads these configs as Map[String, String], ex:
Map("b.c.d" -> "val1", "d.f.g" -> "val2")
I've created a reader and trying to do it recursively:
import scala.collection.mutable.{Map => MutableMap}
private implicit val mapReader: ConfigReader[Map[String, String]] = ConfigReader.fromCursor(cur => {
def concat(prefix: String, key: String): String = if (prefix.nonEmpty) s"$prefix.$key" else key
def toMap(): Map[String, String] = {
val acc = MutableMap[String, String]()
def go(
cur: ConfigCursor,
prefix: String = EMPTY,
acc: MutableMap[String, String]
): Result[Map[String, Object]] = {
cur.fluent.mapObject { obj =>
obj.value.valueType() match {
case ConfigValueType.OBJECT => go(obj, concat(prefix, obj.pathElems.head), acc)
case ConfigValueType.STRING =>
acc += (concat(prefix, obj.pathElems.head) -> obj.asString.right.getOrElse(EMPTY))
}
obj.asRight
}
}
go(cur, acc = acc)
acc.toMap
}
toMap().asRight
})
It gives me the correct result but is there a way to avoid MutableMap here?
P.S. Also, I would like to keep implementation by "pureconfig" reader.
The solution given by Ivan Stanislavciuc isn't ideal. If the parsed config object contains values other than strings or objects, you don't get an error message (as you would expect) but instead some very strange output. For instance, if you parse a typesafe config document like this
"a":[1]
The resulting value will look like this:
Map(a -> [
# String: 1
1
])
And even if the input only contains objects and strings, it doesn't work correctly, because it erroneously adds double quotes around all the string values.
So I gave this a shot myself and came up with a recursive solution that reports an error for things like lists or null and doesn't add quotes that shouldn't be there.
implicit val reader: ConfigReader[Map[String, String]] = {
implicit val r: ConfigReader[String => Map[String, String]] =
ConfigReader[String]
.map(v => (prefix: String) => Map(prefix -> v))
.orElse { reader.map { v =>
(prefix: String) => v.map { case (k, v2) => s"$prefix.$k" -> v2 }
}}
ConfigReader[Map[String, String => Map[String, String]]].map {
_.flatMap { case (prefix, v) => v(prefix) }
}
}
Note that my solution doesn't mention ConfigValue or ConfigReader.Result at all. It only takes existing ConfigReader objects and combines them with combinators like map and orElse. This is, generally speaking, the best way to write ConfigReaders: don't start from scratch with methods like ConfigReader.fromFunction, use existing readers and combine them.
It seems a bit surprising at first that the above code works at all, because I'm using reader within its own definition. But it works because the orElse method takes its argument by name and not by value.
You can do the same without using recursion. Use method entrySet as following
import scala.jdk.CollectionConverters._
val hocon =
"""
|a {
| b.c.d = "val1"
| d.f.g = val2
|}""".stripMargin
val config = ConfigFactory.load(ConfigFactory.parseString(hocon))
val innerConfig = config.getConfig("a")
val map = innerConfig
.entrySet()
.asScala
.map { entry =>
entry.getKey -> entry.getValue.render()
}
.toMap
println(map)
Produces
Map(b.c.d -> "val1", d.f.g -> "val2")
With given knowledge, it's possible to define a pureconfig.ConfigReader that reads Map[String, String] as following
implicit val reader: ConfigReader[Map[String, String]] = ConfigReader.fromFunction {
case co: ConfigObject =>
Right(
co.toConfig
.entrySet()
.asScala
.map { entry =>
entry.getKey -> entry.getValue.render()
}
.toMap
)
case value =>
//Handle error case
Left(
ConfigReaderFailures(
ThrowableFailure(
new RuntimeException("cannot be mapped to map of string -> string"),
Option(value.origin())
)
)
)
}
I did not want to write custom readers to get a mapping of key value pairs. I instead changed my internal data type from a map to list of pairs (I am using kotlin), and then I can easily change that to a map at some later internal stage if I need to. My HOCON was then able to look like this.
additionalProperties = [
{first = "sasl.mechanism", second = "PLAIN"},
{first = "security.protocol", second = "SASL_SSL"},
]
additionalProducerProperties = [
{first = "acks", second = "all"},
]
Not the best for humans... but I prefer it to having to build custom parsing components.
I have a few vals that match for matching values
Here is an example:
val job_ = Try(jobId.toInt) match {
case Success(value) => jobs.findById(value).map(_.id)
.getOrElse( Left(WrongValue("jobId", s"$value is not a valid job id")))
case Failure(_) => jobs.findByName(jobId.toString).map(_.id)
.getOrElse( Left(WrongValue("jobId", s"'$jobId' is not a known job title.")))
}
// Here the value arrives as a string e.i "yes || no || true || or false" then converted to a boolean
val bool_ = bool.toLowerCase() match {
case "yes" => true
case "no" => false
case "true" => true
case "false" => false
case other => Left(Invalid("bool", s"wrong value received"))
}
Note: invalid case is case class Invalid(x: String, xx: String)
above i'm looking for a given job value and checking whether it exist in the db or not,
No I have a few of these and want to add to a list, here is my list val and flatten it:
val errors = List(..all my vals errors...).flatten // <--- my_list_val (how do I include val bool_ and val job_)
if (errors.isEmpty) { do stuff }
My result should contain errors from val bool_ and val job_
THANK!
You need to fix the types first. The type of bool_ is Any. Which does not give you something you can work with.
If you want to use Either, you need to use it everwhere.
Then, the easiest approach would be to use a for comprehension (I am assuming you're dealing with Either[F, T] here, where WrongValue and Invalid are both sub-classes of F and you're not really interested in the errors).
for {
foundJob <- job_
_ <- bool_
} yield {
// do stuff
}
Note, that in Scala >= 2.13 you can use toIntOption when converting the String to Int:
vaj job_: Either[F, T] = jobId.toIntOption match {
case Some(value) => ...
case _ => ...
}
Also, in case expressions, you can use alternatives when you have the same statement for several cases:
val bool_: Either[F, Boolean] = bool.toLowerCase() match {
case "yes" | "true" => Right(true)
case "no" | "false" => Right(false)
case other => Left(Invalid("bool", "wrong value received"))
}
So, according to your question, and your comments, these are the types you're dealing with.
type ID = Long //whatever id is
def WrongValue(x: String, xx: String) :String = "?-?-?"
case class Invalid(x: String, xx: String)
Now let's create a couple of error values.
val job_ :Either[String,ID] = Left(WrongValue("x","xx"))
val bool_ :Either[Invalid,Boolean] = Left(Invalid("x","xx"))
To combine and report them you might do something like this.
val errors :List[String] =
List(job_, bool_).flatMap(_.swap.toOption.map(_.toString))
println(errors.mkString(" & "))
//?-?-? & Invalid(x,xx)
After checking types as #cbley explained. You can just do a filter operation with pattern matching on your list:
val error = List(// your variables ).filter(_ match{
case Left(_) => true
case _ => false
})
There are many nice libraries for writing/reading Scala case classes to/from CSV files. I'm looking for something that goes beyond that, which can handle nested cases classes. For example, here a Match has two Players:
case class Player(name: String, ranking: Int)
case class Match(place: String, winner: Player, loser: Player)
val matches = List(
Match("London", Player("Jane",7), Player("Fred",23)),
Match("Rome", Player("Marco",19), Player("Giulia",3)),
Match("Paris", Player("Isabelle",2), Player("Julien",5))
)
I'd like to effortlessly (no boilerplate!) write/read matches to/from this CSV:
place,winner.name,winner.ranking,loser.name,loser.ranking
London,Jane,7,Fred,23
Rome,Marco,19,Giulia,3
Paris,Isabelle,2,Julien,5
Note the automated header line using the dot "." to form the column name for a nested field, e.g. winner.ranking. I'd be delighted if someone could demonstrate a simple way to do this (say, using reflection or Shapeless).
[Motivation. During data analysis it's convenient to have a flat CSV to play around with, for sorting, filtering, etc., even when case classes are nested. And it would be nice if you could load nested case classes back from such files.]
Since a case-class is a Product, getting the values of the various fields is relatively easy. Getting the names of the fields/columns does require using Java reflection.
The following function takes a list of case-class instances and returns a list of rows, each is a list of strings. It is using a recursion to get the values and headers of child case-class instances.
def toCsv(p: List[Product]): List[List[String]] = {
def header(c: Class[_], prefix: String = ""): List[String] = {
c.getDeclaredFields.toList.flatMap { field =>
val name = prefix + field.getName
if (classOf[Product].isAssignableFrom(field.getType)) header(field.getType, name + ".")
else List(name)
}
}
def flatten(p: Product): List[String] =
p.productIterator.flatMap {
case p: Product => flatten(p)
case v: Any => List(v.toString)
}.toList
header(classOf[Match]) :: p.map(flatten)
}
However, constructing case-classes from CSV is far more involved, requiring to use reflection for getting the types of the various fields, for creating the values from the CSV strings and for constructing the case-class instances.
For simplicity (not saying the code is simple, just so it won't be further complicated), I assume that the order of columns in the CSV is the same as if the file was produced by the toCsv(...) function above.
The following function starts by creating a list of "instructions how to process a single CSV row" (the instructions are also used to verify that the column headers in the CSV matches the the case-class properties). The instructions are then used to recursively produce one CSV row at a time.
def fromCsv[T <: Product](csv: List[List[String]])(implicit tag: ClassTag[T]): List[T] = {
trait Instruction {
val name: String
val header = true
}
case class BeginCaseClassField(name: String, clazz: Class[_]) extends Instruction {
override val header = false
}
case class EndCaseClassField(name: String) extends Instruction {
override val header = false
}
case class IntField(name: String) extends Instruction
case class StringField(name: String) extends Instruction
case class DoubleField(name: String) extends Instruction
def scan(c: Class[_], prefix: String = ""): List[Instruction] = {
c.getDeclaredFields.toList.flatMap { field =>
val name = prefix + field.getName
val fType = field.getType
if (fType == classOf[Int]) List(IntField(name))
else if (fType == classOf[Double]) List(DoubleField(name))
else if (fType == classOf[String]) List(StringField(name))
else if (classOf[Product].isAssignableFrom(fType)) BeginCaseClassField(name, fType) :: scan(fType, name + ".")
else throw new IllegalArgumentException(s"Unsupported field type: $fType")
} :+ EndCaseClassField(prefix)
}
def produce(instructions: List[Instruction], row: List[String], argAccumulator: List[Any]): (List[Instruction], List[String], List[Any]) = instructions match {
case IntField(_) :: tail => produce(tail, row.drop(1), argAccumulator :+ row.head.toString.toInt)
case StringField(_) :: tail => produce(tail, row.drop(1), argAccumulator :+ row.head.toString)
case DoubleField(_) :: tail => produce(tail, row.drop(1), argAccumulator :+ row.head.toString.toDouble)
case BeginCaseClassField(_, clazz) :: tail =>
val (instructionRemaining, rowRemaining, constructorArgs) = produce(tail, row, List.empty)
val newCaseClass = clazz.getConstructors.head.newInstance(constructorArgs.map(_.asInstanceOf[AnyRef]): _*)
produce(instructionRemaining, rowRemaining, argAccumulator :+ newCaseClass)
case EndCaseClassField(_) :: tail => (tail, row, argAccumulator)
case Nil if row.isEmpty => (Nil, Nil, argAccumulator)
case Nil => throw new IllegalArgumentException("Not all values from CSV row were used")
}
val instructions = BeginCaseClassField(".", tag.runtimeClass) :: scan(tag.runtimeClass)
assert(csv.head == instructions.filter(_.header).map(_.name), "CSV header doesn't match target case-class fields")
csv.drop(1).map(row => produce(instructions, row, List.empty)._3.head.asInstanceOf[T])
}
I've tested this using:
case class Player(name: String, ranking: Int, price: Double)
case class Match(place: String, winner: Player, loser: Player)
val matches = List(
Match("London", Player("Jane", 7, 12.5), Player("Fred", 23, 11.1)),
Match("Rome", Player("Marco", 19, 13.54), Player("Giulia", 3, 41.8)),
Match("Paris", Player("Isabelle", 2, 31.7), Player("Julien", 5, 16.8))
)
val csv = toCsv(matches)
val matchesFromCsv = fromCsv[Match](csv)
assert(matches == matchesFromCsv)
Obviously this should be optimized and hardened if you ever want to use this for production...
I have a very simple (n00b) question but I'm somehow stuck. I'm trying to read a set of files in Spark with wholeTextFiles and want to return an RDD[LogEntry], where LogEntry is just a case class. I want to end up with an RDD of valid entries and I need to use a regular expression to extract the parameters for my case class. When an entry is not valid I do not want the extractor logic to fail but simply write an entry in a log. For that I use LazyLogging.
object LogProcessors extends LazyLogging {
def extractLogs(sc: SparkContext, path: String, numPartitions: Int = 5): RDD[Option[CleaningLogEntry]] = {
val pattern = "<some pattern>".r
val logs = sc.wholeTextFiles(path, numPartitions)
val entries = logs.map(fileContent => {
val file = fileContent._1
val content = fileContent._2
content.split("\\r?\\n").map(line => line match {
case pattern(dt, ev, seq) => Some(LogEntry(<...>))
case _ => logger.error(s"Cannot parse $file: $line"); None
})
})
That gives me an RDD[Array[Option[LogEntry]]]. Is there a neat way to end up with an RDD of the LogEntrys? I'm somehow missing it.
I was thinking about using Try instead, but I'm not sure if that's any better.
Thoughts greatly appreciated.
To get rid of the Array - simply replace the map command with flatMap - flatMap will treat a result of type Traversable[T] for each record as separate records of type T.
To get rid of the Option - collect only the successful ones: entries.collect { case Some(entry) => entry }.
Note that this collect(p: PartialFunction) overload (which performs something equivelant to a map and a filter combined) is very different from collect() (which sends all data to the driver).
Altogether, this would be something like:
def extractLogs(sc: SparkContext, path: String, numPartitions: Int = 5): RDD[CleaningLogEntry] = {
val pattern = "<some pattern>".r
val logs = sc.wholeTextFiles(path, numPartitions)
val entries = logs.flatMap(fileContent => {
val file = fileContent._1
val content = fileContent._2
content.split("\\r?\\n").map(line => line match {
case pattern(dt, ev, seq) => Some(LogEntry(<...>))
case _ => logger.error(s"Cannot parse $file: $line"); None
})
})
entries.collect { case Some(entry) => entry }
}
I have a form
case class UserUpdateForm(
id:Long, name: String,
remark: Option[String], location: Option[String])
I define the fields as
"id" -> of[Long],
"remarks" -> optional(text)
the remark field is None, Not Some("") I am excepting,
So, how can I bind an empty string to optional text field
case class OptionalText(wrapped: Mapping[String], val constraints: Seq[Constraint[Option[String]]] = Nil) extends Mapping[Option[String]] {
override val format: Option[(String, Seq[Any])] = wrapped.format
val key = wrapped.key
def verifying(addConstraints: Constraint[Option[String]]*): Mapping[Option[String]] = {
this.copy(constraints = constraints ++ addConstraints.toSeq)
}
def bind(data: Map[String, String]): Either[Seq[FormError], Option[String]] = {
data.keys.filter(p => p == key || p.startsWith(key + ".") || p.startsWith(key + "[")).map(k => data.get(k)).collect { case Some(v) => v }.headOption.map { _ =>
wrapped.bind(data).right.map(Some(_))
}.getOrElse {
Right(None)
}.right.flatMap(applyConstraints)
}
def unbind(value: Option[String]): (Map[String, String], Seq[FormError]) = {
val errors = collectErrors(value)
value.map(wrapped.unbind(_)).map(r => r._1 -> (r._2 ++ errors)).getOrElse(Map.empty -> errors)
}
def withPrefix(prefix: String): Mapping[Option[String]] = {
copy(wrapped = wrapped.withPrefix(prefix))
}
val mappings: Seq[Mapping[_]] = wrapped.mappings
}
val textOpt = new OptionalText(text)
Finally I copied the OptionalMapping class and exclude the empty filter part
I stumbled upon the same thing some months ago. I didn't find any easy way around it, so I decided to play along.
Basically, "remarks" -> optional(text)
will always return None when text is an empty string. So instead of treating an empty string as a sign of no updates, fill the remarks field in the form with the original value and then, after you get it back:
remarks match {
case None => // set remarks to ""
case originalRemark => // do nothing
case _ => // set remarks to the new value
}
Hope it helps. It's my first entry here, on Stack Overflow :)
Use
"remarks" -> default(text, "")