Scala: Dynamic selection criteria as a method argument - scala

My intention is to create a function in Scala which accepts some kind of a dynamic query (similar to case expressions in pattern matching) and returns matching instances inside a List. Suppose that the list consists of instances of case class Person which has few properties with different types. The function should be able to accept a dynamic combination of values for some of the fields, and return matching Persons. I am specifically looking for a clean solution. One possible ways to use such a function would be to pass an object with an anonymous type as the query (the "pattern"):
def find(?): List<Person> = { ? }
val matches = find(new { val name = "Name"; val gender = Gender.MALE })
Ideally, I would like to develop a clean way of passing conditions, instead of concrete values, but it's not essential. Since I am learning Scala, I am not aware of all the techniques to implement such a thing. In C#, I used an anonymous type (similar to the second line of code above) and dynamic parameters to achieve something similar. What is a clean and elegant solution in Scala?

I'm not sure if this is what you are looking for but let's try it this way:
First, we define Person as case class Person(name: String, gender: Gender.Value) where Gender is an already defined enum.
Then we create a Query case class which has the same fields, but as options which default to None, and a method for comparing the query to a person:
case class Query(name: Option[String] = None,
gender: Option[Gender.Value] = None){
def ===(person: Person) = check(person.name, name) &&
check(person.gender, gender)
private def check[T](field: T, q: Option[T]) = field == q.getOrElse(field)
}
Unfortunately, in this solution === has to call check separately for each field. Let's leave it like that for now. Maybe it is sufficient (because, for example, the list of fields will not change).
Note that check returns true if the query's option is None, sot you don't have to pass all fields of the query:
val q = Query(name = Some("Ann")) // the gender is not important
q === Person("Ann", Gender.FEMALE) // returns true
And finally the find method:
def find(people: List[Person], query: Query) = people.filter(query === _)

Related

How to deep compare each element with other elements in a list in scala (optional fileds)

I have case class that represents report, and report have expenses.
case class FakeExpense(amount: Option[Double], country: Option[String], currency: Option[String])
case class FakeReport(id: Int, expenses: List[FakeExpense])
and I want to return true/false if report is valid or not, and it is not valid if there is 2 expenses with the exact same fields...what would be the right way with scala to do something like this?
a valid report:
val report = FakeReport(1, List(FakeExpense(Some(150), Some("US"), Some("USD")),FakeExpense(Some(85), Some("DE"), Some("EUR"))))
a non valid report:
val report = FakeReport(2, List(FakeExpense(Some(150), Some("US"), Some("USD")),FakeExpense(Some(150), Some("US"), Some("USD"))))
Thanks!
Consider List.distinct approach like so
def isValidReport(report: FakeReport): Boolean =
report.expenses.distinct.length == report.expenses.length
The above solution makes three passes through the list. Applying Luis' comment we can reduce to two passes like so
def isValidReport(report: FakeReport): Boolean = {
report.expenses.sizeIs == report.expenses.iterator.distinct.length
}
We shave off one pass because distinct.length collapses to a single transformation in iterator.distinct.length. sizeIs is a potential quick-savings where we might not have to go through the whole list.
Here is a single-pass tail-recursive solution based on HashSet having O(eC) lookup and insertion
def isValidReport(report: FakeReport): Boolean = {
val set = mutable.HashSet[FakeExpense]()
#tailrec def loop(l: List[FakeExpense]): Boolean = l match {
case Nil => true
case h :: t => if (set.add(h)) loop(t) else false
}
loop(report.expenses)
}
Also note how we might return early on first duplicate.
If you need exactly same reports disregarding double comparison with epsilon(|a-b| < epsilon, where epsilon is small enough), you just can turn list to set as it will remove duplicate entries and then check if the size of your list changed.
fakeReport.expenses.toSet.length == fakeReport.expenses.length
It will work till everything that inside your classes you want to use with == correctly implement equals methods (primitive types, most of default classes, case classes and objects do so). What will not implement this correctly is an array and possibly certain java classes and these classes wrapped into AnyVal classes.

Grouping by generic parameter in Slick

I am trying to implement generic grouping using Slick 3.2.3. By generic grouping I mean grouping the same query by different parameters or sets thereof.
Supposing I have a table:
class MyTable(tag: Tag) extends Table[MyEntry](tag, "my_table") {
def text1 = column[String]("text1")
def text2 = column[Option[String]]("text2")
def list = column[List[String]]("list") // I am using postgres+slick_pg
...
}
Then I have a complex query with several joins and I would like to be able to group it by text1, (text1, text2), list etc. One way to do it would be to define a generic function which performs grouping using extractor parameter:
private def getData[T](extractor: MyTable => T) = {
// supposing MyTable comes second in the list
// of joined tables in my complex query
val groupedQuery = myComplexQuery.groupedBy(x => extractor(x._2))
...
// here goes aggregation functions, mapping etc.
}
where one of extractor implementations may be defined as
val extractor: MyTable => (Rep[String], Rep[Option[String]]) = me => me.text1 -> me.text2
However, since extractor is generic, groupBy cannot find matching Shape for T type, and it means that I will have to provide it as well. My question is how exactly to define such Shapes? Documentation for slick.lifted package lacks examples, and it is not exactly obvious what generic types K, T, G and P mean in Query#groupBy definition (or FlatShapeLevel for that matter). I would appreciate if somebody provided examples of such extractor functions at least for a primitive type (String) and a tuple2 (say, (String, Option[String])). Or perhaps there is a better way to achieve the same result which I have overlooked? Thanks.

Understanding Some and Option in Scala

I am reading this example from their docs:
class Email(val username: String, val domainName: String)
object Email {
def fromString(emailString: String): Option[Email] = {
emailString.split('#') match {
case Array(a, b) => Some(new Email(a, b))
case _ => None
}
}
}
println(Email.fromString("scala.center#epfl.ch"))
val scalaCenterEmail = Email.fromString("scala.center#epfl.ch")
scalaCenterEmail match {
case Some(email) => println(
s"""Registered an email
|Username: ${email.username}
|Domain name: ${email.domainName}
""")
case None => println("Error: could not parse email")
}
My questions:
What is Some and Option?
What is a factory method (just some function that creates a new object and returns it?)
What is the point of companion objects? Is it just to contain functions that are available to all instances of class? Are they like class methods in Ruby?
What is Some and Option?
Option is a data structure that represents optionality, as the name suggests. Whenever a computation may not return a value, you can return an Option. Option has two cases (represented as two subclasses): Some or None.
In the example above, the method Email.fromString can fail and not return a value. This is represented with Option. In order to know whether the computation yielded a value or not, you can use match and check whether it was a Some or a None:
Email.fromString("scala.center#epfl.ch") match {
case Some(email) => // do something if it's a Some
case None => // do something it it's a None
}
This is much better than returning null because now whoever calls the method can't possibly forget to check the return value.
For example compare this:
def willReturnNull(s: String): String = null
willReturnNull("foo").length() // NullPointerException!
with this
def willReturnNone(s: String): Option[String] = None
willReturnNone("foo").length() // doesn't compile, because 'length' is not a member of `Option`
Also, note that using match is just a way of working with Option. Further discussion would involve using map, flatMap, getOrElse or similar methods defined on Option, but I feel it would be off-topic here.
What is a factory method (just some function that creates a new object and returns it?)
This is nothing specific to Scala. A "factory method" is usually a static method that constructs the value of some type, possibly hiding the details of the type itself. In this case fromString is a factory method because it allows you create an Email without calling the Email constructor with new Email(...)
What is the point of companion objects? Is it just to contain functions that are available to all instances of class? Are they like class methods in Ruby?
As a first approximation, yes. Scala doesn't have static members of a class. Instead, you can have an object associated with that class where you define everything that is static.
E.g. in Java you would have:
public class Email {
public String username;
public String domain;
public static Optional<Email> fromString(String: s) {
// ...
}
}
Where as in Scala you would define the same class as roughly:
class Email(val username: String, val domain: String)
object Email {
def fromString(s: String): Option[Email] = {
// ...
}
}
I would like to add some examples/information to the third question.
If you use akka in companion object you can put every message that you use in case method (it should proceed and use by actor). Moreover, you can add some val for a name of actors or other constant values.
If you work with JSON you should create a format for it (sometimes custom reads and writes). This format you should put inside companion object. Methods to create instances too.
If you go deeper to Scala you can find case classes. So a possibility to create an object of this class without new is because there is a method apply in "default" companion object.
But in general, it's a place where you can put every "static" method etc.
About Option, it provides you a possibility to avoid some exception and make something when you don't have any values.
Gabriele put an example with email, so I'll add another one.
You have a method that sends email, but you take email from User class. The user can have this field empty, so if we have something like it
val maybeEmail: Option[String] = user.email you can use for example map to send an email
maybeEmail.map(email => sendEmail(email))
So if you use it, during writing methods like above you don't need to think that user specify his email or not :)

Pattern for generating negative Scalacheck scenarios: Using property based testing to test validation logic in Scala

We are looking for a viable design pattern for building Scalacheck Gen (generators) that can produce both positive and negative test scenarios. This will allow us to run forAll tests to validate functionality (positive cases), and also verify that our case class validation works correctly by failing on all invalid combinations of data.
Making a simple, parameterized Gen that does this on a one-off basis is pretty easy. For example:
def idGen(valid: Boolean = true): Gen[String] = Gen.oneOf(ID.values.toList).map(s => if (valid) s else Gen.oneOf(simpleRandomCode(4), "").sample.get)
With the above, I can get a valid or invalid ID for testing purposes. The valid one, I use to make sure business logic succeeds. The invalid one, I use to make sure our validation logic rejects the case class.
Ok, so -- problem is, on a large scale, this becomes very unwieldly. Let's say I have a data container with, oh, 100 different elements. Generating a "good" one is easy. But now, I want to generate a "bad" one, and furthermore:
I want to generate a bad one for each data element, where a single data element is bad (so at minimum, at least 100 bad instances, testing that each invalid parameter is caught by validation logic).
I want to be able to override specific elements, for instance feeding in a bad ID or a bad "foobar." Whatever that is.
One pattern we can look to for inspiration is apply and copy, which allows us to easily compose new objects while specifying overridden values. For example:
val f = Foo("a", "b") // f: Foo = Foo(a,b)
val t = Foo.unapply(f) // t: Option[(String, String)] = Some((a,b))
Foo(t.get._1, "c") // res0: Foo = Foo(a,c)
Above we see the basic idea of creating a mutating object from the template of another object. This is more easily expressed in Scala as:
val f = someFoo copy(b = "c")
Using this as inspiration we can think about our objectives. A few things to think about:
First, we could define a map or a container of key/values for the data element and generated value. This could be used in place of a tuple to support named value mutation.
Given a container of key/value pairs, we could easily select one (or more) pairs at random and change a value. This supports the objective of generating a data set where one value is altered to create failure.
Given such a container, we can easily create a new object from the invalid collection of values (using either apply() or some other technique).
Alternatively, perhaps we can develop a pattern that uses a tuple and then just apply() it, kind of like the copy method, as long as we can still randomly alter one or more values.
We can probably explore developing a reusable pattern that does something like this:
def thingGen(invalidValueCount: Int): Gen[Thing] = ???
def someTest = forAll(thingGen) { v => invalidV = v.invalidate(1); validate(invalidV) must beFalse }
In the above code, we have a generator thingGen that returns (valid) Things. Then for all instances returned, we invoke a generic method invalidate(count: Int) which will randomly invalidate count values, returning an invalid object. We can then use that to ascertain whether our validation logic works correctly.
This would require defining an invalidate() function that, given a parameter (either by name, or by position) can then replace the identified parameter with a value that is known to be bad. This implies have an "anti-generator" for specific values, for instance, if an ID must be 3 characters, then it knows to create a string that is anything but 3 characters long.
Of course to invalidate a known, single parameter (to inject bad data into a test condition) we can simply use the copy method:
def thingGen(invalidValueCount: Int): Gen[Thing] = ???
def someTest = forAll(thingGen) { v => v2 = v copy(id = "xxx"); validate(v2) must beFalse }
That is the sum of my thinking to date. Am I barking up the wrong tree? Are there good patterns out there that handle this kind of testing? Any commentary or suggestions on how best to approach this problem of testing our validation logic?
We can combine a valid instance and an set of invalid fields (so that every field, if copied, would cause validation failure) to get an invalid object using shapeless library.
Shapeless allows you to represent your class as a list of key-value pairs that are still strongly typed and support some high-level operations, and converting back from this representation to your original class.
In example below I'll be providing an invalid instance for each single field provided
import shapeless._, record._
import shapeless.labelled.FieldType
import shapeless.ops.record.Updater
A detailed intro
Let's pretend we have a data class, and a valid instance of it (we only need one, so it can be hardcoded)
case class User(id: String, name: String, about: String, age: Int) {
def isValid = id.length == 3 && name.nonEmpty && age >= 0
}
val someValidUser = User("oo7", "Frank", "A good guy", 42)
assert(someValidUser.isValid)
We can then define a class to be used for invalid values:
case class BogusUserFields(name: String, id: String, age: Int)
val bogusData = BogusUserFields("", "1234", -5)
Instances of such classes can be provided using ScalaCheck. It's much easier to write a generator where all fields would cause failure. Order of fields doesn't matter, but their names and types do. Here we excluded about from User set of fields so we can do what you asked for (feeding only a subset of fields you want to test)
We then use LabelledGeneric[T] to convert User and BogusUserFields to their corresponding record value (and later we will convert User back)
val userLG = LabelledGeneric[User]
val bogusLG = LabelledGeneric[BogusUserFields]
val validUserRecord = userLG.to(someValidUser)
val bogusRecord = bogusLG.to(bogusData)
Records are lists of key-value pairs, so we can use head to get a single mapping, and the + operator supports adding / replacing field to another record. Let's pick every invalid field into our user one at a time. Also, here's the conversion back in action:
val invalidUser1 = userLG.from(validUserRecord + bogusRecord.head)// invalid name
val invalidUser2 = userLG.from(validUserRecord + bogusRecord.tail.head)// invalid ID
val invalidUser3 = userLG.from(validUserRecord + bogusRecord.tail.tail.head) // invalid age
assert(List(invalidUser1, invalidUser2, invalidUser3).forall(!_.isValid))
Since we basically are applying the same function (validUserRecord + _) to every key-value pair in our bogusRecord, we can also use map operator, except we use it with an unusual - polymorphic - function. We can also easily convert it to List, because every element will be of a same type now.
object polymerge extends Poly1 {
implicit def caseField[K, V](implicit upd: Updater[userLG.Repr, FieldType[K, V]]) =
at[FieldType[K, V]](upd(validUserRecord, _))
}
val allInvalidUsers = bogusRecord.map(polymerge).toList.map(userLG.from)
assert(allInvalidUsers == List(invalidUser1, invalidUser2, invalidUser3))
Generalizing and removing all the boilerplate
Now the whole point of this was that we can generalize it to work for any two arbitrary classes. The encoding of all relationships and operations is a bit cumbersome and it took me a while to get it right with all the implicit not found errors, so I'll skip the details.
class Picks[A, AR <: HList](defaults: A)(implicit lgA: LabelledGeneric.Aux[A, AR]) {
private val defaultsRec = lgA.to(defaults)
object mergeIntoTemplate extends Poly1 {
implicit def caseField[K, V](implicit upd: Updater[AR, FieldType[K, V]]) =
at[FieldType[K, V]](upd(defaultsRec, _))
}
def from[B, BR <: HList, MR <: HList, F <: Poly](options: B)
(implicit
optionsLG: LabelledGeneric.Aux[B, BR],
mapper: ops.hlist.Mapper.Aux[mergeIntoTemplate.type, BR, MR],
toList: ops.hlist.ToTraversable.Aux[MR, List, AR]
) = {
optionsLG.to(options).map(mergeIntoTemplate).toList.map(lgA.from)
}
}
So, here it is in action:
val cp = new Picks(someValidUser)
assert(cp.from(bogusData) == allInvalidUsers)
Unfortunately, you cannot write new Picks(someValidUser).from(bogusData) because implicit for mapper requires a stable identifier. On the other hand, cp instance can be reused with other types:
case class BogusName(name: String)
assert(cp.from(BogusName("")).head == someValidUser.copy(name = ""))
And now it works for all types! And bogus data is required to be any subset of class fields, so it will work even for class itself
case class Address(country: String, city: String, line_1: String, line_2: String) {
def isValid = Seq(country, city, line_1, line_2).forall(_.nonEmpty)
}
val acp = new Picks(Address("Test country", "Test city", "Test line 1", "Test line 2"))
val invalidAddresses = acp.from(Address("", "", "", ""))
assert(invalidAddresses.forall(!_.isValid))
You can see the code running at ScalaFiddle

Scala "update" immutable object best practices

With a mutable object I can write something like
var user = DAO.getUser(id)
user.name = "John"
user.email ="john#doe.com"
// logic on user
If user is immutable then I need to clone\copy it on every change operation.
I know a few ways to perform this
case class copy
method (like changeName) that creates a new object with the new property
What is the best practice?
And one more question. Is there any existing technique to get "changes" relative to the original object(for example to generate update statement)?
Both ways you've mentioned belongs to functional and OO paradigms respectively. If you prefer functional decomposition with abstract data type, which, in Scala, is represented by case classes, then choose copy method. Using mutators is not a good practice in my option, cause that will pull you back to Java/C#/C++ way of life.
On the other hand making ADT case class like
case class Person(name: String, age: String)
is more consise then:
class Person(_name: String, _age: String) {
var name = _name
var age = _a
def changeName(newName: String): Unit = { name = newName }
// ... and so on
}
(not the best imperative code, can be shorter, but clear).
Of cause there is another way with mutators, just to return a new object on each call:
class Person(val name: String,
val age: String) {
def changeName(newName: String): Unit = new Person(newName, age)
// ... and so on
}
But still case class way is more consise.
And if you go futher, to concurrent/parallel programming, you'll see that functional consept with immutable value are much better, then tring to guess in what state your object currently are.
Update
Thanks to the senia, forgot to mention two things.
Lenses
At the most basic level, lenses are sort of getters and setters for immutable data and looks like this:
case class Lens[A,B](get: A => B, set: (A,B) => A) {
def apply(a: A) = get(a)
// ...
}
That is it. A lens is a an object that contains two functions: get and set. get takes an A and returns a B. set takes an A and B and returns a new A. It’s easy to see that the type B is a value contained in A. When we pass an instance to get we return that value. When we pass an A and a B to set we update the value B in A and return a new A reflecting the change. For convenience the get is aliased to apply. There is a good intro to Scalaz Lens case class
Records
This one, ofcause, comes from the shapeless library and called Records. An implementation of extensible records modelled as HLists of associations. Keys are encoded using singleton types and fully determine the types of their corresponding values (ex from github):
object author extends Field[String]
object title extends Field[String]
object price extends Field[Double]
object inPrint extends Field[Boolean]
val book =
(author -> "Benjamin Pierce") ::
(title -> "Types and Programming Languages") ::
(price -> 44.11) ::
HNil
// Read price field
val currentPrice = book.get(price) // Inferred type is Double
currentPrice == 44.11
// Update price field, relying on static type of currentPrice
val updated = book + (price -> (currentPrice+2.0))
// Add a new field
val extended = updated + (inPrint -> true)