SQL DSL for Scala - scala

I am struggling to create a SQL DSL for Scala. The DSL is an extension to Querydsl, which is a popular Query abstraction layer for Java.
I am struggling now with really simple expressions like the following
user.firstName == "Bob" || user.firstName == "Ann"
As Querydsl supports already an expression model which can be used here I decided to provide conversions from Proxy objects to Querydsl expressions. In order to use the proxies I create an instance like this
import com.mysema.query.alias.Alias._
var user = alias(classOf[User])
With the following implicit conversions I can convert proxy instances and proxy property call chains into Querydsl expressions
import com.mysema.query.alias.Alias._
import com.mysema.query.types.expr._
import com.mysema.query.types.path._
object Conversions {
def not(b: EBoolean): EBoolean = b.not()
implicit def booleanPath(b: Boolean): PBoolean = $(b);
implicit def stringPath(s: String): PString = $(s);
implicit def datePath(d: java.sql.Date): PDate[java.sql.Date] = $(d);
implicit def dateTimePath(d: java.util.Date): PDateTime[java.util.Date] = $(d);
implicit def timePath(t: java.sql.Time): PTime[java.sql.Time] = $(t);
implicit def comparablePath(c: Comparable[_]): PComparable[_] = $(c);
implicit def simplePath(s: Object): PSimple[_] = $(s);
}
Now I can construct expressions like this
import com.mysema.query.alias.Alias._
import com.mysema.query.scala.Conversions._
var user = alias(classOf[User])
var predicate = (user.firstName like "Bob") or (user.firstName like "Ann")
I am struggling with the following problem.
eq and ne are already available as methods in Scala, so the conversions aren't triggered when they are used
This problem can be generalized as the following. When using method names that are already available in Scala types such as eq, ne, startsWith etc one needs to use some kind of escaping to trigger the implicit conversions.
I am considering the following
Uppercase
var predicate = (user.firstName LIKE "Bob") OR (user.firstName LIKE "Ann")
This is for example the approach in Circumflex ORM, a very powerful ORM framework for Scala with similar DSL aims. But this approach would be inconsistent with the query keywords (select, from, where etc), which are lowercase in Querydsl.
Some prefix
var predicate = (user.firstName :like "Bob") :or (user.firstName :like "Ann")
The context of the predicate usage is something like this
var user = alias(classOf[User])
query().from(user)
.where(
(user.firstName like "Bob") or (user.firstName like "Ann"))
.orderBy(user.firstName asc)
.list(user);
Do you see better options or a different approach for SQL DSL construction for Scala?
So the question basically boils down to two cases
Is it possible to trigger an implicit type conversion when using a method that exists in the super class (e.g. eq)
If it is not possible, what would be the most Scalaesque syntax to use for methods like eq, ne.
EDIT
We got Scala support in Querydsl working by using alias instances and a $-prefix based escape syntax. Here is a blog post on the results : http://blog.mysema.com/2010/09/querying-with-scala.html

There was a very good talk at Scala Days: Type-safe SQL embedded in Scala by Christoph Wulf.
See the video here: Type-safe SQL embedded in Scala by Christoph Wulf

Mr Westkämper - I was pondering this problem, and I wondered if would be possible to use 'tracer' objects, where the basic data types such as Int and String would be extended such that they contained source information, and the results of combining them would likewise hold within themselves their sources and the nature of the combination.
For example, your user.firstName method would return a TracerString, which extends String, but which also indicates that the String corresponds to a column in a relation. The == method would be overwritten such that it returns an EqualityTracerBoolean which extends Boolean. This would preserve the standard Scala semantics. However, the constructor for EqualityTracerBoolean would record the fact that the result of the expression was derived by comparing a column in a relation to a string constant. Your 'where' method could then analyse the EqualityTracerBoolean object returned by the conditional expression evaluated over a dummy argument in order to derive the expression used to create it.
There would have to be override defs for inequality operators, as well as plus and minus, for Ints, and whatever else you wished to represent from sql, and corresponding tracer classes for each of these. It would be a bit of a project!
Anyway, I decided not to bother, and use squeryl instead.

I didn't have the exact same problem with jOOQ, as I'm using a bit more verbose operator names: equal, notEqual, etc instead of eq, ne. On the other hand, there is a val operator in jOOQ for explicitly creating bind values, which I had to overload with value, as val is a keyword in Scala. Is overloading operators an option for you? I documented my attempts of running jOOQ in Scala here:
http://lukaseder.wordpress.com/2011/12/11/the-ultimate-sql-dsl-jooq-in-scala/
Just like you, I had also thought about capitalising all keywords in a major release (including SELECT, FROM, etc). But that will leave an open question about whether "compound" keywords should be split in two method calls, or connected by an underscore: GROUP().BY() or GROUP_BY(). WHEN().MATCHED().THEN().UPDATE() or WHEN_MATCHED_THEN_UPDATE(). Since the result is not really satisfying, I guess it's not worth to break backwards-compatibility for such a fix, even if the two-method-call option would look very very nice in Scala, as . and () can be omitted. So maybe, jOOQ and QueryDSL should both be "wrapped" (as opposed to "extended") by a dedicated Scala-API?

What about decompiling the bytecode at runtime? I started to write such a tool:
http://h2database.com/html/jaqu.html#natural_syntax
I know it's a hack, so please don't vote -1 :-) I just wanted to mentioned it. It's a relatively novel approach. Instead of decompiling at runtime, it might be possible to do it at compile time using an annotation processor, not sure if that's possible using Scala (and not sure if it's really possible with Java, but Project Lombok seems to do something like that).

Related

How to find max date from stream in scala using CompareTo?

I am new to Scala and trying to explore how I can use Java functionalities with Scala.
I am having stream of LocalDate which is a Java class and I am trying to find maximum date out of my list.
var processedResult : Stream[LocalDate] =List(javaList)
.toStream
.map { s => {
//some processing
LocalDate.parse(str, formatter)
}
}
I know we can do easily by using .compare() and .compareTo() in Java but I am not sure how do I use the same thing over here.
Also, I have no idea how Ordering works in Scala when it comes to sorting.
Can anyone suggest how can get this done?
First of all, a lot of minor details that I will point out since it seems you are pretty new to the language and I expect those to help you with your learning path.
First, avoid var at all costs, especially when learning.
While mutability has its place and is not always wrong, forcing you to avoid it while learning will help you. Particularly, avoid it when it doesn't provide any value; like in this case.
Second, this List(javaList) doesn't do what you think it does. It creates a single element Scala List whose unique element is a Java List. What you probably want is to transform that Java List into a Scala one, for that you can use the CollectionConverters.
import scala.jdk.CollectionConverters._ // This works if you are in 2.13
// if you are in 2.12 or lower use: import scala.collection.JavaConverters._
val scalaList = javaList.asScala.toList
Third, not sure why you want to use a Scala Stream, a Stream is for infinite or very large collections where you want all the transformations to be made lazily and only produce elements as they are consumed (also, btw, it was deprecated in 2.13 in favour of LazyList).
Maybe, you are confused because in Java you need a "Stream" to apply functional operations like map? If so, note that in Scala all collections provide the same rich API.
Fourth, Ordering is a Typeclass which is a functional pattern for Polymorphism. On its own, this is a very broad question so I won't answer it here, but I hope the two links provide insight.
The TL;DR; is simple, it is just that an Ordering for a type T knows how to order (sort) elements of type T. Thus operations like max will work for any collection of any type if, and only if, the compiler can prove the existence of an Ordering for that type if it can then it will pass such value implicitly to the method call for you; again the implicits topic is very broad and deserves its own question.
Now for your particular question, you can just call max or maxOption in the List or Stream and that is all.
Note that max will throw if the List is empty, whereas maxOption returns an Option which will be empty (None) for an empty input; idiomatic Scala favour the latter over the former.
If you really want to use compareTo then you can provide your own Ordering.
scalaList.maxOption(Ordering.fromLessThan[LocalDate]((d1, d2) => d1.compareTo(d2) < 0))
Ordering[A] is a type class which defines how to compare 2 elements of type A. So to compare LocalDates you need Ordering[LocalDate] instance.
LocalDate extends Comparable in Java and Scala conveniently provides instances for Comparables so when you invoke:
Ordering[java.time.LocalDate]
in REPL you'll see that Scala is able to provide you the instance without you needing to do anything (you could take a look at the list of methods provided by this typeclass).
Since you have and Ordering in implicit scope which types matches the Stream's type (e.g. Stream[LocalDate] needs Ordering[LocalDate]) you can call .max method... and that's it.
val processedResult : Stream[LocalDate] = ...
val newestDate: LocalDate = processedResult.max

Possible to make use of Scala's Option flatMap method more concise?

I'm admittedly very new to Scala, and I'm having trouble with the syntactical sugar I see in many Scala examples.
It often results in a very concise statement, but honestly so far (for me) a bit unreadable.
So I wish to take a typical use of the Option class, safe-dereferencing, as a good place to start for understanding, for example, the use of the underscore in a particular example I've seen.
I found a really nice article showing examples of the use of Option to avoid the case of null.
https://medium.com/#sinisalouc/demystifying-the-monad-in-scala-cc716bb6f534#.fhrljf7nl
He describes a use as so:
trait User {
val child: Option[User]
}
By the way, you can also write those functions as in-place lambda
functions instead of defining them a priori. Then the code becomes
this:
val result = UserService.loadUser("mike")
.flatMap(user => user.child)
.flatMap(user => user.child)
That looks great! Maybe not as concise as one can do in groovy, but not bad.
So I thought I'd try to apply it to a case I am trying to solve.
I have a type Person where the existence of a Person is optional, but if we have a person, his attributes are guaranteed. For that reason, there are no use of the Option type within the Person type itself.
The Person has an PID which is of type Id. The Id type consists of two String types; the Id-Type and the Id-Value.
I've used the Scala console to test the following:
class Id(val idCode : String, val idVal : String)
class Person(val pid : Id, val name : String)
val anId: Id = new Id("Passport_number", "12345")
val person: Person = new Person(anId, "Sean")
val operson : Option[Person] = Some(person)
OK. That setup my person and it's optional instance.
I learned from the above linked article that I could get the Persons Id-Val by using flatMap; Like this:
val result = operson.flatMap(person => Some(person.pid)).flatMap(pid => Some(pid.idVal)).getOrElse("NoValue")
Great! That works. And if I infact have no person, my result is "NoValue".
I used flatMap (and not Map) because, unless I misunderstand (and my tests with Map were incorrect) if I use Map I have to provide an alternate or default Person instance. That I didn't want to have to do.
OK, so, flatMap is the way to go.
However, that is really not a very concise statement.
If I were writing that in more of a groovy style, I guess i'd be able to do something like this:
val result = person?.pid.idVal
Wow, that's a bit nicer!
Surely Scala has a means to provide something at least nearly as nice as Groovy?
In the above linked example, he was able to make his statement more concise using some of that syntactical sugar I mentioned before. The underscore:
or even more concise:
val result = UserService.loadUser("mike")
.flatMap(_.child)
.flatMap(_.child)
So, it seems in this case the underscore character allows you to skip specifying the type (as the type is inferred) and replace it with underscore.
However, when I try the same thing with my example:
val result = operson.flatMap(Some(_.pid)).flatMap(Some(_.idVal)).getOrElse("NoValue")
Scala complains.
<console>:15: error: missing parameter type for expanded function ((x$2) => x$2.idVal)
val result = operson.flatMap(Some(_.pid)).flatMap(Some(_.idVal)).getOrElse("NoValue")
Can someone help me along here?
How am I misunderstanding this?
Is there a short-hand method of writing my above lengthy statement?
Is flatMap the best way to achieve what I am after? Or is there a better more concise and/or readable way to do it ?
thanks in advance!
Why do you insist on using flatMap? I'd just use map for your example instead:
val result = operson.map(_.pid).map(_.idVal).getOrElse("NoValue")
or even shorter:
val result = operson.map(_.pid.idVal).getOrElse("NoValue")
You should only use flatMap with functions that return Options. Your pid and idVals are not Options, so just map them instead.
You said
I have a type Person where the existence of a Person is optional, but if we have a person, his attributes are guaranteed. For that reason, there are no use of the Option type within the Person type itself.
This is the essential difference between your example and the User example. In the User example, both the existence of a User instance, and its child field are options. This is why, to get a child, you need to flatMap. However, since in your example, only the existence of a Person is not guaranteed, after you've retrieved an Option[Person], you can safely map to any of its fields.
Think of flatMap as a map, followed by a flatten (hence its name). If I mapped on child:
val ouser = Some(new User())
val child: Option[Option[User]] = ouser.map(_.child)
I would end up with an Option[Option[User]]. I need to flatten that to a single Option level, that's why I use flatMap in the first place.
If you looking for the most concise solution, consider this:
val result = operson.fold("NoValue")(_.pid.idVal)
Though one could find it not clear or confusing

Anorm: WHERE condition, conditionally

Consider a repository/DAO method like this, which works great:
def countReports(customerId: Long, createdSince: ZonedDateTime) =
DB.withConnection {
implicit c =>
SQL"""SELECT COUNT(*)
FROM report
WHERE customer_id = $customerId
AND created >= $createdSince
""".as(scalar[Int].single)
}
But what if the method is defined with optional parameters:
def countReports(customerId: Option[Long], createdSince: Option[ZonedDateTime])
Point being, if either optional argument is present, use it in filtering the results (as shown above), and otherwise (in case it is None) simply leave out the corresponding WHERE condition.
What's the simplest way to write this method with optional WHERE conditions? As Anorm newbie I was struggling to find an example of this, but I suppose there must be some sensible way to do it (that is, without duplicating the SQL for each combination of present/missing arguments).
Note that the java.time.ZonedDateTime instance maps perfectly and automatically into Postgres timestamptz when used inside the Anorm SQL call. (Trying to extract the WHERE condition as a string, outside SQL, created with normal string interpolation did not work; toString produces a representation not understood by the database.)
Play 2.4.4
One approach is to set up filter clauses such as
val customerClause =
if (customerId.isEmpty) ""
else " and customer_id={customerId}"
then substitute these into you SQL:
SQL(s"""
select count(*)
from report
where true
$customerClause
$createdClause
""")
.on('customerId -> customerId,
'createdSince -> createdSince)
.as(scalar[Int].singleOpt).getOrElse(0)
Using {variable} as opposed to $variable is I think preferable as it reduces the risk of SQL injection attacks where someone potentially calls your method with a malicious string. Anorm doesn't mind if you have additional symbols that aren't referenced in the SQL (i.e. if a clause string is empty). Lastly, depending on the database(?), a count might return no rows, so I use singleOpt rather than single.
I'm curious as to what other answers you receive.
Edit: Anorm interpolation (i.e. SQL"...", an interpolation implementation beyond Scala's s"...", f"..." and raw"...") was introduced to allow the use $variable as equivalent to {variable} with .on. And from Play 2.4, Scala and Anorm interpolation can be mixed using $ for Anorm (SQL parameter/variable) and #$ for Scala (plain string). And indeed this works well, as long as the Scala interpolated string does not contains references to an SQL parameter. The only way, in 2.4.4, I could find to use a variable in an Scala interpolated string when using Anorm interpolation, was:
val limitClause = if (nameFilter="") "" else s"where name>'$nameFilter'"
SQL"select * from tab #$limitClause order by name"
But this is vulnerable to SQL injection (e.g. a string like it's will cause a runtime syntax exception). So, in the case of variables inside interpolated strings, it seems it is necessary to use the "traditional" .on approach with only Scala interpolation:
val limitClause = if (nameFilter="") "" else "where name>{nameFilter}"
SQL(s"select * from tab $limitClause order by name").on('limitClause -> limitClause)
Perhaps in the future Anorm interpolation could be extended to parse the interpolated string for variables?
Edit2: I'm finding there are some tables where the number of attributes that might or might not be included in the query changes from time to time. For these cases I'm defining a context class, e.g. CustomerContext. In this case class there are lazy vals for the different clauses that affect the sql. Callers of the sql method must supply a CustomerContext, and the sql will then have inclusions such as ${context.createdClause} and so on. This helps give a consistency, as I end up using the context in other places (such as total record count for paging, etc.).
Finally got this simpler approach posted by Joel Arnold to work in my example case, also with ZonedDateTime!
def countReports(customerId: Option[Long], createdSince: Option[ZonedDateTime]) =
DB.withConnection {
implicit c =>
SQL( """
SELECT count(*) FROM report
WHERE ({customerId} is null or customer_id = {customerId})
AND ({created}::timestamptz is null or created >= {created})
""")
.on('customerId -> customerId, 'created -> createdSince)
.as(scalar[Int].singleOpt).getOrElse(0)
}
The tricky part is having to use {created}::timestamptz in the null check. As Joel commented, this is needed to work around a PostgreSQL driver issue.
Apparently the cast is needed only for timestamp types, and the simpler way ({customerId} is null) works with everything else. Also, comment if you know whether other databases require something like this, or if this is a Postgres-only peculiarity.
(While wwkudu's approach also works fine, this definitely is cleaner, as you can see comparing them side to side in a full example.)

Scala Case Class Map Expansion

In groovy one can do:
class Foo {
Integer a,b
}
Map map = [a:1,b:2]
def foo = new Foo(map) // map expanded, object created
I understand that Scala is not in any sense of the word, Groovy, but am wondering if map expansion in this context is supported
Simplistically, I tried and failed with:
case class Foo(a:Int, b:Int)
val map = Map("a"-> 1, "b"-> 2)
Foo(map: _*) // no dice, always applied to first property
A related thread that shows possible solutions to the problem.
Now, from what I've been able to dig up, as of Scala 2.9.1 at least, reflection in regard to case classes is basically a no-op. The net effect then appears to be that one is forced into some form of manual object creation, which, given the power of Scala, is somewhat ironic.
I should mention that the use case involves the servlet request parameters map. Specifically, using Lift, Play, Spray, Scalatra, etc., I would like to take the sanitized params map (filtered via routing layer) and bind it to a target case class instance without needing to manually create the object, nor specify its types. This would require "reliable" reflection and implicits like "str2Date" to handle type conversion errors.
Perhaps in 2.10 with the new reflection library, implementing the above will be cake. Only 2 months into Scala, so just scratching the surface; I do not see any straightforward way to pull this off right now (for seasoned Scala developers, maybe doable)
Well, the good news is that Scala's Product interface, implemented by all case classes, actually doesn't make this very hard to do. I'm the author of a Scala serialization library called Salat that supplies some utilities for using pickled Scala signatures to get typed field information
https://github.com/novus/salat - check out some of the utilities in the salat-util package.
Actually, I think this is something that Salat should do - what a good idea.
Re: D.C. Sobral's point about the impossibility of verifying params at compile time - point taken, but in practice this should work at runtime just like deserializing anything else with no guarantees about structure, like JSON or a Mongo DBObject. Also, Salat has utilities to leverage default args where supplied.
This is not possible, because it is impossible to verify at compile time that all parameters were passed in that map.

workaround for final == and != (equals and not equals) methods in scala DSL

So I'm wrapping bits of the Mechanical Turk API, and you need to specify qualification requirements such as:
Worker_Locale == "US"
Worker_PercentAssignmentsApproved > 95
...
In my code, I'd like to allow the syntax above and have these translated into something like:
QualificationRequirement("00000000000000000071", "LocaleValue.Country", "EqualTo", "US")
QualificationRequirement("000000000000000000L0", "IntegerValue", "GreaterThan", 95)
I can achieve most of what I want by declaring an object like:
object Worker_PercentAssignmentsApproved {
def >(x: Int) = {
QualificationRequirement("000000000000000000L0", "IntegerValue", "GreaterThan", x)
}
}
But I can't do the same thing for the "==" (equals) or "!=" (not equals) methods since they're declared final in AnyRef. Is there a standard workaround for this? Perhaps I should just use "===" and "!==" instead?
(I guess one good answer might be a summary of how a few different scala DSLs have chosen to work around this issue and then I can just do whatever the majority of those do.)
Edit: Note that I'm not trying to actually perform an equality comparison. Instead, I'm trying to observe the comparison operator the user indicated in scala code, save an object based description of that comparison, and give that description to the server. Specifically, the following scala code:
Worker_Locale == "US"
will result in the following parameters being added to my request:
&QualificationRequirement.1.QualificationTypeId=000000000000000000L0
&QualificationRequirement.1.Comparator=EqualTo
&QualificationRequirement.1.LocaleValue.Country=US
So I can't override equals since it returns a Boolean, and I need to return a structure that represents all these parameters.
If you look at the definition of == and != in the scala reference, (§ 12.1), you’ll find that they are defined in terms of eq and equals.
eq is the reference equality and is also final (it is only used to check for null in that case) but you should be able to override equals.
Note that you’ll probably also need to write the hashCode method to ensure
∀ o1, o2 with o1.equals(o2) ⇒ (o1.hashCode.equals(o2.hashCode)).
However, if you need some other return type for your DSL than Boolean or more flexibility in general, you should maybe use ===, as has been done in Squeryl for example.
Here's a little survey of what various DSLs use for this kind of thing.
Liftweb uses === in Javascript expressions:
JsIf(ValById("username") === value.toLowerCase, ...)
Squeryl uses === for SQL expressions:
authors.where(a=> a.lastName === "Pouchkine")
querydsl uses $eq for SQL expressions:
person.firstName $eq "Ben"
Prolog-in-Scala uses === for Prolog expressions:
'Z === 'A
Scalatest uses === to get an Option instead of a Boolean:
assert("hello" === "world")
So I think the consensus is mostly to use ===.
I've been considering a similar problem. I was thinking of creating a DSL for writing domain-specific formulas. The trouble is that users might want to do string manipulation too and you end up with expression like
"some string" + <someDslConstruct>
No matter what you do its going to lex this as a something like
stringToLiteralString("some string" + <someDslConstruct>)
I think the only potential way out of this pit would be to try using macros. In your example perhaps you could have a macro that wraps a scala expression and converts the raw AST into a query? Doing this for arbitrary expressions wouldn't be feasible but if your domain is sufficiently well constrained it might be a workable alternative solution.