Replacement for deprecated PostgresDataType.JSON? - postgresql

I'm using JOOQ with PostgreSQL, and trying to implement a query like this:
INSERT INTO dest_table (id,name,custom_data)
SELECT key as id,
nameproperty as name,
CONCAT('{"propertyA": "',property_a,'", "propertyB": "',property_b,'","propertyC": "',property_c,'"}')::json as custom_data
FROM source_table
The concatenation/JSON bit is what I'm here to ask about. I actually have managed to get it working, but only by using this (Kotlin):
val concatBits = mutableListOf<Field<Any>>()
... build up various bits of the concatenation ...
val concatField = concat(*(concatBits.toTypedArray())).cast(PostgresDataType.JSON)
It concerns me that PostgresDataType is deprecated. The documentation says I should use SQLDataType instead, but it has no JSON value.
What's the recommended way to do this?
EDIT: a bit more information ...
I'm building the query like this:
val innerSelectFields = listOf(
field("key").`as`(DEST_TABLE.ID),
field("nameproperty").`as`(DEST_TABLE.NAME),
concatField.`as`(DEST_TABLE.CUSTOM_DATA)
)
val innerSelect = dslContext
.select(innerSelectFields)
.from(table("source_table"))
val insertInto = dslContext
.insertInto(DEST_TABLE)
.select(innerSelect)
The initial query I posted is slightly misleading, as the resulting SQL from this code doesn't have the
(id,name,custom_data) part.
Also, in case it matters, "source_table" is a temporary table, created during runtime, so there are no autogenerated classes for it.

jOOQ currently doesn't support the JSON data type out of the box. The main reason is that it is unclear what Java type to bind a JSON data structure to, as the JDK doesn't have such a standard type, and jOOQ will not prefer one third party library over the other.
The currently recommended approach is to create your own custom data type binding for your preferred third party JSON library:
https://www.jooq.org/doc/latest/manual/code-generation/custom-data-type-bindings
In that case, you will no longer need to explicitly cast your bind variable to some JSON type, because your binding will take care of that transparently.

Related

User defined postgresql types using Npgsql from F#

We use postgresql's features to the maximum to ease our development effort. We make heavy use of custom types (user defined types) in postgresql; most of our functions and stored procedures either take them as input parameters or return them.
We would like to make use of them from F#'s SqlDataProvider. That means we should somehow be able to tell F# how to map F# user type to postgresql user type. In other words
Postgresql has our defined user type post_user_defined
F# has our defined user type fsharp_user_defined
We should instruct Npgsql to somehow perform this mapping. My research so far points me to two approaches and none of them are completely clear to me. Any help is appreciated
Approach 1
NpgsqlTypes namespace has pre-defined set of postgresql types mapped to .NET out of box. Few of them are classes, others structures. Say I would like to use postgresql's built in type point which is mapped to .NET by Npgsql via NpgsqlPoint. I can map this to application specific data structure like this:
let point (x,y) = NpgsqlTypes.NpgsqlPoint(x,y)
(From PostgreSQLTests.fsx)
In this case, postgresql point and NpgsqlPoint (.NET) are already defined. Now I would like to do the same for my custom type.
Suppose the user defined postgresql composite is
create type product_t as ( name text, product_type text);
And the application data structure (F#) is the record
type product_f = {name :string; ptype :string }
or a tuple
type product_f = string * string
How do I tell Npgsql to make use of my type when passed as a parameter to postgresql functions/procedures? It looks like I will need to use NpgsqTypes.NpgsqlDbType.Composite or Npgsql.PostgresCompositeType which doesn't have a constructor that is public.
I am at a dead end here!
Approach 2
Taking cue from this post, I could create a custom type and register with MapCompositeGlobally and use it to pass to postgresql functions.So, here I try my hand at it
On Postgresql side, the type and functions are respectively
CREATE TYPE product_t AS
(name text,
product_type text)
and
func_product(p product_t) RETURNS void AS
And from my application in F#
type PgProductType(Name:string,ProductType:string)=
member this.Name = Name
member this.ProductType = ProductType
new() = PgProductType("","")
Npgsql.NpgsqlConnection.MapCompositeGlobally<PgProductType>("product_t",null)
and then
type Provider = SqlDataProvider
let ctx = Provider.GetDataContext()
let prd = new PgProductType("F#Product","")
ctx.Functions.FuncProduct.Invoke(prd);;
ctx.Functions.FuncIproduct.Invoke(prd);;
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
stdin(29,1): error FS0501: The member or object constructor 'Invoke' takes 0 argument(s) but is here given 1. The requir
ed signature is 'SqlDataProvider<...>.dataContext.Functions.FuncIproduct.Result.Invoke() : Unit'.
Its strange to note that the error reports that : constructor 'Invoke' takes 0 argument(s) but is here given 1. F# side of things are completely blind to the argument that postgresql function takes. It does recognize that the function FuncIproduct exists but blind to the arguments it takes.
Regarding your 1st approach, as you've understood NpgsqlTypes contains some types which Npgsql supports out of the box - but these are only PostgreSQL built-in types. You cannot add a new type into there without changing Npgsql's source code, which isn't something you want to do.
Also, you should understand the difference between user-defined types (which PostgreSQL calls "composite") and totally independent types such as point. The latter are full types (similar to int4), with their own custom binary representation, while the former aren't.
Your 2nd approach is the right one - Npgsql comes with full support for PostgreSQL composite types. I have no idea how SqlDataProvider functions - I'm assuming this is an F#-specific type provider - but once you've properly mapped your composite via MapCompositeGlobally, Npgsql allows you to write it transparently by setting an NpgsqlParameter's Value to an instance of PgProductType. It may be worth trying to get it working with type providers first.

Zend\db\sql - prepareStatementForSqlObject - still need to bind or worry about sql injection?

I'm using zf 2.4 and for this example in Zend\db\sql. Do I need to worry about sql injection or do I still need to do quote() or escape anything if I already use prepareStatementForSqlObject()? The below example will do the blind variable already?
https://framework.zend.com/manual/2.4/en/modules/zend.db.sql.html
use Zend\Db\Sql\Sql;
$sql = new Sql($adapter);
$select = $sql->select();
$select->from('foo');
$select->where(array('id' => $id));
$statement = $sql->prepareStatementForSqlObject($select);
$results = $statement->execute();
The Select class will cleverly check your predicate(s) and add them in a safe manner to the query to prevent SQL-injection. I'd recommend you to take a look at the source for yourself so I'll point you to the process and the classes that are responsible for this in the latest ZF version.
Predicate Processing
Take a look at the class PredicateSet. The method \Zend\Db\Sql\Predicate::addPredicates determines the best way to handle your predicate based on their type. In your case you are using an associative array. Every item in that array will be checked and processed based on type:
If an abstraction replacement character (questionmark) is found, it will be turned into an Expression.
If the value is NULL, an IS NULL check will be performed on the column found in the key: WHERE key IS NULL.
If the value is an array, and IN check will be performed on the kolumn found in the key: WHERE key IN (arrayVal1, arrayVal2, ...).
Otherwise, the predicate will be a new Operator of the type 'equals': WHERE key = value.
In each case the final predicate to be added to the Select will be implementing PredicateInterface
Preparing the statement
The method \Zend\Db\Sql\Sql::prepareStatementForSqlObject instructs its adapter (i.e. PDO) to create a statement that will be prepared. From here it gets a little bit more complicated.
\Zend\Db\Sql is where the real magic happens where in method \Zend\Db\Sql::createSqlFromSpecificationAndParameters the function vsprintf is used to build the query strings, as you can see here.
NotePlease consider using the new docs.framework.zend.com website from now on. This website is leading when it comes to documentation of the latest version.

Anorm: WHERE condition, conditionally

Consider a repository/DAO method like this, which works great:
def countReports(customerId: Long, createdSince: ZonedDateTime) =
DB.withConnection {
implicit c =>
SQL"""SELECT COUNT(*)
FROM report
WHERE customer_id = $customerId
AND created >= $createdSince
""".as(scalar[Int].single)
}
But what if the method is defined with optional parameters:
def countReports(customerId: Option[Long], createdSince: Option[ZonedDateTime])
Point being, if either optional argument is present, use it in filtering the results (as shown above), and otherwise (in case it is None) simply leave out the corresponding WHERE condition.
What's the simplest way to write this method with optional WHERE conditions? As Anorm newbie I was struggling to find an example of this, but I suppose there must be some sensible way to do it (that is, without duplicating the SQL for each combination of present/missing arguments).
Note that the java.time.ZonedDateTime instance maps perfectly and automatically into Postgres timestamptz when used inside the Anorm SQL call. (Trying to extract the WHERE condition as a string, outside SQL, created with normal string interpolation did not work; toString produces a representation not understood by the database.)
Play 2.4.4
One approach is to set up filter clauses such as
val customerClause =
if (customerId.isEmpty) ""
else " and customer_id={customerId}"
then substitute these into you SQL:
SQL(s"""
select count(*)
from report
where true
$customerClause
$createdClause
""")
.on('customerId -> customerId,
'createdSince -> createdSince)
.as(scalar[Int].singleOpt).getOrElse(0)
Using {variable} as opposed to $variable is I think preferable as it reduces the risk of SQL injection attacks where someone potentially calls your method with a malicious string. Anorm doesn't mind if you have additional symbols that aren't referenced in the SQL (i.e. if a clause string is empty). Lastly, depending on the database(?), a count might return no rows, so I use singleOpt rather than single.
I'm curious as to what other answers you receive.
Edit: Anorm interpolation (i.e. SQL"...", an interpolation implementation beyond Scala's s"...", f"..." and raw"...") was introduced to allow the use $variable as equivalent to {variable} with .on. And from Play 2.4, Scala and Anorm interpolation can be mixed using $ for Anorm (SQL parameter/variable) and #$ for Scala (plain string). And indeed this works well, as long as the Scala interpolated string does not contains references to an SQL parameter. The only way, in 2.4.4, I could find to use a variable in an Scala interpolated string when using Anorm interpolation, was:
val limitClause = if (nameFilter="") "" else s"where name>'$nameFilter'"
SQL"select * from tab #$limitClause order by name"
But this is vulnerable to SQL injection (e.g. a string like it's will cause a runtime syntax exception). So, in the case of variables inside interpolated strings, it seems it is necessary to use the "traditional" .on approach with only Scala interpolation:
val limitClause = if (nameFilter="") "" else "where name>{nameFilter}"
SQL(s"select * from tab $limitClause order by name").on('limitClause -> limitClause)
Perhaps in the future Anorm interpolation could be extended to parse the interpolated string for variables?
Edit2: I'm finding there are some tables where the number of attributes that might or might not be included in the query changes from time to time. For these cases I'm defining a context class, e.g. CustomerContext. In this case class there are lazy vals for the different clauses that affect the sql. Callers of the sql method must supply a CustomerContext, and the sql will then have inclusions such as ${context.createdClause} and so on. This helps give a consistency, as I end up using the context in other places (such as total record count for paging, etc.).
Finally got this simpler approach posted by Joel Arnold to work in my example case, also with ZonedDateTime!
def countReports(customerId: Option[Long], createdSince: Option[ZonedDateTime]) =
DB.withConnection {
implicit c =>
SQL( """
SELECT count(*) FROM report
WHERE ({customerId} is null or customer_id = {customerId})
AND ({created}::timestamptz is null or created >= {created})
""")
.on('customerId -> customerId, 'created -> createdSince)
.as(scalar[Int].singleOpt).getOrElse(0)
}
The tricky part is having to use {created}::timestamptz in the null check. As Joel commented, this is needed to work around a PostgreSQL driver issue.
Apparently the cast is needed only for timestamp types, and the simpler way ({customerId} is null) works with everything else. Also, comment if you know whether other databases require something like this, or if this is a Postgres-only peculiarity.
(While wwkudu's approach also works fine, this definitely is cleaner, as you can see comparing them side to side in a full example.)

Scala Anorm - how use it properly

Scala's play framework claims that Anorm, and writing your own SQL is better that ORM's. One of the reasons is that you anyway most often want only transfer data between database and frontend as json. However, most tutorials, and even Play documentation give examples of parsing sql's returned values into case classes, in order to parse it again into json. We still have an object relational mapping anyway, or am I missing a point?
In my database there exists a table with 33 columns. Declaring a case class takes me 33 lines, declaring a parser with ~ operator, takes another 33. Using case statement to create an Object, another 66! Seriously, what am I doing wrong? Is there any shortcut? In django the same thing takes only 33 lines.
If you're using Anorm within a Play application, then the mapping into a Json object of your case class (assuming it has fairly normal apply and unapply functions defined for it, which most do) should be pretty much as simple as defining an implicit which uses the >2.10 macro based Json-inception methods...so all you actually need is a definition like this:
implicit val myCaseFormats = Json.format[MyCaseClass]
where 'MyCaseClass' is the name of your case type. You could even bake this into the parser combinator you use for de-serialising row-sets back from the database...that would dramatically clean up your code and cut down the amount of code you have to write.
See here for details on the Json macros:
https://www.playframework.com/documentation/2.1.1/ScalaJsonInception
I use this quite extensively in a pretty large code-base and it does make things quite clean.
In terms of your parsers for Anorm, remember that you don't have to produce a case-class instance as a result of a parse...you can actually return anything you like, which could just be an indexed sequence of your column values (if you're using something like Shapeless to allow for mixed-type lists etc...) or some other structure.
You do hav macro support in Anorm as well so the the parsers for your case classes can be one liners like this:
import norm.{Macro, Rowset}
val parser = Macro.namedParser[MyCaseClass]
If you want to do something custom, (such as parse direct to JsValue) then you have the flexibility to just hand-craft a more crafty parser.
HTH

SQL DSL for Scala

I am struggling to create a SQL DSL for Scala. The DSL is an extension to Querydsl, which is a popular Query abstraction layer for Java.
I am struggling now with really simple expressions like the following
user.firstName == "Bob" || user.firstName == "Ann"
As Querydsl supports already an expression model which can be used here I decided to provide conversions from Proxy objects to Querydsl expressions. In order to use the proxies I create an instance like this
import com.mysema.query.alias.Alias._
var user = alias(classOf[User])
With the following implicit conversions I can convert proxy instances and proxy property call chains into Querydsl expressions
import com.mysema.query.alias.Alias._
import com.mysema.query.types.expr._
import com.mysema.query.types.path._
object Conversions {
def not(b: EBoolean): EBoolean = b.not()
implicit def booleanPath(b: Boolean): PBoolean = $(b);
implicit def stringPath(s: String): PString = $(s);
implicit def datePath(d: java.sql.Date): PDate[java.sql.Date] = $(d);
implicit def dateTimePath(d: java.util.Date): PDateTime[java.util.Date] = $(d);
implicit def timePath(t: java.sql.Time): PTime[java.sql.Time] = $(t);
implicit def comparablePath(c: Comparable[_]): PComparable[_] = $(c);
implicit def simplePath(s: Object): PSimple[_] = $(s);
}
Now I can construct expressions like this
import com.mysema.query.alias.Alias._
import com.mysema.query.scala.Conversions._
var user = alias(classOf[User])
var predicate = (user.firstName like "Bob") or (user.firstName like "Ann")
I am struggling with the following problem.
eq and ne are already available as methods in Scala, so the conversions aren't triggered when they are used
This problem can be generalized as the following. When using method names that are already available in Scala types such as eq, ne, startsWith etc one needs to use some kind of escaping to trigger the implicit conversions.
I am considering the following
Uppercase
var predicate = (user.firstName LIKE "Bob") OR (user.firstName LIKE "Ann")
This is for example the approach in Circumflex ORM, a very powerful ORM framework for Scala with similar DSL aims. But this approach would be inconsistent with the query keywords (select, from, where etc), which are lowercase in Querydsl.
Some prefix
var predicate = (user.firstName :like "Bob") :or (user.firstName :like "Ann")
The context of the predicate usage is something like this
var user = alias(classOf[User])
query().from(user)
.where(
(user.firstName like "Bob") or (user.firstName like "Ann"))
.orderBy(user.firstName asc)
.list(user);
Do you see better options or a different approach for SQL DSL construction for Scala?
So the question basically boils down to two cases
Is it possible to trigger an implicit type conversion when using a method that exists in the super class (e.g. eq)
If it is not possible, what would be the most Scalaesque syntax to use for methods like eq, ne.
EDIT
We got Scala support in Querydsl working by using alias instances and a $-prefix based escape syntax. Here is a blog post on the results : http://blog.mysema.com/2010/09/querying-with-scala.html
There was a very good talk at Scala Days: Type-safe SQL embedded in Scala by Christoph Wulf.
See the video here: Type-safe SQL embedded in Scala by Christoph Wulf
Mr Westkämper - I was pondering this problem, and I wondered if would be possible to use 'tracer' objects, where the basic data types such as Int and String would be extended such that they contained source information, and the results of combining them would likewise hold within themselves their sources and the nature of the combination.
For example, your user.firstName method would return a TracerString, which extends String, but which also indicates that the String corresponds to a column in a relation. The == method would be overwritten such that it returns an EqualityTracerBoolean which extends Boolean. This would preserve the standard Scala semantics. However, the constructor for EqualityTracerBoolean would record the fact that the result of the expression was derived by comparing a column in a relation to a string constant. Your 'where' method could then analyse the EqualityTracerBoolean object returned by the conditional expression evaluated over a dummy argument in order to derive the expression used to create it.
There would have to be override defs for inequality operators, as well as plus and minus, for Ints, and whatever else you wished to represent from sql, and corresponding tracer classes for each of these. It would be a bit of a project!
Anyway, I decided not to bother, and use squeryl instead.
I didn't have the exact same problem with jOOQ, as I'm using a bit more verbose operator names: equal, notEqual, etc instead of eq, ne. On the other hand, there is a val operator in jOOQ for explicitly creating bind values, which I had to overload with value, as val is a keyword in Scala. Is overloading operators an option for you? I documented my attempts of running jOOQ in Scala here:
http://lukaseder.wordpress.com/2011/12/11/the-ultimate-sql-dsl-jooq-in-scala/
Just like you, I had also thought about capitalising all keywords in a major release (including SELECT, FROM, etc). But that will leave an open question about whether "compound" keywords should be split in two method calls, or connected by an underscore: GROUP().BY() or GROUP_BY(). WHEN().MATCHED().THEN().UPDATE() or WHEN_MATCHED_THEN_UPDATE(). Since the result is not really satisfying, I guess it's not worth to break backwards-compatibility for such a fix, even if the two-method-call option would look very very nice in Scala, as . and () can be omitted. So maybe, jOOQ and QueryDSL should both be "wrapped" (as opposed to "extended") by a dedicated Scala-API?
What about decompiling the bytecode at runtime? I started to write such a tool:
http://h2database.com/html/jaqu.html#natural_syntax
I know it's a hack, so please don't vote -1 :-) I just wanted to mentioned it. It's a relatively novel approach. Instead of decompiling at runtime, it might be possible to do it at compile time using an annotation processor, not sure if that's possible using Scala (and not sure if it's really possible with Java, but Project Lombok seems to do something like that).