Scala query generating invalid SQL - scala

I'm using scalaquery to connect to both oracle and postgres servers.
This behaviour is occuring for both Oracle and Postgres, but it's only valid (and still incorrect) SQL in Postgres.
At some point, I'm running a query in scalaquery of the form:
row.foo.bind == parameter.foo || row.foo inSetBind parameter.foo.children
Parameter is a trait, which is known to have a foo in it.
The problem here is that out of the ~100 queries run, scala-query only generates the correct SQL once, of the form
...
WHERE row.foo = ? or row.foo in (?, ?, ?, ?, ?)
...
Most of the time it instead generates
...
WHERE row.foo = ? or false
...
Why is this happening inconsistently, is it a bug (I assume it is), and how do I work around it?

It turns out that the query was looking at an empty set, because parameter.foo had no childen in most cases.
Given that WHERE row.foo IN () is not valid SQL, it was instead written out as false.
This still leaves the issue of false being generated despite the code being targeted at oracle DB, but the root cause has now been cleared up.

Related

Parameterizing postgis geometry query

I'm trying to query whether any of a set of polygons (passed in at runtime) intersects with a set of polygons stored in the database in the "enclosing_polygons" field, which is a MultiPolygonField.
Here is an example of the query:
select * from my_table where field1 = any (?) and field2 = any (?) and (
ST_Intersects(ST_GeometryFromText('POLYGON((? ?, ? ?, ? ?, ? ?, ? ?))'), enclosing_polygons) or
ST_Intersects(ST_GeometryFromText('POLYGON((? ?, ? ?, ? ?, ? ?, ? ?))'), enclosing_polygons))
and detection_type = 0 order by confidence desc limit 2000
The query works fine with hardcoded values, but when I try to parameterize it, Postgres does not seem to recognize the ? placehoders for the polygon points as parameters when I try to populate them.
When I set the first two parameters (for field1 and field2), these JDBC statements succeed:
statement.setArray(1, array1)
statement.setArray(2, array2)
However, if I try to set any parameters beyond these first two, they fail. This statement:
statement.setDouble(3, point1x)
fails with the following error:
The column index is out of range: 3, number of columns: 2.
Why does Postgres not recognize these ?s in the POLYGON constructor as query parameters?
How can I make this work?
It is up to your driver to implement the ? placeholders, PostgreSQL never sees them. In your driver, like in almost all drivers, the question marks occurring inside the single quotes are just literal question marks, not place holders.
You probably need to construct the POLYGON((...)) string yourself, then pass that whole string into the query as a single placeholder. So that part of the query would look like ST_Intersects(ST_GeometryFromText(?), enclosing_polygons)
There are alternatives but they mostly just move the problem around without directly solving it. If you really want to just use plain question marks with each bound to one number, you could replace the ST_GeometryFromText function with something like:
st_makepolygon(st_makeline(array[st_makepoint(?,?),st_makepoint(?,?),st_makepoint(?,?),st_makepoint(?,?),st_makepoint(?,?)]))

Replace-function via EclipseLink on DB2 z/OS

I am migrating an application from JPA 2.0 with OpenJPA on WebSphere 8.5 to JPA 2.1 with EclipseLink on WebSphere 9.0, using DB2 12 on z/OS. Generally it is working, but one rather complex query is failing. I could localize the problem to the usage of a custom DB2-function call within a criteria-query. The call looks something like this:
criteriaBuilder.function("REPLACE", String.class, fromMyEntity.get("myField"), criteriaBuilder.literal("a"), criteriaBuilder.literal("b"));
This produces the following error (had to translate some error texts, since WebSphere localizes them, and anonymize my field/table names, so labels/names might not be 100% exact):
Error: Exception [EclipseLink-4002] (Eclipse Persistence Services - 2.6.3.WAS-v20160414-bd51c70): org.eclipse.persistence.exceptions.DatabaseException
Internal Exception: com.ibm.websphere.ce.cm.StaleConnectionException: DB2 SQL Error: SQLCODE=-171, SQLSTATE=42815, SQLERRMC=2;REPLACE, DRIVER=3.72.44
Errorcode: -171
Call: SELECT COUNT(ID) FROM MY_TABLE WHERE REPLACE(MYFIELD, ?, ?) LIKE ?
bind => [abc, def, %g%]
Query: ReportQuery(referenceClass=MyEntity sql="SELECT COUNT(ID) FROM MY_TABLE WHERE REPLACE(MYFIELD, ?, ?) LIKE ?").
What really confuses me, if I take the generated query, replace the placeholders with the given bound parameters, and execute that in a database client myself, it works without error.
The documentation states, the first parameter must not be empty (https://www.ibm.com/support/knowledgecenter/en/SSEPEK_12.0.0/sqlref/src/tpc/db2z_bif_replace.html), and indeed if I use an empty string either as a literal in the query or in my database client, it will produce the above error. But none of the rows in the database contain an empty value. There are checks in place to prevent this in the old environment, but they don't appear to work with the new environment, so I disabled them while searching for the problem, and made sure myself no empty values exist. I can even use the primary key as the first parameter, and it will still fail, and that can't even contain an empty/null value.
Using other functions (like TRANSLATE) works, I also tried using "SYSIBM.REPLACE" as name, and different combinations of parameters, but as soon as I use a real column to replace data in, it fails. Anybody got any ideas what I am doing wrong here?
This is my table definition:
CREATE TABLE "MY_TABLE" (
"ID" INTEGER NOT NULL GENERATED BY DEFAULT AS IDENTITY (NO MINVALUE NO MAXVALUE NO CYCLE CACHE 20 NO ORDER ),
"MYFIELD" VARCHAR(160) FOR MIXED DATA WITH DEFAULT NULL,
[....]
) IN "<Database>"."<Tablespace>" PARTITION BY SIZE EVERY 4 G AUDIT NONE DATA CAPTURE NONE CCSID UNICODE;

Postgres bytea error when binding null to prepared statements

I am working with a Java application which uses JPA and a Postgres database, and I am trying to create a flexible prepared statement which can handle a variable number of input parameters. An example query would best explain this:
SELECT *
FROM my_table
WHERE
(string_col = :param1 OR :param1 IS NULL) AND
(double_col = :param2 OR :param2 IS NULL);
The idea behind this "trick" is that if a user specifies only one parameter, say :param1, we can just bind null to :param2, and the WHERE clause would then behave as if only the first parameter were even being checked. This approach lets us handle, in theory, any number of input parameters using a single prepared statement, instead of needing to maintain many different statements.
I have gotten a simple POC working locally using pure JDBC prepared statements. However, doing so required casting the parameter before comparing it to NULL, e.g.
WHERE (double_col = ? OR ?::numeric IS NULL)
^^ does not work without cast
However, my actual application is using JPA, and I keep getting the following persistent error:
Caused by: org.postgresql.util.PSQLException: ERROR: operator does not exist: double precision = bytea
Hint: No operator matches the given name and argument type(s). You might need to add explicit type casts.
The problem does not occur with string/text columns, but only with columns which are double precision in my Postgres table. I have tried all combinations of casting, and nothing works:
(double_col = :param2 OR CAST(:param2 AS double precision) IS NULL);
(CAST(double_col AS double precision) = :param2 OR :param2 IS NULL);
(CAST(double_col AS double precision) = :param2 OR CAST(:param2 AS double precision) IS NULL);
The error seems to be saying that JDBC is sending Postgres a bytea type for the double columns, and then Postgres is rolling over because it can't find a way to cast byte to double precision.
The Java code looks something like:
Query query = entityManager.createNativeQuery(sqlString, MyEntity.class);
query.setParameter("param1", "some value");
// bind other parameters here
List<MyEntity> = query.getResultList();
For reference, here are the versions of everything I am using:
Hibernate version | 4.3.7.Final
Spring data JPA vesion | 1.7.1.RELEASE
Postgres driver version | 42.2.2
Postgres database version | 9.6.10
Java version | 1.8.0_171
Not having received any feedback in the form of answers or even a comment, I was getting ready to give up, when I stumbled onto this excellent blog post:
How to bind custom Hibernate parameter types to JPA queries
The post gives two options for controlling the types which JPA passes through the driver to Postgres (or whatever the underlying database actually is). I went with the approach using TypedParameterValue. Here is what my code looks like continuing with the example given above:
Query query = entityManager.createNativeQuery(sqlString, MyEntity.class);
query.setParameter("param1", new TypedParameterValue(StringType.INSTANCE, null));
query.setParameter("param2", new TypedParameterValue(DoubleType.INSTANCE, null));
List<MyEntity> = query.getResultList();
Of course, it is trivial to be passing null for every parameter in the query, but I am doing this mainly to show the syntax for the text and double columns. In practice, we would expect at least a few of the parameters to be non null, but the above syntax handles all values, null or otherwise.
If you want to keep using plain queries with automatic parameter binding, you could try the following.
WHERE (? IS NULL OR (CAST(CAST(? AS TEXT) AS DOUBLE PRECISION) = double_col
This seems to satisfy the PostgreSQL driver's type checks as well as yielding the correct results. I haven't done much testing, but the performance hit seems minimal because the CASTs happen on a constant value rather than rows from the database.

PreparedStatement setNull in SELECT query

I am using Postgresql together with HikariCP and my query is something like
SELECT * FROM my_table WHERE int_val = ? ...
Now, I would like to set NULL value to my variables - I have tried
ps.setNull(1, Types.INTEGER); // ps is instance of PreparedStatement
try (ResultSet rs = ps.executeQuery()) {
... // get result from resultset
}
Although I have rows matching the conditions ( NULL in column 'int_val'), I have not received any records..
The problem is (I think) in query produced by the Statement, looks like:
System.out.println(ps.toString());
// --> SELECT * FROM my_table WHERE int_val = NULL ...
But the query should look like:
"SELECT * FROM my_table WHERE int_val IS NULL ..." - this query works
I need to use dynamically create PreparedStatements which will contain NULL values, so I cannot somehow easily bypass this.
I have tried creating connection without the HikariCP with the same result, so I thing the problem is in the postgresql driver? Or am I doing something wrong?
UPDATE:
Based on answer from #Vao Tsun I have set transform_null_equals = on in postgresql.conf , which started changing val = null --> val is null in 'simple' Statements, but NOT in PreparedStatements..
To summarize:
try (ResultSet rs = st.executeQuery(SELECT * FROM my_table WHERE int_val = NULL)){
// query is replaced to '.. int_val IS NULL ..' and gets correct result
}
ps.setNull(1, Types.INTEGER);
try (ResultSet rs = ps.executeQuery()) {
// Does not get replaced and does not get any result
}
I am using JVM version 1.8.0_121, the latest postgres driver (42.1.4), but I have also tried older driver (9.4.1212). Database version -- PostgreSQL 9.6.2, compiled by Visual C++ build 1800, 64-bit.
It is meant behaviour that comparison x = null is equal to null (no matter what x is equal to). Basically for SQL NULL is unknown, not the actual value... To bypass it you can set transform_null_equals to on or true. Please checkout docs:
https://www.postgresql.org/docs/current/static/functions-comparison.html
Some applications might expect that expression = NULL returns true if
expression evaluates to the null value. It is highly recommended that
these applications be modified to comply with the SQL standard.
However, if that cannot be done the transform_null_equals
configuration variable is available. If it is enabled, PostgreSQL will
convert x = NULL clauses to x IS NULL.
I have just found a solution, which works the same for "values" and "NULLs" by using IS NOT DISTINCT FROM instead of =.
More on postgresql wiki
It is important to recognize that null is not a value with SQL. It is encoding the logical notion of "unknown". This is why null = var results in false always, even for cases where var has a value of null. So even if if you are replacing the value of your variable (aka ? in your case) with a value of null, the result be definition must not be what you do expect as long as SQL standard is complied with.
Now there are some databases around that try to outsmart SQL standard by assuming a column value of null should be taken as a programming language null (nil, undef or whatever is used for that purpose).
This creates some convenience for the unwary programmer, but in the long run causes grieve as soon as you need a true distinction between a SQL null and a programming language null.
Nevertheless, for ease of porting from such databases to PostgresQL (or simple for ease of lazy programming) you may resort to setting transform_null_equals.
BUT, you are using prepared statements. As such, prepared statements are converted to query plan once and such query plan needs to be valid for all potential values of the variables used in the prepared statement query. Now, a VAR is null is fundamentally different from a VAR = ?. So there is no chance for the query parser, query optimizer or even query execution engine to dynamically rewrite the (already prepared) query based on the actual parameter values passed in.
From this, you should take the recommendation serious that is given with the documentation of transform_null_equals and change your code to use VAR is null when a null value is to be searched for and a VAR = ? for other cases.

SQL Server OpenQuery() behaving differently then a direct query from TOAD

The following query works efficiently when run directly against Oracle 11 using TOAD (with native Oracle drivers)
select ... from ... where ...
and srvg_ocd in (
select ocd
from rptofc
where eff_endt = to_date('12/31/9999','mm/dd/yyyy')
and rgn_nm = 'Boston'
) ...
;
The exact same query "never" returns if passed from SQL Server 2008 to the same Oracle database via openquery(). SQL Server has a link to the Oracle database using an Oracle Provider OLE DB driver.
select * from openquery( servername, '
select ... from ... where ...
and srvg_ocd in (
select ocd
from rptofc
where eff_endt = to_date(''12/31/9999'',''mm/dd/yyyy'')
and rgn_nm = ''Boston''
) ...
');
The query doesn't return in a reasonable amount of time, and the user kills the query. I don't know if it would eventually return with the correct result.
This result where the direct TOAD query works efficiently and the openquery() version "never" returns is reproducible.
A small modification to the openquery() gives the correct efficient result: Change eff_endt to trunc(eff_endt).
That is well and good, but it doesn't seem like the change should be necessary.
openquery() is supposed to be pass through, so how can there be a difference between the TOAD and openquery() behavior?
The reason we care is because we frequently develop complex queries with TOAD directly accessing Oracle. Once we have the query functioning and optimized, we convert it to an openquery() string for use in a SQL Server application. It is extremely aggravating to have a query suddenly fail with openquery() when we know it worked as a direct query. Then we have to search for a work-around through trial and error.
I would like to see the Oracle trace files for the two scenarios, but the Oracle server is within another organization, and we are not getting cooperation from the Oracle DBAs.
Does anyone know of any driver, or TOAD, or ??? issues that could account for the discrepancy? Is there any way to eliminate the problem such that both methods always give the same result?
I know you asked this a while ago but I just came across your question.
I agree, they should be the same. Obviously there is a difference. We need to find out where the difference is.
I am thinking out loud as I type...
What happens if you specify just a few column instead of select * from openquery?
How many rows are supposed to be returned?
What if, in the oracle select, you limit the returned rows?
How quickly does the openquery timeout?
Are TOAD and SS on the same machine? Are you RDPing into the SS and running toad from there?
Are they using the same drivers? including bit? (32/64) version?
Are they using the same account on oracle?
It is interesting that using the trunc() makes a difference. I assume [eff_endt] is one of the returned fields?
I am wondering if SS is getting all the rows back but it is choking on doing the date conversions. The date type in oracle may need to be converted to a ss date type before ss shows it to you.
What if you insert the rows from the openquery into a table where the date field is just a (n)varchar. I am thinking ss might just dump the date it is getting back from oracle into that text field without trying to convert it.
something like:
insert into mytable(f1,f2,f3,datetimeX)
select f1,f2,f3,datetimeX from openquery( servername, '
select f1,f2,f3,datetimeX from ... where ...
and srvg_ocd in (
select ocd
from rptofc
where eff_endt = to_date(''12/31/9999'',''mm/dd/yyyy'')
and rgn_nm = ''Boston''
) ...
');
What if toad or ss is modifying the query statement before sending it to oracle. You could fire up wireshark and see what toad and ss are actually sending.
I would be very curious if you get this resolved. I link ss to oracle often and have not run into this issue.
Here are basic things you can check for to see what the database is doing after it receives the query. First, check that the execution plans are the same in TOAD as when the query runs using openquery. You could plan the query yourself in TOAD using:
explain plan set statement_id = 'openquery_test' for <your query here>;
select *
from table(dbms_xplan.display(statement_id => 'openquery_test';
then have someone initiate the query using openquery() and have someone with permissions to view v$ tables to run:
select sql_id from v$session where username = '<user running the query>';
(If there's more than one connection with the same user, you'll have to find an additional attribute to isolate the row representing the session running the query.)
select *
from table(dbms_xplan.display_cursor('<value from query above'));
If those look the same then I'd move on to checking database waits and see what it's stuck on.
select se.username
, sw.event
, sw.p1text
, sw.p2text
, sw.p3text
, sw.wait_time_micro/1000000 as seconds_in_wait
, sw.state
, sw.time_since_last_wait_micro/1000000 as seconds_since_last_wait
from v$session se
inner join
v$session_wait sw
on se.sid = sw.sid
where se.username = '<user running the query>'
;
(again, if there's more than one session with the same username, you'll need another attribute to whittle it down to the one you're interested in.)
If the plans are different, then you need to find out why, or if they're the same, look into what it's waiting on (e.g. SQL*Net message to client ?) and why.
I noticed a difference using OLEDB through MS Access (2013) connecting to Oracle 10g & 11g tables, in that it did not always recognize indexes or primary keys on the Oracle tables properly. The same query through an MS Access 2000 database (using odbc) worked fine / had no problem with indexes & keys. The only way I found to fix the OLEDB version was to include all of the key fields in the SELECT -- which was not a satisfying answer, but it's all I could find. This might be an option to try through SSMS / OpenQuery(...) as well.
Besides that... you can try some alternatives to OPENQUERY, such as:
4-part names: SELECT ... FROM Server..Schema.Table
Execute AT: EXEC ('select...') at linked server
But as for why the OLEDB provider works differently than the native Oracle Provider -- the providers are not identical, and the native provider would be more likely to pave-over Oracle quirks than the more generic OLEDB provider would.