Need for foo at the end of the query - pyspark

Why is there a need to have the word foo at the end of the query.
queryTwo = """(SELECT max(lst_updt_dt_tm) as maxDt FROM table) foo """
theTable = sqlContext.read.format("jdbc").options(url="the url", driver = "the td driver", dbtable = queryTwo, user="xxxx", password="xxxx").load()
theTable.show();
If I remove the word foo, the code fails.

Reverse engineering what is happening here from the error in your comment:
java.sql.SQLException: [Teradata Database] [TeraJDBC 15.10.00.33] [Error 3707] [SQLState 42000] Syntax error, expected something like a name or a Unicode delimited identifier or an 'UDFCALLNAME' keyword or '(' between the 'FROM' keyword and the 'SELECT' keyword.
It's pretty clear that your dbtable parameter requires a SUBQUERY as input. Whatever is happening in sqlContext.read.format("jdbc").options is rewriting the query as :
SELECT <something> FROM <dbtable> <possibly something more here>;
As such it requires that dbtable be a subquery which is why you must wrap it in parentheses and give it an alias foo.
If you read up on the documentation for SPARK SQL You will see:
dbtable: The JDBC table that should be read. Note that anything that is
valid in a FROM clause of a SQL query can be used. For example,
instead of a full table you could also use a subquery in parentheses.
So you could give it a table, or, optionally, you can give it a subquery. Since it doesn't specify that the subqueries alias needs to be foo you could really write any legal alias name here like mysubquery or "this is a poor choice for a subquery alias" or "f" if you feel so inclined..

Related

Why single quote escape cannot be used in QuestDB, Error: dangling expression

I'm trying to use Query Variables in Grafana, the panel query source is PostgreSQL for QuestDB.
I have added the variable without any issue, but I'm unable to use the variable in Panel query since the variable values contains the spaces (SENSOR01 ON_OFF), also I'm unable to figure-out how to add single quote escape.
Following are the scenarios I tried:
Scenario1: this indicates due to space in the Variable value, on_off considered as separate word
where sensor_name = $sensor
db query error: pq: unexpected token: on_off
.
.
Scenario2: tried to add single quotes explicitly for the variable value, but there is generic error from source DB (QuestDB)
where sensor_name = concat('''', $sensor, '''')
db query error: pq: dangling expression
When tried Scenario2 approach directly in query of Variable, getting the same error
..
Scenario3: Hard-coded the variable value with space and with single quotes, but this giving me error with first part of the variable, looks like the hard-coded single quotes not passed here!
Error (Scenario3):
Is there any way/workaround to tackle this issue?
Could you just add the quotes directly in the query?
where sensor_name = '$sensor'
I have a similar grafana panel querying a questDB database using a variable and it works for me. This is my query:
select device_type, avg(duration_ms) as avg_duration_ms, avg(speed) as avg_speed, avg(measure1) as avg_m1, avg(measure2) as avg_m2 from ilp_test
WHERE
$__timeFilter(timestamp) and device_type = '$deviceType'
A rather hacky workaround would be to do:
where sensor_name = concat(cast(cast('&' as int) + 1 as char), $sensor, cast(cast('&' as int) + 1 as char))
This should work, but I'm pretty sure there is a better solution. Let me find it and get back to you.
Update. We may support Postgres syntax (which is '' escaping for a single quote char) in one of upcoming versions. For now, you'd have to use the above workaround.

Documentation on Postgres Parse tree?

Is there a place that says what that documents the Postgres parse tree? For example from Query Parsing it gives an example of:
SELECT * FROM foo where bar = 42 ORDER BY id DESC LIMIT 23;
(
{SELECT
:distinctClause <>
:intoClause <>
:targetList (
{RESTARGET
:name <>
:indirection <>
:val
...
But I can't find a place that gives more information, for example, what is indirection in the RESTARGET clause? It does link to the Postgres internals docs # https://www.postgresql.org/docs/current/overview.html, but I couldn't find anything about the Postgres parse tree in detail from that. Is anything documented on that?
Note: for the question I'm only concerned with the SELECT statement.
Most of the relevant structs are defined in parsenodes.h, e.g. ResTarget and its attributes are defined here:
https://github.com/postgres/postgres/blob/master/src/include/nodes/parsenodes.h#L461
You may also find pg_query useful, a library I wrote to parse queries using the Postgres parser: https://github.com/pganalyze/pg_query (the output format is either JSON or Protobuf, and maps to the parsenodes.h struct names 1:1)

Slick plain SQL escape PostgreSQL json function

I'm trying to escape the ?| operator in this query:
val data = sql"""
SELECT ......
FROM .......
WHERE table.column ?| array['23', '12']
""".as[Int].head
db.run(data)
however the ?| operator gets translated as $1| in the query (checked in the DB query log) and it obviously generates the error
ERROR: syntax error at or near "$1" at character 735
I've tried with #?| and $?| without success
? is a placeholder for a parameter in JDBC (that's the level after Slick). You can escape ? specifically for PostgreSQL as ??|. There's useful discussion of this in SO 14779896 - Does the JDBC spec prevent '?' from being used as an operator.
An alternative to this convention would be to use non-symbolic alternatives: jsonb_exists_any. E.g.,
WHERE jsonb_exists_any(table.column, array['23', '12'])

Erlang mnesia equivalent of "select * from Tb"

I'm a total erlang noob and I just want to see what's in a particular table I have. I want to just "select *" from a particular table to start with. The examples I'm seeing, such as the official documentation, all have column restrictions which I don't really want. I don't really know how to form the MatchHead or Guard to match anything (aka "*").
A very simple primer on how to just get everything out of a table would be very appreciated!
For example, you can use qlc:
F = fun() ->
Q = qlc:q([R || R <- mnesia:table(foo)]),
qlc:e(Q)
end,
mnesia:transaction(F).
The simplest way to do it is probably mnesia:dirty_match_object:
mnesia:dirty_match_object(foo, #foo{_ = '_'}).
That is, match everything in the table foo that is a foo record, regardless of the values of the fields (every field is '_', i.e. wildcard). Note that since it uses record construction syntax, it will only work in a module where you have included the record definition, or in the shell after evaluating rr(my_module) to make the record definition available.
(I expected mnesia:dirty_match_object(foo, '_') to work, but that fails with a bad_type error.)
To do it with select, call it like this:
mnesia:dirty_select(foo, [{'_', [], ['$_']}]).
Here, MatchHead is _, i.e. match anything. The guards are [], an empty list, i.e. no extra limitations. The result spec is ['$_'], i.e. return the entire record. For more information about match specs, see the match specifications chapter of the ERTS user guide.
If an expression is too deep and gets printed with ... in the shell, you can ask the shell to print the entire thing by evaluating rp(EXPRESSION). EXPRESSION can either be the function call once again, or v(-1) for the value returned by the previous expression, or v(42) for the value returned by the expression preceded by the shell prompt 42>.

PostgreSQL jsonb, `?` and JDBC

I am using PostgreSQL 9.4 and the awesome JSONB field type. I am trying to query against a field in a document. The following works in the psql CLI
SELECT id FROM program WHERE document -> 'dept' ? 'CS'
When I try to run the same query via my Scala app, I'm getting the error below. I'm using Play framework and Anorm, so the query looks like this
SQL(s"SELECT id FROM program WHERE document -> 'dept' ? {dept}")
.on('dept -> "CS")
....
SQLException: : No value specified for parameter 5.
(SimpleParameterList.java:223)
(in my actual queries there are more parameters)
I can get around this by casting my parameter to type jsonb and using the #> operator to check containment.
SQL(s"SELECT id FROM program WHERE document -> 'dept' #> {dept}::jsonb")
.on('dept -> "CS")
....
I'm not too keen on the work around. I don't know if there are performance penalties for the cast, but it's extra typing, and non-obvious.
Is there anything else I can do?
As a workaround to avoid the ? operator, you could create a new operator doing exactly the same.
This is the code of the original operator:
CREATE OPERATOR ?(
PROCEDURE = jsonb_exists,
LEFTARG = jsonb,
RIGHTARG = text,
RESTRICT = contsel,
JOIN = contjoinsel);
SELECT '{"a":1, "b":2}'::jsonb ? 'b'; -- true
Use a different name, without any conflicts, like #-# and create a new one:
CREATE OPERATOR #-#(
PROCEDURE = jsonb_exists,
LEFTARG = jsonb,
RIGHTARG = text,
RESTRICT = contsel,
JOIN = contjoinsel);
SELECT '{"a":1, "b":2}'::jsonb #-# 'b'; -- true
Use this new operator in your code and it should work.
Check pgAdmin -> pg_catalog -> Operators for all the operators that use a ? in the name.
In JDBC (and standard SQL) the question mark is reserved as a parameter placeholder. Other uses are not allowed.
See Does the JDBC spec prevent '?' from being used as an operator (outside of quotes)? and the discussion on jdbc-spec-discuss.
The current PostgreSQL JDBC driver will transform all occurrences (outside text or comments) of a question mark to a PostgreSQL specific parameter placeholder. I am not sure if the PostgreSQL JDBC project has done anything (like introducing an escape as discussed in the links above) to address this yet. A quick look at the code and documentation suggests they didn't, but I didn't dig too deep.
Addendum: As shown in the answer by bobmarksie, current versions of the PostgreSQL JDBC driver now support escaping the question mark by doubling it (ie: use ?? instead of ?).
I had the same issue a couple of days ago and after some investigation I found this.
https://jdbc.postgresql.org/documentation/head/statement.html
In JDBC, the question mark (?) is the placeholder for the positional parameters of a PreparedStatement. There are, however, a number of PostgreSQL operators that contain a question mark. To keep such question marks in a SQL statement from being interpreted as positional parameters, use two question marks (??) as escape sequence. You can also use this escape sequence in a Statement, but that is not required. Specifically only in a Statement a single (?) can be used as an operator.
Using 2 question marks seemed to work well for me - I was using the following driver (illustrated using maven dependency) ...
<dependency>
<groupId>org.postgresql</groupId>
<artifactId>postgresql</artifactId>
<version>9.4-1201-jdbc41</version>
</dependency>
... and MyBatis for creating the SQL queries and it seemed to work well. Seemed easier / cleaner than creating an PostgreSQL operator.
SQL went from e.g.
select * from user_docs where userTags ?| array['sport','property']
... to ...
select * from user_docs where userTags ??| array['sport','property']
Hopefully this works with your scenario!
As bob said just use ?? instead of ?
SQL(s"SELECT id FROM program WHERE document -> 'dept' ?? {dept}")
.on('dept -> "CS")