test JOOQ postgres jsonb column for key exists - postgresql

I've got a table TABLE that contains a jsonb column named tags. The tags element in each row may or may not contain a field called group. My goal is to group by tags.group for all rows where tags contains a group field. Like the following postgres query:
select tags->>'group' as group, sum(n) as sum
from TABLE
where tags ? 'group'
group by tags->>'group';
I'm trying to turn it into JOOQ and cannot find out how to express the where tags ? 'group' condition.
For example,
val selectGroup = DSL.field("{0}->>'{1}'", String::class.java, TABLE.TAGS, "group")
dsl().select(selectGroup, DSL.sum(TABLE.N))
.from(TABLE)
.where(TABLE.TAGS.contains('group'))
.groupBy(selectGroup)
This is equivalent to testing contains condition #> in postgres. But I need to do exists condition ?. How can I express that in JOOQ?

There are two things worth mentioning here:
The ? operator in JDBC
Unfortunately, there's no good solution to this as ? is currently strictly limited to be used as a bind variable placeholder in the PostgreSQL JDBC driver. So, even if you could find a way to send that character to the server through jOOQ, the JDBC driver would still misinterpret it.
A workaround is documented in this Stack Overflow question.
Plain SQL and string literals
When you're using the plain SQL templating language in jOOQ, beware that there is a parser that will parse certain tokens of your string, including e.g. comments and string literals. This means that your usage of...
DSL.field("{0}->>'{1}'", String::class.java, TABLE.TAGS, "group")
is incorrect, as '{1}' will be parsed as a string literal and sent to the server as is. If you want to use a variable string literal, do this instead:
DSL.field("{0}->>{1}", String::class.java, TABLE.TAGS, DSL.inline("group"))
See also DSL.inline()

Related

How can I prevent SQL injection with arbitrary JSONB query string provided by an external client?

I have a basic REST service backed by a PostgreSQL database with a table with various columns, one of which is a JSONB column that contains arbitrary data. Clients can store data filling in the fixed columns and provide any JSON as opaque data that is stored in the JSONB column.
I want to allow the client to query the database with constraints on both the fixed columns and the JSONB. It is easy to translate some query parameters like ?field=value and convert that into a parameterized SQL query for the fixed columns, but I want to add an arbitrary JSONB query to the SQL as well.
This JSONB query string could contain SQL injection, how can I prevent this? I think that because the structure of the JSONB data is arbitrary I can't use a parameterized query for this purpose. All the documentation I can find suggests I use parameterized queries, and I can't find any useful information on how to actually sanitize the query string itself, which seems like my only option.
For example a similar question is:
How to prevent SQL Injection in PostgreSQL JSON/JSONB field?
But I can't apply the same solution as I don't know the structure of the JSONB or the query, I can't assume the client wants to query a particular path using a particular operator, the entire JSONB query needs to be freely provided by the client.
I'm using golang, in case there are any existing libraries or code fragments that I can use.
edit: some example queries on the JSONB that the client might do:
(content->>'company') is NULL
(content->>'income')::numeric>80000
content->'company'->>'name'='EA' AND (content->>'income')::numeric>80000
content->'assets'#>'[{"kind":"car"}]'
(content->>'DOB')::TIMESTAMP<'2000-01-30T10:12:18.120Z'::TIMESTAMP
EXISTS (SELECT FROM jsonb_array_elements(content->'assets') asset WHERE (asset->>'value')::numeric > 100000)
Note that these don't cover all possible types of queries. Ideally I want any query that PostgreSQL supports on the JSONB data to be allowed. I just want to check the query to ensure it doesn't contain sql injection. For example, a simplistic and probably inadequate solution would be to not allow any ";" in the query string.
You could allow the users to specify a path within the JSON document, and then parameterize that path within a call to a function like json_extract_path_text. That is, the WHERE clause would look like:
WHERE json_extract_path_text(data, $1) = $2
The path argument is just a string, easily parameterized, which describes the keys to traverse down to the given value, e.g. 'foo.bars[0].name'. The right-hand side of the clause would be parameterized along the same rules as you're using for fixed column filtering.

SSRS multi value parameter - can't get it to work

First off this is my first attempt at a multi select. I've done a lot of searching but I can't find the answer that works for me.
I have a postgresql query which has bg.revision_key in (_revision_key) which holds the parameter. A side note, we've named all our parameters in the queries with the underscore and they all work, they are single select in SSRS.
In my SSRS report I have a parameter called Revision Key Segment which is the multi select parameter. I've ticked Allow multi value and in Available Values I have value field pointing to revision_key in the dataset.
In my dataset parameter options I have Parameter Value [#revision_key]
In my shared dataset I also have my parameter set to Allow multi value.
For some reason I can't seem to get the multi select to work so I must be missing something somewhere but I've ran out of ideas.
Unlike with SQL Server, when you connect to a database using an ODBC connection, the parameter support is different. You cannot use named parameters and instead have to use the ? syntax.
In order to accommodate multiple values you can concatenate them into a single string and use a like statement to search them. However, this is inefficient. Another approach is to use a function to split the values into an in-line table.
In PostgreSQL you can use an expression like this:
inner join (select CAST(regexp_split_to_table(?, ',') AS int) as filter) as my on my.filter = key_column
Then in the dataset properties, under the parameters tab, use an expression like this to concatenate the values:
=Join(Parameters!Keys.Value, ",")
In other words, the report is concatenating the values into a comma-separated list. The database is splitting them into a table of integers then inner joining on the values.

Using arrays with pg-promise

I'm using pg-promise and am not understanding how to run this query. The first query works, but I would like to use pg-promise's safe character escaping, and then I try the second query it doesn't work.
Works:
db.any(`SELECT title FROM books WHERE id = ANY ('{${ids}}') ORDER BY id`)
Doesn't work
db.any(`SELECT title FROM books WHERE id = ANY ($1) ORDER BY id`, ids)
The example has 2 problems. First, it goes against what the documentation tells you:
IMPORTANT: Never use the reserved ${} syntax inside ES6 template strings, as those have no knowledge of how to format values for PostgreSQL. Inside ES6 template strings you should only use one of the 4 alternatives - $(), $<>, $[] or $//.
Manual query formatting, like in your first example, is a very bad practice, resulting in bad things, ranging from broken queries to SQL injection.
And the second issue is that after switching to the correct SQL formatting, you should use the CSV Filter to properly format the list of values:
db.any(`SELECT title FROM books WHERE id IN ($/ids:csv/) ORDER BY id`, {ids})
or via an index variable:
db.any(`SELECT title FROM books WHERE id IN ($1:csv) ORDER BY id`, [ids])
Note that I also changed from ANY to IN operand, as we are providing a list of open values here.
And you can use filter :list interchangeably, whichever you like.

need help, bug in query select with isnull

i think this can be a bug in yii2,
when i use:
$query->select ("isnull(cba.HSLET,0)");
$query->join('INNER JOIN','cba','cba.id = regverification.cba_id');
yii miss create the sql sentence resulting in a error near:
isnull(cba.HSLET, [0)]
From Yii 2 Guide:
Besides column names, you can also select DB expressions. You must use the array format when selecting a DB expression that contains commas to avoid incorrect automatic name quoting.
So it should be:
$query->select(["ISNULL(cba.HSLET, 0)"]);

Parameterized SQL Columns?

I have some code which utilizes parameterized queries to prevent against injection, but I also need to be able to dynamically construct the query regardless of the structure of the table. What is the proper way to do this?
Here's an example, say I have a table with columns Name, Address, Telephone. I have a web page where I run Show Columns and populate a select drop-down with them as options.
Next, I have a textbox called Search. This textbox is used as the parameter.
Currently my code looks something like this:
result = pquery('SELECT * FROM contacts WHERE `' + escape(column) + '`=?', search);
I get an icky feeling from it though. The reason I'm using parameterized queries is to avoid using escape. Also, escape is likely not designed for escaping column names.
How can I make sure this works the way I intend?
Edit:
The reason I require dynamic queries is that the schema is user-configurable, and I will not be around to fix anything hard-coded.
Instead of passing the column names, just pass an identifier that you code will translate to a column name using a hardcoded table. This means you don't need to worry about malicious data being passed, since all the data is either translated legally, or is known to be invalid. Psudoish code:
#columns = qw/Name Address Telephone/;
if ($columns[$param]) {
$query = "select * from contacts where $columns[$param] = ?";
} else {
die "Invalid column!";
}
run_sql($query, $search);
The trick is to be confident in your escaping and validating routines. I use my own SQL escape function that is overloaded for literals of different types. Nowhere do I insert expressions (as opposed to quoted literal values) directly from user input.
Still, it can be done, I recommend a separate — and strict — function for validating the column name. Allow it to accept only a single identifier, something like
/^\w[\w\d_]*$/
You'll have to rely on assumptions you can make about your own column names.
I use ADO.NET and the use of SQL Commands and SQLParameters to those commands which take care of the Escape problem. So if you are in a Microsoft-tool environment as well, I can say that I use this very sucesfully to build dynamic SQL and yet protect my parameters
best of luck
Make the column based on the results of another query to a table that enumerates the possible schema values. In that second query you can hardcode the select to the column name that is used to define the schema. if no rows are returned then the entered column is invalid.
In standard SQL, you enclose delimited identifiers in double quotes. This means that:
SELECT * FROM "SomeTable" WHERE "SomeColumn" = ?
will select from a table called SomeTable with the shown capitalization (not a case-converted version of the name), and will apply a condition to a column called SomeColumn with the shown capitalization.
Of itself, that's not very helpful, but...if you can apply the escape() technique with double quotes to the names entered via your web form, then you can build up your query reasonably confidently.
Of course, you said you wanted to avoid using escape - and indeed you don't have to use it on the parameters where you provide the ? place-holders. But where you are putting user-provided data into the query, you need to protect yourself from malicious people.
Different DBMS have different ways of providing delimited identifiers. MS SQL Server, for instance, seems to use square brackets [SomeTable] instead of double quotes.
Column names in some databases can contain spaces, which mean you'd have to quote the column name, but if your database contains no such columns, just run the column name through a regular expression or some sort of check before splicing into the SQL:
if ( $column !~ /^\w+$/ ) {
die "Bad column name [$column]";
}