Search jsonb fields in postgresql with Hasura - postgresql

Is it possible to do a greater than search across a jsonb field using hasura?
it looks to be possible in PostgreSQL itself, How can I do less than, greater than in JSON Postgres fields?
in postgres I'm storing a table
asset
name: string
version: int
metadata: jsonb
the metadata looks like this.
{'length': 5}
I am able to find asset that matches exactly using the _contains.
{
asset(where:{metadata : {_contains : {length: 5}}}){
name
metadata
}
}
I would like to be able to find asset with a length over 10.
I tried:
{
asset(where:{metadata : {_gt : {length: 10}}}){
name
metadata
}
}

A. Possibility to do on graphql level directly
Hasura documentation: JSONB operators (_contains, _has_key, etc.) mentions only 4 operators:
The _contains, _contained_in, _has_key, _has_keys_any and _has_keys_all operators are used to filter based on JSONB columns.
So direct answer for your question: No. It's not possible to do on graphql level in hasura.
(At least it's not possible yet. Who knows: maybe in future releases more operators will be implemented.
)
B. Using derived views
But there is another way, the one explained in https://hasura.io/blog/postgres-json-and-jsonb-type-support-on-graphql-41f586e47536/#derived-data
This recomendation is repeated in: https://github.com/hasura/graphql-engine/issues/6331
We don't have operators like that for JSONB (might be solved by something like #5211) but you can use a view or computed field to flatten the text field from the JSONB column into a new column and then do a like on that.
Recipe is:
1. Create a view
CREATE VIEW assets -- note plural here. Name view accordingly to your style guide
AS
SELECT
name,
version,
metadata,
(metadata->>'length')::int as meta_len -- cast to other number type if needed
FROM asset
2. Register this view
3. Use it in graphql queries as usual table
E.g.
query{
assets(where: {meta_len: {_gt:10}}){
name
metadata
}
C. Using SETOF-functions
1. Create SETOF-function
CREATE FUNCTION get_assets(min_length int DEFAULT 0)
RETURNS SETOF asset
LANGUAGE SQL
STABLE
AS $$
SELECT * FROM asset
WHERE
(metadata->>'length')::int > min_length;
$$;
2. Register in hasura
3. Use in queries
query{
get_assets(args: {min_length: 10}){
name
metadata
}
I think that was the last possible option.
It will not gives you full "schemaless freedom" that maybe you're looking but IDK know about other ways.

Related

Redshift Spectrum table doesnt recognize array

I have ran a crawler on json S3 file for updating an existing external table.
Once finished I checked the SVL_S3LOG to see the structure of the external table and saw it was updated and I have new column with Array<int> type like expected.
When I have tried to execute select * on the external table I got this error: "Invalid operation: Nested tables do not support '*' in the SELECT clause.;"
So I have tried to detailed the select statement with all columns names:
select name, date, books.... (books is the Array<int> type)
from external_table_a1
and got this error:
Invalid operation: column "books" does not exist in external_table_a1;"
I have also checked under "AWS Glue" the table external_table_a1 and saw that column "books" is recognized and have the type Array<int>.
Can someone explain why my simple query is wrong?
What am I missing?
Querying JSON data is a bit of a hassle with Redshift: when parsing is enabled (eg using the appropriate SerDe configuration) the JSON is stored as a SUPER type. In your case that's the Array<int>.
The AWS documentation on Querying semistructured data seems pretty straightforward, mentioning that PartiQL uses "dotted notation and array subscript for path navigation when accessing nested data". This doesn't work for me, although I don't find any reasons in their SUPER Limitations Documentation.
Solution 1
What I have to do is set the flags set json_serialization_enable to true; and set json_serialization_parse_nested_strings to true; which will parse the SUPER type as JSON (ie back to JSON). I can then use JSON-functions to query the data. Unnesting data gets even crazier because you can only use the unnest syntax select item from table as t, t.items as item on SUPER types. I genuinely don't think that this is the supposed way to query and unnest SUPER objects but that's the only approach that worked for me.
They described that in some older "Amazon Redshift Developer Guide".
Solution 2
When you are writing your query or creating a query Redshift will try to fit the output into one of the basic column data types. If the result of your query does not match any of those types, Redshift will not process the query. Hence, in order to convert a SUPER to a compatible type you will have to unnest it (using the rather peculiar Redshift unnest syntax).
For me, this works in certain cases but I'm not always able to properly index arrays, not can I access the array index (using my_table.array_column as array_entry at array_index syntax).

How can I prevent SQL injection with arbitrary JSONB query string provided by an external client?

I have a basic REST service backed by a PostgreSQL database with a table with various columns, one of which is a JSONB column that contains arbitrary data. Clients can store data filling in the fixed columns and provide any JSON as opaque data that is stored in the JSONB column.
I want to allow the client to query the database with constraints on both the fixed columns and the JSONB. It is easy to translate some query parameters like ?field=value and convert that into a parameterized SQL query for the fixed columns, but I want to add an arbitrary JSONB query to the SQL as well.
This JSONB query string could contain SQL injection, how can I prevent this? I think that because the structure of the JSONB data is arbitrary I can't use a parameterized query for this purpose. All the documentation I can find suggests I use parameterized queries, and I can't find any useful information on how to actually sanitize the query string itself, which seems like my only option.
For example a similar question is:
How to prevent SQL Injection in PostgreSQL JSON/JSONB field?
But I can't apply the same solution as I don't know the structure of the JSONB or the query, I can't assume the client wants to query a particular path using a particular operator, the entire JSONB query needs to be freely provided by the client.
I'm using golang, in case there are any existing libraries or code fragments that I can use.
edit: some example queries on the JSONB that the client might do:
(content->>'company') is NULL
(content->>'income')::numeric>80000
content->'company'->>'name'='EA' AND (content->>'income')::numeric>80000
content->'assets'#>'[{"kind":"car"}]'
(content->>'DOB')::TIMESTAMP<'2000-01-30T10:12:18.120Z'::TIMESTAMP
EXISTS (SELECT FROM jsonb_array_elements(content->'assets') asset WHERE (asset->>'value')::numeric > 100000)
Note that these don't cover all possible types of queries. Ideally I want any query that PostgreSQL supports on the JSONB data to be allowed. I just want to check the query to ensure it doesn't contain sql injection. For example, a simplistic and probably inadequate solution would be to not allow any ";" in the query string.
You could allow the users to specify a path within the JSON document, and then parameterize that path within a call to a function like json_extract_path_text. That is, the WHERE clause would look like:
WHERE json_extract_path_text(data, $1) = $2
The path argument is just a string, easily parameterized, which describes the keys to traverse down to the given value, e.g. 'foo.bars[0].name'. The right-hand side of the clause would be parameterized along the same rules as you're using for fixed column filtering.

Parametric query and hstore in PostgreSQL

I have a query with one parameter and am using jmoiron/sqlx to run it against Nominatim database that has a hstore field "name". The query itself is like
SELECT place_id, parent_place_id, name->'name:ru' as name from placesx WHERE admin_level = 3 and parent_place_id IN (?)
The problem when I use sqlx.In, sqlx.Bind and sqlx.Prepare functions that it takes :ru as a query parameter and complains about it.
Question is - how it can be avoided so that I can retrieve specific locale value ('name:en', 'name:de' etc) from hstore without this collision?
So far I use a regular expression and do not unmasrhal string to hstore’ map[string]string since I couldn’t figure out how to retrieve value from it by key.

ERROR: data type tstzrange[] has no default operator class for access method "gist" in Postgres 10

I am trying to set an index to a tstzrange[] column in PostgreSQL 10. I created the column via the pgAdmin 4 GUI, set its name and data type as tstzrange[] and set it as not null, nothing more.
I then did a CREATE EXTENSION btree_gist; for the database and it worked.
Then I saw in the documentation that I should index the range and I do:
CREATE INDEX era_ac_range_idx ON era_ac USING GIST (era_ac_range);
...but then I get:
ERROR: data type tstzrange[] has no default operator class for
access method "gist"
which, frankly, I don't know what it actually means, or how to solve it. What should I do ?
PS, that column is currently empty, has no data yet.
Ps2, This table describes chronological eras, there is an id, the era name (eg the sixties) and the timezone range (eg 1960-1969).
A date is inserted by the user and I want to check in which era it belongs.
Well, you have an array of timestamp-ranges as a single column. You can index an array with a GIN index and a range with (iirc) GIN or GiST. However, I'm not sure how an index on a column that is both would operate. I guess you could model it as an N-dimensional r-tree or some such.
I'm assuming you want to check for overlapping ranges.Could you normalise the data and have a linked table with one range in each row?

Postgres with json column

I have a column in Postgres database which has a json
{"predict":[{"method":"A","val":1.2},{"method":"B","val":1.7}]}
I would like to extract both val as a separate column. Is there a way I could do this from within Postgres?
Postgres introduced JSON types plus functions and operators in 9.2. If your column is a JSON type you can use them to do your extraction.
If the json always has the structure you indicate (i.e. a key "predict" that holds an array with two JSON objects each having a "method" and a "val" key) then the solution is simply:
SELECT ((my_json->'predict')->>0)->'val' AS method_a,
((my_json->'predict')->>1)->'val' AS method_b
FROM my_table;
If the structure can vary then you'd have to tell us more about it to provide you with a solution.