I'm trying to select keys from JSONB type with true values. So far I managed to do that using this query but I feel like there is a better way:
SELECT json.key
FROM jsonb_each_text('{"aaa": true, "bbb": false}'::JSONB) json
WHERE json.value = 'true';
What I don't like is the WHERE clause where I'm comparing strings. Is there a way to cast it to boolean?
If yes, would it work for truthy and falsy values too? (explanation of truthy and falsy values in javascript: http://www.codeproject.com/Articles/713894/Truthy-Vs-Falsy-Values-in-JavaScript).
jsonb has an equality operator (=; unlike json), so you could write
SELECT key
FROM jsonb_each('{"aaa": true, "bbb": false}')
WHERE value = jsonb 'true'
(with jsonb_each_text() you rely on some JSON values' text representation).
You can even include some additional values, if you want:
WHERE value IN (to_jsonb(TRUE), jsonb '"true"', to_jsonb('truthy'))
IN uses the equality operator under the hood.
Related
I have a two part question
We have a PostgreSQL table with a jsonb column. The values in jsonb are valid jsons, but they are such that for some rows a node would come in as an array whereas for others it will come as an object.
for example, the json we receive could either be like this ( node4 I just an object )
"node1": {
"node2": {
"node3": {
"node4": {
"attr1": "7d181b05-9c9b-4759-9368-aa7a38b0dc69",
"attr2": "S1",
"UserID": "WebServices",
"attr3": "S&P 500*",
"attr4": "EI",
"attr5": "0"
}
}
}
}
Or like this ( node4 is an array )
"node1": {
"node2": {
"node3": {
"node4": [
{
"attr1": "7d181b05-9c9b-4759-9368-aa7a38b0dc69",
"attr2": "S1",
"UserID": "WebServices",
"attr3": "S&P 500*",
"attr4": "EI",
"attr5": "0"
},
{
"attr1": "7d181b05-9c9b-4759-9368-aa7a38b0dc69",
"attr2": "S1",
"UserID": "WebServices",
"attr3": "S&P 500*",
"attr4": "EI",
"attr5": "0"
}
]
}
}
}
And I have to write a jsonpath query to extract, for example, attr1, for each PostgreSQL row containing this json. I would like to have just one jsonpath query that would always work irrespective of whether the node is object or array. So, I want to use a path like below, assuming, if it is an array, it will return the value for all indices in that array.
jsonb_path_query(payload, '$.node1.node2.node3.node4[*].attr1')#>> '{}' AS "ATTR1"
I would like to avoid checking whether the type in array or object and then run a separate query for each and do a union.
Is it possible?
A sub-question related to above - Since I needed the output as text without the quotes, and somewhere I saw to use #>> '{}' - so I tried that and it is working, but can someone explain, how that works?
The second part of the question is - the incoming json can have multiple sets of nested arrays and the json and the number of nodes is huge. So other part I would like to do is flatten the json into multiple rows. The examples I found were one has to identify each level and either use cross join or unnest. What I was hoping is there is a way to flatten a node that is an array, including all of the parent information, without knowing which, if any, if its parents are arrays or simple object. Is this possible as well?
Update
I tried to look at the documentation and tried to understand the #>> '{}' construct, and then I came to realise that '{}' is the right hand operand for the #>> operator which takes a path and in my case the path is the current attribute value hence {}. Looking at examples that had non-empty single attribute path helped me realise that.
Thank you
You can use a "recursive term" in the JSON path expression:
select t.some_column,
p.attr1
from the_table t
cross join jsonb_path_query(payload, 'strict $.**.attr1') as p(attr1)
Note that the strict modifier is required, otherwise, each value will be returned multiple times.
This will return one row for each key attr1 found in any level of the JSON structure.
For the given sample data, this would return:
attr1
--------------------------------------
"7d181b05-9c9b-4759-9368-aa7a38b0dc69"
"7d181b05-9c9b-4759-9368-aa7a38b0dc69"
"7d181b05-9c9b-4759-9368-aa7a38b0dc69"
"I would like to avoid checking whether the type in array or object and then run a separate query for each and do a union. Is it possible?"
Yes it is and your jsonpath query works fine in both cases either when node4 is a jsonb object or when it is a jsonb array because the jsonpath wildcard array accessor [*] also works with a jsonb object in the lax mode which is the default behavior (but not in the strict mode see the manual). See the test results in dbfiddle.
"I saw to use #>> '{}' - so I tried that and it is working, but can someone explain, how that works?"
The output of the jsonb_path_query function is of type jsonb, and when the result is a jsonb string, then it is automatically displayed with double quotes " in the query results. The operator #>> converts the output into the text type which is displayed without " in the query results and the associated text array '{}' just point at the root of the passed jsonb data.
" the incoming json can have multiple sets of nested arrays and the json and the number of nodes is huge. So other part I would like to do is flatten the json into multiple rows"
you can refer to the answer of a_horse_with_no_name using the recursive wildcard member accessor .**
Is it possible to use array operators on a type of bytea[]?
For example:
CREATE TABLE test (
metadata bytea[]
);
SELECT * FROM test WHERE test.metadata && ANY($1);
// could not find array type for data type bytea[]
If it's not possible, is there an alternative approach without changing the type from bytea[]?
postgresql 12.x
Do not use ANY, just compare the arrays directly using an array constructor and array functions
CREATE TABLE test (
metadata bytea[]
);
INSERT INTO public.test (metadata) VALUES('{"x","y"}');
SELECT * FROM test t WHERE metadata && array[E'\x78'::bytea];
When using ANY, the left-hand expression is evaluated and compared to each element of the right-hand array using the given operator, which must yield a Boolean result. So the original sql was trying to do something like bytea[] && bytea.
This applies not only for bytea[], but any array type e.g text[] or integer[].
I get this error when querying with a json column:
(psycopg2.ProgrammingError) operator does not exist: json = text
The column is defined as JSON with SQLAlchemy:
json_data = db.Column(db.JSON, nullable=False)
How do you compare with Postgres?
There is no equality (or inequality) operator for the data type json. If you need to test the value as a whole, you might cast to jsonb:
... WHERE json_data::jsonb = jsonb '{}';
Or cast to text for simple cases:
... WHERE json_data::text = '{}';
But there are many valid text representations for the same json value - which is the reason why Postgres does not implement equality / inequality operators for the type.
See:
How to query a json column for empty objects?
I'm working in Postgres 9.6.5. I have the following table:
id | integer
data | jsonb
The data in the data column is nested, in the form:
{ 'identification': { 'registration_number': 'foo' }}
I'd like to index registration_number, so I can query on it. I've tried this (based on this answer):
CREATE INDEX ON mytable((data->>'identification'->>'registration_number'));
But got this:
ERROR: operator does not exist: text ->> unknown
LINE 1: CREATE INDEX ON psc((data->>'identification'->>'registration... ^
HINT: No operator matches the given name and argument type(s). You might need to add explicit type casts.
What am I doing wrong?
You want:
CREATE INDEX ON mytable((data -> 'identification' ->> 'registration_number'));
The -> operator returns the jsonb object under the key, and the ->> operator returns the jsonb object under the key as text. The most notable difference between the two operators is that ->> will "unwrap" string values (i.e. remove double quotes from the TEXT representation).
The error you're seeing is reported because data ->> 'identification' returns text, and the subsequent ->> is not defined for the text type.
Since version 9.3 Postgres has the #> and #>> operators. This operators allow the user to specify a path (using an array of text) inside jsonb column to get the value.
You could use this operator to achieve your goal in a simpler way.
CREATE INDEX ON mytable((data #>> '{identification, registration_number}'));
If I have a jsonb column called value with fields such as:
{"id": "5e367554-bf4e-4057-8089-a3a43c9470c0",
"tags": ["principal", "reversal", "interest"],,, etc}
how would I find all the records containing given tags, e.g:
if given: ["reversal", "interest"]
it should find all records with either "reversal" or "interest" or both.
My experimentation got me to this abomination so far:
select value from account_balance_updated
where value #> '{}' :: jsonb and value->>'tags' LIKE '%"principal"%';
of course this is completely wrong and inefficient
Assuming you are using PG 9.4+, you can use the jsonb_array_elements() function:
SELECT DISTINCT abu.*
FROM account_balance_updated abu,
jsonb_array_elements(abu.value->'tags') t
WHERE t.value <# '["reversal", "interest"]'::jsonb;
As it turned out you can use cool jsonb operators described here:
https://www.postgresql.org/docs/9.5/static/functions-json.html
so original query doesn't have to change much:
select value from account_balance_updated
where value #> '{}' :: jsonb and value->'tags' ?| array['reversal', 'interest'];
in my case I also needed to escape the ? (??|) because I am using so called "prepared statement" where you pass query string and parameters to jdbc and question marks are like placeholders for params:
https://docs.oracle.com/javase/tutorial/jdbc/basics/prepared.html