Say I have table in Postgres with column data of type JSONB. This column contains pretty complex object for example:
{
...,
gender: ['men', 'women'],
...
}
I have query like gender=men&gender=women&gender=something_else and want to find all rows in table where ANY of gender's members `IN ('men', 'women', 'something_else'). For example:
SELECT uuid, data ->> 'gender' FROM "OX_Articles" WHERE data ->> 'gender' INTERSECTS WITH (men', 'women', 'something_else');
Of course we haven't keywords INTERSECTS WITH.
Either in (...) or = any(array[...]) should work.
They should have similar performances. I favor =any because it handles empty RHS (whereas IN can't handle empty literal tuples) and I'd expect whatever postgres bindings I have to convert the host language's arrays/lists/arraylists to pg arrays, not pg tuples.
Related
I have a column in postgresql table with type jsonb.
{
.....
"type": "car",
"vehicleIds": [
"980e3761-935a-4e52-be77-9f9461dec4d1","980e3761-935a-4e52-be77-9f9461dec4d2"
]
.....
}
Application runs queries against these fields to fetch records. I need to index this column only for these fields.
How can this be done?
This is query structure with properties as the column name:
SELECT *
FROM Vehicle f
WHERE f.properties::text ## CONCAT('$.vehicleIds[*] >', :vehicleId )= true
AND f.properties::text ## CONCAT('$.type >', :type ) = true
The query you are using is highly confusing, as it boils down to be a text search query, as the ## is applied on a text value.
I also don't understand the '$.type > ... condition. With values like car I would expect an equality operator, rather than "greater than". Using > together with a UUID also doesn't seem to make sense.
If you want to search for values of type car and contain a list of IDs, using the "contains" operator #> is a better way to do that:
SELECT *
FROM Vehicle f
WHERE f.properties #> '{"type": "car", "vehicleIds": ["980e3761-935a-4e52-be77-9f9461dec4d1"]}'
The above could make use of a GIN index on the properties column:
create index on vehicles using gin (properties);
If the type key is always queried with equality (which I assume), a combined index might be more efficient:
create index on vehicles using gin ( (properties ->> 'type'), (properties -> 'vehicleIds') );
You need to install the btree_gin extension in order to create that index.
That index would be a bit smaller but needs a different query:
SELECT *
FROM Vehicle f
WHERE f.properties ->> 'type' = 'car'
AND f.properties -> 'vehicleIds' #> '["980e3761-935a-4e52-be77-9f9461dec4d1"]'
You will need to validate if the indexes are used and which ones is more efficient by looking at the execution plan
I have a column (let's call it jsn) in my database with json object (actually stored as plain text for reasons). This json object looks like this:
{"a":
{"b":[{"num":123, ...},
{"num":456, ...},
...,
{"num":789, ...}],
...
},
...
}
I'm interested in the biggest "num" inside that list of objects "b" inside the object "a".
If the list if of known length I can do it like this:
SELECT
GREATEST((jsn::json->'a'->'b'->>0)::int,
(jsn::json->'a'->'b'->>1)::int,
... ,
(jsn::json->'a'->'b'->>N)::int))
FROM table
Note that I'm new to PostgreSQL (and database querying in general!) so that may be a rubbish way to do it. In any case it works. What I can't figure out is how to make this work when the list, 'b', is of arbitrary and unknown length.
In case it is relevant, I am using PostgreSQL 10 hosted on AWS RDS and running queries using pgAdmin 4.
You need to unnest the array then you can apply a max() on the result:
select max((n.x -> 'num')::int)
from the_table t
cross join jsonb_array_elements(t.jsn::jsonb -> 'a' -> 'b') as n(x);
you probably want to add a group by, so that you can distinguish rom which row the max value came from. Assuming your table has a column id that is unique:
select id, max((n.x -> 'num')::int)
from the_table t
cross join jsonb_array_elements(t.jsn::jsonb -> 'a' -> 'b') as n(x)
group by id;
I'm working in Postgres 9.6.5. I have the following table:
id | integer
data | jsonb
The data in the data column is nested, in the form:
{ 'identification': { 'registration_number': 'foo' }}
I'd like to index registration_number, so I can query on it. I've tried this (based on this answer):
CREATE INDEX ON mytable((data->>'identification'->>'registration_number'));
But got this:
ERROR: operator does not exist: text ->> unknown
LINE 1: CREATE INDEX ON psc((data->>'identification'->>'registration... ^
HINT: No operator matches the given name and argument type(s). You might need to add explicit type casts.
What am I doing wrong?
You want:
CREATE INDEX ON mytable((data -> 'identification' ->> 'registration_number'));
The -> operator returns the jsonb object under the key, and the ->> operator returns the jsonb object under the key as text. The most notable difference between the two operators is that ->> will "unwrap" string values (i.e. remove double quotes from the TEXT representation).
The error you're seeing is reported because data ->> 'identification' returns text, and the subsequent ->> is not defined for the text type.
Since version 9.3 Postgres has the #> and #>> operators. This operators allow the user to specify a path (using an array of text) inside jsonb column to get the value.
You could use this operator to achieve your goal in a simpler way.
CREATE INDEX ON mytable((data #>> '{identification, registration_number}'));
If I have a jsonb column called value with fields such as:
{"id": "5e367554-bf4e-4057-8089-a3a43c9470c0",
"tags": ["principal", "reversal", "interest"],,, etc}
how would I find all the records containing given tags, e.g:
if given: ["reversal", "interest"]
it should find all records with either "reversal" or "interest" or both.
My experimentation got me to this abomination so far:
select value from account_balance_updated
where value #> '{}' :: jsonb and value->>'tags' LIKE '%"principal"%';
of course this is completely wrong and inefficient
Assuming you are using PG 9.4+, you can use the jsonb_array_elements() function:
SELECT DISTINCT abu.*
FROM account_balance_updated abu,
jsonb_array_elements(abu.value->'tags') t
WHERE t.value <# '["reversal", "interest"]'::jsonb;
As it turned out you can use cool jsonb operators described here:
https://www.postgresql.org/docs/9.5/static/functions-json.html
so original query doesn't have to change much:
select value from account_balance_updated
where value #> '{}' :: jsonb and value->'tags' ?| array['reversal', 'interest'];
in my case I also needed to escape the ? (??|) because I am using so called "prepared statement" where you pass query string and parameters to jdbc and question marks are like placeholders for params:
https://docs.oracle.com/javase/tutorial/jdbc/basics/prepared.html
Let's say I have a table in Postgres that looks like this - note the zips field is json.
cities
name (text) | zips (json)
San Francisco | [94100, 94101, ...]
Washington DC | [20000, 20001, ...]
Now I want to do something like select * from cities where zip=94101, in other words, testing membership.
I tried using WHERE zips ? '94101' and got operator does not exist: json ? unknown.
I tried using WHERE zips->'94101' but was not sure what to put there, as Postgres complained argument of WHERE must by type boolean, not type json.
What do I want here? How would I solve this for 9.3 and 9.4?
edit Yes, I know I should be using the native array type... the database adapter we are using doesn't support this.
In PostgreSQL 9.4+, you can use #> operator with jsonb type:
create table test (city text, zips jsonb);
insert into test values ('A', '[1, 2, 3]'), ('B', '[4, 5, 6]');
select * from test where zips #> '[1]';
An additional advantage of such approach is 9.4's new GIN indexes that speed up such queries on big tables.
For PostgreSQL 9.4+, you should use json[b]_array_elements_text():
(the containment operator ? does something similar, but for a JSON array, it can only find exact matches, which could only occur, if your array contains strings, not numbers)
create table cities (
city text,
zips jsonb
);
insert into cities (city, zips) values
('Test1', '[123, 234]'),
('Test2', '[234, 345]'),
('Test3', '[345, 456]'),
('Test4', '[456, 123]'),
('Test5', '["123", "note the quotes!"]'),
('Test6', '"123"'), -- this is a string in json(b)
('Test7', '{"123": "this is an object, not an array!"}');
-- select * from cities where zips ? '123';
-- would yield 'Test5', 'Test6' & 'Test7', but none of you want
-- this is a safe solution:
select cities.*
from cities
join jsonb_array_elements_text(
case jsonb_typeof(zips)
when 'array' then zips
else '[]'
end
) zip on zip = '123';
-- but you can use this simplified query, if you are sure,
-- your "zips" column only contains JSON arrays:
select cities.*
from cities
join jsonb_array_elements_text(zips) zip on zip = '123';
For 9.3, you can use json_array_elements() (& convert zips manually to text):
select cities.*
from cities
join json_array_elements(zips) zip on zip::text = '123';
Note: for 9.3, you can't make your query safe (at least easily), you need to store only JSON arrays in the zips column. Also, the query above won't find any string matches, your array elements need to be numbers.
Note 2: for 9.4+ you can use the safe solution with json too (not just with jsonb, but you must call json_typeof(zips) instead of jsonb_typeof(zips)).
Edit: actually, the #> operator is better in PostgreSQL 9.4+, as #Ainar-G mentioned (because it's indexable). A little side-note: it only finds rows, if your column and query both use JSON numbers (or JSON strings, but not mixed).
For 9.3, you can use json_array_elements(). I can't test with jsonb in version 9.4 right now.
create table test (
city varchar(35) primary key,
zips json not null
);
insert into test values
('San Francisco', '[94101, 94102]');
select *
from (
select *, json_array_elements(zips)::text as zip from test
) x
where zip = '94101';