Find element in array of jsonb documents by value as case insensitive - postgresql

Assume I have userinfo table with person column containing the following jsonb object:
{
"skills": [
{
"name": "php"
},
{
"name": "Python"
}
]
}
In order to get Python skill i would write the following query
select * from userinfo
where person -> 'skills' #> '[{"name":"Python"}]'
It works well, but if specify '[{"name":"python"}]' as lower case it doesn't return me what i want.
How can i write case insensitive query there?
Postgre version is 11.2

you can do that when unnesting with an exists predicate:
select u.*
from userinfo u
where exists (select *
from jsonb_array_elements(u.person -> 'skills') as s(j)
-- make sure to only do this for rows that actually contain an array
where jsonb_typeof(u.person -> 'skills') = 'array'
and lower(s.j ->> 'name') = 'python');
Online example: https://rextester.com/XKVUA73952

demo:db<>fiddle
AFAIK there is no in-built JSON function for that. So, you have to convert the JSON-String to lower case (meaning casting into type text, lower-case it, recast it into type jsonb):
WHERE lower(person::text)::jsonb -> 'skills' #> '[{"name":"python"}]'

select * from (select 1 as id, 'fname' as firstname, 'sname' as surname, jsonb_array_elements('{
"skills": [{"name": "php"},{"name": "Python"}]}'::jsonb->'skills') skill) p
where p.skill->>'name' ilike 'python';
to suit the tables in the question it'd be something like
select * from (select *, jsonb_array_elements(person->'skills') skill from userinfo) u
where u.skill->>'name' ilike 'python';
Just a note, this will return multiple entries for the same userinfo if you start looking for multiple skills .. if you use the above, you'd want to group by the fields you want returned or select distinct id, username etc.
like (assuming there's an id column in the userinfo table)
select distinct id from (select *, jsonb_array_elements(person->'skills') skill from userinfo) u
where u.skill->>'name' ilike 'python' or u.skill->>'name' ilike 'php';
it all depends what you want to do

Related

Search for string in jsonb values - PostgreSQL

For simplicity, a row of table looks like this:
key: "z06khw1bwi886r18k1m7d66bi67yqlns",
reference_keys: {
"KEY": "1x6t4y",
"CODE": "IT137-521e9204-ABC-TESTE"
"NAME": "A"
},
I have a jsonb object like this one {"KEY": "1x6t4y", "CODE": "IT137-521e9204-ABC-TESTE", "NAME": "A"} and I want to search for a query in the values of any key. If my query is something like '521e9204' I want it to return the row that reference_keys has '521e9204' in any value. Basicly the keys don't matter for this scenario.
Note: The column reference_keys and so the jsonb object, are always a 1 dimensional array.
I have tried a query like this:
SELECT * FROM table
LEFT JOIN jsonb_each_text(table.reference_keys) AS j(k, value) ON true
WHERE j.value LIKE '%521e9204%'
The problem is that it duplicates rows, for every key in the json and it messes up the returned items.
I have also thinked of doing something like this:
SELECT DISTINCT jsonb_object_keys(reference_keys) from table;
and then use a query like:
SELECT * FROM table
WHERE reference_keys->>'CODE' like '%521e9204%'
It seems like this would work but I really don't want to rely on this solution.
You can rewrite your JOIN to an EXISTS condition to avoid the duplicates:
SELECT t.*
FROM the_table t
WHERE EXISTS (select *
from jsonb_each_text(t.reference_keys) AS j(k, value)
WHERE j.value LIKE '%521e9204%');
If you are using Postgres 12 or later, you can also use a JSON path query:
where jsonb_path_exists(reference_keys, 'strict $.** ? (# like_regex "521e9204")')

PostgreSQL: Find and delete duplicated jsonb data, excluding a key/value pair when comparing

I have been searching all over to find a way to do this.
I am trying to clean up a table with a lot of duplicated jsonb fields.
There are some examples out there, but as a little twist, I need to exclude one key/value pair in the jsonb field, to get the result I need.
Example jsonb
{
"main": {
"orders": {
"order_id": "1"
"customer_id": "1",
"update_at": "11/23/2017 17:47:13"
}
}
Compared to:
{
"main": {
"orders": {
"order_id": "1"
"customer_id": "1",
"updated_at": "11/23/2017 17:49:53"
}
}
If I can exclude the "updated_at" key when comparing, the query should find it a duplicate and this, and possibly other, duplicated entries should be deleted, keeping only one, the first "original" one.
I have found this query, to try and find the duplicates. But it doesn't take my situation into account. Maybe someone can help structuring this to meet the requirements.
SELECT t1.jsonb_field
FROM customers t1
INNER JOIN (SELECT jsonb_field, COUNT(*) AS CountOf
FROM customers
GROUP BY jsonb_field
HAVING COUNT(*)>1
) t2 ON t1.jsonb_field=t2.jsonb_field
WHERE
t1.customer_id = 1
Thanks in advance :-)
If the Updated at is always at the same path, then you can remove it:
SELECT t1.jsonb_field
FROM customers t1
INNER JOIN (SELECT jsonb_field, COUNT(*) AS CountOf
FROM customers
GROUP BY jsonb_field
HAVING COUNT(*)>1
) t2 ON
t1.jsonb_field #-'{main,orders,updated_at}'
=
t2.jsonb_field #-'{main,orders,updated_at}'
WHERE
t1.customer_id = 1
See https://www.postgresql.org/docs/9.5/static/functions-json.html
additional operators
EDIT
If you dont have #- you might just cast to text, and do a regex replace
regexp_replace(t1.jsonb_field::text, '"update_at": "[^"]*?"','')::jsonb
=
regexp_replace(t2.jsonb_field::text, '"update_at": "[^"]*?"','')::jsonb
I even think, you don't need to cast it back to jsonb. But to be save.
Mind the regex matche ANY "update_at" field (by key) in the json. It should not match data, because it would not match an escaped closing quote \", nor find the colon after it.
Note the regex actually should be '"update_at": "[^"]*?",?'
But on sql fiddle that fails. (maybe depends on the postgresbuild..., check with your version, because as far as regex go, this is correct)
If the comma is not removed, the cast to json fails.
you can try '"update_at": "[^"]*?",'
no ? : that will remove the comma, but fail if update_at was the last in the list.
worst case, nest the 2
regexp_replace(
regexp_replace(t1.jsonb_field::text, '"update_at": "[^"]*?",','')
'"update_at": "[^"]*?"','')::jsonb
for postgresql 9.4
Though sqlfidle only has 9.3 and 9.6
9.3 is missing the json_object_agg. But the postgres doc says it is in 9.4. So this should work
It will only work, if all records have objects under the important keys.
main->orders
If main->orders is a json array, or scalar, then this may give an error.
Same if {"main": [1,2]} => error.
Each json_each returns a table with a row for each key in the json
json_object_agg aggregates them back to a json array.
The case statement filters the one key on each level that needs to be handled.
In the deepest nest level, it filters out the updated_at row.
On sqlfidle set query separator to '//'
If you use psql client, replace the // with ;
create or replace function foo(x json)
returns jsonb
language sql
as $$
select json_object_agg(key,
case key when 'main' then
(select json_object_agg(t2.key,
case t2.key when 'orders' then
(select json_object_agg(t3.key, t3.value)
from json_each(t2.value) as t3
WHERE t3.key <> 'updated_at'
)
else t2.value
end)
from json_each(t1.value) as t2
)
else t1.value
end)::jsonb
from json_each(x) as t1
$$ //
select foo(x)
from
(select '{ "main":{"orders":{"order_id": "1", "customer_id": "1", "updated_at": "11/23/2017 17:49:53" }}}'::json as x) as t1
x (the argument) may need to be jsonb, if that is your datatype

Using functions with orientdb select query with edge filter

Schema
Customer -> (Edge)Ownes -> Vehicle {vehicle_number}
tried to query the customer record who "Ownes" a vehicle by its number like below and it worked. (both 'in' and 'contains' worked fine)
select from Customer where "KL-01-B-8898" in out("Ownes").vehicle_number
I want to do the same query but using a case insensitive search, like below, but returned '0' records
select from Customer where "kl-01-b-8898" in out("Ownes").vehicle_number.toLowerCase()
I changed the query like below and it returned the rows. Is it possible to use functions like 'toLowerCase' in the queries like above, with out sub select ?
select from Customer where #rid in (select in("Ownes").#rid from Vehicle where vehicle_number.toLowerCase() ="kl-01-b-8898")
You can use this:
select from Customer
let $a= ( select number.toUpperCase() from (select out("Ownes").vehicle_number as number from $parent.$current unwind number))
where "KL-01-B-8898" in first($a).number
This doesn't work:
select from Customer where "kl-01-b-8898" in out("Ownes").vehicle_number.toLowerCase()
because
out("Ownes").vehicle_number
return a list of String
This works:
select from Customer where #rid in (select in("Ownes").#rid from Vehicle where vehicle_number.toLowerCase() ="kl-01-b-8898")
because vehicle_number is a String
See the documentation: http://orientdb.com/docs/last/SQL-Methods.html#bundled-methods

How to use postgresql any with jsonb data

Related
see this question
Question
I have a postgresql table that has a column of type jsonb. the json data looks like this
{
"personal":{
"gender":"male",
"contact":{
"home":{
"email":"ceo#home.me",
"phone_number":"5551234"
},
"work":{
"email":"ceo#work.id",
"phone_number":"5551111"
}
},
..
"nationality":"Martian",
..
},
"employment":{
"title":"Chief Executive Officer",
"benefits":[
"Insurance A",
"Company Car"
],
..
}
}
This query works perfectly well
select employees->'personal'->'contact'->'work'->>'email'
from employees
where employees->'personal'->>'nationality' in ('Martian','Terran')
I would like to fetch all employees who have benefits of type Insurance A OR Insurance B, this ugly query works:
select employees->'personal'->'contact'->'work'->>'email'
from employees
where employees->'employment'->'benefits' ? 'Insurance A'
OR employees->'employment'->'benefits' ? 'Insurance B';
I would like to use any instead like so:
select * from employees
where employees->'employment'->>'benefits' =
any('{Insurance A, Insurance B}'::text[]);
but this returns 0 results.. ideas?
What i've also tried
I tried the following syntaxes (all failed):
.. = any({'Insurance A','Insurance B'}::text[]);
.. = any('Insurance A'::text,'Insurance B'::text}::array);
.. = any({'Insurance A'::text,'Insurance B'::text}::array);
.. = any(['Insurance A'::text,'Insurance B'::text]::array);
employees->'employment'->'benefits' is a json array, so you should unnest it to use its elements in any comparison.
Use the function jsonb_array_elements_text() in lateral join:
select *
from
employees,
jsonb_array_elements_text(employees->'employment'->'benefits') benefits(benefit)
where
benefit = any('{Insurance A, Insurance B}'::text[]);
The syntax
from
employees,
jsonb_array_elements_text(employees->'employment'->'benefits')
is equivalent to
from
employees,
lateral jsonb_array_elements_text(employees->'employment'->'benefits')
The word lateral may be omitted. For the documentation:
LATERAL can also precede a function-call FROM item, but in this case
it is a noise word, because the function expression can refer to
earlier FROM items in any case.
See also: What is the difference between LATERAL and a subquery in PostgreSQL?
The syntax
from jsonb_array_elements_text(employees->'employment'->'benefits') benefits(benefit)
is a form of aliasing, per the documentation
Another form of table aliasing gives temporary names to the columns of
the table, as well as the table itself:
FROM table_reference [AS] alias ( column1 [, column2 [, ...]] )
You can use the containment operator ?| to check if the array contains any of the values you want.
select * from employees
where employees->'employment'->'benefits' ?| array['Insurance A', 'Insurance B']
If you happen to a case where you want all of the values to be in the array, then there's the ?& operator to check for that.

PostgreSQL - jsonb_each

I have just started to play around with jsonb on postgres and finding examples hard to find online as it is a relatively new concept.I am trying to use jsonb_each_text to printout a table of keys and values but get a csv's in a single column.
I have the below json saved as as jsonb and using it to test my queries.
{
"lookup_id": "730fca0c-2984-4d5c-8fab-2a9aa2144534",
"service_type": "XXX",
"metadata": "sampledata2",
"matrix": [
{
"payment_selection": "type",
"offer_currencies": [
{
"currency_code": "EUR",
"value": 1220.42
}
]
}
]
}
I can gain access to offer_currencies array with
SELECT element -> 'offer_currencies' -> 0
FROM test t, jsonb_array_elements(t.json -> 'matrix') AS element
WHERE element ->> 'payment_selection' = 'type'
which gives a result of "{"value": 1220.42, "currency_code": "EUR"}", so if i run the below query I get (I have to change " for ')
select * from jsonb_each_text('{"value": 1220.42, "currency_code": "EUR"}')
Key | Value
---------------|----------
"value" | "1220.42"
"currency_code"| "EUR"
So using the above theory I created this query
SELECT jsonb_each_text(data)
FROM (SELECT element -> 'offer_currencies' -> 0 AS data
FROM test t, jsonb_array_elements(t.json -> 'matrix') AS element
WHERE element ->> 'payment_selection' = 'type') AS dummy;
But this prints csv's in one column
record
---------------------
"(value,1220.42)"
"(currency_code,EUR)"
The primary problem here, is that you select the whole row as a column (PostgreSQL allows that). You can fix that with SELECT (jsonb_each_text(data)).* ....
But: don't SELECT set-returning functions, that can often lead to errors (or unexpected results). Instead, use f.ex. LATERAL joins/sub-queries:
select first_currency.*
from test t
, jsonb_array_elements(t.json -> 'matrix') element
, jsonb_each_text(element -> 'offer_currencies' -> 0) first_currency
where element ->> 'payment_selection' = 'type'
Note: function calls in the FROM clause are implicit LATERAL joins (here: CROSS JOINs).
WITH testa AS(
select jsonb_array_elements
(t.json -> 'matrix') -> 'offer_currencies' -> 0 as jsonbcolumn from test t)
SELECT d.key, d.value FROM testa
join jsonb_each_text(testa.jsonbcolumn) d ON true
ORDER BY 1, 2;
tetsa get the temporal jsonb data. Then using lateral join to transform the jsonb data to table format.