Querying values jsonB in Postgresql - postgresql

I have a table with ProductID (int) and ProductGroups jsonb.
The ProductGroups just have values in the json rather than tag names. I want to be able to query the following data to get ProductID where ProductGroup contains 69.
ProductID ProductGroups
125481 [134, 83]
128166 [134, 83]
128175 [134, 83]
128172 [134, 83]
131492 [69, 134]
131489 [69, 134]
131860 [128, 131, 133, 100, 71]
128142 [134, 83]
I have queried what I think of as normal jsonb with tag names in different tables with below query where I callout the name and value
SELECT *
FROM trans."TxnHeader" mpt, jsonb_array_elements(mpt."ExtensionProperty") as ext
where 1=1
and jsonb_typeof(mpt."ExtensionProperty") = 'array'
and ext->>'Name' = 'posTranId' and ext->>'Value' = '8539'

You can do it using the containment operator #> as follows :
select *
from products
where ProductGroups #> '69'
Demo here

Related

Writing a query in SQLAlchemy to count occurrences and store IDs

I'm working with a postgres db using SQLAlchemy.
I have a table like this
class Author(Base):
__tablename__ = "Author"
id = Column(BIGINT, primary_key=True)
name = Column(Unicode)
and I want to identify all homonymous authors and save their id in a list.
For example if in the database there are 2 authors named "John" and 3 named "Jack", with ID respectively 11, 22, 33, 44 a 55, I want my query to return
[("John", [11,22]), ("Jack", [33,44,55])]
For now I've been able to write
[x for x in db_session.query(
func.count(Author.name),
Author.name
).group_by(Author.name) if x[0]>1]
but this just gives me back occurrences
[(2,"John"),(3,"Jack")]
Thank you very much for the help!
The way to do this in SQL would be to use PostgreSQL's array_agg function to group the ids into an array:
SELECT
name,
array_agg(id) AS ids
FROM
my_table
GROUP BY
name
HAVING
count(name) > 1;
The array_agg function collects the ids for each name, and the HAVING clause excludes those with only a single row. The output of the query would look like this:
name │ ids
═══════╪════════════════════
Alice │ {2,4,9,10,16}
Bob │ {1,6,11,12,13}
Carol │ {3,5,7,8,14,15,17}
Translated into SQLAlchemy, the query would look like this:
import sqlalchemy as sa
...
q = (
db_session.query(Author.name, sa.func.array_agg(Author.id).label('ids'))
.group_by(Author.name)
.having(sa.func.count(Author.name) > 1)
)
Calling q.all() will return a list of (name, [ids]) tuples like this:
[
('Alice', [2, 4, 9, 10, 16]),
('Bob', [1, 6, 11, 12, 13]),
('Carol', [3, 5, 7, 8, 14, 15, 17]),
]
In SQLAlchemy 1.4/2.0-style syntax equivalent would be:
with Session() as s:
q = (
sa.select(Author.name, sa.func.array_agg(Author.id).label('ids'))
.group_by(Author.name)
.having(sa.func.count(Author.name) > 1)
)
res = s.execute(q)

How to use wildcard in the path to search jsonb values for postgres?

Using postgres version 10.13
This is my datatable jsongraphs
id
jsongraph
1
{ "data": {"scopes_by_id": { "121": { "id": 121, "pk": 121, "name": "Prework" } }, "commonsites_by_id": {"123": {"id": 123, "pk": 123, "name": "Somewhere over the rainbow"}}}}
2
{ "data": {"scopes_by_id": { "156": { "id": 156, "pk": 156, "name": "ABC" } }, "commonsites_by_id": {"123": {"id": 123, "pk": 123, "name": "Somewhere over the rainbow"}}}}
I want the distinct values of scope id and site id which should be (121, 123), (156,123)
So I tried
SELECT DISTINCT
jsongraph->'data'->'scopes_by_id'->>'pk' ,
jsongraph->'data'->'commonsites_by_id'->>'pk' from jsongraphs;
This won't work because the path should be like data->scopes_by_id->121->>pk but I cannot know beforehand the value of 121 in between.
Is there a way to get the values of what I need by filling in some kind of wildcard in the path?
E.g.data->scopes_by_id->{*}->>pk like that?
ANd because this is legacy data, it's also hard to change the data itself.
As the nesting level seems to be fixed, you could do something like this:
select j.id, scopes.*, commonsites.*
from jsongraphs j
cross join lateral (
select jsonb_agg(j.jsongraph #> array['data','scopes_by_id', t1.scope_id, 'pk']) as scope_ids
from jsonb_each_text(j.jsongraph #> '{data,scopes_by_id}') as t1(scope_id)
) scopes
cross join lateral (
select jsonb_agg(j.jsongraph #> array['data','commonsites_by_id', t2.site_id, 'pk']) as common_ids
from jsonb_each_text(j.jsongraph #> '{data,commonsites_by_id}') as t2(site_id)
) commonsites
order by id;
The sub-queries extract all key below the respective part (e.g. scopes_by_id) and then uses the #>' operator to access the path for each id inside the original JSON value. And finally all PK values are aggregated back into a single array.
This returns the PK values from each part separately as an array in order to handle the situation where you have a different number of "scope ids" and "commonsite ids"
If you just want "the first" id from each section, you can remove the aggregation and use a LIMIT clause:
select j.id, scopes.*, commonsites.*
from jsongraphs j
cross join lateral (
select j.jsongraph #> array['data','scopes_by_id', t1.scope_id, 'pk'] as scope_id
from jsonb_each_text(j.jsongraph #> '{data,scopes_by_id}') as t1(scope_id)
limit 1
) scopes
cross join lateral (
select j.jsongraph #> array['data','commonsites_by_id', t2.site_id, 'pk'] as common_id
from jsonb_each_text(j.jsongraph #> '{data,commonsites_by_id}') as t2(site_id)
limit 1
) commonsites
order by id;
Not sure on which level you want to apply the "distinct" part for this.
In Postgres 12 or later, you could achieve the same with:
select id,
jsonb_path_query_array(j.jsongraph, 'strict $.data.scopes_by_id.**.pk') as scopes,
jsonb_path_query_array(j.jsongraph, 'strict $.data.commonsites_by_id.**.pk') as common
from jsongraphs ;
order by id;
Online example

Different path formats for PostgreSQL JSONB functions

I'm confused by how path uses different formats depending on the function in the PostgreSQL JSONB documentation.
If I had a PostgreSQL table foo that looks like
pk
json_obj
0
{"values": [{"id": "a_b", "value": 5}, {"id": "c_d", "value": 6]}
1
{"values": [{"id": "c_d", "value": 7}, {"id": "e_f", "value": 8]}
Why does this query give me these results?
SELECT json_obj, -- {"values": [{"id": "a_b", "value": 5}, {"id": "c_d", "value": 6]}
json_obj #? '$.values[*].id', -- true
json_obj #> '$.values[*].id', -- ERROR: malformed array literal
json_obj #> '{values, 0, id}', -- "a_b"
JSONB_SET(json_obj, '$.annotations[*].id', '"hi"') -- ERROR: malformed array literal
FROM foo;
Specifically, why does #? support $.values[*].id (described on that page in another section) but JSONB_SET uses some other path format {bar,3,baz}?
Ultimately, what I would like to do and don't know how, is to remove non-alphanumeric characters (e.g. underscores in this example) in all id values represented by the path $.values[*].id.
The reason is that the operators have different data types on the right hand side.
SELECT oprname, oprright::regtype
FROM pg_operator
WHERE oprleft = 'jsonb'::regtype
AND oprname IN ('#?', '#>');
oprname | oprright
---------+----------
#> | text[]
#? | jsonpath
(2 rows)
Similarly, the second argument of jsonb_set is a text[].
Now '$.values[*].id' is a valid jsonpath, but not a valid text[] literal.
Thanks for the answers and comments about why the data types were different.
I wanted to post how I solved my problem:
Ultimately, what I would like to do and don't know how, is to remove
non-alphanumeric characters (e.g. underscores in this example) in all
id values represented by the path $.values[*].id.
WITH unnested AS (
SELECT f.pk, JSONB_ARRAY_ELEMENTS(f.json_obj -> 'values') AS value
FROM foo f
),
updated_values AS (
SELECT un.pk, JSONB_SET(un.value, '{id}', TO_JSONB(LOWER(REGEXP_REPLACE(un.value ->> 'id', '[^a-zA-Z0-9]', '', 'g'))), FALSE) AS new_value
FROM unnested un
WHERE value -> 'id' IS NOT NULL -- Had some values that didn't have 'id' keys
)
UPDATE foo f2
SET json_obj = JSONB_SET(f2.json_obj, '{values}', (SELECT JSONB_AGG(uv.new_value) FROM updated_values uv WHERE uv.pk = f2.pk), FALSE)
WHERE JSONB_PATH_EXISTS(f2.json_obj, '$.values[*].id') -- Had some values that didn't have 'id' keys

Postgresql update column with integer values

I've a column with jsonb type and contains list of elements either in string or integer format.
What I want now is to make all of them as same type e.g either all int or all string format
Tried: this way I get single element but I need to update all of the elements inside of the list.
SELECT parent_path -> 1 AS path
FROM abc
LIMIT 10
OR
Update abc SET parent_path = ARRAY[parent_path]::TEXT[] AS parent_path
FROM abc
OR
UPDATE abc SET parent_path = replace(parent_path::text, '"', '') where id=123
Current Output
path
[6123697, 178, 6023099]
[625953521394212864, 117, 6023181]
["153", "6288361", "553248635949090971"]
[553248635358954983, 178320, 174, 6022967]
[6050684, 6050648, 120, 6022967]
[653, 178238, 6239135, 38, 6023117]
["153", "6288496", "553248635977039112"]
[553248635998143523, 6023185]
[553248635976194501, 6022967]
[553248635976195634, 6022967]
Expected Output
path
[6123697, 178, 6023099]
[625953521394212864, 117, 6023181]
[153, 6288361, 553248635949090971] <----
[553248635358954983, 178320, 174, 6022967]
[6050684, 6050648, 120, 6022967]
[653, 178238, 6239135, 38, 6023117]
[153, 6288496, 553248635977039112] <----
[553248635998143523, 6023185]
[553248635976194501, 6022967]
[553248635976195634, 6022967]
Note: Missing double quotes on the list. I've tried several methods from here but no luck
You will have to unnest them, cleanup each element, then aggregate it back to an array:
The following converts all elements to integers:
select (select jsonb_agg(x.i::bigint order by idx)
from jsonb_array_elements_text(a.path) with ordinality as x(i, idx)
) as clean_path
from abc a;
You can use a scalar subquery to select, unnest, and aggregate the elements:
WITH mytable AS (
SELECT row_number() over () as id, col::JSONB
FROM (VALUES ('[6123697, 178, 6023099]'),
('["6123697", "178", "6023099"]')) as bla(col)
)
SELECT id, (SELECT JSONB_AGG(el::int) FROM jsonb_array_elements_text(col) as el)
FROM mytable

One2many field issue Odoo 10.0

I have this very weird issue with One2many field.
First let me explain you the scenario...
I have a One2many field in sale.order.line, below code will explain the structure better
class testModule(models.Model):
_name = 'test.module'
name = fields.Char()
class testModule2(models.Model):
_name = 'test.module2'
location_id = fields.Many2one('test.module')
field1 = fields.Char()
field2 = fields.Many2one('sale.order.line')
class testModule3(models.Model):
_inherit = 'sale.order.line'
test_location = fields.One2many('test.module2', 'field2')
CASE 1:
Now what is happening is that when i create a new sales order, i select the partner_id and then add a sale.order.line and inside this line i add the One2many field test_location and then i save.
CASE 2:
Create new sales order, select partner_id then add sale.order.line and inside the sale.order.line add the test_location line [close the sales order line window]. Now after the entry before hitting save i change a field say partner_id and then click save.
CASE 3:
this case is same as case 2 but with the addition that i again change the partner_id field [changes made total 2 times first of case2 and then now], then i click on save.
RESULTS
CASE 1 works fine.
CASE 2 has a issue of
odoo.sql_db: bad query: INSERT INTO "test_module2" ("id", "field2", "field1", "location_id", "create_uid", "write_uid", "create_date", "write_date") VALUES(nextval('test_module2_id_seq'), 27, 'asd', ARRAY[1, '1'], 1, 1, (now() at time zone 'UTC'), (now() at time zone 'UTC')) RETURNING id
ProgrammingError: column "location_id" is of type integer but expression is of type integer[]
LINE 1: ...VALUES(nextval('test_module2_id_seq'), 27, 'asd', ARRAY[1, '...
now for this case i put a debugger on create/write method of sale.order.line to see waht the values are getting passed..
values = {u'product_uom': 1, u'sequence': 0, u'price_unit': 885, u'product_uom_qty': 1, u'qty_invoiced': 0, u'procurement_ids': [[5]], u'qty_delivered': 0, u'qty_to_invoice': 0, u'qty_delivered_updateable': False, u'customer_lead': 0, u'analytic_tag_ids': [[5]], u'state': u'draft', u'tax_id': [[5]], u'test_location': [[5], [0, 0, {u'field1': u'asd', u'location_id': [1, u'1']}]], 'order_id': 20, u'price_subtotal': 885, u'discount': 0, u'layout_category_id': False, u'product_id': 29, u'price_total': 885, u'invoice_status': u'no', u'name': u'[CARD] Graphics Card', u'invoice_lines': [[5]]}
in the above values location_id is getting passed like u'location_id': [1, u'1']}]] which is not correct...so for this i correct the issue in code and the update the values and pass that...
CASE 3
if the user changes the field say 2 or more than 2 times then the values are
values = {u'invoice_lines': [[5]], u'procurement_ids': [[5]], u'tax_id': [[5]], u'test_location': [[5], [1, 7, {u'field1': u'asd', u'location_id': False}]], u'analytic_tag_ids': [[5]]}
here
u'location_id': False
MULTIPLE CASE
if the user does case 1 the on the same record does case 2 or case 3 then sometimes the line will be saved as field2 = Null or False in the database other values like location_id and field1 will have data but not field2
NOTE: THIS HAPPENS WITH ANY FIELD NOT ONLY PARTNER_ID FIELD ON HEADER LEVEL OF SALE ORDER
I tried debugging myself but couldn't find the reason why this is happening .