Postgresql json select from values in second layer of containment of arrays - postgresql

I have a jsonb column 'data' that contains a tree like json, example:
{
"libraries":[
{
"books":[
{
"name":"mybook",
"type":"fiction"
},
{
"name":"yourbook",
"type":"comedy"
}
{
"name":"hisbook",
"type":"fiction"
}
]
}
]
}
I want to be able to do a index using query that selects a value from the indented "book" jsons according to the type.
so all book names that are fiction.
I was able to do this using jsonb_array_elements a join query, but as i understand this would not be optimized with using the GIN index.
my query is
select books->'name'
from data,
jsonb_array_elements(data->'libraries') libraries,
jsonb_array_elements(libraries->'books') books,
where books->>'type'='grading'

If the example data you are showing is the type of data that is common in your JSON, I would suggest that you may be setting things up wrong.
Why not make a library table and a book table and not use JSON at all, it seems JSON is not the right choice here.
CREATE TABLE library
(
id serial,
name text
);
CREATE TABLE book
(
isbn BIGINT,
name text,
book_type text
);
CREATE TABLE library_books
(
library_id integer,
isbn BIGINT
)
select book.* from library_books where library_id = 1;

Related

Using index for jsonb sub keys postgresql

I have a table like:
create table items
(
id int constraint items_pk primary key,
acl jsonb
);
With such items:
id,acl
1,{
"users": {
"2": { <-- The key "2" is the user_id
"role1": {...},
"role2": {...}
},
"3": {
"role1": {...}
}
},
"groups": {...}
}
...
I want to count the number of items where the user "2" has the role "role2", what I do:
SELECT COUNT(*) FROM items WHERE ( acl->'users'->'2' ? 'role2')
The problem is that I want this query to use an index, but I can't make this query to use them. Here are the index I setup:
CREATE INDEX _index1 ON items using gin (acl jsonb_ops);
CREATE INDEX _index2 ON items using gin ((acl->'users') jsonb_ops);
Then I tried this query that is using the index but it is slower ( like 40x ) than the first one so it is unuseful. And also goes beyond the fact that I just want to verify the presence of the "role2" key in acl->'users'->'2'.
SELECT COUNT(*) FROM items WHERE ( acl #> '{"users": {"2": {"role2": {...}}}}');
My question is how can I make this query to use an index keeping my current json data structure ?
I know I can use string arrays and lots of other things to make this usecase work but they imply changing the data structure, and this is not my point here because the problem is that this data structure is used at scale and I want to know if something is possible with this structure.

Using Postgres FK from jsonb with Hasura?

We have foreign keys within a json blob in postgres. We join with these like so:
SELECT f.id, b.id FROM foo AS f
LEFT JOIN bar AS b ON f.data -> 'baz' ->> 'barId' = text(b.id)
I'm now trying out Hasura to do som graphql queries and I need these as object relationships. In the UI I can only try to manually add relationships with normal columns, not nested json data:
Is it at all possible to get a graphql relationship this way?
I got the answer in comments, thanks #iamnat. I'll just evolve here with my example for clarity since I still struggled a bit:
Super simple schema and data as such:
CREATE EXTENSION IF NOT EXISTS "uuid-ossp";
CREATE TABLE foo
(
id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
name text,
data jsonb NOT NULL
);
CREATE TABLE bar
(
id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
name text,
);
WITH bars AS (
INSERT INTO bar (name) VALUES ('bar') RETURNING id
)
INSERT INTO foo (name, data) VALUES ('foo', jsonb_build_object('barId', (SELECT id FROM bars)));
I then can create a function for the relationship:
CREATE FUNCTION foo_bar(foo_row foo)
RETURNS SETOF bar AS $$
SELECT *
FROM bar
WHERE text(id) = foo_row.data ->> 'barId'
$$ LANGUAGE sql STABLE;
This I can then use in Hasura as a computed field under "Data" -> foo -> Modify -> Computed fields -> "Add a new computed field". Just give it a name and reference the function in a dropdown:
I can then query:
query MyQuery {
foo {
name
foo_bar {
name
}
}
}
with expected result:
{
"data": {
"foo": [
{
"name": "foo",
"foo_bar": [
{
"name": "bar"
}
]
}
]
}
}

Querying a many:many relationship on PK of the related table (ie. filtering by related table column)

I have a many:many relationship between 2 tables: note and tag, and want to be able to search all notes by their tagId. Because of the many:many I have a junction table note_tag.
My goal is to expose a computed field on my Postgraphile-generated Graphql schema that I can query against, along with the other properties of the note table.
I'm playing around with postgraphile-plugin-connection-filter. This plugin makes it possible to filter by things like authorId (which would be 1:many), but I'm unable to figure out how to filter by a many:many. I have a computed column on my note table called tags, which is JSON. Is there a way to "look into" this json and pick out where id = 1?
Here is my computed column tags:
create or replace function note_tags(note note, tagid text)
returns jsonb as $$
select
json_strip_nulls(
json_agg(
json_build_object(
'title', tag.title,
'id', tag.id,
)
)
)::jsonb
from note
inner join note_tag on note_tag.tag_id = tagid and note_tag.note_id = note.id
left join note_tag nt on note.id = nt.note_id
left join tag on nt.tag_id = tag.id
where note.account_id = '1'
group by note.id, note.title;
$$ language sql stable;
as I understand the function above, I am returning jsonb, based on the tagid that was given (to the function): inner join note_tag on note_tag.tag_id = tagid. So why is the json not being filtered by id when the column gets computed?
I am trying to make a query like this:
query notesByTagId {
notes {
edges {
node {
title
id
tags(tagid: "1")
}
}
}
}
but right now when I execute this query, I get back stringified JSON in the tags field. However, all tags are included in the json, whether or not the note actually belongs to that tag or not.
For instance, this note with id = 1 should only have tags with id = 1 and id = 2. Right now it returns every tag in the database
{
"data": {
"notes": {
"edges": [
{
"node": {
"id": "1",
"tags": "[{\"id\":\"1\",\"title\":\"Psychology\"},{\"id\":\"2\",\"title\":\"Logic\"},{\"id\":\"3\",\"title\":\"Charisma\"}]",
...
The key factor with this computed column is that the JSON must include all tags that the note belongs to, even though we are searching for notes on a single tagid
here are my simplified tables...
note:
create table notes(
id text,
title text
)
tag:
create table tag(
id text,
title text
)
note_tag:
create table note_tag(
note_id text FK references note.id
tag_id text FK references tag.id
)
Update
I am changing up the approach a bit, and am toying with the following function:
create or replace function note_tags(n note)
returns setof tag as $$
select tag.*
from tag
inner join note_tag on (note_tag.tag_id = tag.id)
where note_tag.note_id = n.id;
$$ language sql stable;
I am able to retrieve all notes with the tags field populated, but now I need to be able to filter out the notes that don't belong to a particular tag, while still retaining all of the tags that belong to a given note.
So the question remains the same as above: how do we filter a table based on a related table's PK?
After a while of digging, I think I've come across a good approach. Based on this response, I have made a function that returns all notes by a given tagid.
Here it is:
create or replace function all_notes_with_tag_id(tagid text)
returns setof note as $$
select distinct note.*
from tag
inner join note_tag on (note_tag.tag_id = tag.id)
inner join note on (note_tag.note_id = note.id)
where tag.id = tagid;
$$ language sql stable;
The error in approach was to expect the computed column to do all of the work, whereas its only job should be to get all of the data. This function all_nuggets_with_bucket_id can now be called directly in graphql like so:
query MyQuery($tagid: String!) {
allNotesWithTagId(tagid: $tagid) {
edges {
node {
id
title
tags {
edges {
node {
id
title
}
}
}
}
}
}
}

Using Hasura session in PostgreSQL function for computed field

I created a "favorite" functionality, which is similar to the common "Like" functionality in many websites.
There are 3 tables:
"User" with primary key UUID
"Photo" with pk UUID
"Favorite" with pk user.UUID and post.UUID
The corresponding SQL is:
CREATE TABLE public."user" (
id uuid DEFAULT public.gen_random_uuid() NOT NULL
);
CREATE TABLE public."photo" (
id uuid DEFAULT public.gen_random_uuid() NOT NULL
);
CREATE TABLE public."favorite" (
userId uuid NOT NULL
photoId uuid NOT NULL
);
Now, I would like to query photos with a computed field isFavorite as boolean where the value is set to true when the current user has favorited the photo.
So, I created this custom SQL function:
CREATE OR REPLACE FUNCTION public.isfavorite(photo photo, hasura_session json)
RETURNS boolean
LANGUAGE sql
STABLE
AS $function$
SELECT EXISTS (
SELECT *
FROM public.favorite
WHERE "userId" = (VALUES (hasura_session ->> 'x-hasura-role'))::uuid AND "photoId" = photo.uuid
)
$function$
I can create this function with SQL in Hasura, but when I set this function to a computed field in the photo table, Hasura display this error:
in table "photo": in computed field "isFavorite": function "isfavorite" is overloaded. Overloaded functions are not supported
Where I made a mistake? Can we build a custom function that return boolean? How do you build a favorite (or like) functionality?
Solved: There was two isFavorite functions in the database that cause overloading...
So now there is a isFavorite field in the photo schema, but I need te provide $args with hasura_session as argument.
How to provide hasura_session without the need to fill in arguments?
You will need to track your computed column passing the session variable.
https://hasura.io/docs/1.0/graphql/manual/api-reference/schema-metadata-api/computed-field.html
{
"type":"add_computed_field",
"args":{
"table":{
"name":"photo",
"schema":"public"
},
"name":"isfavorite",
"definition":{
"function":{
"name":"isfavorite",
"schema":"public"
},
"table_argument":"photo_row",
"session_argument": "hasura_session"
}
}
}
This was also added recently. Make sure your are on version v1.3 or later. I would also change the function to accept photo_row as the variable, instead of photo photo this might cause issues with PostgreSQL.
CREATE OR REPLACE FUNCTION public.isfavorite(photo_row photo, hasura_session json)
RETURNS boolean
LANGUAGE sql
STABLE
AS $function$
SELECT EXISTS (
SELECT *
FROM public.favorite
WHERE "userId" = (VALUES (hasura_session ->> 'x-hasura-role'))::uuid AND "photoId" = photo.uuid
)
$function$

How to break out jsonb array into rows for a postgresql query

I have the objective of breaking out the results of a query on a table with a json column that contains an array into individual rows. However I'm not sure about the syntax to write this query. I'm using this:
For the following query
SELECT
jobs.id,
templates.Id,
templates.Version,
templates.StepGroupId,
templates.PublicVersion,
templates.PlannedDataSheetIds,
templates.SnapshottedDataSheetValues
FROM jobs,
jsonb_to_recordset(jobs.source_templates) AS templates(Id, Version, StepGroupId, PublicVersion,
PlannedDataSheetIds, SnapshottedDataSheetValues)
On the following table:
create table jobs
(
id uuid default uuid_generate_v4() not null
constraint jobs_pkey
primary key,
source_templates jsonb,
);
with the jsonb column containing data in this format:
[
{
"Id":"94729e08-7d5c-459d-9244-f66e17059fc4",
"Version":1,
"StepGroupId":"0274590b-c08d-4963-b37e-8fc8f25151d2",
"PublicVersion":1,
"PlannedDataSheetIds":null,
"SnapshottedDataSheetValues":null
},
{
"Id":"66791bfd-8cdb-43f7-92e6-bfb45b0f780f",
"Version":4,
"StepGroupId":"126404c5-ed1e-4796-80b1-ca68ad486682",
"PublicVersion":1,
"PlannedDataSheetIds":null,
"SnapshottedDataSheetValues":null
},
{
"Id":"e3b31b98-8052-40dd-9405-c316b9c62942",
"Version":4,
"StepGroupId":"bc6a9dd3-d527-449e-bb36-39f03eaf87b9",
"PublicVersion":1,
"PlannedDataSheetIds":null,
"SnapshottedDataSheetValues":null
}
]
I get an error:
[42601] ERROR: a column definition list is required for functions returning "record"
What is the right way to do this without generating the error?
You need to define datatypes:
SELECT
jobs.id,
templates.Id,
templates.Version,
templates.StepGroupId,
templates.PublicVersion,
templates.PlannedDataSheetIds,
templates.SnapshottedDataSheetValues
FROM jobs,
jsonb_to_recordset(jobs.source_templates)
AS templates(Id UUID, Version INT, StepGroupId UUID, PublicVersion INT,
PlannedDataSheetIds INT, SnapshottedDataSheetValues INT)