PostgreSQL query nested object in array by WHERE - postgresql

Using PostgreSQL 13.4, I have a query like this, which is used for a GraphQL endpoint:
export const getPlans = async (filter: {
offset: number;
limit: number;
search: string;
}): Promise<SearchResult<Plan>> => {
const query = sql`
SELECT
COUNT(p.*) OVER() AS total_count,
p.json, TO_CHAR(MAX(pp.published_at) AT TIME ZONE 'JST', 'YYYY-MM-DD HH24:MI') as last_published_at
FROM
plans_json p
LEFT JOIN
published_plans pp ON p.plan_id = pp.plan_id
WHERE
1 = 1
`;
if (filter.search)
query.append(sql`
AND
(
p.plan_id::text ILIKE ${`${filter.search}%`}
OR
p.json->>'name' ILIKE ${`%${filter.search}%`}
**OR
p.json->'activities'->'venue'->>'name' ILIKE ${`%${filter.search}%`}
)
`);
// The above OR line or this alternative didn't work
// p #> '{"activities":[{"venue":{"name":${`%${filter.search}`}}}]}'
.
.
.
}
The data I'm accessing looks like this:
{
"data": {
"plans": {
"records": [
{
"id": "345sdfsdf",
"name": "test1",
"activities": [{...},{...}]
},
{
"id": "abc123",
"name": "test2",
"activities": [
{
"name": "test2",
"id": "123abc",
"venue": {
"name": *"I WANT THIS VALUE"* <------------------------
}
}
]
}
]
}
}
}
Since the search parameter provided to this query varies, I can only make changes in the WHERE block, in order to avoid affecting the other two working searches.
I tried 2 approaches (see above query), but neither worked.
Using TypeORM would be an alternative.
EDIT: For example, could I make that statement work somehow? I want to compare VALUE with the search string that is provided as argument:
p.json ->> '{"activities":[{"venue":{"name": VALUE}}]}' ILIKE ${`%${filter.search}`}

First, you should use the jsonb type instead of the json type in postgres for many reasons, see the manual :
... In general, most applications should prefer to store JSON data as
jsonb, unless there are quite specialized needs, such as legacy
assumptions about ordering of object keys...
Then you can use the following query to get the whole json data based on the search_parameter provided to the query via the user interface as far as the search_parameter is a regular expression (see the manual) :
SELECT query
FROM plans_json p
CROSS JOIN LATERAL jsonb_path_query(p.json :: jsonb , FORMAT('$ ? (#.data.plans.records[*].activities[*].venue.name like_regex "%s")', search_parameter) :: jsonpath) AS query
If you need to retrieve part of the json data only, then you transfer in the jsonb_path_query function the corresponding part of the jsonpath which is in the '(#.jsonpath)' section to the '$' section. For instance, if you just want to retrieve the jsonb object {"name": "test2", ...} then the query is :
SELECT query
FROM plans_json p
CROSS JOIN LATERAL jsonb_path_query(p.json :: jsonb , FORMAT('$.data.plans.records[*].activities[*] ? (#.venue.name like_regex "%s")', search_parameter) :: jsonpath) AS query

Related

Indexing of string values in JSONB nested arrays in Postgres

I have a complex object with JSONB (PostgreSQL 12), that has nested arrays in nested arrays and so on. I search for all invoices, that contains specific criteria.
create table invoice:
invoice_number primary key text not null,
parts: jsonb,
...
Object:
"parts": [
{
"groups": [
{
"categories": [
{
"items": [
{
...
"articleName": "article1",
"articleSize": "M",
},
{
...
"articleName": "article2"
"articleSize": "XXL",
}
]
}
]
}
]
},
{
"groups": [
...
]
},
]
I've a build a native query to search for items with a specific articleName:
select * from invoice i,
jsonb_array_elements(i.invoice_parts) parts,
jsonb_array_elements(parts -> 'groups') groups,
jsonb_array_elements(groups -> 'categories') categories,
jsonb_array_elements(categories -> 'items') items
where items ->> 'articleName' like '%name%' and items ->> 'articleSize' = 'XXL';
I assume i could improve search speed with indexing. I've read about Trigram indexes. Would it be the best type of indexing for my case? If yes -> how to build it for such complex object.
Thanks in regards for any advices.
The only option that might speed up this, is to create a GIN index on the parts column and use a JSON path operator:
select *
from invoice
where parts #? '$.parts[*].groups[*].categories[*].items[*] ? (#.articleName like_regex "name" && #.articleSize == "XXL")'
But I doubt this is going to be fast enough, even if that uses the index.

ADF: use the output from a lookup activity on another activity in Data Factory

I have a lookup activity (Get_ID) that returns:
{
"count": 2,
"value": [
{
"TRGT_VAL": "10000"
},
{
"TRGT_VAL": "52000"
}
],
(...)
I want to use these 2 values from TRGT_VAL in a WHERE clause of a query in another activity. I'm using
#concat('SELECT * FROM table WHERE column in ',activity('Get_ID').output.value[0].TRGT_VAL)
But only the first value of 10000 is being taken into account. How to get the whole list?
I solved by using a lot of replaces:
#concat('(',replace(replace(replace(replace(replace(replace(replace(string(activity('Get_ID').output.value),'{',''),' ',''),'"',''),'TRGT_VAL:',''),'[',''),'}',''),']',''),')')
Output
{
"name": "AptitudeCF",
"value": "(10000,52000)"
}
Instead of using big expression with lot of replace functions, you can use String interpolation syntax and frame your query. Below is query which you can consider.
SELECT * FROM table WHERE column in (#{activity('Get_ID').output.value[0].TRGT_VAL},#{activity('Get_ID').output.value[1].TRGT_VAL)

Aggregate results based on array of strings in JSON?

I have a table with a field called 'keywords'. It is a JSONB field with an array of keyword metadata, including the keyword's name.
What I would like is to query the counts of all these keywords by name, i.e. aggregate on keyword name and count(id). All the examples of GROUP BY queries I can find just result in the grouping occuring on the full list (i.e. only giving me counts where the two records have the same set of keywords).
So is it possible to somehow expand the list of keywords in a way that lets me get these counts?
If not, I am still at the planning stage and could refactor my schema to better handle this.
"keywords": [
{
"addedAt": "2017-04-07T21:11:00+0000",
"addedBy": {
"email": "foo#bar.com"
},
"keyword": {
"name": "Animal"
}
},
{
"addedAt": "2017-04-07T20:54:00+0000",
"addedBy": {
"email": "foo#bar.comm"
},
"keyword": {
"name": "Mammal"
}
}
]
step-by-step demo:db<>fiddle
SELECT
elems -> 'keyword' ->> 'name' AS keyword, -- 2
COUNT(*) AS count
FROM
mytable t,
jsonb_array_elements(myjson -> 'keywords') AS elems -- 1
GROUP BY 1 -- 3
Expand the array records into one row per element
Get the keyword's names
Group these text values.

PostgreSQL query of JSONB array by nested object

I have the following array JSON data structure:
{ arrayOfObjects:
[
{
"fieldA": "valueA1",
"fieldB": { "fieldC": "valueC", "fieldD": "valueD" }
},
{
"fieldA": "valueA",
"fieldB": { "fieldC": "valueC", "fieldD": "valueD" }
}
]
}
I would like to select all records where fieldD matches my criteria (and fieldC is unknown). I've see similar answers such as Query for array elements inside JSON type but there the field being queried is a simple string (akin to searching on fieldA in my example) where my problem is that I would like to query based on an object within an object within the array.
I've tried something like select * from myTable where jsonData -> 'arrayOfObjects' #> '[ { "fieldB": { "fieldD": "valueD" } } ]' ) but that doesn't seem to work.
How can I achieve what I want?
You can execute a "contains" query on the JSONB field directly and pass the minimum you're looking for:
SELECT *
FROM mytable
WHERE json_data #> '{"arrayOfObjects": [{"fieldB": {"fieldD": "valueD"}}]}'::JSONB;
This of course assumes that fieldD is always nested under fieldB, but that's a fairly low bar to clear in terms of schema consistency.

Building query in Postgres 9.4.2 for JSONB datatype using builtin function

I have a table schema as follows:
DummyTable
-------------
someData JSONB
All my values will be a JSON object. For example, when you do a select *
from DummyTable, it would look like
someData(JSONB)
------------------
{"values":["P1","P2","P3"],"key":"ProductOne"}
{"values":["P3"],"key":"ProductTwo"}
I want a query which will give me result set as follows:
[
{
"values": ["P1","P2","P3"],
"key": "ProductOne"
},
{
"values": ["P4"],
"key": "ProductTwo"
}
]
I'm using Postgres version 9.4.2. I looked at documentation page of the same, but could not find the query which would give the above result.
However, in my API, I can build the JSON by iterating over rows, but I would prefer query doing the same. I tried json_build_array, row_to_json on a result which would be given by select * from table_name, but no luck.
Any help would be appreciated.
Here is the link I looked for to write a query for JSONB
You can use json_agg or jsonb_agg:
create table dummytable(somedata jsonb not null);
insert into dummytable(somedata) values
('{"values":["P1","P2","P3"],"key":"ProductOne"}'),
('{"values":["P3"],"key":"ProductTwo"}');
select jsonb_pretty(jsonb_agg(somedata)) from dummytable;
Result:
[
{
"key": "ProductOne",
"values": [
"P1",
"P2",
"P3"
]
},
{
"key": "ProductTwo",
"values": [
"P3"
]
}
]
Although retrieving the data row by row and building on client side can be made more efficient, as the server can start to send data much sooner - after it retrieves first matching row from storage. If it needs to build the json array first, it would need to retrieve all the rows and merge them before being able to start sending data.