postgreSQL: jsonb traversal - postgresql

I currently have a table which contains a column with a JSON object representing Twitter cashtags.
For example, this is my original query:
SELECT
DATA->'id' as tweet_id,
DATA->'text' as tweet_text,
DATA->'entities'->'symbols' as cashtags
FROM documents
LIMIT 10
The cashtags column will return something like
[{"text":"HEMP","indices":[0,5]},{"text":"MSEZ","indices":[63,68]}]
How can I traverse this datatype, which is listed as jsonb, in order to say, only return results where the text is equal to HEMP or MSEZ?

The value data->'entities'->'symbols' is a json array. You can unnest the array using the function jsonb_array_elements(), e.g.:
SELECT
data->'id' as tweet_id,
data->'text' as tweet_text,
value as cashtag
FROM documents,
jsonb_array_elements(data->'entities'->'symbols')
where value->>'text' in ('HEMP', 'MSEZ');
tweet_id | tweet_text | cashtag
----------+------------+---------------------------------------
1 | "my_tweet" | {"text": "HEMP", "indices": [0, 5]}
1 | "my_tweet" | {"text": "MSEZ", "indices": [63, 68]}
(2 rows)
or:
SELECT DISTINCT
data->'id' as tweet_id,
data->'text' as tweet_text,
data->'entities'->'symbols' as cashtags
FROM documents,
jsonb_array_elements(data->'entities'->'symbols')
WHERE value->>'text' in ('HEMP', 'MSEZ');
tweet_id | tweet_text | cashtags
----------+------------+------------------------------------------------------------------------------
1 | "my_tweet" | [{"text": "HEMP", "indices": [0, 5]}, {"text": "MSEZ", "indices": [63, 68]}]
(1 row)

Related

Postgres Unique JSON Array Aggregate Values

I have a table that stores values like this:
| id | thing | values |
|----|-------|--------|
| 1 | a |[1, 2] |
| 2 | b |[2, 3] |
| 3 | a |[2, 3] |
And would like to use an aggregate function to group by thing but store only the unique values of the array such that the result would be:
| thing | values |
|-------|---------|
| a |[1, 2, 3]|
| b |[2, 3] |
Is there a simple and performant way of doing this in Postgres?
First you take the JSON array apart with json_array_elements() - this is a set-returning function with a JOIN LATERAL you get a row with id, thing and a JSON array element for each element.
Then you select DISTINCT records for thing and value, ordered by value.
Finally you aggregate records back together with json_agg().
In SQL that looks like:
SELECT thing, json_agg(value) AS values
FROM (
SELECT DISTINCT thing, value
FROM t
JOIN LATERAL json_array_elements(t.values) AS v(value) ON true
ORDER BY value) x
GROUP BY thing
In general you would want to use the jsonb type as that is more efficient than json. Then you'd have to use the corresponding jsonb_...() functions.

Extract the values from multiple jsonb columns in postgres

I have multiple jsonb columns and i want to extract data from each column.
I am using postgressql.
Input:-
ID Jsoncol1 Jsoncol2 Jsoncol3 Date
0 {"#class": "team": {"id":"Captain","dob": [1990, 9, 11]}} {"#class": "group": {"id":"Colour","dob": [1990, 9, 11]}} {"#class": "person": {"id":"Red","dob": [1990, 9, 11]}]} 13/05/2019
Output:-
ID Team Group Person Date
0 Captain Colour Red 13/05/2019
Without #class: (which makes the JSON value invalid) the data could look like that
id | jsoncol1 | jsoncol2 | jsoncol3 | mydate
-: | :---------------------------------------------- | :---------------------------------------------- | :-------------------------------------------- | :---------
0 | {"team": {"id":"Captain","dob": [1990, 9, 11]}} | {"group": {"id":"Colour","dob": [1990, 9, 11]}} | {"person": {"id":"Red","dob": [1990, 9, 11]}} | 2019-05-13
The problem seems to be, extracting the id value from the JSON object:
demo:db<>fiddle
SELECT
id,
jsoncol1 -> 'team' ->> 'id' AS team,
jsoncol2 -> 'group' ->> 'id' AS "group",
jsoncol3 -> 'person' ->> 'id' AS person,
mydate
FROM mytable
-> Postgres JSON documentation

In postgresql how can I select rows where a jsonb array contains objects?

My database table is something like this (data is a JSONB column):
id | data
----+--------------------------------------
1 | {"tags": [{"name": "tag1"}, {"name": "tag2"}]}
2 | {"tags": [{"name": "tag2"}]}
3 | {"tags": [{"name": "tag3"}]}
4 | {"tags": [{"name": "tag4"}]}
I'd like to write a query that will return the rows where data contains tags tag2 or tag3. So rows 1, 2, and 3 should be returned.
I've been looking at the postgresql JSONB documentation and it's not clear to me how to query a nested structure like this. How would I write the where clause?
jsonb_array_elements would harm the performance. If that is a concern, you can use this query instead
SELECT * FROM MyTable where
data #> '[{"name": "tag2"}]'::jsonb
or
data #> '[{"name": "tag3"}]'::jsonb;
Using where exists with a filter on the unnested json array will return the rows with id 1, 2 & 3
SELECT *
FROM mytable
WHERE EXISTS (
SELECT TRUE
FROM jsonb_array_elements(data->'tags') x
WHERE x->>'name' IN ('tag2', 'tag3')
)
let's say:
db=# create table so33(id int, data jsonb);
CREATE TABLE
db=# copy so33 from stdin delimiter '|';
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself.
>> 1 | {"tags": [{"name": "tag1"}, {"name": "tag2"}]}
2 | {"tags": [{"name": "tag2"}]}
3 | {"tags": [{"name": "tag3"}]}
4 | {"tags": [{"name": "tag4"}]}>> >> >>
>> \.
COPY 4
then:
db=# with c as (select *,jsonb_array_elements(data->'tags')->>'name' e from so33)
select * from c where e in ('tag2','tag3');
id | data | e
----+------------------------------------------------+------
1 | {"tags": [{"name": "tag1"}, {"name": "tag2"}]} | tag2
2 | {"tags": [{"name": "tag2"}]} | tag2
3 | {"tags": [{"name": "tag3"}]} | tag3
(3 rows)
A simple jsonb_array_elements should do
obviously SELECT DISTINCT id, data FROM e should give your expected result

Match JSONB row where at least one of object's values is not null

In our database we have a data set sorta like the following:
+----+-------------------------------+
| id | stuff |
+----+-------------------------------+
| 1 | {} |
+----+-------------------------------+
| 2 | {"a": "something", "b": null} |
+----+-------------------------------+
| 3 | {"c": null, "d": null} |
+----+-------------------------------+
I would like to match only, in this case, the one with id = 2, reason being that at least one of the values in the object is not null.
How can this be done with PostgreSQL?
I know one can do something like WHERE stuff != '{}' but that of course only checks for an empty object
Or something like WHERE (stuff->>'a') IS NOT NULL, but the thing is the list of keys in the objects are not hardcoded, could be anything
Use the function jsonb_each_text() or json_each_text(), example:
with my_table(id, jdata) as (
values
(1, '{}'::jsonb),
(2, '{"a": "something", "b": null}'),
(3, '{"c": null, "d": null}')
)
select distinct t.*
from my_table t
cross join jsonb_each_text(jdata)
where value is not null;
id | jdata
----+-------------------------------
2 | {"a": "something", "b": null}
(1 row)
This query (proposed by Abelisto, see the comments) should be more performant on a larger dataset:
select t.*
from my_table t
where exists (
select 1
from jsonb_each_text(jdata)
where value is not null);

OrientDB perform a group by on embeddedlist column

I have the following query:
SELECT Sub_Type, count(Sub_Type)
FROM SOME_TABLE
GROUP BY Sub_Type
Sub_Type field type is an embedded list of string
The result I'm getting is:
Blotter_Sub_Type | count
["A"] | 2
["B"] | 3
["C"] | 3
["A","B"] | 1
["B","C"] | 1
But when I'm really after is to get how many occurrences of each value, my expected result is:
Blotter_Sub_Type | count
"A" | 3
"B" | 5
"C" | 4
Meaning that it will count the occurrences of each value individually
You have to use UNWIND and a subquery:
SELECT Sub_Type, count(Sub_Type) FROM (
SELECT Sub_Type FROM SOME_TABLE UNWIND Sub_Type
) GROUP BY Sub_Type