OrientDB perform a group by on embeddedlist column - orientdb

I have the following query:
SELECT Sub_Type, count(Sub_Type)
FROM SOME_TABLE
GROUP BY Sub_Type
Sub_Type field type is an embedded list of string
The result I'm getting is:
Blotter_Sub_Type | count
["A"] | 2
["B"] | 3
["C"] | 3
["A","B"] | 1
["B","C"] | 1
But when I'm really after is to get how many occurrences of each value, my expected result is:
Blotter_Sub_Type | count
"A" | 3
"B" | 5
"C" | 4
Meaning that it will count the occurrences of each value individually

You have to use UNWIND and a subquery:
SELECT Sub_Type, count(Sub_Type) FROM (
SELECT Sub_Type FROM SOME_TABLE UNWIND Sub_Type
) GROUP BY Sub_Type

Related

Postgresql - Filter object array and extract required values in a json object

I have a PostgreSQL table like below:
| data |
| -------------- |
| {"name":"a","tag":[{"type":"country","value":"US"}]} |
| {"name":"b","tag":[{"type":"country","value":"US"}]}, {"type":"country","value":"UK"}]} |
| {"name":"c","tag":[{"type":"gender","value":"male"}]} |
The goal is to extract all the value in "tag" array with "type" = "country" and aggregate them into a text array. The expected result is as follows:
| result |
| -------------- |
| ["US"] |
| ["US", "UK"] |
| [] |
I've tried to expand the "tag" array and aggregate the desired result back; however, it requires a unique id to group up the results. Hence, I add a column with row number to serve as unique id. Here is what I've done:
SELECT ROW_NUMBER() OVER () AS id, * INTO data_table_with_id FROM data_table;
SELECT ARRAY_AGG(tag_value) AS result
FROM (
SELECT
id,
json_array_elements("data"::json->'tag')->>'type' as tag_type,
json_array_elements("data"::json->'tag')->>'value' as tag_value
FROM data_table_with_id
) tags
WHERE tag_type = 'country'
GROUP BY id;
Is it possible to use a single select to filter the object array and get the required results?
You can do this easily with a JSON path function:
select jsonb_path_query_array(data, '$.tag[*] ?(#.type == "country").value')
from data_table;

T-SQL. HOW to create a table with a sequence of values

I have a table with a list of names and indices. For example like this:
ID | Name | Index
1 | Value 1 | 3
2 | Value 2 | 4
...
N | Value N | NN
I need to create a new table, where every value from field "Name" will be repeat repeated as many times as the "Index" field is specified. For example like this:
ID | Name_2 | ID_2
1 | Value 1 | 1
2 | Value 1 | 2
3 | Value 1 | 3
4 | Value 2 | 1
5 | Value 2 | 2
6 | Value 2 | 3
7 | Value 2 | 4
...
N | Value N | 1
N+1| Value N | 2
...
I have no idea how to write a cycle to get such result. Please, give me an advice.
Here is solution to repeat the rows based on a column value
declare #order table ( Id int, name varchar(20), indx int)
Insert into #order
(Id, name, indx)
VALUES
(1,'Value1',3),
(2,'Value2',4),
(3,'Value3',2)
;WITH cte AS
(
SELECT * FROM #order
UNION ALL
SELECT cte.[ID], cte.name, (cte.indx - 1) indx
FROM cte INNER JOIN #order t
ON cte.[ID] = t.[ID]
WHERE cte.indx > 1
)
SELECT ROW_NUMBER() OVER(ORDER BY name ASC) AS Id, name as [name_2], 1 as [Id_2]
FROM cte
ORDER BY 1

Grouping by unique values inside a JSONB array

Consider the following table structure:
CREATE TABLE residences (id int, price int, categories jsonb);
INSERT INTO residences VALUES
(1, 3, '["monkeys", "hamsters", "foxes"]'),
(2, 5, '["monkeys", "hamsters", "foxes", "foxes"]'),
(3, 7, '[]'),
(4, 11, '["turtles"]');
SELECT * FROM residences;
id | price | categories
----+-------+-------------------------------------------
1 | 3 | ["monkeys", "hamsters", "foxes"]
2 | 5 | ["monkeys", "hamsters", "foxes", "foxes"]
3 | 7 | []
4 | 11 | ["turtles"]
Now I would like to know how many residences there are for each category, as well as their sum of prices. The only way I found was to do this was using a sub-query:
SELECT category, SUM(price), COUNT(*) AS residences_no
FROM
residences a,
(
SELECT DISTINCT(jsonb_array_elements(categories)) AS category
FROM residences
) b
WHERE a.categories #> category
GROUP BY category
ORDER BY category;
category | sum | residences_no
------------+-----+---------------
"foxes" | 8 | 2
"hamsters" | 8 | 2
"monkeys" | 8 | 2
"turtles" | 11 | 1
Using jsonb_array_elements without subquery would return three residences for foxes because of the duplicate entry in the second row. Also the price of the residence would be inflated by 5.
Is there any way to do this without using the sub-query, or any better way to accomplish this result?
EDIT
Initially I did not mention the price column.
select category, count(distinct (id, category))
from residences, jsonb_array_elements(categories) category
group by category
order by category;
category | count
------------+-------
"foxes" | 2
"hamsters" | 2
"monkeys" | 2
"turtles" | 1
(4 rows)
You have to use a derived table to aggregate another column (all prices at 10):
select category, count(*), sum(price) total
from (
select distinct id, category, price
from residences, jsonb_array_elements(categories) category
) s
group by category
order by category;
category | count | total
------------+-------+-------
"foxes" | 2 | 20
"hamsters" | 2 | 20
"monkeys" | 2 | 20
"turtles" | 1 | 10
(4 rows)

Evaluate Values From Multiple Rows As Part of Aggregate or Window Function

I need to find a way to tell if a column has two specific values within a grouped/partitioned section. Easiest to describe by example. I have table "foo" with the following data:
ID | Indicator
1 | A
1 | B
1 | B
2 | C
2 | B
3 | A
3 | B
3 | B
3 | C
4 | A
4 | C
For my output I want a result of "A" if one of the rows in the group has Indicator "A". If not, then "C" if one of the rows Indicator is "C". But in the case where the group has an Indicator of "A" and an Indicator of "C" I want a result of "X" for the group. Given the data I want the following result:
ID | Result
1 | A
2 | C
3 | X
4 | X
The result of A or C (ID 1 and 2 in the example) can be done using a partition and windows function this way:
SELECT DISTINCT ID,
priority_indicator
FROM (SELECT ID,
first_value(Indicator) OVER
(PARTITION BY ID
ORDER BY
CASE
WHEN Indicator = 'A' THEN
1
WHEN Indicator = 'C' THEN
2
ELSE
3
END
) priority_indicator
FROM foo) a
How would you look at the values in multiple rows at once to return an "X" when there's both an "A" and a "C" in the Indicator?
--test data
WITH foo(id,indicator) AS ( VALUES
(1,'A'),
(1,'B'),
(1,'B'),
(2,'C'),
(2,'B'),
(3,'A'),
(3,'B'),
(3,'B'),
(3,'C'),
(4,'A'),
(4,'C')
),
-- get all entries for each Id in indicator_set
agg AS (
SELECT id,array_agg(DISTINCT(indicator)) AS indicator_set FROM foo
GROUP BY id
)
-- actual query
SELECT id,
CASE
WHEN indicator_set #> '{A,C}' THEN 'X'
WHEN indicator_set #> '{A}' THEN 'A'
WHEN indicator_set #> '{C}' THEN 'C'
END result
FROM agg;
Output:
id | result
----+--------
1 | A
2 | C
3 | X
4 | X
(4 rows)

PostgreSQL JSONB query types of each keys

I have certain table:
CREATE TABLE x(
id BIGSERIAL PRIMARY KEY,
data JSONB
);
INSERT INTO x(data)
VALUES( '{"a":"test", "b":123, "c":null, "d":true}' ),
( '{"a":"test", "b":123, "c":null, "d":"yay", "e":"foo", "f":[1,2,3]}' );
How to query types of each key in that table, so it would give an output something like this:
a | string:2
b | number:2
c | null:2
d | boolean:1 string:1
e | string:1
f | jsonb:1 -- or anything
I only know the way to get the keys and count, but don't know how to get the type of each key:
SELECT jsonb_object_keys(data), COUNT(id) FROM x GROUP BY 1 ORDER BY 1
that would give something like:
a | 2
b | 2
c | 2
d | 2
e | 1
f | 1
EDIT:
As pozs points out, there are two typeof functions: one for JSON and one for SQL. This query is the one you're looking for:
SELECT
json_data.key,
jsonb_typeof(json_data.value),
count(*)
FROM x, jsonb_each(x.data) AS json_data
group by key, jsonb_typeof
order by key, jsonb_typeof;
Old Answer: (Hey, it works...)
This query will return the type of the keys:
SELECT
json_data.key,
pg_typeof(json_data.value),
json_data.value
FROM x, jsonb_each(x.data) AS json_data;
... unfortunately, you'll notice that Postgres doesn't differentiate between the different JSON types. it regards it all as jsonb, so the results are:
key1 | value1 | value
------+--------+-----------
a | jsonb | "test"
b | jsonb | 123
c | jsonb | null
d | jsonb | true
a | jsonb | "test"
b | jsonb | 123
c | jsonb | null
d | jsonb | "yay"
e | jsonb | "foo"
f | jsonb | [1, 2, 3]
(10 rows)
However, there aren't that many JSON primitive types, and the output seems to be unambiguous. So this query will do what you're wanting:
with jsontypes as (
SELECT
json_data.key AS key1,
CASE WHEN left(json_data.value::text,1) = '"' THEN 'String'
WHEN json_data.value::text ~ '^-?\d' THEN
CASE WHEN json_data.value::text ~ '\.' THEN 'Number'
ELSE 'Integer'
END
WHEN left(json_data.value::text,1) = '[' THEN 'Array'
WHEN left(json_data.value::text,1) = '{' THEN 'Object'
WHEN json_data.value::text in ('true', 'false') THEN 'Boolean'
WHEN json_data.value::text = 'null' THEN 'Null'
ELSE 'Beats Me'
END as jsontype
FROM x, jsonb_each(x.data) AS json_data -- Note that it won't work if we use jsonb_each_text here because the strings won't have quotes around them, etc.
)
select *, count(*) from jsontypes
group by key1, jsontype
order by key1, jsontype;
Output:
key1 | jsontype | count
------+----------+-------
a | String | 2
b | Integer | 2
c | Null | 2
d | Boolean | 1
d | String | 1
e | String | 1
f | Array | 1
(7 rows)
You can improve your last query with jsonb_typeof
with jsontypes as (
SELECT
json_data.key AS key1,
jsonb_typeof(json_data.value) as jsontype
FROM x, jsonb_each(x.data) AS json_data
)
select *, count(*)
from jsontypes
group by 1, 2
order by 1, 2;