I am using postgresql database. I have a column which is jsonb data type.
For example I have a json data like below:
{
"test_question_number": ["1000000000", "5000000000"],
"question1": 0.04975124378109453,
"question2": 5.077114427860696,
"question3": 75621.89054726369,
"question4": 3482.587064676617,
"question6": 1,
"question8": 0.000176068
}
As you see it is key value json data. And the data can be different, So the key names are not same for other saved json data.
Now I would like to convert it as colum and row. Like below:
---------------------------------------------------------------------------------------
| |test_question_number |question1| |question2| |question3|
---------------------------------------------------------------------------------------
| 1 | "1000000000" | 0.04975124378109453| 5.077114427860696 |75621.89054726369
------------------------ --------------------------------------------------------------
| 2 | "5000000000" | | |
---------------------------------------------------------------------------------------
I have tried jsonb_build_object, jsonb_populate_recordset and some function but I could not solve.
A static pivoting solution might be
WITH t AS
(
SELECT JSONB_TYPEOF(value::JSONB) AS type, js.*
FROM t
CROSS JOIN JSONB_EACH(jsdata) AS js
)
SELECT arr.*, question1, question2, question3, question4, question6, question8
FROM
(
SELECT row_id, test_question_number
FROM t
CROSS JOIN JSONB_ARRAY_ELEMENTS(value::JSONB)
WITH ORDINALITY arr(test_question_number,row_id)
WHERE type = 'array' ) AS arr
LEFT JOIN
( SELECT cnt, MAX(value::text) FILTER (WHERE key = 'question1') AS question1,
MAX(value::text) FILTER (WHERE key = 'question2') AS question2,
MAX(value::text) FILTER (WHERE key = 'question3') AS question3,
MAX(value::text) FILTER (WHERE key = 'question4') AS question4,
MAX(value::text) FILTER (WHERE key = 'question6') AS question6,
MAX(value::text) FILTER (WHERE key = 'question8') AS question8
FROM (SELECT t.*, COUNT(*) OVER (PARTITION BY key) AS cnt
FROM t
WHERE type != 'array'
) AS q
GROUP BY cnt ) AS obj
ON arr.row_id = obj.cnt
distinguishing the elements by types as JSON objects whether an array or non-array
Demo
Related
I have a database table with data similar to this.
create table DataTable (
name text,
value number
)
insert into DataTable values
('A', 1),('A', 2),('B', 3),('Other', 5),('C', 1);
And i have another table
create table "group" (
name text,
default boolean
)
insert into "group" values
('A', false),('B', false),('Other', true);
I want to group the data in the first table based on the defined groups in the second table.
Expected output
Name | sum
A | 3
B | 3
Other | 6
Right now I'm using this query:
select coalesce(g.name, (select name from group where default = true)) name
sum(dt.value)
from DataTable dt
left join group g on dt.name = g.name
group by 1
This works but can cause performance tips in some situations. Any better way to do this?
I have 3 postgresql tables : Documents, Keywords and a join table.
I have query that searches document.id and document.date if certain keywords are related to that document. That works fine like so:
SELECT
documents.id, documents.document_date
FROM
documents
INNER JOIN
documents_keywords ON documents_keywords.document_id = documents.id
INNER JOIN
keywords ON keywords.id = documents_keywords.keyword_id
WHERE
keywords.keyword IN ('bread' , 'cake')
GROUP BY documents.id
This returns:
id | document_date
----+-----------
4 | 1200
12 | 1280
(2 rows)
I also want to exclude keywords. I thought I could do NOT IN like so:
SELECT
documents.id, documents.document_date
FROM
documents
INNER JOIN
documents_keywords ON documents_keywords.document_id = documents.id
INNER JOIN
keywords ON keywords.id = documents_keywords.keyword_id
WHERE
keywords.keyword NOT IN ('cranberries')
GROUP BY documents.id
But that always returns empty, whatever keyword I put:
id | document_date
----+-----------
(0 rows)
This is incorrect. I expected:
id | document_date
----+-----------
4 | 1200
(1 row)
You might want to use an array expression, like this:
WHERE keyword = any(array['bread', 'cake'])
when you want to include a row.
If you want to exclude something, you have to do the NOT IN over a subselect of the inverse condition, e.g.
SELECT ... WHERE document_id NOT IN
(SELECT document_id FROM ...joins... WHERE keyword = ANY(array['cranberry']))
Here is an example I put together:
WITH documents(d_id, date) AS (
VALUES(1,'1000'),(2,'2000'),(3,'3000'),(4,'4000')
),
keywords(k_id, keyword) AS (
VALUES(1, 'cake'), (2, 'bread'), (3, 'cranberry')
),
documents_keywords (d_id, k_id) AS (
VALUES(1,1),(1,2),(2,2),(2,3),(3,3)
)
SELECT * FROM documents where d_id NOT IN (
SELECT d_id FROM
documents
JOIN documents_keywords USING(d_id)
JOIN keywords USING(k_id)
WHERE keyword = ANY(array['cranberry'])
)
Also, I am not sure why you are using GROUP BY, I don't think you need it.
I have a group by query which fetches me some records. What if I wish to find other column details representing those records.
Suppose I have a query as follows .Select id,max(date) from records group by id;
to fetch the most recent entry in the table.
I wish to fetch another column representing those records .
I want to do something like this (This incorrect query is just for example) :
Select type from (Select id,max(date) from records group by id) but here type doesnt exist in the inner query.
I am not able to define the question in a simpler manner.I Apologise for that.
Any help is appreciated.
EDIT :
Column | Type | Modifiers
--------+-----------------------+-----------
id | integer |
rdate | date |
type | character varying(20) |
Sample Data :
id | rdate | type
----+------------+------
1 | 2013-11-03 | E1
1 | 2013-12-12 | E1
2 | 2013-12-12 | A3
3 | 2014-01-11 | B2
1 | 2014-01-15 | A1
4 | 2013-12-23 | C1
5 | 2014-01-05 | C
7 | 2013-12-20 | D
8 | 2013-12-20 | D
9 | 2013-12-23 | A1
While I was trying something like this (I'm no good at sql) : select type from records as r1 inner join (Select id,max(rdate) from records group by id) r2 on r1.rdate = r2.rdate ;
or
select type from records as r1 ,(Select id,max(rdate) from records group by id) r2 inner join r1 on r1.rdate = r2.rdate ;
You can easily do this with a window function:
SELECT id, rdate, type
FROM (
SELECT id, rdate, type, rank() OVER (PARTITION BY id ORDER BY rdate DESC) rnk
FROM records
WHERE rnk = 1
) foo
ORDER BY id;
The window definition OVER (PARTITION BY id ORDER BY rdate DESC) takes all records with the same id value, then sorts then from most recent to least recent rdate and assigns a rank to each row. The rank of 1 is the most recent, so equivalent to max(rdate).
If I've understood the question right, then this should work (or at least get you something you can work with):
SELECT
b.id, b.maxdate, a.type
FROM
records a -- this is the records table, where you'll get the type
INNER JOIN -- now join it to the group by query
(select id, max(rdate) as maxdate FROM records GROUP BY id) b
ON -- join on both rdate and id, otherwise you'll get lots of duplicates
b.id = a.id
AND b.maxdate = a.rdate
Note that if you have records with different types for the same id and rdate combination you'll get duplicates.
I have a table in Postgres which stores a tree structure. Each node has a jsonb field: params_diff:
CREATE TABLE tree (id INT, parent_id INT, params_diff JSONB);
INSERT INTO tree VALUES
(1, NULL, '{ "some_key": "some value" }'::jsonb)
, (2, 1, '{ "some_key": "other value", "other_key": "smth" }'::jsonb)
, (3, 2, '{ "other_key": "smth else" }'::jsonb);
The thing I need is to select a node by id with additional generated params field which contains the result of merging all params_diff from the whole parents chain:
SELECT tree.*, /* some magic here */ AS params FROM tree WHERE id = 3;
id | parent_id | params_diff | params
----+-----------+----------------------------+-------------------------------------------------------
3 | 2 | {"other_key": "smth else"} | {"some_key": "other value", "other_key": "smth else"}
Generally, a recursive CTE can do the job. Example:
Use table alias in another query to traverse a tree
We just need a more magic to decompose, process and re-assemble the JSON result. I am assuming from your example, that you want each key once only, with the first value in the search path (bottom-up):
WITH RECURSIVE cte AS (
SELECT id, parent_id, params_diff, 1 AS lvl
FROM tree
WHERE id = 3
UNION ALL
SELECT t.id, t.parent_id, t.params_diff, c.lvl + 1
FROM cte c
JOIN tree t ON t.id = c.parent_id
)
SELECT id, parent_id, params_diff
, (SELECT json_object(array_agg(key ORDER BY lvl)
, array_agg(value ORDER BY lvl))::jsonb
FROM (
SELECT key, value
FROM (
SELECT DISTINCT ON (key)
p.key, p.value, c.lvl
FROM cte c, jsonb_each_text(c.params_diff) p
ORDER BY p.key, c.lvl
) sub1
ORDER BY lvl
) sub2
) AS params
FROM cte
WHERE id = 3;
How?
Walk the tree with a classic recursive CTE.
Create a derived table with all keys and values with jsonb_each_text() in a LATERAL JOIN, remember the level in the search path (lvl).
Use DISTINCT ON to get the "first" (lowest lvl) value for each key. Details:
Select first row in each GROUP BY group?
Sort and aggregate resulting keys and values and feed the arrays to json_object() to build the final params value.
SQL Fiddle (only as far as pg 9.3 can go with json instead of jsonb).
assuming below table;
column name | type
id | int
date | varchar
When I use
SELECT ROWNUMBER() OVER( ORDER BY TYPE_DATE ) as ROWID,
TO_DATE( date, 'mm\dd\yyyy' ) as TYPE_DATE,
*
FROM TABLE
I always get below error:
SQL0104N an expected token "*" was found following .... <select_sublist>
here are three questions:
Why can't * be used here?
Why can't this new column be used in OVER()
How can I get the set of second 10 records, order by a formatted column
To answer your first question, it is because you have designated additional columns, and DB2 is unable expand this * to a column list. You can fix this by adding a table identifier FROM TABLE T, and using the exposed identifier to expand the column list SELECT ..., T.*
As you can see on this chart from the Information Center, you can only have EITHER * OR expressions and exposed-name.*
>--+-*-----------------------------------------------+---------><
| .-,-------------------------------------------. |
| V | |
'---+-expression--+-------------------------+-+-+-'
| | .-AS-. | |
| '-+----+--new-column-name-' |
'-exposed-name.*--------------------------'
For two and three, the column can't access the value of a function in the same SELECT clause by referring to it by its alias. You can push it lower into a sub-select, and then use the OVER() function. You can then get the rows you want by adding a BETWEEN:
SELECT ROWNUMBER() OVER( ORDER BY TYPE_DATE ) as ROWID, T1.*
FROM (
SELECT TO_DATE( date, 'mm\dd\yyyy' ) as TYPE_DATE, T.*
FROM TABLE T
) T1
WHERE ROWNUMBER() OVER( ORDER BY TYPE_DATE ) BETWEEN 10 AND 20
ORDER BY TYPE_DATE