Unpack dictionary in column in SQL table [duplicate] - postgresql

In a Postgres 9.3 database I have a table in which one column contains JSON, as in the test table shown in the example below.
test=# create table things (id serial PRIMARY KEY, details json, other_field text);
CREATE TABLE
test=# \d things
Table "public.things"
Column | Type | Modifiers
-------------+---------+-----------------------------------------------------
id | integer | not null default nextval('things_id_seq'::regclass)
details | json |
other_field | text |
Indexes:
"things_pkey" PRIMARY KEY, btree (id)
test=# insert into things (details, other_field)
values ('[{"json1": 123, "json2": 456},{"json1": 124, "json2": 457}]', 'nonsense');
INSERT 0 1
test=# insert into things (details, other_field)
values ('[{"json1": 234, "json2": 567}]', 'piffle');
INSERT 0 1
test=# select * from things;
id | details | other_field
----+-------------------------------------------------------------+-------------
1 | [{"json1": 123, "json2": 456},{"json1": 124, "json2": 457}] | nonsense
2 | [{"json1": 234, "json2": 567}] | piffle
(2 rows)
The JSON is always an array containing a variable number of hashes. Each hash always has the same set of keys. I am trying to write a query which returns a row for each entry in the JSON array, with columns for each hash key and the id from the things table. I'm hoping for output like the following:
thing_id | json1 | json2
----------+-------+-------
1 | 123 | 456
1 | 124 | 457
2 | 234 | 567
i.e. two rows for entries with two items in the JSON array. Is it possible to get Postgres to do this?
json_populate_recordset feels like an essential part of the answer, but I can't get it to work with more than one row at once.

select id,
(details ->> 'json1')::int as json1,
(details ->> 'json2')::int as json2
from (
select id, json_array_elements(details) as details
from things
) s
;
id | json1 | json2
----+-------+-------
1 | 123 | 456
1 | 124 | 457
2 | 234 | 567

Related

How to query across multiple rows in postgres

I'm saving dynamic objects (objects of which I do not know the type upfront) using the following 2 tables in Postgres:
CREATE TABLE IF NOT EXISTS objects(
id UUID NOT NULL DEFAULT gen_random_uuid(),
user_id UUID NOT NULL,
name TEXT NOT NULL,
PRIMARY KEY(id)
);
CREATE TABLE IF NOT EXISTS object_values(
id UUID NOT NULL DEFAULT gen_random_uuid(),
event_id UUID NOT NULL,
param TEXT NOT NULL,
value TEXT NOT NULL,
);
So for instance, if I have the following objects:
dog = [
{ breed: "poodle", age: 15, ...},
{ breed: "husky", age: 9, ...},
}
monitors = [
{ manufacturer: "dell", ...},
}
It will live in the DB as follows:
-- objects
| id | user_id | name |
|----|---------|---------|
| 1 | 1 | dog |
| 2 | 2 | dog |
| 3 | 1 | monitor |
-- object_values
| id | event_id | param | value |
|----|----------|--------------|--------|
| 1 | 1 | breed | poodle |
| 2 | 1 | age | 15 |
| 3 | 2 | breed | husky |
| 4 | 2 | age | 9 |
| 5 | 3 | manufacturer | dell |
Note, these tables are big (hundreds of millions). Generally optimised for writing.
What would be a good way of querying/filtering objects based on multiple object params? For instance: Select the number of all husky dogs above the age of 10 per unique user.
I also wonder whether it would have been better to denormalise the tables and collapse the params onto a JSON column (and use gin indexes).
Are there any standards I can use here?
"Select the number of all husky dogs above the age of 10 per unique user" - The following query would do it.
SELECT user_id, COUNT(DISTINCT event_id) AS num_husky_dogs_older_than_10
FROM objects o
INNER JOIN object_values ov
ON o.id_ = ov.event_id
AND o.name_ = 'dog'
GROUP BY o.user_id
HAVING MAX(CASE WHEN ov.param = 'age'
AND ov.value_::integer >= 10 THEN 1 END) = 1
AND MAX(CASE WHEN ov.param = 'breed'
AND ov.value_ = 'husky' THEN 1 END) = 1;
Since your queries are most likely affected by having always the same JOIN operation between these two tables on the same fields, would be good to have a indices on:
the fields you join on ("objects.id", "object_values.event_id")
the fields you filter on ("objects.name", "object_values.param", "object_values.value_")
Check the demo here.

Locks on updating rows with foreign key constraint

I tried executing the same UPDATE query twice like below.
First time the transaction has no lock but I can see a row lock after second query.
Schema:
test=# \d t1
Table "public.t1"
Column | Type | Collation | Nullable | Default
--------+---------+-----------+----------+---------
i | integer | | not null |
j | integer | | |
Indexes:
"t1_pkey" PRIMARY KEY, btree (i)
Referenced by:
TABLE "t2" CONSTRAINT "t2_j_fkey" FOREIGN KEY (j) REFERENCES t1(i)
test=# \d t2
Table "public.t2"
Column | Type | Collation | Nullable | Default
--------+---------+-----------+----------+---------
i | integer | | not null |
j | integer | | |
k | integer | | |
Indexes:
"t2_pkey" PRIMARY KEY, btree (i)
Foreign-key constraints:
"t2_j_fkey" FOREIGN KEY (j) REFERENCES t1(i)
Existing data:
test=# SELECT * FROM t1 ORDER BY i;
i | j
---+---
1 | 1
2 | 2
(2 rows)
test=# SELECT * FROM t2 ORDER BY i;
i | j | k
---+---+---
3 | 1 |
4 | 2 |
(2 rows)
UPDATE queries and row lock status:
test=# BEGIN;
BEGIN
test=# UPDATE t2 SET k = 123 WHERE i = 3;
UPDATE 1
test=# SELECT * FROM t1 AS t, pgrowlocks('t1') AS p WHERE p.locked_row = t.ctid;
i | j | locked_row | locker | multi | xids | modes | pids
---+---+------------+--------+-------+------+-------+------
(0 rows)
test=# UPDATE t2 SET k = 123 WHERE i = 3;
UPDATE 1
test=# SELECT * FROM t1 AS t, pgrowlocks('t1') AS p WHERE p.locked_row = t.ctid;
i | j | locked_row | locker | multi | xids | modes | pids
---+---+------------+--------+-------+----------+-------------------+------
1 | 1 | (0,1) | 107239 | f | {107239} | {"For Key Share"} | {76}
(1 row)
test=#
Why does postgres try to get a row lock only on second time?
By the way, queries updating column t2.j create new lock (ForKeyShare) on t1 row at once. This behavior make sense because t2.j has foreign key constraint references t1.i. But the queries above seems not.
Does anyone can explain this lock?
PostgreSQL version: 9.6.3
Okay, I got it.
http://blog.nordeus.com/dev-ops/postgresql-locking-revealed.htm
This is optimization that exists in Postgres. If locking manager can figure out from the first query that foreign key is not changed (it is not mentioned in update query or is set to same value) it will not lock parent table. But in second query it will behave as it is described in documentation (it will lock parent table in ROW SHARE locking mode and referenced row in FOR SHARE mode)
It seems MySQL is wiser about foreign key locks because the same UPDATE query doesn't make such locks on MySQL.

How to order rows with linked parts in PostgreSQL

I have a table A with columns: id, title, condition
And i have another table B with information about position for some rows from table A. Table B have columns id, next_id, prev_id
How to sort rows from A based on information from table B?
For example,
Table A
id| title
---+-----
1 | title1
2 | title2
3 | title3
4 | title4
5 | title5
Table B
id| next_id | prev_id
---+-----
2 | 1 | null
5 | 4 | 3
I want to get this result:
id| title
---+-----
2 | title2
1 | title1
3 | title3
5 | title5
4 | title4
And after apply this sort, i want to sort by condition column yet.
I've already spent a lot of time looking for a solution, and hope for your help.
You have to add weights to your data, so you can order accordingly. This example uses next_id, not sure if you need to use prev_id, you don't explain the use of it.
Anyway, here's a code example:
-- Temporal Data for the test:
CREATE TEMP TABLE table_a(id integer,tittle text);
CREATE TEMP TABLE table_b(id integer,next_id integer, prev_id integer);
INSERT INTO table_a VALUES
(1,'title1'),
(2,'title2'),
(3,'title3'),
(4,'title4'),
(5,'title5');
INSERT INTO table_b VALUES
(2,1,null),
(5,4,3);
-- QUERY:
SELECT
id,tittle,
CASE -- Adding weight
WHEN next_id IS NULL THEN (id + 0.1)
ELSE next_id
END AS orden
FROM -- Joining tables
(SELECT ta.*,tb.next_id
FROM table_a ta
LEFT JOIN table_b tb
ON ta.id=tb.id)join_a_b
ORDER BY orden
And here's the result:
id | tittle | orden
--------------------------
2 | title2 | 1
1 | title1 | 1.1
3 | title3 | 3.1
5 | title5 | 4
4 | title4 | 4.1

cassandra 2.0.7 cql SELECT Secific Value from map

ALTER TABLE users ADD todo map;
UPDATE users SET todo = { '1':'1111', '2':'2222', '3':'3' ,.... } WHERE user_id = 'frodo';
now ,i want to run the follow cql ,but failed ,is here any other method ?
SELECT user_id, todo['1'] FROM users WHERE user_id = 'frodo';
ps:
the length my map can change. for example : { '1':'1111', '2':'2222', '3':'3' } or { '1':'1111', '2':'2222', '3':'3', '4':'4444'} or { '1':'1111', '2':'2222', '3':'3', '4':'4444' ...}
If you want to use a map collection, you'll have the limitation that you can only select the collection as a whole (docs).
I think you could use the suggestion from the referenced question, even if the length of your map changes. If you store those key/value pairs for each user_id in separate fields, and make your primary key based on user_id and todo_k, you'll have access to them in the select query.
For example:
CREATE TABLE users (
user_id text,
todo_k text,
todo_v text,
PRIMARY KEY (user_id, todo_k)
);
-----------------------------
| user_id | todo_k | todo_v |
-----------------------------
| frodo | 1 | 1111 |
| frodo | 2 | 2222 |
| sam | 1 | 11 |
| sam | 2 | 22 |
| sam | 3 | 33 |
-----------------------------
Then you can do queries like:
select user_id,todo_k,todo_v from users where user_id = 'frodo';
select user_id,todo_k,todo_v from users where user_id = 'sam' and todo_k = 2;

Incrementing a sequence in PostgreSQL based on a foreign key

I would like to use sequences to create friendly IDs for some objects in my database. My problem is that I don't want the sequence to be global to the table, instead I want to increment the value based on a foreign key.
For example, my table is defined as:
CREATE TABLE foo (id numeric PRIMARY KEY, friendly_id SERIAL, bar_id numeric NOT NULL)
And I would like friendly_id to increment separately for each bar_id such that the following statements:
INSERT INTO foo (123, DEFAULT, 345)
INSERT INTO foo (124, DEFAULT, 345)
INSERT INTO foo (125, DEFAULT, 346)
INSERT INTO foo (126, DEFAULT, 345)
Would result in (desired behavior):
id | friendly_id | bar_id
-----------+------------------+-----------------
123 | 1 | 345
124 | 2 | 345
125 | 1 | 346
126 | 3 | 345
Instead of (current behavior):
id | friendly_id | bar_id
-----------+------------------+-----------------
123 | 1 | 345
124 | 2 | 345
125 | 3 | 346
126 | 4 | 345
Is this possible using sequences or is there a better way to achieve this?
create table foo (
id serial primary key,
friendly_id integer not null,
bar_id integer not null,
unique(friendly_id, bar_id)
);
At the application wrap the insertion in a exception catching loop to retry if a duplicate key exception is raised
insert into foo (friendly_id, bar_id)
select
coalesce(max(friendly_id), 0) + 1,
346
from foo
where bar_id = 346