Postgres set jsonb field with columns from the same table - postgresql

I am trying to move data from a couple of columns to a single JSONB column.
For example, I have x,y,z columns like this :
id | x | y | z | data
---------------------
1 | 1 | 2 | 3 | NULL
2 | 4 | 5 | 6 | NULL
3 | 7 | 8 | 9 | NULL
And I want to make it like that :
id | x | y | z | data
---------------------
1 | 1 | 2 | 3 | {"x":1, "y":2, "z": 3}
2 | 4 | 5 | 6 | {"x":4, "y":5, "z": 6}
3 | 7 | 8 | 9 | {"x":7, "y":8, "z": 9}
I tried unsuccessfully with jsonb_set and jsonb_insert. I am using postgresql 14.

You can convert the entire row to a json value, then remove the id column
select t.*,
to_jsonb(t) - 'id' as data
from the_table t;

OK, thanks to #Edouard I've made the query :
UPDATE table SET data = jsonb_build_object('x', x, 'y', y, 'z', z);

An update version of the solution by #a_horse_with_no_name:
update the_table set data = to_jsonb(the_table.*) - 'id' - 'data';
Unlike the jsonb_build_object approach it works w/o change for whatever number of columns except id and data.

Related

Postgresql: Making timeseries table smaller

I am using PostgreSQL 12.5, compiled by Visual C++ build 1914, 64-bit and I have the following table:
CREATE TABLE public.historian_new_data_id_v2 (
"timestamp" timestamptz NOT NULL,
value float8 NOT NULL,
quality float4 NOT NULL,
tagname_id int2 NOT NULL
);
I would expect this table to take 8 + 8 + 4 + 2 = 22 bytes per row, as I think there are no alignment issues. Even in the worst case, with all fields 8-bytes-aligned I would expect it to take 8 * 4 = 32 bytes per row.
However, these are the size stats I get for the table:
with row_count as (select COUNT(*) as c from historian_new_data_id_v2)
select
c as "number of rows",
pg_size_pretty(pg_total_relation_size('historian_new_data_id_v2')) as "total table size",
pg_total_relation_size('historian_new_data_id_v2')::numeric / c as " bytes/row"
from row_count
number of rows: 409858537
table size: 20 GB
bytes / row: 52.1783453494345538
That is a lot of overhead! 52 bytes per row instead of the expected 22 or worse case 32. How is this difference explained?
And also, any advice to make this table smaller (the amount of rows is going to skyrocket soon)?
Each row in PostgreSQL has system columns:
select attname, attnum, attlen
from pg_attribute
where attrelid = 'public.historian_new_data_id_v2'::regclass
order by attnum;
+------------+----------+----------+
| attname | attnum | attlen |
|------------+----------+----------|
| tableoid | -6 | 4 |
| cmax | -5 | 4 |
| xmax | -4 | 4 |
| cmin | -3 | 4 |
| xmin | -2 | 4 |
| ctid | -1 | 6 |
| timestamp | 1 | 8 |
| value | 2 | 8 |
| quality | 3 | 4 |
| tagname_id | 4 | 2 |
+------------+----------+----------+
This columns available to select, if you want:
select tableoid, cmax, xmax, cmin, xmin, ctid, timestamp, value, quality, tagname_id
from public.historian_new_data_id_v2;

unify multiple tables with the same structure into one table in postgres

I have multiple tables with the same structure in a postgres database. I want to unify these tables into one table.
Table 1
| a | b |
----|----
| 1 | 2 |
Table 2
| a | b |
----|----
| 3 | 4 |
| 5 | 6 |
Table 3
| a | b |
----|----
| 7 | 8 |
Expected Output Table:
| a | b |
----|----
| 1 | 2 |
| 3 | 4 |
| 5 | 6 |
| 7 | 8 |
Can anyone help me in this issue?
This should be pretty simple.
Check out the union Command. Sql Union
Basically you want to append all the data in one table.
With the assumption that the tables are limited and you dont want any automation around that the easiest approach would be to build select union queries like below:
select a,b from table1
union
select a,b from table2
union
select a,b from table3
You can use the "insert into select" to create a new table with the outout of your result.

Setting muliple rows in postgres based on the set values of previous postgres rows

I'm running postgres 9.4
I'm essentially updating an existing unorganized structure to a folder based organization. Im auto-assigning an order number to each item for user reordering, but doing an initial setting of all of these values with a 1 time use update statement. However, It seems like SET is taking my subquery's from clause and not recreating it for each successive row that it sets.
Here's my query example:
UPDATE folder_items
SET order_number =
(SELECT COALESCE(MAX(folder_items_2.order_number), 0) + 1
FROM folder_items AS folder_items_2
WHERE folder_items.parent_folder_id = folder_items_2.parent_folder_id
AND folder_items.folder_set_id = folder_items_2.folder_set_id
AND folder_items.id != folder_items_2.id);
With my initial table:
| folder_id | folder_set_id | order_number
row 1 | 1 | 1 | null
row 2 | 2 | 1 | null
row 3 | 3 | 2 | null
row 4 | 4 | 2 | null
row 5 | 5 | 2 | null
row 6 | 6 | 3 | null
when I run my query I get something like
| folder_id | folder_set_id | order_number
row 1 | 1 | 1 | 1
row 2 | 2 | 1 | 1
row 3 | 3 | 2 | 1
row 4 | 4 | 2 | 1
row 5 | 5 | 2 | 1
row 6 | 6 | 3 | 1
However, I want results that look like this:
| folder_id | folder_set_id | order_number
row 1 | 1 | 1 | 1
row 2 | 2 | 1 | 2
row 3 | 3 | 2 | 1
row 4 | 4 | 2 | 2
row 5 | 5 | 2 | 3
row 6 | 6 | 3 | 1
Is there a way to get these desired results? Is the best way to do some sort of window function that counts how many in the same folder_set_id are underneath each row?
Use ROW_NUMBER to calculate the ORDER_ID, then update the table.
with new_order as (
SELECT "folder_id",
row_number() over ( partition by "folder_set_id"
order by "folder_id") as rn
FROM Table1
)
UPDATE Table1 AS t
SET "order_number" = n.rn
FROM new_order AS n
WHERE t."folder_id" = n."folder_id";
SQL DEMO
OUTPUT
| row_id | folder_id | folder_set_id | order_number |
|--------|-----------|---------------|--------------|
| row 1 | 1 | 1 | 1 |
| row 2 | 2 | 1 | 2 |
| row 3 | 3 | 2 | 1 |
| row 4 | 4 | 2 | 2 |
| row 5 | 5 | 2 | 3 |
| row 6 | 6 | 3 | 1 |

Multi-table recursive sql statement

I have been struggling to optimize a recursive call done purely in ruby. I have moved the data onto a postgresql database, and I would like to make use of the WITH RECURSIVE function that postgresql offers.
The examples that I could find all seems to use a single table, such as a menu or a categories table.
My situation is slightly different. I have a questions and an answers table.
+----------------------+ +------------------+
| questions | | answers |
+----------------------+ +------------------+
| id | | source_id | <- from question ID
| start_node (boolean) | | target_id | <- to question ID
| end_node (boolean) | +------------------+
+----------------------+
I would like to fetch all questions that's connected together by the related answers.
I would also like to be able to go the other way in the tree, e.g from any given node to the root node in the tree.
To give another example of a question-answer tree in a graphical way:
Q1
|-- A1
| '-- Q2
| |-- A2
| | '-- Q3
| '-- A3
| '-- Q4
'-- A4
'-- Q5
As you can see, a question can have multiple outgoing questions, but they can also have multiple incoming answers -- any-to-many.
I hope that someone has a good idea, or can point me to some examples, articles or guides.
Thanks in advance, everybody.
Regards,
Emil
This is far, far from ideal but I would play around recursive query over joins, like that:
WITH RECURSIVE questions_with_answers AS (
SELECT
q.*, a.*
FROM
questions q
LEFT OUTER JOIN
answers a ON (q.id = a.source_id)
UNION ALL
SELECT
q.*, a.*
FROM
questions_with_answers qa
JOIN
questions q ON (qa.target_id = q.id)
LEFT OUTER JOIN
answers a ON (q.id = a.source_id)
)
SELECT * FROM questions_with_answers WHERE source_id IS NOT NULL AND target_id IS NOT NULL;
Which gives me result:
id | name | start_node | end_node | source_id | target_id
----+------+------------+----------+-----------+-----------
1 | Q1 | | | 1 | 2
2 | A1 | | | 2 | 3
3 | Q2 | | | 3 | 4
3 | Q2 | | | 3 | 6
4 | A2 | | | 4 | 5
6 | A3 | | | 6 | 7
1 | Q1 | | | 1 | 8
8 | A4 | | | 8 | 9
2 | A1 | | | 2 | 3
3 | Q2 | | | 3 | 6
3 | Q2 | | | 3 | 4
4 | A2 | | | 4 | 5
6 | A3 | | | 6 | 7
8 | A4 | | | 8 | 9
3 | Q2 | | | 3 | 6
3 | Q2 | | | 3 | 4
6 | A3 | | | 6 | 7
4 | A2 | | | 4 | 5
6 | A3 | | | 6 | 7
4 | A2 | | | 4 | 5
(20 rows)
In fact you do not need two tables.
I would like to encourage you to analyse this example.
Maintaining one table instead of two will save you a lot of trouble, especially when it comes to recursive queries.
This minimal structure contains all the necessary information:
create table the_table (id int primary key, parent_id int);
insert into the_table values
(1, 0), -- root question
(2, 1),
(3, 1),
(4, 2),
(5, 2),
(6, 1),
(7, 3),
(8, 0), -- root question
(9, 8);
Whether the node is a question or an answer depends on its position in the tree. Of course, you can add a column with information about the type of node to the table.
Use this query to get answer for both your requests (uncomment adequate where condition):
with recursive cte(id, parent_id, depth, type, root) as (
select id, parent_id, 1, 'Q', id
from the_table
where parent_id = 0
-- and id = 1 <-- looking for list of a&q for root question #1
union all
select
t.id, t.parent_id, depth+ 1,
case when (depth & 1)::boolean then 'A' else 'Q' end, c.root
from cte c
join the_table t on t.parent_id = c.id
)
select *
from cte
-- where id = 9 <-- looking for root question for answer #9
order by id;
id | parent_id | depth | type | root
----+-----------+-------+------+------
1 | 0 | 1 | Q | 1
2 | 1 | 2 | A | 1
3 | 1 | 2 | A | 1
4 | 2 | 3 | Q | 1
5 | 2 | 3 | Q | 1
6 | 1 | 2 | A | 1
7 | 3 | 3 | Q | 1
8 | 0 | 1 | Q | 8
9 | 8 | 2 | A | 8
(9 rows)
The relationship child - parent is unambiguous and applies to both sides. There is no need to store this information twice. In other words, if we store information about parents, the information about children is redundant (and vice versa). It is one of the fundamental properties of the data structure called tree. See the examples:
-- find parent of node #6
select parent_id
from the_table
where id = 6;
-- find children of node #6
select id
from the_table
where parent_id = 6;

How to set sequence number of sub-elements in TSQL unsing same element as parent?

I need to set a sequence inside T-SQL when in the first column I have sequence marker (which is repeating) and use other column for ordering.
It is hard to explain so I try with example.
This is what I need:
|------------|-------------|----------------|
| Group Col | Order Col | Desired Result |
|------------|-------------|----------------|
| D | 1 | NULL |
| A | 2 | 1 |
| C | 3 | 1 |
| E | 4 | 1 |
| A | 5 | 2 |
| B | 6 | 2 |
| C | 7 | 2 |
| A | 8 | 3 |
| F | 9 | 3 |
| T | 10 | 3 |
| A | 11 | 4 |
| Y | 12 | 4 |
|------------|-------------|----------------|
So my marker is A (each time I met A I must start new group inside my result). All rows before first A must be set to NULL.
I know that I can achieve that with loop but it would be slow solution and I need to update a lot of rows (may be sometimes several thousand).
Is there a way to achive this without loop?
You can use window version of COUNT to get the desired result:
SELECT [Group Col], [Order Col],
COUNT(CASE WHEN [Group Col] = 'A' THEN 1 END)
OVER
(ORDER BY [Order Col]) AS [Desired Result]
FROM mytable
If you need all rows before first A set to NULL then use SUM instead of COUNT.
Demo here