postgresql json: keys as values - postgresql

I just discovered the json capabilities of postgresql but have trouble understanding how to generate json with queries. I hope the question I am asking makes sense and please excuse me if I am missing something obvious.
my problem ? how to generate json with some values being keys to others.
here an example
drop table if exists my_table;
create table my_table(id int, sale_year int, sale_qty int);
insert into my_table values (10, 2007, 2);
insert into my_table values (10, 2008, 1);
insert into my_table values (10, 2009, 0);
insert into my_table values (20, 2009, 2);
insert into my_table values (30, 2011, 1);
insert into my_table values (30, 2012, 3);
The following statement
SELECT id, json_agg(to_json(my_table)) FROM public.my_table group by id;
gives me a json per id (e.g. for id = 20)
20, [{"id":20, "sale_year": 2009, "sale_qty": 2}]
my question is:
is it possible to return a json with the following structure ?
{"2009": 2}

I think you want something like this:
select id, json_agg(json_build_object(sale_year, sale_qty))
from my_table
group by id
order by id;
This returns:
id | json_agg
---+-------------------------------------------
10 | [{"2007" : 2}, {"2008" : 1}, {"2009" : 0}]
20 | [{"2009" : 2}]
30 | [{"2011" : 1}, {"2012" : 3}]

I hope that this will help someone else
in some cases, one would want to get, not an array of jsonb data but a single jsonb element.
inspired from this post this post this is an example of how to do it
with tx1
as
(
select
*
from
(values
(10, 2007, 2),
(10, 2008, 1),
(10, 2009, 0),
(20, 2009, 2),
(30, 2011, 1),
(30, 2012, 3))
as t (id, sale_year, sale_qty)),
tx2
as
(select id,
jsonb_agg(json_build_object(sale_year, sale_qty)) as x_data
from tx1
group by id
order by id)
SELECT
id,
x_data,
jo.obj
FROM tx2
CROSS JOIN
LATERAL
(
SELECT JSON_OBJECT_AGG(jt.key, jt.value) obj
FROM JSONB_ARRAY_ELEMENTS(x_data) je
CROSS JOIN
LATERAL JSONB_EACH(je.value) jt
) jo
This gives
{ "2007" : 2, "2008" : 1, "2009" : 0 }
{ "2011" : 1, "2012" : 3 }

Related

Grouping user id columns together with string_agg on PostgreSQL 13

This is my emails table
create table emails (
id bigint not null primary key generated by default as identity,
name text not null
);
And contacts table:
create table contacts (
id bigint not null primary key generated by default as identity,
email_id bigint not null,
user_id bigint not null,
full_name text not null,
ordering int not null
);
As you can see I have user_id field here. There can be multiple same user ID's on my result so i want to join them using comma ,
Insert some data to the tables:
insert into emails (name)
values
('dennis1'),
('dennis2');
insert into contacts (id, email_id, user_id, full_name, ordering)
values
(5, 1, 1, 'dennis1', 9),
(6, 2, 1, 'dennis1', 5),
(7, 2, 1, 'dennis1', 1),
(8, 1, 3, 'john', 2),
(9, 2, 4, 'dennis7', 1),
(10, 2, 4, 'dennis7', 1);
My query is:
select em.name,
c.user_ids
from emails em
join (
select email_id, string_agg(user_id::text, ',' order by ordering desc) as user_ids
from contacts
group by email_id
) c on c.email_id = em.id
order by em.name;
Actual Result
name user_ids
dennis1 1,3
dennis2 1,1,4,4
Expected Result
name user_ids
dennis1 1,3
dennis2 1,4
On my real-world data, I get same user id like 50 times. Instead it should appear 1 time only. In example above, you see user 1 and 4 appears 2 times for dennis2 user.
How can I unique them?
Demo: https://dbfiddle.uk/?rdbms=postgres_13&fiddle=2e957b52eb46742f3ddea27ec36effb1
P.S: I tried to add user_id it to group by but this time I get duplicate rows...
demo:db<>fiddle
SELECT
name,
string_agg(user_id::text, ',' order by ordering desc)
FROM (
SELECT DISTINCT ON (em.id, c.user_id)
*
FROM emails em
JOIN contacts c ON c.email_id = em.id
) s
GROUP BY name
Join the tables
DISTINCT ON email and the user_id, so for every email record, there is no equal users
Aggregate

Performance Issue with finding recent date of each group and joining to all records

I have following tables:
CREATE TABLE person (
id INTEGER NOT NULL,
name TEXT,
CONSTRAINT person_pkey PRIMARY KEY(id)
);
INSERT INTO person ("id", "name")
VALUES
(1, E'Person1'),
(2, E'Person2'),
(3, E'Person3'),
(4, E'Person4'),
(5, E'Person5'),
(6, E'Person6');
CREATE TABLE person_book (
id INTEGER NOT NULL,
person_id INTEGER,
book_id INTEGER,
receive_date DATE,
expire_date DATE,
CONSTRAINT person_book_pkey PRIMARY KEY(id)
);
/* Data for the 'person_book' table (Records 1 - 9) */
INSERT INTO person_book ("id", "person_id", "book_id", "receive_date", "expire_date")
VALUES
(1, 1, 1, E'2016-01-18', NULL),
(2, 1, 2, E'2016-02-18', E'2016-10-18'),
(3, 1, 4, E'2016-03-18', E'2016-12-18'),
(4, 2, 3, E'2017-02-18', NULL),
(5, 3, 5, E'2015-02-18', E'2016-02-23'),
(6, 4, 34, E'2016-12-18', E'2018-02-18'),
(7, 5, 56, E'2016-12-28', NULL),
(8, 5, 34, E'2018-01-19', E'2018-10-09'),
(9, 5, 57, E'2018-06-09', E'2018-10-09');
CREATE TABLE book (
id INTEGER NOT NULL,
type TEXT,
CONSTRAINT book_pkey PRIMARY KEY(id)
) ;
/* Data for the 'book' table (Records 1 - 8) */
INSERT INTO book ("id", "type")
VALUES
( 1, E'Btype1'),
( 2, E'Btype2'),
( 3, E'Btype3'),
( 4, E'Btype4'),
( 5, E'Btype5'),
(34, E'Btype34'),
(56, E'Btype56'),
(67, E'Btype67');
My query should list name of all persons and for persons with recently received book types of (book_id IN (2, 4, 34, 56, 67)), it should display the book type and expire date; if a person hasn’t received such book type it should display blank as book type and expire date.
My query looks like this:
SELECT p.name,
pb.expire_date,
b.type
FROM
(SELECT p.id AS person_id, MAX(pb.receive_date) recent_date
FROM
Person p
JOIN person_book pb ON pb.person_id = p.id
WHERE pb.book_id IN (2, 4, 34, 56, 67)
GROUP BY p.id
)tmp
JOIN person_book pb ON pb.person_id = tmp.person_id
AND tmp.recent_date = pb.receive_date AND pb.book_id IN
(2, 4, 34, 56, 67)
JOIN book b ON b.id = pb.book_id
RIGHT JOIN Person p ON p.id = pb.person_id
The (correct) result:
name | expire_date | type
---------+-------------+---------
Person1 | 2016-12-18 | Btype4
Person2 | |
Person3 | |
Person4 | 2018-02-18 | Btype34
Person5 | 2018-10-09 | Btype34
Person6 | |
The query works fine but since I'm right joining a small table with a huge one, it's slow. Is there any efficient way of rewriting this query?
My local PostgreSQL version is 9.3.18; but the query should work on version 8.4 as well since that's our productions version.
Problems with your setup
My local PostgreSQL version is 9.3.18; but the query should work on version 8.4 as well since that's our productions version.
That makes two major problems before even looking at the query:
Postgres 8.4 is just too old. Especially for "production". It has reached EOL in July 2014. No more security upgrades, hopelessly outdated. Urgently consider upgrading to a current version.
It's a loaded footgun to use very different versions for development and production. Confusion and errors that go undetected. We have seen more than one desperate request here on SO stemming from this folly.
Better query
This equivalent should be substantially simpler and faster (works in pg 8.4, too):
SELECT p.name, pb.expire_date, b.type
FROM (
SELECT DISTINCT ON (person_id)
person_id, book_id, expire_date
FROM person_book
WHERE book_id IN (2, 4, 34, 56, 67)
ORDER BY person_id, receive_date DESC NULLS LAST
) pb
JOIN book b ON b.id = pb.book_id
RIGHT JOIN person p ON p.id = pb.person_id;
To optimize read performance, this partial multicolumn index with matching sort order would be perfect:
CREATE INDEX ON person_book (person_id, receive_date DESC NULLS LAST)
WHERE book_id IN (2, 4, 34, 56, 67);
In modern Postgres versions (9.2 or later) you might append book_id, expire_date to the index columns to get index-only scans. See:
How does PostgreSQL perform ORDER BY if a b-tree index is built on that field?
About DISTINCT ON:
Select first row in each GROUP BY group?
About DESC NULLS LAST:
PostgreSQL sort by datetime asc, null first?

Create a GROUP BY query to show the latest row

So my tables are:
user_msgs: http://sqlfiddle.com/#!9/7d6a9
token_msgs: http://sqlfiddle.com/#!9/3ac0f
There are only these 4 users as listed. When a user sends a message to another user, the query checks if there is a communication between those 2 users already started by checking the token_msgs table's from_id and to_id and if no token exists, create token and use that in the user_msgs table. So the token is a unique field in these 2 tables.
Now, I want to list the users with whom user1 has started the conversation. So if from_id or to_id include 1 those conversation should be listed.
There are multiple rows for conversations in the user_msgs table for same users.
I think I need to use group_concat but not sure. I am trying to build the query to do the same and show the latest of the conversation on the top, hence ORDER BY time DESC:
SELECT * FROM (SELECT * FROM user_msgs ORDER BY time DESC) as temp_messages GROUP BY token
Please help in building the query.
Thanks.
CREATE TABLE `token_msgs` (
`id` int(11) NOT NULL,
`from_id` int(100) NOT NULL,
`to_id` int(100) NOT NULL,
`token` varchar(50) NOT NULL
) ENGINE=MyISAM DEFAULT CHARSET=latin1;
--
-- Dumping data for table `token_msgs`
--
INSERT INTO `token_msgs` (`id`, `from_id`, `to_id`, `token`) VALUES
(1, 1, 2, '1omcda84om2'),
(2, 1, 3, '1omd0666om3'),
(3, 4, 1, '4om6713bom1'),
(4, 3, 4, '3om0e1abom4');
---
CREATE TABLE `user_msgs` (
`id` int(11) NOT NULL,
`token` varchar(50) NOT NULL,
`from_id` int(50) NOT NULL,
`to_id` int(50) NOT NULL,
`message` text NOT NULL,
`time` datetime NOT NULL
) ENGINE=MyISAM DEFAULT CHARSET=latin1;
--
-- Dumping data for table `user_msgs`
--
INSERT INTO `user_msgs` (`id`, `token`, `from_id`, `to_id`, `message`, `time`) VALUES
(1, '1omcda84om2', 1, 2, '1 => 2\r\nCan I have your picture so I can show Santa what I want for Christmas?', '2016-08-14 22:50:34'),
(2, '1omcda84om2', 2, 1, 'Makeup tip: You\'re not in the circus.\r\n2=>1', '2016-08-14 22:51:26'),
(3, '1omd0666om3', 1, 3, 'Behind every fat woman there is a beautiful woman. No seriously, your in the way. 1=>3', '2016-08-14 22:52:08'),
(4, '1omd0666om3', 3, 1, 'Me: Siri, why am I alone? Siri: *opens front facing camera*', '2016-08-14 22:53:24'),
(5, '1omcda84om2', 1, 2, 'I know milk does a body good, but damn girl, how much have you been drinking? 1 => 2', '2016-08-14 22:54:36'),
(6, '4om6713bom1', 4, 1, 'Hi, Im interested in your profile. Please send your contact number and I will call you.', '2016-08-15 00:18:11'),
(7, '3om0e1abom4', 3, 4, 'Girl you\'re like a car accident, cause I just can\'t look away. 3=>4', '2016-08-15 00:42:57'),
(8, '3om0e1abom4', 3, 4, 'Hola!! \r\n3=>4', '2016-08-15 00:43:34'),
(9, '1omd0666om3', 3, 1, 'Sometext from 3=>1', '2016-08-15 13:53:54'),
(10, '3om0e1abom4', 3, 4, 'More from 3->4', '2016-08-15 13:54:46');
Let's try this (on fiddle):
SELECT *
FROM (SELECT * FROM user_msgs
WHERE from_id = 1 OR to_id = 1
ORDER BY id DESC
) main
GROUP BY from_id + to_id
ORDER BY id DESC
Thing to mention GROUP BY from_id + to_id this is because sum makes it unique for each conversation between two persons: like from 1 to 3 is same as from 3 to 1. No need for extra table, and it makes it harder to maintain
UPDATE:
Because sometimes GROUPing works weird in MySQL I've created new approach to this problem:
SELECT
a.*
FROM user_msgs a
LEFT JOIN user_msgs b
ON ((b.`from_id` = a.`from_id` AND b.`to_id` = a.`to_id`)
OR (b.`from_id` = a.`to_id` AND b.`to_id` = a.`from_id`))
AND a.`id` < b.`id`
WHERE (a.from_id = 1 OR a.to_id = 1)
AND b.`id` IS NULL
ORDER BY a.id DESC

PostgreSQL Get holes in index column

I suppose it is not easy to query a table for data which don't exists but maybe here is some trick to achieve holes in one integer column (rowindex).
Here is small table for illustrating concrete situation:
DROP TABLE IF EXISTS examtable1;
CREATE TABLE examtable1
(rowindex integer primary key, mydate timestamp, num1 integer);
INSERT INTO examtable1 (rowindex, mydate, num1)
VALUES (1, '2015-03-09 07:12:45', 1),
(3, '2015-03-09 07:17:12', 4),
(5, '2015-03-09 07:22:43', 1),
(6, '2015-03-09 07:25:15', 3),
(7, '2015-03-09 07:41:46', 2),
(10, '2015-03-09 07:42:05', 1),
(11, '2015-03-09 07:45:16', 4),
(14, '2015-03-09 07:48:38', 5),
(15, '2015-03-09 08:15:44', 2);
SELECT rowindex FROM examtable1;
With showed query I get all used indexes listed.
But I would like to get (say) first five indexes which is missed so I can use them for insert new data at desired rowindex.
In concrete example result will be: 2, 4, 8, 9, 12 what represent indexes which are not used.
Is here any trick to build a query which will give n number of missing indexes?
In real, such table may contain many rows and "holes" can be anywhere.
You can do this by generating a list of all numbers using generate_series() and then check which numbers don't exist in your table.
This can either be done using an outer join:
select nr.i as missing_index
from (
select i
from generate_series(1, (select max(rowindex) from examtable1)) i
) nr
left join examtable1 t1 on nr.i = t1.rowindex
where t1.rowindex is null;
or an not exists query:
select i
from generate_series(1, (select max(rowindex) from examtable1)) i
where not exists (select 1
from examtable1 t1
where t1.rowindex = i.i);
I have used a hardcoded lower bound for generate_series() so that you would also detect a missing rowindex that is smaller than the lowest number.

Select all but sort by count in postgresql

I have a table myTable with a lot of columns, keep in mind this table is too big, and one of that columns is a geometry point, we'll call it mySortColumn. I need to sort my select by count mySortColumn when there are the same.
One example could be this
myTable
id, mySortColumn
----------------
1, ASD12321F
2, ASD12321G
3, ASD12321F
4, ASD12321G
5, ASD12321H
6, ASD12321F
I have a query which can do what I want, the problem is the time. Actually it take like 30 seconds, and it seems like this:
SELECT
id,
mySortColumn
FROM
myTable
JOIN (
SELECT
mySortColumn,
ST_Y(mySortColumn) AS lat,
ST_X(mySortColumn) AS lng,
COUNT(*)
FROM myTable
GROUP BY mySortColumn
HAVING COUNT(*) > 1
) AS myPosition ON (
ST_X(myTable.mySortColumn) = myPosition.lng
AND ST_Y(myTable.mySortColumn) = myPosition.lat
)
WHERE
<some filters>
ORDER BY COUNT DESC
The result must be this:
id, mySortColumn
----------------
1, ASD12321F
3, ASD12321F
6, ASD12321F
2, ASD12321G
4, ASD12321G
5, ASD12321H
I hope you can help me.
Here you are:
select * from myTable order by count(1) over (partition by mySortColumn) desc;
For more info about aggregate over () construction have a look at:
http://www.postgresql.org/docs/9.4/static/tutorial-window.html