Aggregate function for corresponding row - postgresql

I have the following table with a combined primary key of id and ts to implement historization:
create table "author" (
"id" bigint not null,
"ts" timestamp not null default now(),
"login" text unique not null,
primary key ("id", "ts")
);
Now I am interested only in the latest login value. Therefor I group by id:
select "id", max("ts"), "login" from "author" group by "id";
But this throws an error: login should be used in an aggregate function.
id and max("ts") uniquely identify a row because the tupple (id, ts) is the primary key. I need the login which matches the row identified by id and max("ts").
I can write a sub-select to find the login:
select ao."id", max(ao."ts"),
(select ai.login from "author" ai
where ao."id" = ai."id" and max(ao."ts") = ai.ts)
from "author" ao
group by "id";
This works but it is quite noisy and not very clever, because it searches the whole table although searching the group would be sufficient.
Does an aggregate function exist, which avoids the sub-select and gives me the remaining login, which belongs to id and max("ts")?

You have to identify the correct key to get the value you like from the table.
The correct key is:
select "id", max("ts") from "author" group by "id";
And using this to get the login you want:
select a1."id", a1.ts, a1.login
from "author" a1
inner join (select "id", max("ts") maxts, "login" from "author" group by "id") a2
ON a1.id = a2.id AND a1.ts = a2.maxts;
Alternatively using window functions:
SELECT "id", "ts", login
FROM (
select "id", "ts", CASE WHEN "ts" = max("ts") OVER (PARTITION BY "id") THEN 1 ELSE 0 END as isMax, "login" from "author" group by "id"
) dt
WHERE isMax = 1
There's a few other ways to skin this cat, but that's basically the gist.

Related

How to distinct on a table with a pair of opposite id values and get one id pair?

I want to eliminate from a table with two id columns (aID, bId) the opposite Id pairs.
For example the aId = 123abc and the bId = 345def. The opposite side is aId = 345def and bId = 123abc. Another bId = 678def is not the opposite side from aId ! That should be listed.
How to get only one of this pairs, which one is even.
CREATE TABLE distinct_pair_of_id (
"aId" VARCHAR,
"bId" VARCHAR
);
INSERT INTO distinct_pair_of_id ("aId", "bId")
VALUES
('123abc', '345def'),
('345def', '123abc'),
('123abc', '678def'),
('678def', '123abc'),
('345def', '986def'),
('345def', '765def')
;
You could use a LEAST/GREATEST trick here:
SELECT DISTINCT LEAST(aId, bId) AS aId,
GREATEST(aId, bId) AS bId
FROM distinct_pair_of_id;
This would also return records having only one value but no pair. If you instead want to return only records which do have a pair, we can aggregate:
SELECT LEAST(aId, bId) AS aId,
GREATEST(aId, bId) AS bId
FROM distinct_pair_of_id
GROUP BY 1, 2
HAVING COUNT(*) > 1;

How to fetch from a table has lots of foreign keys to an another table..?

I have a table called USER and it has a column "id" and "username",
And have another table called BUGS and it has a columns "createdBy", "updatedBy" and these columns is a foreign key to the "id" column in USER.
So how can i join twice from USER.
Something like:
SELECT
u1.user_name AS created_by_name
u2.user_name AS updated_by_name
b.id AS bug_id
FROM
bugs b
JOIN users u1 ON b.created_by = u1.id
JOIN users u2 ON b.updated_by = u2.id
If the updated_by column can be null then you want a LEFT JOIN for that.

Merging an array of JSON objects to one JSON column in Postgres

I have two tables, products and products_ext that can be reduced essentially
to this basic form:
CREATE TABLE "products" (
"product_id" TEXT PRIMARY KEY
);
CREATE TABLE "products_ext" (
"product_id" TEXT NOT NULL,
"key" TEXT NOT NULL,
"value" JSON,
PRIMARY KEY ("product_id", "key"),
FOREIGN KEY ("product_id") REFERENCES "products"("product_id")
ON DELETE CASCADE
ON UPDATE CASCADE
);
Let us assume mock data
INSERT INTO "products" ("product_id") VALUES
('test1'),
('test2'),
('test3');
INSERT INTO "products_ext" (product_id, "key", "value") VALUES
('test1', 'foo', '"Foo"'),
('test1', 'bar', '"Bar"'),
('test2', 'foo', '"Foo"');
I can use a query
SELECT
"P"."product_id",
ARRAY(
SELECT
json_build_object(
"E"."key",
"E"."value"
)
FROM "products_ext" AS "E"
WHERE "E"."product_id" = "P"."product_id"
)
FROM
"products" AS "P";
which yields
product_id | array
------------+-----------------------------------------------
test1 | {"{\"foo\" : \"Foo\"}","{\"bar\" : \"Bar\"}"}
test2 | {"{\"foo\" : \"Foo\"}"}
but I cannot make it to yield a merged JSON. Is there an easy way in Postgres 10
to merge an array of multiple JSONs as one JSON that would yield?
product_id | json
------------+----------------------------------------
test1 | {\"foo\" : \"Foo\", \"bar\" : \"Bar\"}
test2 | {\"foo\" : \"Foo\"}
test3 | {}
Primary key pair "product_id" and "key" already make sure that there are no
key collisions. There may be rows in the products that do not have any data in products_ext and in those cases an empty JSON object should be provided.
demo:db<>fiddle
Use json_object_agg():
SELECT
p.product_id AS product_id,
json_object_agg(e.key, e.value)
FROM
products AS p
JOIN
products_ext AS e ON p.product_id = e.product_id
GROUP BY p.product_id;
Edit for empty product_ext values:
demo:db<>fiddle
SELECT
p.product_id AS product_id,
COALESCE(
json_object_agg(e.key, e.value) FILTER (WHERE e.key IS NOT NULL),
'{}'
)
FROM
products AS p
LEFT JOIN
products_ext AS e ON p.product_id = e.product_id
GROUP BY p.product_id;

PostgreSQL - Optimize query with multiple subqueries

I have 2 tables, users and sessions. The tables look like this:
users - id (int), name (varchar)
sessions - id (int), user_id (int), ip (inet), cookie_identifier (varchar)
All columns have an index.
Now, I am trying to query all users that have a session with the same ip or cookie_identifier as a specific user.
Here is my query:
SELECT *
FROM "users"
WHERE "id" IN
(SELECT "user_id"
FROM "sessions"
WHERE "user_id" <> 1234
AND ("ip" IN
(SELECT "ip"
FROM "sessions"
WHERE "user_id" = 1234
GROUP BY "ip")
OR "cookie_identifier" IN
(SELECT "cookie_identifier"
FROM "sessions"
WHERE "user_id" = 1234
GROUP BY "cookie_identifier"))
GROUP BY "user_id")
The users table has ~200,000 rows, the sessions table has ~1.5 million rows. The query takes around 3-5 seconds.
Is it possible to optimize those results?
I would suggest, as a trial, to remove all grouping:
SELECT
*
FROM users
WHERE id IN (
SELECT
user_id
FROM sessions
WHERE user_id <> 1234
AND (ip IN (
SELECT
ip
FROM sessions
WHERE user_id = 1234
)
OR cookie_identifier IN (
SELECT
cookie_identifier
FROM sessions
WHERE user_id = 1234
)
)
)
;
If that isn't helpful, try altering the above to use EXISTS instead of IN
SELECT
*
FROM users u
WHERE EXISTS (
SELECT
NULL
FROM sessions s
WHERE s.user_id <> 1234
AND u.id = s.user_id
AND EXISTS (
SELECT
NULL
FROM sessions s2
WHERE s2.user_id = 1234
AND (s.ip = s2.ip
OR s.cookie_identifier = s2.cookie_identifier
)
)
)
;

Return Specific ID in Postgres Column

I am learning Postgres and have a basic question.
Let's say I have the following:
SELECT "id", count(*) AS "count"
FROM "events" GROUP BY "id" ORDER BY "id"
How would I retrieve a specific id from this? Like, ID 12345?
Since the filter is not on an aggregated value, won't adding it in WHERE clause be better (performance-wise)?
SELECT "id", count(*) AS "count"
FROM "events" WHERE id = 12345 GROUP BY "id"
Probably the order by don't serve any purpose either.
add a having clause like
SELECT "id", count(*) AS "count"
FROM "events" GROUP BY "id" having id = 12345 ORDER BY "id"
see a sample here
http://sqlfiddle.com/#!15/09fc9/2