Fetch records with distinct value of one column while replacing another col's value when multiple records - db2

I have 2 tables that I need to join based on distinct rid while replacing the column value with having different values in multiple rows. Better explained with an example set below.
CREATE TABLE usr (rid INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
name VARCHAR(12) NOT NULL,
email VARCHAR(20) NOT NULL);
CREATE TABLE usr_loc
(rid INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
code CHAR NOT NULL PRIMARY KEY,
loc_id INT NOT NULL PRIMARY KEY);
INSERT INTO usr VALUES
(1,'John','john#product'),
(2,'Linda','linda#product'),
(3,'Greg','greg#product'),
(4,'Kate','kate#product'),
(5,'Johny','johny#product'),
(6,'Mary','mary#test');
INSERT INTO usr_loc VALUES
(1,'A',4532),
(1,'I',4538),
(1,'I',4545),
(2,'I',3123),
(3,'A',4512),
(3,'A',4527),
(4,'I',4567),
(4,'A',4565),
(5,'I',4512),
(6,'I',4567);
(6,'I',4569);
Required Result Set
+-----+-------+------+-----------------+
| rid | name | Code | email |
+-----+-------+------+-----------------+
| 1 | John | B | 'john#product' |
| 2 | Linda | I | 'linda#product' |
| 3 | Greg | A | 'greg#product' |
| 4 | Kate | B | 'kate#product' |
| 5 | Johny | I | 'johny#product' |
| 6 | Mary | I | 'mary#test' |
+-----+-------+------+-----------------+
I have tried some queries to join and some to count but lost with the one which exactly satisfies the whole scenario.
The query I came up with is
SELECT distinct(a.rid)as rid, a.name, a.email, 'B' as code
FROM usr
JOIN usr_loc b ON a.rid=b.rid
WHERE a.rid IN (SELECT rid FROM usr_loc GROUP BY rid HAVING COUNT(*) > 1);`

You need to group by the users and count how many occurrences you have in usr_loc. If more than a single one, then replace the code by B. See below:
select
rid,
name,
case when cnt > 1 then 'B' else min_code end as code,
email
from (
select u.rid, u.name, u.email, min(l.code) as min_code, count(*) as cnt
from usr u
join usr_loc l on l.rid = u.rid
group by u.rid, u.name, u.email
) x;
Seems to me that you are using MySQL, rather than IBM DB2. Is that so?

Related

How can I ensure that a join table is referencing two tables with a composite FK, one of the two column being in common on both tables?

I have 3 tables : employee, event, and these are N-N so the 3rd table employee_event.
The trick is, they can only N-N within the same group
employee
+---------+--------------+
| id | group |
+---------+--------------+
| 1 | A |
| 2 | B |
+---------+--------------+
event
+---------+--------------+
| id | group |
+---------+--------------+
| 43 | A |
| 44 | B |
+----
employee_event
+---------+--------------+
| employee_id | event_id |
+-------------+--------------+
| 1 | 43 |
| 2 | 44 |
+---------+--------------+
So the combination employee_id=1 event_id=44 should not be possible, because employee from group A can not attend an event from group B. How can I secure my DB with this?
My first idea is to add the column employee_event.group so that I can make my two FK (composite) with employee_id + group and event_id + group respectively to the table employee and event. But is there a way to avoid adding a column in the join table for the only purpose of FKs?
Thx!
You may create a function and use it as a check constraint on table employee_event.
create or replace function groups_match (employee_id integer, event_id integer)
returns boolean language sql as
$$
select
(select group from employee where id = employee_id) =
(select group from event where id = event_id);
$$;
and then add a check constraint on table employee_event.
ALTER TABLE employee_event
ADD CONSTRAINT groups_match_check
CHECK groups_match(employee_id, event_id);
Still bear in mind that rows in employee_event that used to be valid may become invalid but still remain intact if certain changes in tables employee and event occur.

PostgreSQL Group By not working as expected - wants too many inclusions

I have a simple postgresql table that I'm tying to query. Imaging a table like this...
| ID | Account_ID | Iteration |
|----|------------|-----------|
| 1 | 100 | 1 |
| 2 | 101 | 1 |
| 3 | 100 | 2 |
I need to get the ID column for each Account_ID where Iteration is at its maximum value. So, you'd think something like this would work
SELECT "ID", "Account_ID", MAX("Iteration")
FROM "Table_Name"
GROUP BY "Account_ID"
And I expect to get:
| ID | Account_ID | MAX(Iteration) |
|----|------------|----------------|
| 2 | 101 | 1 |
| 3 | 100 | 2 |
But when I do this, Postgres complains:
ERROR: column "ID" must appear in the GROUP BY clause or be used in an aggregate function
Which, when I do that it just destroys the grouping altogether and gives me the whole table!
Is the best way to approach this using the following?
SELECT DISTINCT ON ("Account_ID") "ID", "Account_ID", "Iteration"
FROM "Marketing_Sparks"
ORDER BY "Account_ID" ASC, "Iteration" DESC;
The GROUP BY statement aggregates rows with the same values in the columns included in the group by into a single row. Because this row isn't the same as the original row, you can't have a column that is not in the group by or in an aggregate function. To get what you want, you will probably have to select without the ID column, then join the result to the original table. I don't know PostgreSQL syntax, but I assume it would be something like the following.
SELECT Table_Name.ID, aggregate.Account_ID, aggregate.MIteration
(SELECT Account_ID, MAX(Iteration) AS MIteration
FROM Table_Name
GROUP BY Account_ID) aggregate
LEFT JOIN Table_Name ON aggregate.Account_ID = Table_Name.Account_ID AND
aggregate.MIteration = Tabel_Name.Iteration

Postgres - updates with join gives wrong results

I'm having some hard time understanding what I'm doing wrong.
The result of this query shows the same results for each row instead of being updated by the right result.
My DATA
I'm trying to update a table of stats over a set of business
business_stats ( id SERIAL,
pk integer not null,
b_total integer,
PRIMARY KEY(pk)
);
the details of each business are stored here
business_details (id SERIAL,
category CHARACTER VARYING,
feature_a CHARACTER VARYING,
feature_b CHARACTER VARYING,
feature_c CHARACTER VARYING
);
and here a table that associate the pk with the category
datasets (id SERIAL,
pk integer not null,
category CHARACTER VARYING;
PRIMARY KEY(pk)
);
WHAT I DID (wrong)
UPDATE business_stats
SET b_total = agg.total
FROM business_stats b,
( SELECT d.pk, count(bd.id) total
FROM business_details AS bd
INNER JOIN datasets AS d
ON bd.category = d.category
GROUP BY d.pk
) agg
WHERE b.pk = agg.pk;
The result of this query is
| id | pk | b_total |
+----+----+-----------+
| 1 | 14 | 273611 |
| 2 | 15 | 273611 |
| 3 | 16 | 273611 |
| 4 | 17 | 273611 |
but if I run just the SELECT the results of each pk are completely different
| pk | agg.total |
+----+-------------+
| 14 | 273611 |
| 15 | 407802 |
| 16 | 179996 |
| 17 | 815580 |
THE QUESTION
why is this happening?
why is the WHERE clause not working?
Before writing this question I've used as reference these posts: a, b, c
Do the following (I always recommend against joins in Updates)
UPDATE business_stats bs
SET b_total =
( SELECT count(c.id) total
FROM business_details AS bd
INNER JOIN datasets AS d
ON bd.category = d.category
where d.pk=bs.pk
)
/*optional*/
where exists (SELECT *
FROM business_details AS bd
INNER JOIN datasets AS d
ON bd.category = d.category
where d.pk=bs.pk)
The issue is your FROM clause. The repeated reference to business_stats means you aren't restricting the join like you expect to. You're joining agg against the second unrelated mention of business_stats rather than the row you want to update.
Something like this is what you are after (warning not tested):
UPDATE business_stats AS b
SET b_total = agg.total
FROM
(...) agg
WHERE b.pk = agg.pk;

Postgresssql Advanced Join

I am new to postgresssql.
I have two table like this
Table A
id |value | type1 | type2 | type3
bigint |text | bigint | bigint | bigint
Table B
Id | description
bigint | text
Table A's type1,type2,type3 is the ids of Table B but not foreign key constraint.
I have to retrieve like this
Select a.id,
a.value,
b.description1(as of a.type1),
b.description1(as of a.type1),
b.description1(as of a.type1)
If you have to many columns you should consider change your db design
TableA
id | value
1 | <something>
2 | <something>
TableAType
id | TableA.id | type_id | typeValue
1 | 1 | type1 | bigint
2 | 1 | type2 | bigint
3 | 1 | type3 | bigint
.....
4 | 1 | typeN | bigint
TableB (type_description)
Id | description
bigint | text
Then your query become more simple and isn't affected when you add/remove types.
SELECT TB.Description, TT.TypeValue
FROM TableAType TT
JOIN TableB TB
ON TT.Type_id = TB.id
Then you can use a PIVOT to get the tabular data. Again the advantage is you can delete remove types, and your query doesnt change, only need update the types tables.
You should (LEFT) JOIN tableB 3 times, and use a different alias for each one.
select id, value,
type1, t1.description descri1,
type2, t2.description descri2,
type3, t3.description descri3
from tableA ta
left join tableB t1
on ta.type1 = t1.id
left join tableB t2
on ta.type2 = t2.id
left join tableB t3
on ta.type3 = t3.id;

Sort SELECT result by pairs of columns

In the following PostgreSQL 8.4.13 table
(where author users give grades to id users):
# \d pref_rep;
Table "public.pref_rep"
Column | Type | Modifiers
-----------+-----------------------------+-----------------------------------------------------------
id | character varying(32) | not null
author | character varying(32) | not null
good | boolean |
fair | boolean |
nice | boolean |
about | character varying(256) |
stamp | timestamp without time zone | default now()
author_ip | inet |
rep_id | integer | not null default nextval('pref_rep_rep_id_seq'::regclass)
Indexes:
"pref_rep_pkey" PRIMARY KEY, btree (id, author)
Check constraints:
"pref_rep_check" CHECK (id::text <> author::text)
Foreign-key constraints:
"pref_rep_author_fkey" FOREIGN KEY (author) REFERENCES pref_users(id) ON DELETE CASCADE
"pref_rep_id_fkey" FOREIGN KEY (id) REFERENCES pref_users(id) ON DELETE CASCADE
how to find faked entries, which have same id and same author_ip?
I.e. some users register several accounts and then submit bad notes (the good, fair, nice columns above) for other users. But I can still identify them by their author_ip addresses.
I'm trying to find them by fetching:
# select id, author_ip from pref_rep group by id, author_ip;
id | author_ip
-------------------------+-----------------
OK490496816466 | 94.230.231.106
OK360565502458 | 78.106.102.16
DE25213 | 178.216.72.185
OK331482634936 | 95.158.209.5
VK25785834 | 77.109.20.182
OK206383671767 | 80.179.90.103
OK505822972559 | 46.158.46.126
OK237791033602 | 178.76.216.77
VK90402803 | 109.68.173.37
MR16281819401420759860 | 109.252.139.198
MR5586967138985630915 | 2.93.14.248
OK341086615664 | 93.77.75.142
OK446200841566 | 95.59.127.194
But I need to sort the above result.
How can I sort it by the number of pairs (id, author_ip) desc please?
select id, pr.author_ip
from
pref_rep pr
inner join
(
select author_ip
from pref_rep
group by author_ip
having count(*) > 1
) s using(author_ip)
order by 2, 1