Join two tables with count from first table - postgresql

I know there is an obvious answer to this question, but I'm like a noob trying to remember how to write queries. I have the following table structure in Postgresql:
CREATE TABLE public.table1 (
accountid BIGINT NOT NULL,
rpt_start DATE NOT NULL,
rpt_end DATE NOT NULL,
CONSTRAINT table1_pkey PRIMARY KEY(accountid, rpt_start, rpt_end)
)
WITH (oids = false);
CREATE TABLE public.table2 (
customer_id BIGINT NOT NULL,
read VARCHAR(255),
CONSTRAINT table2 PRIMARY KEY(customer_id)
)
WITH (oids = false);
The objective of the query is to display a result set of accountid's, count of accountid's in table1 and read from table2. The join is on table1.accountid = table2.customer_id.
The result set should appear as follows:
accountid count read
1234 2 100
1235 9 110
1236 1 91
The count column reflect the number of rows in table1 for each accountid. The read column is a value from table2 associated with the same accountid.

select accountid, "count", read
from
(
select accountid, count(*) "count"
from table1
group by accountid
) t1
inner join
table2 t2 on t1.accountid = t2.customer_id
order by accountid

SELECT table2.customer_id, COUNT(*), table2.read
FROM table2
LEFT JOIN table1 ON (table2.customer_id = table1.accountid)
GROUP BY table2.customer_id, table2.read

SELECT t2.customer_id, t2.read, COUNT(*) AS the_count
FROM table2 t2
JOIN table1 t1 ON t1.accountid = t2.customer_id
GROUP BY t2.customer_id, t2.read
;

Related

Find the mismatch columns from two tables with same structure

I have two tables with same structure. I have to find the mismatch columns in both the tables based on id and year combination. Below is the table structure:
id and year is primary key in both the tables.
============================================================
Create table and insert script for table1:
create table table1 (id int, year int, name varchar(50), stat varchar(50), PRIMARY KEY (id,year));
insert into table1 values (1,2021,'Aman','L');
insert into table1 values (2,2021,'Ankit','H');
insert into table1 values (3,2021,'Rahul','G');
insert into table1 values (4,2021,'Gagan','L');
============================================================
Create table and insert script for table2:
create table table2 (id int, year int, name varchar(50), stat varchar(50), PRIMARY KEY (id,year));
insert into table2 values (1,2020,'Aman','H');
insert into table2 values (2,2020,'Anuj','M');
insert into table2 values (3,2020,'Rahul','G')
insert into table2 values (4,2020,'Abhi','L')
============================================================
Expected Output:
for example, id = 1 and year = 2021 from 1st table when compared with id = 1 and year = 2020 (table1 year -1) from table2 should return that stat is different.
id = 2 and year = 2021 from table1 when compared with id = 2 and year = 2020 from table2 should return that name and stat is different.
I need to compare the year-1 from table2 with year column of table1.
Can anyone help me with sql or DB2 query or procedure, how can I do that.
Sounds like you need a simple join
select *
from table1 t1
join table2 t2
on t1.id = t2.id
and t1.year = t2.year + 1
where t1.stat <> t2.stat
or t1.name <> t2.name --if you want this?

PostgreSQL count other values of ID that have the same value of other column

Let's say we have the following table that stores id of an observation and its address_id. You can create the table with the following code:
drop table if exists schema.pl_address_cnt;
create table schema.pl_address_cnt (
id serial,
address_id int);
insert into schema.pl_address_cnt(address_id) values
(100), (101), (100), (101), (100), (125), (128), (200), (200), (100);
My task is to count for each id how many other ids (thus -1) have the same address_id. I've come up with a solution that turns out to be quite expensive (explain) on the original dataset. I wonder whether my solution can be somehow optimised.
with tmp_table as (select address_id
, count(distinct id) as id_count
from schema.pl_address_cnt
group by address_id
)
select id
, id_count - 1
from schema.pl_address_cnt as pac
left join tmp_table as tt on tt.address_id=pac.address_id;
You can try to omit the CTE and do a self left join on common address but different ID and then aggregate this.
SELECT pac1.id,
count(pac2.id)
FROM pl_address_cnt pac1
LEFT JOIN pl_address_cnt pac2
ON pac1.address_id = pac2.address_id
AND pac1.id <> pac2.id
GROUP BY pac1.id
ORDER BY pac1.id;
For performance you can try indexes on (address_id, id) and (id).

LEFT JOIN trouble with multiple tables

I have the following query
SELECT a.account_id, sum(p.amount) AS amount
FROM accounts a
LEFT JOIN users_accounts ua
JOIN users u
JOIN payments p on p.meta_id = u.user_id
ON u.user_id = ua.user_id
ON ua.account_id = a.account_id
WHERE p.date_prcsd BETWEEN '2017-08-01 00:00:00' AND '2017-08-31 23:59:59'
GROUP BY a.account_id
ORDER BY account_id ASC;
What I want is all the rows from accounts a and zeroes for missing amount data. Same result set for different types of joins and different join structures - only rows that have some payments in p.
Where do I go wrong?
Simplified:
SELECT a.account_id
,sum(coalesce(p2.amount, 0)) AS amount
FROM accounts a
LEFT JOIN users_accounts ua ON (a.account_id = ua.account_id)
LEFT JOIN users u ON (ua.user_id = u.user_id)
LEFT JOIN (
SELECT p.meta_id
,p.amount
FROM payments p
WHERE p.date BETWEEN '2017-08-01' AND '2017-08-10'
) AS p2 ON (u.user_id = p2.meta_id)
GROUP BY a.account_id
ORDER BY account_id ASC;
Result:
account_id | amount
------------+--------
1 | 4
2 | 0
3 | 0
(3 rows)
Explanation: you need to take care of all returning null values. coalesce() does that for you. The where-clause is actually the real problem in your solution because it filters out rows that you would want to have in your endresult. On top of that: you left out the left join for the other tables. I created a simplified test db:
$ cat tables.sql
drop table users_accounts;
drop table payments;
drop table users;
drop table accounts;
create table accounts (account_id serial primary key, name varchar not
null);
create table users (user_id serial primary key, name varchar not null);
create table users_accounts(user_id int references users(user_id),
account_id int references
accounts(account_id));
create table payments(meta_id int references users(user_id), amount int
not null, date date);
insert into accounts (account_id, name) values (1, 'Account A'), (2,
'Account B'), (3, 'Account C');
insert into users (user_id, name) values (1, 'Marc'), (2, 'Ruben'), (3,
'Isaak');
insert into users_accounts (user_id, account_id) values (1,1),(2,1);
insert into payments(meta_id, amount, date) values (1,1, '2017-08-01'),
(1,2, '2017-08-11'),(1,3, '2017-08-03'),(2,1, null),(2,2, null),(2,3,
null);

Selecting one specific data row (required), and 3 others (specific data row must be included)

I need to select a specific row and 2 other rows that is not that specific row (a total of 3). The specific row must always be included in the 3 results. How should I go about it? I think it can be done with a UNION ALL, but do I have another choice? Thanks all! :)
Here are my scripts to create the sample tables:
create table users (
user_id serial primary key,
user_name varchar(20) not null
);
create table result_table1 (
result_id serial primary key,
user_id int4 references users(user_id),
result_1 int4 not null
);
create table result_table2 (
result_id serial primary key,
user_id int4 references users(user_id),
result_2 int4 not null
);
insert into users (user_name) values ('Kevin'),('John'),('Batman'),('Someguy');
insert into result_table1 (user_id, result_1) values (1, 20),(2, 40),(3, 70),(4, 42);
insert into result_table2 (user_id, result_2) values (1, 4),(2, 3),(3, 7),(4, 5);
Here is my UNION query:
SELECT result_table1.user_id,
result_1,
result_2
FROM result_table1
INNER JOIN (
SELECT user_id
FROM users
) users
ON users.user_id = result_table1.user_id
INNER JOIN (
SELECT result_table2.user_id,
result_2
FROM result_table2
) result_table2
ON result_table2.user_id = result_table1.user_id
WHERE users.user_id = 1
UNION ALL
SELECT result_table1.user_id,
result_1,
result_2
FROM result_table1
INNER JOIN (
SELECT user_id
FROM users
) users
ON users.user_id = result_table1.user_id
INNER JOIN (
SELECT result_table2.user_id,
result_2
FROM result_table2
) result_table2
ON result_table2.user_id = result_table1.user_id
WHERE users.user_id != 1
LIMIT 3;
Are there any options other than a UNION? The query works and does what I want for now, but will it always include user_id = 1 if I had a larger set of rows (assume that user_id = 1 will always be there)? :(
Thank you all! :)

postgresql calling column with same name

I have two tables, where they have the same ID name (I cannot change the way the tables are designed) and I'm trying to query table2's ID, how would I do this when they are joined?
create table table1(
id integer, -- PG: serial
description MediumString not null,
primary key (id)
);
create table table2 (
id integer, -- PG: serial
tid references table1(id),
primary key (id)
);
So basically when they're joined, two columns will have the same name "id" if I do the following query
select * from table1
join table2 on table1.id = table2.tid;
Alias the columns if you want both "id"s
SELECT table1.id AS id1, table2.id AS id2
FROM table1...
If you want to query all * on both tables but still be able to reference a specific id you can do that too, you will end up with duplicate id columns that you probably won't use, but in some situations if you really need all the data, it's worth it.
select table1.*, table2.*, table1.id as 'table1.id', table2.id as 'table2.id'
from ...
You cannot select it using select *.
try this :
select table1.id, table1.description, table2.id, table2.tid
from table1
inner join table2
on table1.id = table2.tid