Update a table from a union select statement - postgresql

I have two tables as below:
tablea
k | 1 | 2
--------------------
a | mango | xx
b | orange| xx
c | xx | apple
d | xx | banana
a | xx | mango
tableb
k | 1 | 2
--------------------
a | |
b | |
c | |
d | |
How can I update tableb from tablea so I get the results below?
tableb
k | 1 | 2
--------------------
a | mango | mango
b | orange| xx
c | xx | apple
d | xx | banana
if in case I try to use a update statement like below
update tableb
set 1 = x.1,
2 = x.2
from
(
select * from tablea
) x
where tablea.k = x.k
Can I make the update statement to ignore xx if k is duplicate?
Thanks.

Here is the SELECT, hope you can make the update.
Try to search a match for every one on the left side with name <> 'xx'
Then union with the rest of rows I havent use it yet.
SQL Fiddle Demo
SELECT t1."k", t1."1", COALESCE(t2."2", 'xx') "2"
FROM tablea t1
LEFT JOIN tablea t2
ON t1."1" = t2."2"
WHERE t1."1" <> 'xx'
UNION ALL
SELECT t1."k", t1."1", t1."2"
FROM tablea t1
WHERE t1."1" = 'xx'
AND t1."2" NOT IN (SELECT t2."1" FROM tablea t2 WHERE t2."1" <> 'xx')

Related

How to compute frequency/count of concurrent events by combination in postgresql?

I am looking for a way to identify event names names that co-occur: i.e., correlate event names with the same start (startts) and end (endts) times: the events are exactly concurrent (partial overlap is not a feature of this data base, which makes this conditional criterion a bit simpler to satisfy).
toy dataframe
+------------------+
|name startts endts|
| A 02:20 02:23 |
| A 02:23 02:25 |
| A 02:27 02:28 |
| B 02:20 02:23 |
| B 02:23 02:25 |
| B 02:25 02:27 |
| C 02:27 02:28 |
| D 02:27 02:28 |
| D 02:28 02:31 |
| E 02:27 02:28 |
| E 02:29 02:31 |
+------------------+
Ideal output:
+---------------------------+
|combination| count |
+---------------------------+
| AB | 2 |
| AC | 1 |
| AE | 1 |
| AD | 1 |
| BC | 0 |
| BD | 0 |
| BE | 0 |
| CE | 0 |
+-----------+---------------+
Naturally, I would have tried a loop but I recognize PostgreSQL is not optimal for this.
What I've tried is generating a temporary table by selecting for distinct name and startts and endts combinations and then doing a left join on the table itself (selecting name).
User #GMB provided the following (modified) solution; however, the performance is not satisfactory given the size of the database (even running the query on a time window of 10 minutes never completes). For context, there are about 300-400 unique names; so about 80200 combinations (if my math checks out). Order is not important for the permutations.
#GMB's attempt:
I understand this as a self-join, aggregation, and a conditional count of matching intervals:
select t1.name name1, t2.name name2,
sum(case when t1.startts = t2.startts and t1.endts = t2.endts then 1 else 0 end) cnt
from mytable t1
inner join mytable t2 on t2.name > t1.name
group by t1.name, t2.name
order by t1.name, t2.name
Demo on DB Fiddle:
name1 | name2 | cnt
:---- | :---- | --:
A | B | 2
A | C | 1
A | D | 1
A | E | 1
B | C | 0
B | D | 0
B | E | 0
C | D | 1
C | E | 1
D | E | 1
#GMB notes that, if you are looking for a count of overlapping intervals, all you have to do is change the sum() to:
sum(t1.startts <= t2.endts and t1.endts >= t2.startts) cnt
Version = PostgreSQL 8.0.2 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 3.4.2 20041017 (Red Hat 3.4.2-6.fc3), Redshift 1.0.19097
Thank you.
Consider the following in MySQL (where your DBFiddle points to):
SELECT name, COUNT(*)
FROM (
SELECT group_concat(name ORDER BY name) name
FROM mytable
GROUP BY startts, endts
ORDER BY name
) as names
GROUP BY name
ORDER BY name
Equivalent in PostgreSQL:
SELECT name, COUNT(*)
FROM (
SELECT string_agg(name ORDER BY name) name
FROM mytable
GROUP BY startts, endts
ORDER BY name
) as names
GROUP BY name
ORDER BY name
First, you create a list of concurrent events (in the subquery), and then you count them.

PostgreSQL query with variable result column

Hoping i'm missing something simple here, and saying it clearly..
I have a table that i need to join to 1 of 2 possible other tables, depending on user input of the id found in the primary table.
t1:
primary_id | t2_id | t3_id
--------------------------
1 | x | null
2 | null | y
t2:
id | value
----------
x | a
t3:
id | value
----------
y | b
I want to do the following in a single query:
select primary_id, t2_value from t1, t2
where t1.t2_id=t2.id
or
select primary_id, t3_value from t1, t3
where t1.t3_id=t3.id
here's the full join:
with t1 as (select * from t1),
t2 as (select * from t2),
t3 as (select * from t3),
select * from
t1
left join
t2
on t1.t2_id=t2.id
left join
t3
on t1.t3_id=t3.id
where t1.t2_id is not null
when i run this, i'd like to get back just t1 and t2 columns, not t3..
please and thank you!
You can use COALESCE():
select t1.primary_id,
coalesce(t1.t2_id, t1.t3_id) id,
coalesce(t2.value, t3.value) "value"
from t1
left join t2 on t1.t2_id=t2.id
left join t3 on t1.t3_id=t3.id
where t1.t2_id is not null
See the demo.
Results:
| primary_id | id | value |
| ---------- | --- | ----- |
| 1 | x | a |
If you change the condition to:
where t1.t3_id is not null
you will get:
| primary_id | id | value |
| ---------- | --- | ----- |
| 2 | y | b |

How to use join with aggregate function in postgresql?

I have 4 tables
Table1
id | name
1 | A
2 | B
Table2
id | name1
1 | C
2 | D
Table3
id | name2
1 | E
2 | F
Table4
id | name1_id | name2_id | name3_id
1 | 1 | 2 | 1
2 | 2 | 2 | 2
3 | 1 | 2 | 1
4 | 2 | 1 | 1
5 | 1 | 1 | 2
6 | 2 | 2 | 1
7 | 1 | 1 | 2
8 | 2 | 1 | 1
9 | 1 | 2 | 1
10 | 2 | 2 | 1
Now I want to join all tables with 4 and get this type of output
name | count
{A,B} | {5, 5}
{C,D} | {5, 6}
{E,F} | {7, 3}
I tried this
select array_agg(distinct(t1.name)), array_agg(distinct(temp.test))
from
(select t4.name1_id, (count(t4.name1_id)) "test"
from table4 t4 group by t4.name1_id
) temp
join table1 t1
on temp.name1_id = t1.id
I am trying to achieve this. Anybody can help me.
Calculate the counts for every table separately and union the results:
select
array_agg(name order by name) as name,
array_agg(count order by name) as count
from (
select 1 as t, name, count(*)
from table4
join table1 t1 on t1.id = name1_id
group by name
union all
select 2 as t, name, count(*)
from table4
join table2 t2 on t2.id = name2_id
group by name
union all
select 3 as t, name, count(*)
from table4
join table3 t3 on t3.id = name3_id
group by name
) s
group by t;
name | count
-------+-------
{A,B} | {5,5}
{C,D} | {4,6}
{E,F} | {7,3}
(3 rows)

Incrementally count row numbers for distrinct rows in a join select

I have a select that joins two tables, a and b, via a join table, ab.
select a.*, b.*
from a
left join ab on a.id = ab.aid
left join b on b.id = ab.bid;
And this produces
id | athing | id | bthing
----+----------+----+-----------
7 | athing x | 1 | bthing a
7 | athing x | 2 | bthing b
7 | athing x | 3 | bthing c
3 | athing y | 1 | bthing a
(4 rows)
I want a column that incrementally counts the number of rows in a. That is:
count | id | athing | id | bthing
-------+----+----------+----+-----------
1 | 7 | athing x | 1 | bthing a
1 | 7 | athing x | 2 | bthing b
1 | 7 | athing x | 3 | bthing c
2 | 3 | athing y | 1 | bthing a
(4 rows)
I have looked at using the window function row_number(), but that seems to count all the rows.
I want to incrementally count the distinct a rows, regardless of how many rows the joined table creates.
Is this possible in Postgresql? Thank you.
Use row_number() when selecting from the table a (note, the order of the rows in a is defined in over clause):
select a.*, b.*
from (
select row_number() over (order by id desc) as count, *
from a
) a
left join ab on a.id = ab.aid
left join b on b.id = ab.bid;
count | id | athing | id | bthing
-------+----+----------+----+----------
1 | 7 | athing x | 1 | bthing a
1 | 7 | athing x | 2 | bthing b
1 | 7 | athing x | 3 | bthing c
2 | 3 | athing y | 1 | bthing a
(4 rows)
or dense_rank() on the result dataset.
select
dense_rank() over (order by a.id desc) as count,
a.*, b.*
from a
left join ab on a.id = ab.aid
left join b on b.id = ab.bid;
Read about window functions.

SQL - group by - limit clause - postgresql

I have a table which has two columns C1 and C2.
C1 has an integer data type and C2 has text.
Table looks like this.
---C1--- ---C2---
1 | a |
1 | b |
1 | c |
1 | d |
1 | e |
1 | f |
1 | g |
2 | h |
2 | i |
2 | j |
2 | k |
2 | l |
2 | m |
2 | n |
------------------
My question: i want a sql query which does group by on column C1 but with size of 3.
looks like this.
------------------
1 | a,b,c |
1 | d,e,f |
1 | g |
2 | h,i,j |
2 | k,l,m |
2 | n |
------------------
is it possible by executing SQL???
Note: I do not want to write stored procedure or function...
You can use a common table expression to partition the results into rows, and then use STRING_AGG to join them into comma separated lists;
WITH cte AS (
SELECT *, (ROW_NUMBER() OVER (PARTITION BY C1 ORDER BY C2)-1)/3 rn
FROM mytable
)
SELECT C1, STRING_AGG(C2, ',') ALL_C2
FROM cte
GROUP BY C1,rn
ORDER BY C1
An SQLfiddle to test with.
A short explanation of the common table expression;
ROW_NUMBER() OVER (...) will number the results from 1 to n for each value of C1. We then subtract 1 and divide by 3 to get the sequence 0,0,0,1,1,1,2,2,2... and group by that value in the outer query to get 3 results per row.
Apart from Joachim Isaksson's answer,you try this method also
SELECT C1, string_agg(C2, ',') as c2
FROM (
SELECT *, (ROW_NUMBER() OVER (PARTITION BY C1 ORDER BY C2)-1)/3 as row_num
FROM atable) t
GROUP BY C1,row_num
ORDER BY c2