I have a stored procedure in which I cannot add GROUP BY. I need to somehow take the number of records from another table. From the back-end come filter parameters (samples) for one table from it, I want to get data on the number of records.
SELECT cei_mot_count.cnt FROM someTable AS st
LEFT JOIN CEI_Motivations AS cei_mot ON cei_mot.Id= st.ID
--LEFT OUTER JOIN (SELECT Id, StayingID, COUNT(*) as cnt FROM CEI_Motivations GROUP BY Id,StayingID) as cei_mot_count ON cei_mot_count.StayingID = cei_mot.StayingID
--LEFT OUTER JOIN (SELECT Id, COUNT(cei_mot.Id) as cnt FROM cei_mot GROUP BY Id) as cei_mot_count ON cei_mot_count.Id IS NOT NULL
OUTER APPLY (SELECT COUNT(mot.Id) as cnt
FROM cei_mot as mot
WHERE mot.Id IS NOT NULL
GROUP BY mot.Id) as cei_mot_count
i catch Exception: Invalid object name 'cei_mot'
You can't use a table alias from a join inside an apply. I think what you want is:
SELECT cei_mot_count.cnt FROM someTable AS st
LEFT JOIN CEI_Motivations AS cei_mot ON cei_mot.Id= st.ID
--LEFT OUTER JOIN (SELECT Id, StayingID, COUNT(*) as cnt FROM CEI_Motivations GROUP BY Id,StayingID) as cei_mot_count ON cei_mot_count.StayingID = cei_mot.StayingID
--LEFT OUTER JOIN (SELECT Id, COUNT(cei_mot.Id) as cnt FROM cei_mot GROUP BY Id) as cei_mot_count ON cei_mot_count.Id IS NOT NULL
OUTER APPLY (SELECT COUNT(mot.Id) as cnt
FROM CEI_Motivations as mot
WHERE mot.Id IS NOT NULL AND mot.Id= st.ID
GROUP BY mot.Id) as cei_mot_count
Note how this version uses CEI_Motivations twice independently.
Related
Say select id from some_expensive_query is the cte I want to share. Currently I write two sql in a transaction:
with t as (select id from some_expensive_query) select * from t1 join t on t.id =t1.id;
with t as (select id from some_expensive_query) select * from t2 join t on t.id =t2.id;
As you can see, the cte is executed twice but I want something like:
t = select id from some_expensive_query;
select * from t1 join t on t.id =t1.id;
select * from t2 join t on t.id =t2.id;
for portability, I don't want to use pgsql or functions, anyway to solve this?
Why don't you use union all ?
with t as (select id from some_expensive_query)
select * from t1 join t on t.id =t1.id
union all
select * from t2 join t on t.id =t2.id;
I'm reviewing some of our Redshift queries and found cases with multiple levels of nested select like the one below:
LEFT JOIN
(
SELECT *
FROM (
SELECT
id,
created_at,
min(created_at) OVER (PARTITION BY id, slug) AS transition_date
FROM table
WHERE status = 'cancelled'
GROUP BY id, Y, Z, created_at
)
WHERE created_at = transition_date
) t1 ON b.id = t1.id
if this were MySQL, I would've done something like this to remove one level of nested select:
LEFT JOIN
(
SELECT
id,
created_at,
#tdate := min(created_at) OVER (PARTITION BY id, slug) AS transition_date
FROM table
WHERE status = 'cancelled' and #tdate = bul.created_at
GROUP BY id, Y, Z, created_at
) t1 ON b.id = t1.id
Is it possible to so something similar in RedShift?
--- update
forgot to include GROUP BY in the nested SELECT, which may affect the answer
You can move the condition for the transition_date into the JOIN condition:
LEFT JOIN
(
SELECT
id,
created_at,
min(created_at) OVER (PARTITION BY id, slug) AS transition_date
FROM table
WHERE status = 'cancelled'
) t1 ON b.id = t1.id AND t1.created_at = t1.transition_date
SELECT film_id, film_actor.actor_id,first_name,last_name,COUNT(*)
FROM film_actor
INNER JOIN actor ON film_actor.actor_id=actor.actor_id
GROUP BY film_actor.actor_id ;
SELECT
film_id,? ---(Which table ? you can neglect)
film_actor.actor_id,
first_name,
last_name,
COUNT(*) AS Total
FROM film_actor
INNER JOIN actor
ON film_actor.actor_id =actor.actor_id
GROUP BY film_actor.actor_id ;
Best practice for this SQL statement will be like
SELECT
c.film_id,
a.actor_id,
b.first_name,
b.last_name
a.Count
FROM
(
SELECT
film_actor.actor_id,
COUNT(*) AS Count
FROM film_actor
INNER JOIN actor
ON film_actor.actor_id =actor.actor_id
GROUP BY film_actor.actor_id ;
) AS a
INNER JOIN actor b ON b.actor_id = a.actor_id
INNER JOIN film_actor c ON c.actor_id = a.actor_id
I would like to check across multiple tables that the same keys / same number of keys are present in each of the tables.
Currently I have created a solution that checks the count of keys per individual table, checks the count of keys when all tables are merged together, then compares.
This solution works but I wonder if there is a more optimal solution...
Example solution as it stands:
SELECT COUNT(DISTINCT variable) AS num_ids FROM table_a;
SELECT COUNT(DISTINCT variable) AS num_ids FROM table_b;
SELECT COUNT(DISTINCT variable) AS num_ids FROM table_c;
SELECT COUNT(DISTINCT a.variable) AS num_ids
FROM (SELECT DISTINCT VARIABLE FROM table_a) a
INNER JOIN (SELECT DISTINCT VARIABLE FROM table_b) b ON a.variable = b.variable
INNER JOIN (SELECT DISTINCT VARIABLE FROM table_c) c ON a.variable = c.variable;
UPDATE:
The difficultly that I'm facing putting this together in one query is that any of the tables might not be unique on the VARIABLE that I am looking to check, so I've had to use distinct before merging to avoid expanding the join
Since we are only counting, I think there is no need in joining the tables on the variable column. A UNION should be enough.
We still have to use DISTINCT to ignore/suppress duplicates, which often means extra sort.
An index on variable should help for getting counts for separate tables, but it will not help for getting the count of the combined table.
Here is an example for comparing two tables:
WITH
CTE_A
AS
(
SELECT COUNT(DISTINCT variable) AS CountA
FROM TableA
)
,CTE_B
AS
(
SELECT COUNT(DISTINCT variable) AS CountB
FROM TableB
)
,CTE_AB
AS
(
SELECT COUNT(DISTINCT variable) AS CountAB
FROM
(
SELECT variable
FROM TableA
UNION ALL
-- sic! use ALL here to avoid sort when merging two tables
-- there should be only one distinct sort for the outer `COUNT`
SELECT variable
FROM TableB
) AS AB
)
SELECT
CASE WHEN CountA = CountAB AND CountB = CountAB
THEN 'same' ELSE 'different' END AS ResultAB
FROM
CTE_A
CROSS JOIN CTE_B
CROSS JOIN CTE_AB
;
Three tables:
WITH
CTE_A
AS
(
SELECT COUNT(DISTINCT variable) AS CountA
FROM TableA
)
,CTE_B
AS
(
SELECT COUNT(DISTINCT variable) AS CountB
FROM TableB
)
,CTE_C
AS
(
SELECT COUNT(DISTINCT variable) AS CountC
FROM TableC
)
,CTE_ABC
AS
(
SELECT COUNT(DISTINCT variable) AS CountABC
FROM
(
SELECT variable
FROM TableA
UNION ALL
-- sic! use ALL here to avoid sort when merging two tables
-- there should be only one distinct sort for the outer `COUNT`
SELECT variable
FROM TableB
UNION ALL
-- sic! use ALL here to avoid sort when merging two tables
-- there should be only one distinct sort for the outer `COUNT`
SELECT variable
FROM TableC
) AS AB
)
SELECT
CASE WHEN CountA = CountABC AND CountB = CountABC AND CountC = CountABC
THEN 'same' ELSE 'different' END AS ResultABC
FROM
CTE_A
CROSS JOIN CTE_B
CROSS JOIN CTE_C
CROSS JOIN CTE_ABC
;
I deliberately chose CTE, because as far as I know Postgres materializes CTE and in our case each CTE will have only one row.
Using array_agg with order by is even better variant, if it is available on redshift. You'll still need to use DISTINCT, but you don't have to merge all tables together.
WITH
CTE_A
AS
(
SELECT array_agg(DISTINCT variable ORDER BY variable) AS A
FROM TableA
)
,CTE_B
AS
(
SELECT array_agg(DISTINCT variable ORDER BY variable) AS B
FROM TableB
)
,CTE_C
AS
(
SELECT array_agg(DISTINCT variable ORDER BY variable) AS C
FROM TableC
)
SELECT
CASE WHEN A = B AND B = C
THEN 'same' ELSE 'different' END AS ResultABC
FROM
CTE_A
CROSS JOIN CTE_B
CROSS JOIN CTE_C
;
Well, here is probably the nastiest piece of SQL I could build for you :) I will forever deny that I wrote this and that my stackoverflow account was hacked ;)
SELECT
'All OK'
WHERE
( SELECT COUNT(DISTINCT id) FROM table_a ) = ( SELECT COUNT(DISTINCT id) FROM table_b )
AND ( SELECT COUNT(DISTINCT id) FROM table_b ) = ( SELECT COUNT(DISTINCT id) FROM table_c )
By the way, this won't optimise the query - it's still doing three queries (but I guess it's better than 4?).
UPDATE: In light of your use-case below: NEW sql fiddle http://sqlfiddle.com/#!15/a0403/1
SELECT DISTINCT
tbl_a.a_count,
tbl_b.b_count,
tbl_c.c_count
FROM
( SELECT COUNT(id) a_count, array_agg(id order by id) ids FROM table_a) tbl_a,
( SELECT COUNT(id) b_count, array_agg(id order by id) ids FROM table_b) tbl_b,
( SELECT COUNT(id) c_count, array_agg(id order by id) ids FROM table_c) tbl_c
WHERE
tbl_a.ids = tbl_b.ids
AND tbl_b.ids = tbl_c.ids
The above query will only return if all tables have the same number of rows, ensuring that the IDS are also the same.
I wrote this, and it is wrong syntax, help me fix it, I want 'T' to be an alias of the result of the two inner joins.
select T.id
from table1
inner join table2 on table1.x = table2.y
inner join table3 on table3.z = table1.w as T;
You cannot use aliases to name the "entire" join, you can, however, put aliases on individual tables of the join:
select t1.id
from table1 t1
inner join table2 t2 on t1.x = t2.y
inner join table3 t3 on t3.z = t1.w
In the projection, you will have to use the alias of the table, which defines the id column you are going to select.
You can't directly name the result of a join. One option is to use a subquery:
select T.id
from (
select *
from table1
inner join table2 on table1.x = table2.y
inner join table3 on table3.z = table1.w
) T
Another option is subquery factoring:
with T as (
select *
from table1
inner join table2 on table1.x = table2.y
inner join table3 on table3.z = table1.w
)
select T.id
from T