Redshift - Sum output from two different queries into a single query

Redshift - Sum output from two different queries into a single query - amazon-redshift

I am trying to sum output of 2 different sql in Redshift.
SQL1:
select count (*) from table1; -- Output : 10
SQL2:
select count (*) from table2; -- Output : 14
I am trying to build a query that would show the total of both these queries. Expected output : 24

Figured the solution:
select ((select count(*) from table1) +
(select count(*) from table2)) as count

Related

I want to select 2 data from database which durations less than 150

I have a problem with my SQL command. I want to select 2 movies which 2 movies sum of durations less than 150 I wrote this SQL command:
Select
movie_title,Sum(movie_time) as sum_movie
From
movie_movie
Group By
movie_title
Having
Sum(movie_time)<100
Order By
sum_movie DESC

You can get two movies with minimum movie_time values with order by movie_time ASC limit 2 in CTE, and then use that in the condition.
with two_min_movie as (
select *
from movie_movie
order by movie_time ASC limit 2
)
select *
from two_min_movie
where (select sum(movie_time) from two_min_movie) < 150
Demo in DBfiddle

Pivot table using crosstab and count

I have to display a table like this:
Year
Month
Delivered
Not delivered
Not Received
2021
Jan
10
86
75
2021
Feb
13
36
96
2021
March
49
7
61
2021
Apr
3
21
72
Using raw data generated by this query:
SELECT
year,
TO_CHAR( creation_date, 'Month') AS month,
marking,
COUNT(*) AS count
FROM invoices
GROUP BY 1,2,3
I have tried using crosstab() but I got error:
SELECT * FROM crosstab('
SELECT
year,
TO_CHAR( creation_date, ''Month'') AS month,
marking,
COUNT(*) AS count
FROM invoices
GROUP BY 1,2,3
') AS ct(year text, month text, marking text)
I would prefer to not manually type all marking values because they are a lot.
ERROR: invalid source data SQL statement
DETAIL: The provided SQL must return 3 columns: rowid, category, and values.

1. Static solution with a limited list of marking values :
SELECT year
, TO_CHAR( creation_date, 'Month') AS month
, COUNT(*) FILTER (WHERE marking = 'Delivered') AS Delivered
, COUNT(*) FILTER (WHERE marking = 'Not delivered') AS "Not delivered"
, COUNT(*) FILTER (WHERE marking = 'Not Received') AS "Not Received"
FROM invoices
GROUP BY 1,2
2. Full dynamic solution with a large list of marking values :
This proposal is an alternative solution to the crosstab solution as proposed in A and B.
The proposed solution here just requires a dedicated composite type which can be dynamically created and then it relies on the jsonb type and standard functions :
Starting from your query which counts the number of rows per year, month and marking value :
Using the jsonb_object_agg function, the resulting rows are first
aggregated by year and month into jsonb objects whose jsonb keys
correspond to the marking values and whose jsonb values
correspond to the counts.
the resulting jsonb objects are then converted into records using the jsonb_populate_record function and the dedicated composite type.
First we dynamically create a composite type which corresponds to the ordered list of marking values :
CREATE OR REPLACE PROCEDURE create_composite_type() LANGUAGE plpgsql AS $$
DECLARE
column_list text ;
BEGIN
SELECT string_agg(DISTINCT quote_ident(marking) || ' bigint', ',' ORDER BY quote_ident(marking) || ' bigint' ASC)
INTO column_list
FROM invoices ;
EXECUTE 'DROP TYPE IF EXISTS composite_type' ;
EXECUTE 'CREATE TYPE composite_type AS (' || column_list || ')' ;
END ;
$$ ;
CALL create_composite_type() ;
Then the expected result is provided by the following query :
SELECT a.year
, TO_CHAR(a.year_month, 'Month') AS month
, (jsonb_populate_record( null :: composite_type
, jsonb_object_agg(a.marking, a.count)
)
).*
FROM
( SELECT year
, date_trunc('month', creation_date) AS year_month
, marking
, count(*) AS count
FROM invoices AS v
GROUP BY 1,2,3
) AS a
GROUP BY 1,2
ORDER BY month
Obviously, if the list of marking values may vary in time, then you have to recall the create_composite_type() procedure just before executing the query. If you don't update the composite_type, the query will still work (no error !) but some old marking values may be obsolete (not used anymore), and some new marking values may be missing in the query result (not displayed as columns).
See the full demo in dbfiddle.

You need to generate the crosstab() call dynamically.
But since SQL does not allow dynamic return types, you need a two-step workflow:
Generate query
Execute query
If you are unfamiliar with crosstab(), read this first:
PostgreSQL Crosstab Query
It's odd to generate the month from creation_date, but not the year. To simplify, I use a combined column year_month instead.
Query to generate the crosstab() query:
SELECT format(
$f$SELECT * FROM crosstab(
$q$
SELECT to_char(date_trunc('month', creation_date), 'YYYY_Month') AS year_month
, marking
, COUNT(*) AS ct
FROM invoices
GROUP BY date_trunc('month', creation_date), marking
ORDER BY date_trunc('month', creation_date) -- optional
$q$
, $c$VALUES (%s)$c$
) AS ct(year_month text, %s);
$f$, string_agg(quote_literal(sub.marking), '), (')
, string_agg(quote_ident (sub.marking), ' int, ') || ' int'
)
FROM (SELECT DISTINCT marking FROM invoices ORDER BY 1) sub;
If the table invoices is big with only few distinct values for marking (which seems likely) there are faster ways to get distinct values. See:
Optimize GROUP BY query to retrieve latest row per user
Generates a query of the form:
SELECT * FROM crosstab(
$q$
SELECT to_char(date_trunc('month', creation_date), 'YYYY_Month') AS year_month
, marking
, COUNT(*) AS ct
FROM invoices
GROUP BY date_trunc('month', creation_date), marking
ORDER BY date_trunc('month', creation_date) -- optional
$q$
, $c$VALUES ('Delivered'), ('Not Delivered'), ('Not Received')$c$
) AS ct(year_month text, "Delivered" int, "Not Delivered" int, "Not Received" int);
The simplified query does not need "extra columns. See:
Pivot on Multiple Columns using Tablefunc
Note the use date_trunc('month', creation_date) in GROUP BY and ORDER BY. This produces a valid sort order, and faster, too. See:
Cumulative sum of values by month, filling in for missing months
How to get rows by max(date) group by Year-Month in Postgres?
Also note the use of dollar-quotes to avoid quoting hell. See:
Insert text with single quotes in PostgreSQL
Months without entries don't show up in the result, and no markings for an existing month show as NULL. You can adapt either if need be. See:
Join a count query on generate_series() and retrieve Null values as '0'
Then execute the generated query.
db<>fiddle here (reusing
Edouard's fiddle, kudos!)
See:
Execute a dynamic crosstab query
In psql
In psql you can use \qexec to immediately execute the generated query. See:
Simulate CREATE DATABASE IF NOT EXISTS for PostgreSQL?
In Postgres 9.6 or later, you can also use the meta-command \crosstabview instead of crosstab():
test=> SELECT to_char(date_trunc('month', creation_date), 'YYYY_Month') AS year_month
test-> , marking
test-> , COUNT(*) AS count
test-> FROM invoices
test-> GROUP BY date_trunc('month', creation_date), 2
test-> ORDER BY date_trunc('month', creation_date)\crosstabview
year_month | Not Received | Delivered | Not Delivered
----------------+--------------+-----------+---------------
2020_January | 1 | 1 | 1
2020_March | | 2 | 2
2021_January | 1 | 1 | 2
2021_February | 1 | |
2021_March | | 1 |
2021_August | 2 | 1 | 1
2022_August | | 2 |
2022_November | 1 | 2 | 3
2022_December | 2 | |
(9 rows)
Note that \crosstabview - unlike crosstab() - does not support "extra" columns. If you insist on separate year and month columns, you need crosstab().
See:
How do I generate a pivoted CROSS JOIN where the resulting table definition is unknown?

Postgres - Using window function in grouped rows

According to the Postgres Doc at https://www.postgresql.org/docs/9.4/queries-table-expressions.html#QUERIES-WINDOW it states
If the query contains any window functions (...), these functions are evaluated after any grouping, aggregation, and HAVING filtering is performed. That is, if the query uses any aggregates, GROUP BY, or HAVING, then the rows seen by the window functions are the group rows instead of the original table rows from FROM/WHERE.
I didn't get the concept of " then the rows seen by the window functions are the group rows instead of the original table rows from FROM/WHERE". Allow me to use an example to explain my doubt:
Using this ready to run example below
with cte as (
select 1 as primary_id, 1 as foreign_id, 10 as begins
union
select 2 as primary_id, 1 as foreign_id, 20 as begins
union
select 3 as primary_id, 1 as foreign_id, 30 as begins
union
select 4 as primary_id, 2 as foreign_id, 40 as begins
)
select foreign_id, count(*) over () as window_rows_count, count(*) as grouped_rows_count
from cte
group by foreign_id
You may notice that the result is
So if "the rows seen by the window functions are the group rows".. then ¿why window_rows_count is returning a different value from grouped_rows_count?

If you remove the window function from the query:
select foreign_id, count(*) as grouped_rows_count
from cte
group by foreign_id
the result, as expected is this:
> foreign_id | grouped_rows_count
> ---------: | -----------------:
> 1 | 3
> 2 | 1
and on this result, which is 2 rows, if you also apply the window function count(*) over(), it will return 2, because it counts all the rows of the resultset since the over clause is empty, without any partition.

You should follow the last comment on your post.
And for more analysis, you may process the following query :
with cte as (
select 1 as primary_id, 1 as foreign_id, 10 as begins
union
select 2 as primary_id, 1 as foreign_id, 20 as begins
union
select 3 as primary_id, 1 as foreign_id, 30 as begins
union
select 4 as primary_id, 2 as foreign_id, 40 as begins
)
select foreign_id, count(*) over (PARTITION BY foreign_id) as window_rows_count, count(*) as grouped_rows_count
from cte
group by foreign_id ;
You'll see this time that you are getting 1 row for each foreign id.
Checkout the documentation on postgres at this url :
https://www.postgresql.org/docs/13/tutorial-window.html
The window function is applied to the whole set obtained by the former query.

Firebird get the list with all available id

In a table I have records with id's 2,4,5,8. How can I receive a list with values 1,3,6,7. I have tried in this way
SELECT t1.id + 1
FROM table t1
WHERE NOT EXISTS (
SELECT *
FROM table t2
WHERE t2.id = t1.id + 1
)
but it's not working correctly. It doesn't bring all available positions.
Is it possible without another table?

You can get all the missing ID's from a recursive CTE, like this:
with recursive numbers as (
select 1 number
from rdb$database
union all
select number+1
from rdb$database
join numbers on numbers.number < 1024
)
select n.number
from numbers n
where not exists (select 1
from table t
where t.id = n.number)
the number < 1024 condition in my example limit the query to the max 1024 recursion depth. After that, the query will end with an error. If you need more than 1024 consecutive ID's you have either run the query multiple times adjusting the interval of numbers generated or think in a different query that produces consecutive numbers without reaching that level of recursion, which is not too difficult to write.

Conditional Union in T-SQL

Currently I've a query as follows:
-- Query 1
SELECT
acc_code, acc_name, alias, LAmt, coalesce(LAmt,0) AS amt
FROM
(SELECT
acc_code, acc_name, alias,
(SELECT
(SUM(cr_amt)-SUM(dr_amt))
FROM
ledger_mcg l
WHERE
(l.acc_code LIKE a.acc_code + '.%' OR l.acc_code=a.acc_code)
AND
fy_id=1
AND
posted_date BETWEEN '2010-01-01' AND '2011-06-02') AS LAmt
FROM
acc_head_mcg AS a
WHERE
(acc_type='4')) AS T1
WHERE
coalesce(LAmt,0)<>0
Query 2 is same as Query 1 except that acc_type = '5' in Query 2. Query 2 always returns a resultset with a single row. Now, I need the union of the two queries i.e
Query 1
UNION
Query 2
only when the amt returned by Query 2 is less than 0. Else, I don't need a union but only the resulset from Query 1.
The best way I can think of is to create a parameterised scalar function. How best can I do this?

You could store the result of the first query into a temporary table, then, if the table wasn't empty, execute the other query.
IF OBJECT_ID('tempdb..#MultipleQueriesResults') IS NOT NULL
DROP TABLE #MultipleQueriesResults;
SELECT
acc_code, acc_name, alias, LAmt, coalesce(LAmt,0) AS amt
INTO #MultipleQueriesResults
FROM
(SELECT
acc_code, acc_name, alias,
(SELECT
(SUM(cr_amt)-SUM(dr_amt))
FROM
ledger_mcg l
WHERE
(l.acc_code LIKE a.acc_code + '.%' OR l.acc_code=a.acc_code)
AND
fy_id=1
AND
posted_date BETWEEN '2010-01-01' AND '2011-06-02') AS LAmt
FROM
acc_head_mcg AS a
WHERE
(acc_type='4')) AS T1
WHERE
coalesce(LAmt,0)<>0;
IF NOT EXISTS (SELECT * FROM #MultipleQueriesResults)
… /* run Query 2 */

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Redshift - Sum output from two different queries into a single query - amazon-redshift

I am trying to sum output of 2 different sql in Redshift. SQL1: select count () from table1; -- Output : 10 SQL2: select count () from table2; -- Output : 14 I am trying to build a query that would show the total of both these queries. Expected output : 24

Figured the solution: select ((select count() from table1) + (select count() from table2)) as count

Related

I want to select 2 data from database which durations less than 150

Pivot table using crosstab and count

Postgres - Using window function in grouped rows

Firebird get the list with all available id

Conditional Union in T-SQL

Categories

Resources

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Redshift - Sum output from two different queries into a single query - amazon-redshift

I am trying to sum output of 2 different sql in Redshift. SQL1: select count (*) from table1; -- Output : 10 SQL2: select count (*) from table2; -- Output : 14 I am trying to build a query that would show the total of both these queries. Expected output : 24

Figured the solution: select ((select count(*) from table1) + (select count(*) from table2)) as count

Related

I want to select 2 data from database which durations less than 150

Pivot table using crosstab and count

Postgres - Using window function in grouped rows

Firebird get the list with all available id

Conditional Union in T-SQL

Categories

Resources

I am trying to sum output of 2 different sql in Redshift. SQL1: select count () from table1; -- Output : 10 SQL2: select count () from table2; -- Output : 14 I am trying to build a query that would show the total of both these queries. Expected output : 24

Figured the solution: select ((select count() from table1) + (select count() from table2)) as count