Big Query Order By Grouped Field

Big Query Order By Grouped Field - group-by

I have a query that groups by date which works fine.
SELECT EXTRACT(date FROM DATETIME(timestamp, 'US/Eastern')) date, SUM(users) total_users FROM `mydataset.mytable`
GROUP BY EXTRACT(date FROM DATETIME(timestamp, 'US/Eastern'))
but when I try to order by date:
SELECT EXTRACT(date FROM DATETIME(timestamp, 'US/Eastern')) date, SUM(users) total_users FROM `mydataset.mytable`
GROUP BY EXTRACT(date FROM DATETIME(timestamp, 'US/Eastern'))
ORDER BY EXTRACT(date FROM DATETIME(timestamp, 'US/Eastern'));
I get the following error:
SELECT list expression references column timestamp which is neither grouped nor aggregated at [1:35]
The timestamp column is clearly part of the group by and even stranger still is that it works without the ORDER BY clause... What's going on here?

#standardSQL
SELECT
EXTRACT(DATE FROM DATETIME(timestamp, 'US/Eastern')) date,
SUM(users) total_users
FROM `mydataset.mytable`
GROUP BY 1
ORDER BY 1

You can try subselect:
#standardSQL
SELECT
date,
total_users
FROM (
SELECT
EXTRACT(date FROM DATETIME(timestamp,'US/Eastern')) date,
SUM(users) total_users
FROM
`mydataset.mytable`
GROUP BY EXTRACT(date FROM DATETIME(timestamp, 'US/Eastern'))
)
ORDER BY
date

Related

Is there a SQL code for cumulative count of SaaS customer over months?

I have a table with:
ID (id client), date_start (subscription of SaaS), date_end (could be a date value or be NULL).
So I need a cumulative count of active clients month by month.
any idea on how to write that in Postgres and achieve this result?
Starting from this, but I don't know how to proceed
select
date_trunc('month', c.date_start)::date,
count(*)
from customer

Please check next solution:
select
subscrubed_date,
subscrubed_customers,
unsubscrubed_customers,
coalesce(subscrubed_customers, 0) - coalesce(unsubscrubed_customers, 0) cumulative
from (
select distinct
date_trunc('month', c.date_start)::date subscrubed_date,
sum(1) over (order by date_trunc('month', c.date_start)) subscrubed_customers
from customer c
order by subscrubed_date
) subscribed
left join (
select distinct
date_trunc('month', c.date_end)::date unsubscrubed_date,
sum(1) over (order by date_trunc('month', c.date_end)) unsubscrubed_customers
from customer c
where date_end is not null
order by unsubscrubed_date
) unsubscribed on subscribed.subscrubed_date = unsubscribed.unsubscrubed_date;
share SQL query

You have a table of customers. With a start date and sometimes an end date. As you want to group by date, but there are two dates in the table, you need to split these first.
Then, you may have months where only customers came and others where only customers left. So, you'll want a full outer join of the two sets.
For a cumulative sum (also called a running total), use SUM OVER.
with came as
(
select date_trunc('month', date_start) as month, count(*) as cnt
from customer
group by date_trunc('month', date_start)
)
, went as
(
select date_trunc('month', date_end) as month, count(*) as cnt
from customer
where date_end is not null
group by date_trunc('month', date_end)
)
select
month,
came.cnt as cust_new,
went.cnt as cust_gone,
sum(came.cnt - went.cnt) over (order by month) as cust_active
from came full outer join went using (month)
order by month;

How to include three or more aggregators in a sql query?

I have a table called retail which stores items and their price along with date of purchase. I want to find out total monthly count of unique items sold.
This is the sql query I tried
select date_trunc('month', date) as month, sum(count(distinct(items))) as net_result from retail group by month order by date;
But I get the following error
ERROR: aggregate function calls cannot be nested
Now I searched for similar stackoverflow posts one of which is postgres aggregate function calls may not be nested and but I am unable to replicate it to create the correct sql query.
What am I doing wrong?

From your description, it doesn't seem like you need to nest the aggregate functions, the count(distinct item) construction will give you a count of distinct items sold, like so:
select date_trunc('month', date) as month
, count(distinct items) as unique_items_sold
, count(items) as total_items_sold
from retail
group by "month"
order by "month" ;
If you had a column called item_count (say if there was row in the table for each item sold, but a sale might include, say, three widgets)
select date_trunc('month', date) as month
, count(distinct items) as unique_items_sold
, sum(item_count) as total_items_sold
from retail
group by "month"
order by "month" ;

Use subqueries:
Select month, sum(citems) as net_result
from
(select
date_trunc('month', date) as month,
count(distinct(items)) as citems
from
retail
group by month
order by date
)

I am suspect your group by statement will throw an Error because your month column are condition column and you cannot put in the same level in your query so put your full expression instead.
select
month,
sum(disct_item) as net_results
from
(select
date_trunc('month', date) as month,
count(distinct items) as disct_item
from
retail
group by
date_trunc('month', date)
order by
date) as tbl
group by
month;
You cannot make nested aggregate so you wrap first count to subquery and after that in outer you make sum to do the operation.

Google BigQuery: Select and group by "yyyy-mm" from date field ("yyyy-mm-dd") or timestamp

I would like to group by "yyyy-mm" from a date field ("yyyy-mm-dd") or timestamp field so that I can pull and group transactional data over multiple years without having to pull separate queries grouping by month for each year.

SELECT
CONCAT(STRING(YEAR(timestamp)),'-',RIGHT(STRING(100 + MONTH(timestamp)), 2)) AS yyyymm,
<any aggregations here>
FROM YourTable
GROUP BY 1
another option:
SELECT
STRFTIME_UTC_USEC(timestamp, "%Y-%m") AS yyyymm,
<any aggregations here>
FROM YourTable
GROUP BY 1
both versions should work with timestamp or date

You can use the following for Standard SQL:
SELECT concat(cast(format_date("%E4Y", cast(current_date() as date)) as string),'-',cast(format_date("%m", cast(current_date() as date)) as string)) as yyyymm, <other aggregations>
FROM <YourTable>
GROUP BY 1;
Just replace the current_date() with your column name containing the timestamp.

PostgreSQL SELECT date before max(DATE)

I need to select the rows for which the difference between max(date) and the date just before max(date) is smaller than 366 days. I know about SELECT MAX(date) FROM table to get the last date from now, but how could I get the date before?
I would need a query of this kind:
SELECT code, MAX(date) - before_date FROM troncon WHERE MAX(date) - before_date < 366 ;
NB : before_date does not refer to anything and is to be replaced by a functionnal stuff.
Edit : Example of the table I'm testing it on:
CREATE TABLE troncon (code INTEGER, ope_date DATE) ;
INSERT INTO troncon (code, ope_date) VALUES
('C086000-T10001', '2014-11-11'),
('C086000-T10001', '2014-11-11'),
('C086000-T10002', '2014-12-03'),
('C086000-T10002', '2014-01-03'),
('C086000-T10003', '2014-08-11'),
('C086000-T10003', '2014-03-03'),
('C086000-T10003', '2012-02-27'),
('C086000-T10004', '2014-08-11'),
('C086000-T10004', '2013-12-30'),
('C086000-T10004', '2013-06-01'),
('C086000-T10004', '2012-07-31'),
('C086000-T10005', '2013-10-01'),
('C086000-T10005', '2012-11-01'),
('C086000-T10006', '2014-04-01'),
('C086000-T10006', '2014-05-15'),
('C086000-T10001', '2014-07-05'),
('C086000-T10003', '2014-03-03');
Many thanks!

The sub query contains all rows joined with the unique max date, and you select only ones which there differente with the max date is smaller than 366 days:
select * from
(
SELECT id, date, max(date) over(partition by code) max_date FROM your_table
) A
where max_date - date < interval '366 day'
PS: As #a_horse_with_no_name said, you can partition by code to get maximum_date for each code.

Correct use of COALESCE in postgres / COALESCE not working correctly

I am trying to return an hourly report on the number of searches performed. My results do not include the hours when there are zero searches, I thought I had the syntax correct for using COALESCE. Can anyone see what I am doing wrong? Thanks
SELECT CAST(startdatetime as Date),extract(hour from startdatetime) as hr, COALESCE(count(distinct id),0) as average_per_hour
FROM search WHERE CAST(startdatetime As Date) = '2014/07/05'
GROUP BY CAST(startdatetime as Date),extract(hour from startdatetime)
ORDER BY CAST(startdatetime as Date),extract(hour from startdatetime)

Some refinement, but basically the same as #a_horse_with_no_name's answer:
SELECT DATE '2014-07-05', hr, COUNT(DISTINCT id) AS average_per_hour
FROM generate_series(0, 23) hr
LEFT JOIN search ON EXTRACT(HOUR FROM startdatetime) = hr AND CAST(startdatetime AS DATE) = '2014-07-05'
GROUP BY hr
ORDER BY hr
Using CAST(startdatetime AS DATE) in ORDER BY & GROUP BY is irrelevant, because you search only one day. If that is not the case in general, you will need to tweak generate_series() too.
Edit:
This works across multiple days:
SELECT CAST(hr AS DATE), EXTRACT(HOUR FROM hr), COUNT(DISTINCT id) AS average_per_hour
FROM generate_series('2014-07-05 00:00:00', '2014-07-06 23:00:00', INTERVAL '1' HOUR) hr
LEFT JOIN search ON date_trunc('hour', startdatetime) = hr
GROUP BY hr
ORDER BY hr

What you need is a "lookup" table with all the hours in a day. Then you do a left join against your search table. Due to the left join on all possible rows, you can include the "missing" hours as well.
Something like the following (not tested, there might be syntax errors!)
with hours as (
select hr
from generate_series(0,23) hr
)
SELECT CAST(search.startdatetime as Date),
hours.hr,
count(distinct search.id) as average_per_hour
FROM hours
left join search on extract(hour from search.startdatetime) = hours.hr
WHERE cast(startdatetime As Date) = date '2014-07-05'
GROUP BY cast(startdatetime as Date),extract(hour from startdatetime)
ORDER BY CAST(startdatetime as Date),extract(hour from startdatetime);
As shown this will only work if you select exactly one day.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Big Query Order By Grouped Field - group-by

#standardSQL SELECT EXTRACT(DATE FROM DATETIME(timestamp, 'US/Eastern')) date, SUM(users) total_users FROM `mydataset.mytable` GROUP BY 1 ORDER BY 1

You can try subselect: #standardSQL SELECT date, total_users FROM ( SELECT EXTRACT(date FROM DATETIME(timestamp,'US/Eastern')) date, SUM(users) total_users FROM `mydataset.mytable` GROUP BY EXTRACT(date FROM DATETIME(timestamp, 'US/Eastern')) ) ORDER BY date

Related

Is there a SQL code for cumulative count of SaaS customer over months?

How to include three or more aggregators in a sql query?

Google BigQuery: Select and group by "yyyy-mm" from date field ("yyyy-mm-dd") or timestamp

PostgreSQL SELECT date before max(DATE)

Correct use of COALESCE in postgres / COALESCE not working correctly

Categories

Resources