T-SQL Count of items based on date

T-SQL Count of items based on date - tsql

To make the example super simple, lets say that I have a table with three rows, ID, Name, and Date. I need to find the count of all ID's belonging to a specific name where the ID does not belong to this month.
Using that example, I would want this output:
In other words, I want to count how many ID's that a name has that aren't this month/year.
I'm more into PowerShell and still fairly new to SQL. I tried doing a case statement, but because it's not a foreach it seems to be returning "If the Name has ANY date in this month, return NULL" which is not what I want. I want it to count how many ID's per name do not appear in this month.
SELECT NAME,
CASE
WHEN ( Month(date) NOT LIKE Month(Getdate())
AND Year(date) NOT LIKE Year(Getdate()) ) THEN Count(id)
END AS TotalCount
FROM dbo.table
GROUP BY NAME,
date
I really hope this makes sense, but if it doesn't please let me know and I can try to clarify more. I tried researching cursors, but I'm having a hard time grasping them to get them into my statement. Any help would be greatly appreciated!

You only want to group by the non-aggregated columns that are in the result set (in this case, Name). You totally don't need a cursor for this, it's a fairly straight-forward query.
select
Name,
Count(*) count
from
tbl
where
tbl.date > eomonth(getdate()) or
tbl.date <= eomonth(dateadd(mm, -1, getdate())
group by
Name
I did a little bit of trickery on the exclusion of rows that are in the current month. Generally, you want to avoid running functions on the columns you're comparing to if you can so that SQL Server can use an index to speed up its search. I assumed that the ID column is unique, if it's not, change count(*) to count(distinct ID).
Alternative where clause if you're using older versions of sql server. If the table is small enough, you can just do it directly (similar to what you tried originally, it just goes in the query where clause and not embedded in a case)
where
Month(date) <> Month(Getdate())
AND Year(date) <> Year(Getdate())
If you have a large table and sarging on the index is important, there some fun stuff you can build eomonth with dateadd and the date part functions, but it's a pain.

SELECT Name, COUNT(ID) AS TotalCount
FROM dbo.[table]
WHERE DATEPART(MONTH, [Date]) != DATEPART(MONTH, GETDATE()) OR DATEPART(YEAR, [Date]) != DATEPART(YEAR, GETDATE())
GROUP BY Name;

In T-SQL:
SELECT
NAME,
COUNT(id)
FROM dbo.table
WHERE MONTH(Date_M) <> MONTH(GETDATE())
GROUP BY NAME

Related

How to extend dynamic schema with views in Hasura and Postgres?

So I am trying and struggling for few days to extend the schema with the custom groupby using something like this
I have a table with few fields like id, country, ip, created_at.
Then I am trying to get them as groups. For example, group the data based on date, hourly of date, or based on country, and based on country with DISTINCT ip.
I am zero with SQLs honestly. But I tried to play around and get what I want. Here's an example.
SELECT Hour(created_at) AS date,
COUNT(*) AS count
FROM session where CAST(created_at AS date) = '2021-04-05'
GROUP BY Hour(created_at)
ORDER BY date;
SELECT country,
count(*) AS count from (SELECT * FROM session where CAST(created_at AS date) <= '2021-05-12' GROUP BY created_at) AS T1
GROUP BY country;
SELECT country, COUNT(*) as count
FROM (SELECT DISTINCT ip, country FROM session) AS T1
GROUP BY country;
SELECT DATE(created_at) AS date,
COUNT(*) AS count
FROM session
GROUP BY DATE(created_at)
ORDER BY date;
Now I am struggling with two things.
How do I make the date as variables? I mean, if I want to group them for a particular date range/ or today's data hourly, or per quarter gap (more of configurable), how do I add the variables in Hasura's Raw SQL?
Also for this approach I have to add schema for each one of them? Like this
CREATE
OR REPLACE VIEW "public"."unique_session_counts_date" AS
SELECT
date(session.created_at) AS date,
count(*) AS count
FROM
session
GROUP BY
(date(session.created_at))
ORDER BY
(date(session.created_at));
Is there a way to make it more generalized? What I mean is, if it
was in Nodejs I could have done something like
return rawQuery(
`
select ${field} x, count(*) y
from ${table}
where website_id=$1
and created_at between $2 and $3
${domainFilter}
${urlFilter}
group by 1
order by 2 desc
`,
params,
);
In this case, based on whatever field and where clause I send, one query would do the trick for me. Can do something similar in hasura?
Thank you so much in advance.

How do I make the date as variables? I mean, if I want to group them for a particular date range/ or today's data hourly, or per quarter gap (more of configurable), how do I add the variables in Hasura's Raw SQL?
My first thought is this. If you're thinking about passing in variables via a GraphQL for example, the GraphQL would look something like:
query MyQuery {
unique_session_counts_date(where: {created_at: {_gte: "<start date here>", _lte: "<end date here>"}}) {
<...any fields, rollups, etc here...>
}
}
The underlying view/query would follow the group by and order by that you've detailed. Then you'd be able to submit a query of the graphql query and just pass in the pertinent parameters like the $1, $2, and $3 in the raqQuery call.
Also for this approach I have to add schema for each one of them?
The schema? The view? I don't think a view specifically would be required, if a multilevel select or similar query can handle it and perform then a view wouldn't particularly be needed.
That's my first stab at the problem. I'm going to try to work through this problem in a few hours via a Twitch stream # HasuraHQ if you can join, happy to walk through it live.

HQL: Max date of previous month

Good morning,
I have a problem I've been trying to solve for but am getting now where.
I need to find the max date of the previous month. Normally I would just use the following to find the last day of the previous month: last_day(add_months(current_date, -1)
However, this particular data set doesn't always have the last day with data. E.g. Last day in the data for May was May 30th. Obviously if i try using the syntax above it would return no data because it would be looking for 5/31.
So is there a way to find the "max" day available in the data of the previous month? Or the month prior etc.?

For example like this (two scans of table: one in subquery to find max date and one in main query):
select *
from mytable
where as_of_date in (select max(as_of_date) from mytable where as_of_date between first_day(add_months(current_date, -1)) and last_day(add_months(current_date, -1))
Or (single scan + analytic function) like this
select col1 ... colN
from
(
select t.*, rank() over (partition by month (t.as_of_date) order by t.as_of_date desc) rnk
from mytable t
where --If you have partition on date, this WHERE may improve performance
t.as_of_date between first_day(add_months(current_date, -1)) and last_day(add_months(current_date, -1))
)s
where rnk=1

Combine count and max in postgresql sql

I have a problem to formulate an sql question in postgresql, hoping to get some help here
I have a table called visitor that contains an column called fk_employee_id, fk_employee_id contains different number between 1-10, example:
1,3,4,6,4,6,7,3,2,1,6,7,6
Now I want to find out which value that is the most frequent in this column (in this case 6) I have made an question that seem to solve my question;
SELECT fk_employee_id
FROM visitor
GROUP BY fk_employee_id
ORDER BY COUNT(fk_employee_id) DESC
LIMIT 1
but this question, doesn't get right if it is two values that are the most frequent one. So instead I try to write a question which contains max function but cant figure out how, anyone now how to do this?

We can use RANK here to slightly modify your current query:
WITH cte AS (
SELECT
fk_employee_id,
RANK() OVER (ORDER BY COUNT(*) DESC) rank
FROM visitor
GROUP BY fk_employee_id
)
SELECT fk_employee_id
FROM cte
WHERE rank = 1;
Demo

Find rows which have different attribute value in ONE day for same product. (Postgresql)

can someone help me to write a query?
I have for example columns:
Date
product_key
category_code
In one day I expect to have same category_code for one product, but I want to check this with SQL.
Thank you.

If you want to find the day, the product_key and the category_code that doubles, You should use query like this:
SELECT
date,
product_key,
category_code,
count(1)
FROM your_table
GROUP BY date, product_key, category_code
HAVING count(1) > 1;

You can group your results by date and product, and use count and distinct to find if there is more than one category code for a product. You can then filter rows having more than 1 distinct category in the group.
SELECT
Date, product_key, count(distinct category_code) AS categories
FROM
my_table
GROUP BY
Date, product_key
HAVING
count(distinct category_code) > 1

PostgreSQL - get records with null values

I'm trying to get a query which would show distributors that haven't sell anything in 90 days, but the problem I get is with NULL values. It seems PostgreSQL ignores null values, even when I queried to show it (or maybe I did it in wrong way).
Let say there are 1000 distributors, but with this query I only get 1 distributor, but there should be more distributors that didn't sell anything, because if I write SQL query to show distributors that sold by any amount in the last 90 days, it shows about 500. So I wonder where are those other 499? If I understand correctly, those other 499, didn't have any sales, so all records are null and are not showed in query.
Does anyone know how to make it show null values of one table where in relation other table is not null? (like partners table (res_partner) is not null, but sale_order table (sales) or object is null? (I also tried to filter like so.id IS NULL, but in such way I get empty query)
Code of my query:
(
SELECT
min(f1.id) as id,
f1.partner as partner,
f1.sum1
FROM
(
SELECT
min(f2.id) as id,
f2.partner as partner,
sum(f2.null_sum) as sum1
FROM
(
SELECT
min(rp.id) as id,
rp.search_name as partner,
CASE
WHEN
sol.price_subtotal IS NULL
THEN
0
ELSE
sol.price_subtotal
END as null_sum
FROM
sale_order as so,
sale_order_line as sol,
res_partner as rp
WHERE
sol.order_id=so.id and
so.partner_id=rp.id
and
rp.distributor=TRUE
and
so.date_order <= now()::timestamp::date
and
so.date_order >= date_trunc('day', now() - '90 day'::interval)::timestamp::date
and
rp.contract_date <= date_trunc('day', now() - '90 day'::interval)::timestamp::date
GROUP BY
partner,
null_sum
)as f2
GROUP BY
partner
) as f1
WHERE
sum1=0
GROUP BY
partner,
sum1
)as fld
EDIT: 2012-09-18 11 AM.
I think I understand why Postgresql behaves like this. It is because of the time interval. It checks if there is any not null value in that inverval. So it only found one record, because that record had sale order with zero (it was not converted from null to zero) and part which checked for null values was just skipped. If I delete time interval, then I would see all distributors that didn't sell anything at all. But with time interval for some reason it stops checking null values and looks if there are only not null values.
So does anyone know how to make it check for null values too in given interval?.. (for the last 90 days to be exact)

Aggregates like sum() and and min() do ignore NULL values. This is required by the SQL standard and every DBMS I know behaves like that.
If you want to treat a NULL value as e.g. a zero, then use something like this:
sum(coalesce(f2.null_sum, 0)) as sum1
But as far as I understand you question and your invalid query you actually want an outer join between res_partner and the sales tables.
Something like this:
SELECT min(rp.id) as id,
rp.search_name as partner,
sum(coalesce(sol.price_subtotal,0)) as price_subtotal
FROM res_partner as rp
LEFT JOIN sale_order as so ON so.partner_id=rp.id and rp.distributor=TRUE
LEFT JOIN sale_order_line as sol ON sol.order_id=so.id
WHERE so.date_order <= CURRENT_DATE
and so.date_order >= date_trunc('day', now() - '90 day'::interval)::timestamp::date
and rp.contract_date <= date_trunc('day', now() - '90 day'::interval)::timestamp::date
GROUP BY rp.search_name
I'm not 100% sure I understood your problem correctly, but it might give you a headstart.

Try to name subqueries, and retrieve their columns with col.q1, col.q2 etc. to make sure which column from which query/subquery you're dealing with. Maybe it's somewhat simple, e.g. it unites some rows containing only NULLs into one row? Also, at least for debugging purposes, it's smart to add , count(*) at the end of each query/subquery to get implicit number of rows returned on result.. hard to guess what exactly happened..