I am trying to get a count of records between dates.
My data has records from 01/01/2020 to 04/01/2020.
I have set up two parameters, Start-date & End-date
I only want to count the records that are between my start (01/01/2020) and end date (01/31/2020).
Sample Data
Sheet_ID Supervisor_ID Category_ID Date
OB-111 1111 1 01/01/2020
OB-112 1111 4 03/01/2020
OB-113 1111 2 01/01/2020
OB-114 2222 2 01/01/2020
OB-115 2222 2 01/21/2020
I am trying to show the following
Supervisor_ID Category_ID Count
1111 1 1
1111 2 1
2222 2 2
Thank you in advance!
Create a calculated as follows:
IF [Date]>=[StartDate] AND [Date]<=[EndDate] THEN 1 END
Sum this field to get the count.
Related
I have little problem with counting cells with particular value in one row in MSSMS.
Table looks like
ID
Month
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
11
12
13
14
15
16
...
31
5000
1
null
null
1
1
null
1
1
null
null
2
2
2
2
2
null
null
3
3
3
3
3
null
...
1
I need to count how many cells in one row have value for example 1. In this case it would be 5.
Data represents worker shifts in a month. Be aware that there is a column named month (FK with values 1-12), i don't want to count that in a result.
Column ID is ALWAYS 4 digit number.
Possibility is to use count(case when) but in examples there are only two or three columns not 31. Statement will be very long. Is there any other option to count it?
Thanks for any advices.
I'm going to strongly suggest that you abandon your current table design, and instead store one day per month, per record, not column. That is, use this design:
ID | Date | Value
5000 | 2021-01-01 | NULL
5000 | 2021-01-02 | NULL
5000 | 2021-01-03 | 1
5000 | 2021-01-04 | 1
5000 | 2021-01-05 | NULL
...
5000 | 2021-01-31 | 5
Then use this query:
SELECT
ID,
CONVERT(varchar(7), Date, 120),
COUNT(CASE WHEN Value = 1 THEN 1 END) AS one_cnt
FROM yourTable
GROUP BY
ID,
CONVERT(varchar(7), Date, 120);
BACKGROUND
I have three large tables (employee_info, driver_info, school_info) that I have joined together on common attributes using a series of LEFT OUTER JOIN operations. After each join, the resulting number of records increased slightly, indicating that there are duplicate IDs in the data. To try and find all of the duplicates in the IDs, I dumped the ID columns into a temp table like so:
Original Dump of ID Columns
first_name
last_name
employee_id
driver_id
school_id
Mickey
Mouse
1234
abcd
wxyz
Donald
Duck
2423
heca
qwer
Mary
Poppins
1111
acbe
aaaa
Wiley
Cayote
1234
strf
aaaa
Daffy
Duck
1256
acbe
pqrs
Bugs
Bunny
9999
strf
yxwv
Pink
Panther
2222
zzzz
zzaa
Michael
Archangel
0000
rstu
aaaa
In this overly simplified example, you will see that IDs 1234 (employee_id), strf (driver_id), and aaaa (school_id) are each duplicated at least once. I would like to add a count column for each of the ID columns, and populate them with the count for each ID used, like so:
ID Columns with Counts
first_name
last_name
employee_id
employee_id_count
driver_id
driver_id_count
school_id
school_id_count
Mickey
Mouse
1234
2
abcd
1
wxyz
1
Donald
Duck
2423
1
heca
1
qwer
1
Mary
Poppins
1111
1
acbe
1
aaaa
3
Wiley
Cayote
1234
2
strf
2
aaaa
3
Daffy
Duck
1256
1
acbe
1
pqrs
1
Bugs
Bunny
9999
1
strf
2
yxwv
1
Pink
Panther
2222
1
zzzz
1
zzaa
1
Michael
Archangel
0000
1
rstu
1
aaaa
3
You can see that IDs 1234 and strf each have 2 in the count, and aaaa has 3. After generating this table, my goal is to pull out all records where any of the counts are greater than 1, like so:
All Records with One or More Duplicate IDs
first_name
last_name
employee_id
employee_id_count
driver_id
driver_id_count
school_id
school_id_count
Mickey
Mouse
1234
2
abcd
1
wxyz
1
Mary
Poppins
1111
1
acbe
1
aaaa
3
Wiley
Cayote
1234
2
strf
2
aaaa
3
Bugs
Bunny
9999
1
strf
2
yxwv
1
Michael
Archangel
0000
1
rstu
1
aaaa
3
Real World Perspective
In my real-world work, the JOIN'd table contains 100 columns, 15 different ID fields and over 30,000 records, and the final table came out to be 28 more than the original. This may seem like a small amount, but each of the 28 represent a broken link that we must fix.
Is there a simple way to get the counts populated like in the second table above? I have been wrestling with this for hours already, and have not been able to make this work. I tried some aggregate functions, but they cannot be used in table UPDATE operations.
The COUNT function, when used as an analytic function, can do what you want here, e.g.
WITH cte AS (
SELECT *,
COUNT(employee_id) OVER (PARTITION BY employee_id) employee_id_count,
COUNT(driver_id) OVER (PARTITION BY driver_id) driver_id_count,
COUNT(school_id) OVER (PARTITION BY school_id) school_id_count
FROM yourTable
)
SELECT *
FROM cte
WHERE
employee_id_count > 1
driver_id_count > 1
school_id_count > 1;
I am having a problem with Postgres insisting the primary key is part of the “group by” clause, the frustration being that the syntax I have works perfectly with SQLITE.
The error message is ‘column "attendees.id" must appear in the GROUP BY clause or be used in an aggregate function’
My question is can anyone provide the fix such that Postgres will produce the output that SQLite does (SQL statement, schema, data and desired output provided below)
My table schema:
CREATE TABLE Attendees
(
id INTEGER PRIMARY KEY AUTOINCREMENT,
activitynumber int,
activitydate DATE,
club int,
usernum int,
username ,
activityname ,
cost int,
received int,
update_time DATETIME,
CONSTRAINT FK_activitynumber
FOREIGN KEY (activitynumber)
REFERENCES Activity(activitynumber)
CONSTRAINT FK_club
FOREIGN KEY (club)
REFERENCES "club"(clubnum)
CONSTRAINT FK_usernum
FOREIGN KEY (usernum)
REFERENCES "users"(usernum)
);
CREATE INDEX ix_attendees_group ON Attendees
(club,activitydate,activitynumber);
(Please note, I added the index to try and solve the problem but it didn’t).
activities = Attendees.query.filter_by(club=current_user.club).order_by(Attendees.activitdate.desc()).group_by(Attendees.activitydate,Attendees.activitynumber)
The data I have is:-
id activitynumber activitydate club usernum username activityname cost received update_time
1 3 15-10-19 1002 1002000001 susan Monday Swim 200 0 58:43.8
2 3 15-10-19 1002 1002000002 triblokerich Monday Swim 200 0 58:49.9
3 4 17-10-19 1002 1002000001 susan Thursday Swim 200 0 59:04.5
4 4 17-10-19 1002 1002000015 craig Thursday Swim 200 0 59:09.9
5 6 16-10-19 1002 1002000001 susan Dunton 200 0 00:06.5
6 6 16-10-19 1002 1002000002 triblokerich Dunton 200 0 00:17.6
7 3 16-10-19 1002 1002000001 susan Monday Swim 300 0 58:28.1
8 3 16-10-19 1002 1002000002 triblokerich Monday Swim 300 0 58:33.7
9 3 16-10-19 1002 1002000015 craig Monday Swim 300 0 58:37.7
10 3 16-10-19 1002 1002000016 craig2 Monday Swim 300 0 01:41.8
11 3 19-10-19 1002 1002000001 susan Monday Swim 300 0 07:56.4
12 3 19-10-19 1002 1002000002 triblokerich Monday Swim 300 0 08:04.8
and the output I am trying to get (which works with SQLite) is:
Date Activity
2019-10-19 Monday Swim
2019-10-17 Thursday Swim
2019-10-16 Monday Swim
2019-10-16 Dunton
2019-10-15 Monday Swim
If I understand SQL Alchemy's syntax correctly you're looking for something like this, hopefully I'm not far off (broken into lines to show better):
activities = Attendees.query\
.with_entities(Attendees.activitydate, Attendees.activityname)\
.filter_by(club=current_user.club)\
.order_by(Attendees.activitdate.desc())\
.group_by(Attendees.activitydate,Attendees.activityname)
You always need to have the not-grouped-by columns in the result set as aggregates (min, max, sum, etc) because there's no way to show them otherwise. Which ID for example would be taken from the grouping of five rows? SQLite may be one of the inaccurate databases that just throws any of the rows back but PostgreSQL is very strict about the query and will never do this.
In this query it is defined that the result should contain only the columns that are in the group_by, since those are what you need. It also has the activityname as a grouped column rather than activitynumber since they need to match. This way there's no un-aggregated non-grouped columns in the result, only the grouped ones and the query should work fine.
in Tableau I have a table with this form :
rows: Score.
columns:MY(month), sum(good), sum(bad).
This is the information when I use: month 201811
201611 201612 ... 201801 ... 201811 TOTAL
Score Good Bad Good Bad Good Bad ... Good Bad
1 3 0 7 3 6 3 2 1
2 5 1 1 1 1 1 4 4
3 10 3 2 1 0 3 3 3
I want to use a filter with 'Month' column ,when I filter month=201811, show since 201611 to 201711 (last 12 months) in Total column(Totals in Bad and Good columns) by Score.
Filter: 201811
Formula: sum(Good) and sum(Bad) since '201611' to '201711'
I trying "IF DATEDIFF('month', [Good], today()) <=12" but doesn't work.
Thanks for your help.
Try this:
If DATEDIFF("month",TODAY(),[Your Date Field],"Sunday") <= -12
then [Your Date Field] else null end
Then use that as your date column. The "Sunday" is supposed to be whatever you consider the starting day of the week. I wasn't sure what your date field is named so I named it "[Your Date Field]"
I have two table 1000 of record given below.
My first table is USER table.
ID Name DateOfBirth
1 John 1980-11-20 00:00:00.000
2 Denial 1940-04-10 00:00:00.000
3 Binney 1995-12-25 00:00:00.000
4 Sara 1960-11-20 00:00:00.000
5 Poma 1980-11-20 00:00:00.000
6 Cameroon 1980-11-20 00:00:00.000
.....
.....
And my second table is CHANNEL_WATCH_DURATION_BY_USER
userid duration channelname
1 100 SAB
2 200 zee Tv
1 400 axn
2 0 star 1
3 800 star 2
3 700 star 3
4 200 star 4
.....
.....
I need to write the POSTGRES SQL Query to display different age groups contain duration with each channel.
under 18 20-30 age 30-40 age chaneel
10 40 100 star 1
20 0 200 star 2
30 79 0 zee
40 80 30 axn
.....
.....
SELECT
SUM(IF(DATEDIFF(NOW(),DateOfBirth)<18,1,0)) AS under18,
SUM(IF(DATEDIFF(NOW(),DateOfBirth) BETWEEN 20 AND 30,1,0)) as 20_to_30_age,
SUM(IF(DATEDIFF(NOW(),DateOfBirth)BETWEEN 30 AND 40,1,0)) as 30_to_40_age,
channelname as chaneel from
USER a,CHANNEL_WATCH_DURATION_BY_USER b where a.ID=b.USERID GROUP BY channelname