Postgres insists on Primary Key in Group By statement whereas SQLite does not - postgresql

I am having a problem with Postgres insisting the primary key is part of the “group by” clause, the frustration being that the syntax I have works perfectly with SQLITE.
The error message is ‘column "attendees.id" must appear in the GROUP BY clause or be used in an aggregate function’
My question is can anyone provide the fix such that Postgres will produce the output that SQLite does (SQL statement, schema, data and desired output provided below)
My table schema:
CREATE TABLE Attendees
(
id INTEGER PRIMARY KEY AUTOINCREMENT,
activitynumber int,
activitydate DATE,
club int,
usernum int,
username ,
activityname ,
cost int,
received int,
update_time DATETIME,
CONSTRAINT FK_activitynumber
FOREIGN KEY (activitynumber)
REFERENCES Activity(activitynumber)
CONSTRAINT FK_club
FOREIGN KEY (club)
REFERENCES "club"(clubnum)
CONSTRAINT FK_usernum
FOREIGN KEY (usernum)
REFERENCES "users"(usernum)
);
CREATE INDEX ix_attendees_group ON Attendees
(club,activitydate,activitynumber);
(Please note, I added the index to try and solve the problem but it didn’t).
activities = Attendees.query.filter_by(club=current_user.club).order_by(Attendees.activitdate.desc()).group_by(Attendees.activitydate,Attendees.activitynumber)
The data I have is:-
id activitynumber activitydate club usernum username activityname cost received update_time
1 3 15-10-19 1002 1002000001 susan Monday Swim 200 0 58:43.8
2 3 15-10-19 1002 1002000002 triblokerich Monday Swim 200 0 58:49.9
3 4 17-10-19 1002 1002000001 susan Thursday Swim 200 0 59:04.5
4 4 17-10-19 1002 1002000015 craig Thursday Swim 200 0 59:09.9
5 6 16-10-19 1002 1002000001 susan Dunton 200 0 00:06.5
6 6 16-10-19 1002 1002000002 triblokerich Dunton 200 0 00:17.6
7 3 16-10-19 1002 1002000001 susan Monday Swim 300 0 58:28.1
8 3 16-10-19 1002 1002000002 triblokerich Monday Swim 300 0 58:33.7
9 3 16-10-19 1002 1002000015 craig Monday Swim 300 0 58:37.7
10 3 16-10-19 1002 1002000016 craig2 Monday Swim 300 0 01:41.8
11 3 19-10-19 1002 1002000001 susan Monday Swim 300 0 07:56.4
12 3 19-10-19 1002 1002000002 triblokerich Monday Swim 300 0 08:04.8
and the output I am trying to get (which works with SQLite) is:
Date Activity
2019-10-19 Monday Swim
2019-10-17 Thursday Swim
2019-10-16 Monday Swim
2019-10-16 Dunton
2019-10-15 Monday Swim

If I understand SQL Alchemy's syntax correctly you're looking for something like this, hopefully I'm not far off (broken into lines to show better):
activities = Attendees.query\
.with_entities(Attendees.activitydate, Attendees.activityname)\
.filter_by(club=current_user.club)\
.order_by(Attendees.activitdate.desc())\
.group_by(Attendees.activitydate,Attendees.activityname)
You always need to have the not-grouped-by columns in the result set as aggregates (min, max, sum, etc) because there's no way to show them otherwise. Which ID for example would be taken from the grouping of five rows? SQLite may be one of the inaccurate databases that just throws any of the rows back but PostgreSQL is very strict about the query and will never do this.
In this query it is defined that the result should contain only the columns that are in the group_by, since those are what you need. It also has the activityname as a grouped column rather than activitynumber since they need to match. This way there's no un-aggregated non-grouped columns in the result, only the grouped ones and the query should work fine.

Related

T_SQL counting particular values in one row with multiple columns

I have little problem with counting cells with particular value in one row in MSSMS.
Table looks like
ID
Month
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
11
12
13
14
15
16
...
31
5000
1
null
null
1
1
null
1
1
null
null
2
2
2
2
2
null
null
3
3
3
3
3
null
...
1
I need to count how many cells in one row have value for example 1. In this case it would be 5.
Data represents worker shifts in a month. Be aware that there is a column named month (FK with values 1-12), i don't want to count that in a result.
Column ID is ALWAYS 4 digit number.
Possibility is to use count(case when) but in examples there are only two or three columns not 31. Statement will be very long. Is there any other option to count it?
Thanks for any advices.
I'm going to strongly suggest that you abandon your current table design, and instead store one day per month, per record, not column. That is, use this design:
ID | Date | Value
5000 | 2021-01-01 | NULL
5000 | 2021-01-02 | NULL
5000 | 2021-01-03 | 1
5000 | 2021-01-04 | 1
5000 | 2021-01-05 | NULL
...
5000 | 2021-01-31 | 5
Then use this query:
SELECT
ID,
CONVERT(varchar(7), Date, 120),
COUNT(CASE WHEN Value = 1 THEN 1 END) AS one_cnt
FROM yourTable
GROUP BY
ID,
CONVERT(varchar(7), Date, 120);

Is there a way to update a table using random values in each row in a table with multiple primary keys?

We have a databade in postgresql ver 11.7 on the pgAdmin4 which has a table with 3 primary keys .We want to update each row with a random value between 1 and 10 .
We have found the desirable elements but we need a way to update only 1 row each time
the conditions are as follow :
1)The subjects much be in a specific semester which can be found in the table called "Courses"
2)then we must find the amkas of students who are registered ( register_status = 'approved')
and we need to upade the table called "Register" with random exam_grades with those conditions assuming the exam_grade is null
Update "Register" SET exam_grade = (SELECT floor(random() * 10 + 1)::int) WHERE course_code IN
(SELECT DISTINCT course_code FROM "Course" WHERE typical_year = 2 AND typical_season = 'spring' ) AND register_status = 'approved' AND exam_grade is null ;
Just that .Somehow we need the update to just be used on only one row and then just use a for loop to take it one by one.If there is any other information i should include ,please tell me
So the tables are as follows
Register:
amka(PK) serial_number(PK) course_code(PK) exam_grade final_grade lab_grade register_status
19 5 ΕΝΕ 501 null null null proposed
13 15 ΤΗΛ 417 2 2 null fail
13 15 ΤΗΛ 411 10 8.4 null pass
47 22 ΜΠΔ 433 6 null 9 approved
:
Course:
course_code(PK) typical_year typical_season units ects weight obligatory
ΑΓΓ 201 2 winter 2 4 1 true
ΜΑΘ 201 1 winter 4 6 1.5 true
ΜΑΘ 208 1 winter 3 5 1.5 false
The results that i want are
mka(PK) serial_number(PK) course_code(PK) exam_grade final_grade lab_grade register_status
19 5 ΕΝΕ 501 random null null approved
13 15 ΤΗΛ 417 random 2 null approved
13 15 ΤΗΛ 411 random 8.4 null approved
47 22 ΜΠΔ 433 random null 9 approved
new random in each row
but i only get 1 number which fills all of them
i hope with this edit things became clearer

how do you group query by hours?

I'm trying to query surveys completed every hour in a given day.
the survey table is something like this:
id(SERIAL) - userid(INTEGER) - description - timeTaken(timestamp with time zone)
3 ; 1; "some random description"; "2015-01-17 04:30:24.983576-05"
5 ; 2; "sample about x"; "2015-01-17 04:30:24.983576-05"
7 ; 3; "survey about ducks"; "2015-01-17 05:30:24.983576-05"
basically for a given day lets say March 1st, I want to get all the survey rows grouped by the hour they were taken, i.e 7 rows at 1pm, 3 at 2pm, etc. But I'm not sure if its possible to group like this on pg or if I should do it client end.
EDIT: for the data above have id 3 and 5 grouped under 4 and id 7 grouped for 5. basically I want to display the data seperated by the hours they were completed in.
Thanks
You can use date_part to extract just the hour, which you can have in your group by clause. See http://www.postgresql.org/docs/9.4/static/functions-datetime.html.
By using extarct function in postgresql
for the following sample data
id userid descp timetaken
-- ------ ----------------------- ---------------------------
1 1 some random description 2015-01-17 15:00:24.9835760
2 2 sample about x 2015-01-17 15:00:24.9835760
3 3 survey about ducks 2015-01-17 16:00:24.9835760
4 3 survey about ducks 2015-01-01 19:00:24.9835760
5 3 survey about ducks 2015-01-01 16:00:24.9835760
6 3 survey about ducks 2015-01-01 19:00:24.9835760
I need to get the survey_count per hour in date 01-01-2015
select extract(hour from timetaken) survey_hour
,count(*) survey_count
from sur
where timetaken::date ='2015-01-01'
group by survey_hour

SELECT record based upon dates

Assuming data such as the following:
ID EffDate Rate
1 12/12/2011 100
1 01/01/2012 110
1 02/01/2012 120
2 01/01/2012 40
2 02/01/2012 50
3 01/01/2012 25
3 03/01/2012 30
3 05/01/2012 35
How would I find the rate for ID 2 as of 1/15/2012?
Or, the rate for ID 1 for 1/15/2012?
In other words, how do I do a query that finds the correct rate when the date falls between the EffDate for two records? (Rate should be for the date prior to the selected date).
Thanks,
John
How about this:
SELECT Rate
FROM Table1
WHERE ID = 1 AND EffDate = (
SELECT MAX(EffDate)
FROM Table1
WHERE ID = 1 AND EffDate <= '2012-15-01');
Here's an SQL Fiddle to play with. I assume here that 'ID/EffDate' pair is unique for all table (at least the opposite doesn't make sense).
SELECT TOP 1 Rate FROM the_table
WHERE ID=whatever AND EffDate <='whatever'
ORDER BY EffDate DESC
if I read you right.
(edited to suit my idea of ms-sql which I have no idea about).

T-SQL Determine Status Changes in History Table

I have an application which logs changes to records in the "production" table to a "history" table. The history table is basically a field for field copy of the production table, with a few extra columns like last modified date, last modified by user, etc.
This works well because we get a snapshot of the record anytime the record changes. However, it makes it hard to determine unique status changes to a record. An example is below.
BoxID StatusID SubStatusID ModifiedTime
1 4 27 2011-08-11 15:31
1 4 11 2011-08-11 15:28
1 4 11 2011-08-10 09:07
1 5 14 2011-08-09 08:53
1 5 14 2011-08-09 08:19
1 4 11 2011-08-08 14:15
1 4 9 2011-07-27 15:52
1 4 9 2011-07-27 15:49
1 2 8 2011-07-26 12:00
As you can see in the above table (data comes from the real system with other fields removed for brevity and security) BoxID 1 has had 9 changes to the production record. Some of those updates resulted in statuses being changed and some did not, which means other fields (those not shown) have changed.
I need to be able, in TSQL, to extract from this data the unique status changes. The output I am looking for, given the above input table, is below.
BoxID StatusID SubStatusID ModifiedTime
1 4 27 2011-08-11 15:31
1 4 11 2011-08-10 09:07
1 5 14 2011-08-09 08:19
1 4 11 2011-08-08 14:15
1 4 9 2011-07-27 15:49
1 2 8 2011-07-26 12:00
This is not as easy as grouping by StatusID and SubStatusID and taking the min(ModifiedTime) then joining back into the history table since statuses can go backwards as well (see StatusID 4, SubStatusID 11 gets set twice).
Any help would be greatly appreciated!
Does this do work for you
;WITH Boxes_CTE AS
(
SELECT Boxid, StatusID, SubStatusID, ModifiedTime,
ROW_NUMBER() OVER (PARTITION BY Boxid ORDER BY ModifiedTime) AS SEQUENCE
FROM Boxes
)
SELECT b1.Boxid, b1.StatusID, b1.SubStatusID, b1.ModifiedTime
FROM Boxes_CTE b1
LEFT OUTER JOIN Boxes_CTE b2 ON b1.Boxid = b2.Boxid
AND b1.Sequence = b2.Sequence + 1
WHERE b1.StatusID <> b2.StatusID
OR b1.SubStatusID <> b2.SubStatusID
OR b2.StatusID IS NULL
ORDER BY b1.ModifiedTime DESC
;
Select BoxID,StatusID,SubStatusID FROM Staty CurrentStaty
INNER JOIN ON
(
Select BoxID,StatusID,SubStatusID FROM Staty PriorStaty
)
Where Staty.ModifiedTime=
(Select Max(PriorStaty.ModifiedTime) FROM PriorStaty
Where PriortStaty.ModifiedTime<Staty.ModifiedTime)
AND Staty.BoxID=PriorStaty.BoxID
AND NOT (
Staty.StatusID=PriorStaty.StatusID
AND
Staty.SubStatusID=PriorStaty.StatusID
)