I have this situation, I have this table named balance:
id -> Integer Auto Increment not null
balance -> numeric(19) not null
date -> datetime default now() not null
id | balance | date
1 100 2019-09-12 16:15:29.091720
2 99 2019-09-12 16:15:33.404119
3 98 2019-09-12 16:15:33.412087
4 97 2019-09-12 16:15:33.425252
5 96 2019-09-12 16:15:33.442137
6 95 2019-09-12 16:15:33.513825 -> this time A
7 94 2019-09-12 16:15:33.444407 -> this time B
Then I do an insert to the table just 1 column, is balance. Example insert with thread process:
INSERT INTO balance(balance) VALUES(100)
Date B is lower than A. Where the id increment is 6 then 7, are the process by the id?
Example, id 6 is process first time then insert 7. So 6 must be have time lower than 7?
Any clue why this happens?
If the transaction that inserted row with id = 7 started earlier than the one that inserted id=6 then this is possible. Note that time when the transaction started is important, not the time when the insert was executing as part of that transaction.
As documented in the manual, now() returns the time at the start of the transaction (it's the same as transaction_timestamp(), not the "current" time.
If you need that, you should change your default value to clock_timestamp()
Related
I have a table which contains id (unique), number of system (system_id), timestamp time_change and status. When status 1 it means, system unavailable, when 0 - available:
CREATE temp TABLE temp_data_test (
id int8 NULL,
system_id int8 NULL,
time_change timestamptz NULL,
status int4 NULL
);
INSERT INTO temp_data_test (id, system_id, time_change, status) VALUES
(53,1,'2022-04-02 13:57:07.000',1),
(54,1,'2022-04-02 14:10:26.000',0),
(55,1,'2022-04-02 14:28:45.000',1),
(56,1,'2022-04-02 14:32:19.000',0),
(57,1,'2022-04-05 03:20:18.000',1),
(58,3,'2022-04-05 03:21:18.000',1),
(59,2,'2022-04-05 03:21:22.000',1),
(60,2,'2022-04-06 02:27:15.000',0),
(61,3,'2022-04-06 02:27:15.000',0),
(62,1,'2022-04-06 02:28:17.000',0);
And a table date_dict with date (just one column date_of_day date).
As you can see, status doesn't change everyday. But I need a statistics for each calendar day for each system.
So for day, that are not in the table I need add 2 rows for each system. First with timestamp 'date 00:00:00' and status opposite to first nearest status with a higher date (may be not in that day, but tomorrow).
And second with timestamp 'date 23:59:59' with opposite to nearest lower date (today, yesterday etc).
For this table I need something like
id system_id time_change status
63 1 '2022-04-02 00:00:00' 0
64 1 '2022-04-02 23:59:59' 1
65 1 '2022-04-03 00:00:00' 0
66 1 '2022-04-03 23:59:59' 1-- cause system was available from 2 april
67 1 '2022-04-04 00:00:00' 0
68 1 '2022-04-04 23:59:59' 1
69 1 '2022-04-05 00:00:00' 0
70 1 '2022-04-05 23:59:59' 0--cause become unavailable in 2022-04-05 03:20:18
And so on for another system
I suppose, it can be divide into 2 parts with first row and second (00:00:00 and 23:59:59). My attempts lead to null in dates and I try to group by date, which didn't work as I can see.
I am working on a query to return the next 7 days worth of data every time an event happens indicated by "where event = 1". The goal is to then group all the data by the user id and perform aggregate functions on this data after the event happens - the event is encoded as binary [0, 1].
So far, I have been attempting to use nested select statements to structure the data how I would like to have it, but using the window functions is starting to restrict me. I am now thinking a self join could be more appropriate but need help in constructing such a query.
The query currently first creates daily aggregate values grouped by user and date (3rd level nested select). Then, the 2nd level sums the data "value_x" to obtain an aggregate value grouped by the user. Then, the 1st level nested select statement uses the lead function to grab the next rows value over and partitioned by each user which acts as selecting the next day's value when event = 1. Lastly, the select statement uses an aggregate function to calculate the average "sum_next_day_value_after_event" grouped by user and where event = 1. Put together, where event = 1, the query returns the avg(value_x) of the next row's total value_x.
However, this doesn't follow my time rule; "where event = 1", return the next 7 days worth of data after the event happens. If there is not 7 days worth of data, then return whatever data is <= 7 days. Yes, I currently only have one lead with the offset as 1, but you could just put 6 more of these functions to grab the next 6 rows. But, the lead function currently just grabs the next row without regard to date. So theoretically, the next row's "value_x" could actually be 15 days from where "event = 1". Also, as can be seen below in the data table, a user may have more than one row per day.
Here is the following query I have so far:
select
f.user_id
avg(f.sum_next_day_value_after_event) as sum_next_day_values
from (
select
bld.user_id,
lead(bld.value_x, 1) over(partition by bld.user_id order by bld.daily) as sum_next_day_value_after_event
from (
select
l.user_id,
l.daily,
sum(l.value_x) as sum_daily_value_x
from (
select
user_id, value_x, date_part('day', day_ts) as daily
from table_1
group by date_part('day', day_ts), user_id, value_x) l
group by l.user_id, l.day_ts
order by l.user_id) bld) f
group by f.user_id
Below is a snippet of the data from table_1:
user_id
day_ts
value_x
event
50
4/2/21 07:37
25
0
50
4/2/21 07:42
45
0
50
4/2/21 09:14
67
1
50
4/5/21 10:09
8
0
50
4/5/21 10:24
75
0
50
4/8/21 11:08
34
0
50
4/15/21 13:09
32
1
50
4/16/21 14:23
12
0
50
4/29/21 14:34
90
0
55
4/4/21 15:31
12
0
55
4/5/21 15:23
34
0
55
4/17/21 18:58
32
1
55
4/17/21 19:00
66
1
55
4/18/21 19:57
54
0
55
4/23/21 20:02
34
0
55
4/29/21 20:39
57
0
55
4/30/21 21:46
43
0
Technical details:
PostgreSQL, supported by EDB, version = 14.1
pgAdmin4, version 5.7
Thanks for the help!
"The query currently first creates daily aggregate values"
I don't see any aggregate function in your first query, so that the GROUP BY clause is useless.
select
user_id, value_x, date_part('day', day_ts) as daily
from table_1
group by date_part('day', day_ts), user_id, value_x
could be simplified as
select
user_id, value_x, date_part('day', day_ts) as daily
from table_1
which in turn provides no real added value, so this first query could be removed and the second query would become :
select user_id
, date_part('day', day_ts) as daily
, sum(value_x) as sum_daily_value_x
from table_1
group by user_id, date_part('day', day_ts)
The order by user_id clause can also be removed at this step.
Now if you want to calculate the average value of the sum_daily_value_x in the period of 7 days after the event (I'm referring to the avg() function in your top query), you can use avg() as a window function that you can restrict to the period of 7 days after the event :
select f.user_id
, avg(f.sum_daily_value_x) over (order by f.daily range between current row and '7 days' following) as sum_next_day_values
from (
select user_id
, date_part('day', day_ts) as daily
, sum(value_x) as sum_daily_value_x
from table_1
group by user_id, date_part('day', day_ts)
) AS f
group by f.user_id
The partition by f.user_id clause in the window function is useless because the rows have already been grouped by f.user_id before the window function is applied.
You can replace the avg() window function by any other one, for instance sum() which could better fit with the alias sum_next_day_values
I have little problem with counting cells with particular value in one row in MSSMS.
Table looks like
ID
Month
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
11
12
13
14
15
16
...
31
5000
1
null
null
1
1
null
1
1
null
null
2
2
2
2
2
null
null
3
3
3
3
3
null
...
1
I need to count how many cells in one row have value for example 1. In this case it would be 5.
Data represents worker shifts in a month. Be aware that there is a column named month (FK with values 1-12), i don't want to count that in a result.
Column ID is ALWAYS 4 digit number.
Possibility is to use count(case when) but in examples there are only two or three columns not 31. Statement will be very long. Is there any other option to count it?
Thanks for any advices.
I'm going to strongly suggest that you abandon your current table design, and instead store one day per month, per record, not column. That is, use this design:
ID | Date | Value
5000 | 2021-01-01 | NULL
5000 | 2021-01-02 | NULL
5000 | 2021-01-03 | 1
5000 | 2021-01-04 | 1
5000 | 2021-01-05 | NULL
...
5000 | 2021-01-31 | 5
Then use this query:
SELECT
ID,
CONVERT(varchar(7), Date, 120),
COUNT(CASE WHEN Value = 1 THEN 1 END) AS one_cnt
FROM yourTable
GROUP BY
ID,
CONVERT(varchar(7), Date, 120);
We have a databade in postgresql ver 11.7 on the pgAdmin4 which has a table with 3 primary keys .We want to update each row with a random value between 1 and 10 .
We have found the desirable elements but we need a way to update only 1 row each time
the conditions are as follow :
1)The subjects much be in a specific semester which can be found in the table called "Courses"
2)then we must find the amkas of students who are registered ( register_status = 'approved')
and we need to upade the table called "Register" with random exam_grades with those conditions assuming the exam_grade is null
Update "Register" SET exam_grade = (SELECT floor(random() * 10 + 1)::int) WHERE course_code IN
(SELECT DISTINCT course_code FROM "Course" WHERE typical_year = 2 AND typical_season = 'spring' ) AND register_status = 'approved' AND exam_grade is null ;
Just that .Somehow we need the update to just be used on only one row and then just use a for loop to take it one by one.If there is any other information i should include ,please tell me
So the tables are as follows
Register:
amka(PK) serial_number(PK) course_code(PK) exam_grade final_grade lab_grade register_status
19 5 ΕΝΕ 501 null null null proposed
13 15 ΤΗΛ 417 2 2 null fail
13 15 ΤΗΛ 411 10 8.4 null pass
47 22 ΜΠΔ 433 6 null 9 approved
:
Course:
course_code(PK) typical_year typical_season units ects weight obligatory
ΑΓΓ 201 2 winter 2 4 1 true
ΜΑΘ 201 1 winter 4 6 1.5 true
ΜΑΘ 208 1 winter 3 5 1.5 false
The results that i want are
mka(PK) serial_number(PK) course_code(PK) exam_grade final_grade lab_grade register_status
19 5 ΕΝΕ 501 random null null approved
13 15 ΤΗΛ 417 random 2 null approved
13 15 ΤΗΛ 411 random 8.4 null approved
47 22 ΜΠΔ 433 random null 9 approved
new random in each row
but i only get 1 number which fills all of them
i hope with this edit things became clearer
I am having a problem with Postgres insisting the primary key is part of the “group by” clause, the frustration being that the syntax I have works perfectly with SQLITE.
The error message is ‘column "attendees.id" must appear in the GROUP BY clause or be used in an aggregate function’
My question is can anyone provide the fix such that Postgres will produce the output that SQLite does (SQL statement, schema, data and desired output provided below)
My table schema:
CREATE TABLE Attendees
(
id INTEGER PRIMARY KEY AUTOINCREMENT,
activitynumber int,
activitydate DATE,
club int,
usernum int,
username ,
activityname ,
cost int,
received int,
update_time DATETIME,
CONSTRAINT FK_activitynumber
FOREIGN KEY (activitynumber)
REFERENCES Activity(activitynumber)
CONSTRAINT FK_club
FOREIGN KEY (club)
REFERENCES "club"(clubnum)
CONSTRAINT FK_usernum
FOREIGN KEY (usernum)
REFERENCES "users"(usernum)
);
CREATE INDEX ix_attendees_group ON Attendees
(club,activitydate,activitynumber);
(Please note, I added the index to try and solve the problem but it didn’t).
activities = Attendees.query.filter_by(club=current_user.club).order_by(Attendees.activitdate.desc()).group_by(Attendees.activitydate,Attendees.activitynumber)
The data I have is:-
id activitynumber activitydate club usernum username activityname cost received update_time
1 3 15-10-19 1002 1002000001 susan Monday Swim 200 0 58:43.8
2 3 15-10-19 1002 1002000002 triblokerich Monday Swim 200 0 58:49.9
3 4 17-10-19 1002 1002000001 susan Thursday Swim 200 0 59:04.5
4 4 17-10-19 1002 1002000015 craig Thursday Swim 200 0 59:09.9
5 6 16-10-19 1002 1002000001 susan Dunton 200 0 00:06.5
6 6 16-10-19 1002 1002000002 triblokerich Dunton 200 0 00:17.6
7 3 16-10-19 1002 1002000001 susan Monday Swim 300 0 58:28.1
8 3 16-10-19 1002 1002000002 triblokerich Monday Swim 300 0 58:33.7
9 3 16-10-19 1002 1002000015 craig Monday Swim 300 0 58:37.7
10 3 16-10-19 1002 1002000016 craig2 Monday Swim 300 0 01:41.8
11 3 19-10-19 1002 1002000001 susan Monday Swim 300 0 07:56.4
12 3 19-10-19 1002 1002000002 triblokerich Monday Swim 300 0 08:04.8
and the output I am trying to get (which works with SQLite) is:
Date Activity
2019-10-19 Monday Swim
2019-10-17 Thursday Swim
2019-10-16 Monday Swim
2019-10-16 Dunton
2019-10-15 Monday Swim
If I understand SQL Alchemy's syntax correctly you're looking for something like this, hopefully I'm not far off (broken into lines to show better):
activities = Attendees.query\
.with_entities(Attendees.activitydate, Attendees.activityname)\
.filter_by(club=current_user.club)\
.order_by(Attendees.activitdate.desc())\
.group_by(Attendees.activitydate,Attendees.activityname)
You always need to have the not-grouped-by columns in the result set as aggregates (min, max, sum, etc) because there's no way to show them otherwise. Which ID for example would be taken from the grouping of five rows? SQLite may be one of the inaccurate databases that just throws any of the rows back but PostgreSQL is very strict about the query and will never do this.
In this query it is defined that the result should contain only the columns that are in the group_by, since those are what you need. It also has the activityname as a grouped column rather than activitynumber since they need to match. This way there's no un-aggregated non-grouped columns in the result, only the grouped ones and the query should work fine.