create a new postgres table but with averages - postgresql

I'm a SQL newbie.
I have a postgres table that has datetime, and values. with subminute entries. I want to create a new table that takes the average of each minute and saves that instead of having subminute entries. so something like this:
1-12-07 12:29:56:00 2
1-12-07 12:29:56:16 3
1-12-07 12:29:56:34 3
1-12-07 12:29:56:58 4
1-12-07 12:30:00:00 7
to
1-12-07 12:29:00 3
1-12-07 12:30:00 #
Is there a way to do it in postgres?
The only solution I can think of is using a python script to do the trick. But that will take forever as I have a significant amount of data.

In order to do this you must group the results by your time truncated at the minutes:
insert into "NewTable"
select minute, avg(value) from (select date_trunc('minutes', date) as minute, value from "Time") as Average group by minute;

So this is going to be awful and inefficient and an incomplete answer, recommend writing a script (just my .02)
It should be something like this, since there is not a sqlFiddle provided in the question:
select average(my_count)
from
(
select my_date, my_time,my_count
from my_table
group by my_date, extract(minute from my_time)
)

Related

How to get a list of dates in Pervasive SQL

Our time & attendance database is a Pervasive/Actian Zen database. What I'm trying to do is create a query that just lists the next 14 days from today. I'll then cross apply this list of dates with employee records so that in effect I have a list of people/dates for the next 14 days.
I've done it with a recursive CTE on SQL server quite easily. I could also do it with a loop in SQL Server too but I can't figure it out with Pervasive SQL. Loops can only exist within Stored Procedures and triggers.
Looking around I thought that this code that I found and adapted might work, but it doesn't (and further research suggests that there isn't a recursive option within Pervasive at all.
WITH RECURSIVE cte_numbers(n, xDate)
AS (
SELECT
0, CURDATE() + 1
UNION ALL
SELECT
n+1,
dateAdd(day,n,xDate)
FROM
cte_numbers
WHERE n < 14
)
SELECT
xDate
FROM
cte_numbers;
I just wondered whether anyone could help me write an SQL query that gives me this list of dates, outside of a stored procedure.
When you create a table like this:
CREATE TABLE dates(d DATE PRIMARY KEY, x INTEGER);
And create a first record like this:
INSERT INTO dates VALUES ('2021-01-01',0);
Then you can use this statement which doubles the number of records in the table dates, every time it is executed. (so you need to run it a couple of times
When you run it 10 times the table dates will have 21 oktober 2023 as last date.
When you run it 12 times the last date will be 19 march 2032.
INSERT INTO dates
SELECT
DATEADD(DAY,m.m+1,d),
x+m.m+1
from dates
cross join (select max(x) m from dates) m
order by d;
Of course the column x can be deleted (optionally) with next statement, but you cannot add more records using the previous statement:
ALTER TABLE dates DROP COLUMN x;
Finally, to return the next 14 day from today:
SELECT d
FROM DATES
WHERE d BETWEEN CURDATE( ) AND DATEADD(DAY,13,CURDATE());

Calculate the sum of time column in PostgreSql

Can anyone suggest me, the easiest way to find summation of time field in POSTGRESQL. i just find a solution for MYSQL but i need the POSTGRESQL version.
MYSQL: https://stackoverflow.com/questions/3054943/calculate-sum-time-with-mysql
SELECT SEC_TO_TIME(SUM(TIME_TO_SEC(timespent))) FROM myTable;
Demo Data
id time
1 1:23:23
2 4:00:23
3 9:23:23
Desired Output
14:47:09
What you want, is not possible. But you probably misunderstood the time type: it represents a precise time-point in a day. It doesn't make much sense, to add two (or more) times. f.ex. '14:00' + '14:00' = '28:00' (but there are no 28th hour in a day).
What you probably want, is interval (which represents time intervals; hours, minutes, or even years). sum() supports interval arguments.
If you use intervals, it's just that simple:
SELECT sum(interval_col) FROM my_table;
Although, if you stick to the time type (but you have no reason to do that), you can cast it to interval to calculate with it:
SELECT sum(time_col::interval) FROM my_table;
But again, the result will be interval, because time values cannot exceed the 24th hour in a day.
Note: PostgreSQL will even do the cast for you, so sum(time_col) should work too, but the result is interval in this case too.
I tried this solution on sql fieddle:
link
Table creation:
CREATE TABLE time_table (
id integer, time time
);
Insert data:
INSERT INTO time_table (id,time) VALUES
(1,'1:23:23'),
(2,'4:00:23'),
(3,'9:23:23')
query the data:
SELECT
sum(s.time)
FROM
time_table s;
If you need to calculate sum of some field, according another field, you can do this:
select
keyfield,
sum(time_col::interval) totaltime
FROM myTable
GROUP by keyfield
Output example:
keyfield; totaltime
"Gabriel"; "10:00:00"
"John"; "36:00:00"
"Joseph"; "180:00:00"
Data type of totaltime is interval.

Count number of days by ignoring times

I am trying to calculate the number of days that an event category has occurred in T-SQL in SSMS 2008. How do I write this expression?
This value is stored as a datetime, but I want to count the day portion only. For example, if my values were:
2013-01-05 19:20:00.000
2013-01-06 17:20:00.000
2013-01-06 18:20:00.000
2013-01-06 19:20:00.000
2013-01-03 16:15:00.000
2013-01-04 12:55:00.000
Then although there are 6 unique records listed above, I would want to count this as only 4, since there are 3 records on 1/6/2013. Make sense?
This is what I'm trying now that doesn't work:
select
count(s.date_value)
From
table_A s
Cast the datetime value as a date. Also if you want only unique values, use DISTINCT:
SELECT COUNT(DISTINCT CAST(date_value AS date)) FROM table_A

PostgreSQL amount for each day summed up in weeks

I've been trying to find a solution to this challenge all day.
I've got a table:
id | amount | type | date | description | club_id
-------+---------+------+----------------------------+---------------------------------------+---------+--------
783 | 10000 | 5 | 2011-08-23 12:52:19.995249 | Sign on fee | 7
The table has a lot more data than this.
What I'm trying to do is get the sum of amount for each week, given a specific club_id.
The last thing I ended up with was this, but it doesn't work:
WITH RECURSIVE t AS (
SELECT EXTRACT(WEEK FROM date) AS week, amount FROM club_expenses WHERE club_id = 20 AND EXTRACT(WEEK FROM date) < 10 ORDER BY week
UNION ALL
SELECT week+1, amount FROM t WHERE week < 3
)
SELECT week, amount FROM t;
I'm not sure why it doesn't work, but it complains about the UNION ALL.
I'll be off to bed in a minute, so I won't be able to see any answers before tomorrow (sorry).
I hope I've described it adequately.
Thanks in advance!
It looks to me like you are trying to use the UNION ALL to retrieve a subset of the first part of the query. That won't work. You have two options. The first is to use user defined functions to add behavior as you need it and the second is to nest your WITH clauses. I tend to prefer the former, but you may be preferring the latter.
To do the functions/table methods approach you create a function which accepts as input a row from a table and does not hit the table directly. This provides a bunch of benefits including the ability to easily index the output. Here the function would look like:
CREATE FUNCTION week(club_expense) RETURNS int LANGUAGE SQL IMMUTABLE AS $$
select EXTRACT(WEEK FROM $1.date)
$$;
Now you have a usable macro which can be used where you would use a column. You can then:
SELECT c.week, sum(amount) FROM club_expense c
GROUP BY c.week;
Note that the c. before week is not optional. The parser converts that into week(c). If you want to limit this to a year, you can do the same with years.
This is a really neat, useful feature of Postgres.

Select unique values sorted by date

I am trying to solve an interesting problem. I have a table that has, among other data, these columns (dates in this sample are shown in European format - dd/mm/yyyy):
n_place_id dt_visit_date
(integer) (date)
========== =============
1 10/02/2012
3 11/03/2012
4 11/05/2012
13 14/06/2012
3 04/10/2012
3 03/11/2012
5 05/09/2012
13 18/08/2012
Basically, each place may be visited multiple times - and the dates may be in the past (completed visits) or in the future (planned visits). For the sake of simplicity, today's visits are part of planned future visits.
Now, I need to run a select on this table, which would pull unique place IDs from this table (without date) sorted in the following order:
Future visits go before past visits
Future visits take precedence in sorting over past visits for the same place
For future visits, the earliest date must take precedence in sorting for the same place
For past visits, the latest date must take precedence in sorting for the same place.
For example, for the sample data shown above, the result I need is:
5 (earliest future visit)
3 (next future visit into the future)
13 (latest past visit)
4 (previous past visit)
1 (earlier visit in the past)
Now, I can achieve the desired sorting using case when in the order by clause like so:
select
n_place_id
from
place_visit
order by
(case when dt_visit_date >= now()::date then 1 else 2 end),
(case when dt_visit_date >= now():: date then 1 else -1 end) * extract(epoch from dt_visit_date)
This sort of does what I need, but it does contain repeated IDs, whereas I need unique place IDs. If I try to add distinct to the select statement, postgres complains that I must have the order by in the select clause - but then the unique won't be sensible any more, as I have dates in there.
Somehow I feel that there should be a way to get the result I need in one select statement, but I can't get my head around how to do it.
If this can't be done, then, of course, I'll have to do the whole thing in the code, but I'd prefer to have this in one SQL statement.
P.S. I am not worried about the performance, because the dataset I will be sorting is not large. After the where clause will be applied, it will rarely contain more than about 10 records.
With DISTINCT ON you can easily show additional columns of the row with the resulting n_place_id:
SELECT n_place_id, dt_visit_date
FROM (
SELECT DISTINCT ON (n_place_id) *
,dt_visit_date < now()::date AS prio -- future first
,#(now()::date - dt_visit_date) AS diff -- closest first
FROM place_visit
ORDER BY n_place_id, prio, diff
) x
ORDER BY prio, diff;
Effectively I pick the row with the earliest future date (including "today") per n_place_id - or latest date in the past, failing that.
Then the resulting unique rows are sorted by the same criteria.
FALSE sorts before TRUE
The "absolute value" # helps to sort "closest first"
More on the Postgres specific DISTINCT ON in this related answer.
Result:
n_place_id | dt_visit_date
------------+--------------
5 | 2012-09-05
3 | 2012-10-04
13 | 2012-08-18
4 | 2012-05-11
1 | 2012-02-10
Try this
select n_place_id
from
(
select *,
extract(epoch from (dt_visit_date - now())) as seconds,
1 - SIGN(extract(epoch from (dt_visit_date - now())) ) as futurepast
from #t
) v
group by n_place_id
order by max(futurepast) desc, min(abs(seconds))