Datatype for time period - postgresql

I would like to insert time period into table column. For example: I've got a table with 7 columns, each column is a day of a week. Is there a possibility to create a datatype that is a time period for employees work hours? Let's say from 1AM to 8AM. Or in 24h system.
If there is not, how should i deal with it?

If you're building a table of something like business hours, you're probably better off with fewer columns.
create table business_hours (
day_of_week char(3) not null unique
check (day_of_week in ('Mon', 'Tue', 'Wed', 'Thu', 'Fri', 'Sat', 'Sun')),
start_time time not null,
end_time time not null
check (end_time > start_time)
);
insert into business_hours values
('Mon', '09:00', '17:00'),
('Tue', '09:00', '17:00'),
('Wed', '09:00', '17:00'),
('Thu', '09:00', '17:00'),
('Fri', '09:00', '17:00'),
('Sat', '11:00', '15:00');
You can join that table with a calendar table (or create a calendar table on the fly with generate_series()) to produce the business hours for the current week.
select c.cal_date, bh.*
from calendar c
inner join business_hours bh on bh.day_of_week = c.day_of_week
where cal_date between '2013-01-20' and '2013-01-27'
order by cal_date
Arranging that data into a matrix is a presentation-level issue. Use application code to do that.
The simplest calendar table you can use for this kind of query has just two columns. (Mine uses English. Adjust abbreviations as you like, but they must match the abbreviations in the table "business_hours".)
CREATE TABLE calendar
(
cal_date date NOT NULL,
day_of_week character(3) NOT NULL,
CONSTRAINT cal_pkey PRIMARY KEY (cal_date),
CONSTRAINT cal_dow_values CHECK (day_of_week =
CASE
WHEN date_part('dow', cal_date) = 0 THEN 'Sun'
WHEN date_part('dow', cal_date) = 1 THEN 'Mon'
WHEN date_part('dow', cal_date) = 2 THEN 'Tue'
WHEN date_part('dow', cal_date) = 3 THEN 'Wed'
WHEN date_part('dow', cal_date) = 4 THEN 'Thu'
WHEN date_part('dow', cal_date) = 5 THEN 'Fri'
WHEN date_part('dow', cal_date) = 6 THEN 'Sat'
ELSE NULL
END)
);
CREATE INDEX ON calendar (day_of_week);
There are a lot of different ways to populate a calendar table--spreadsheet, PostgreSQL function, scripting language to generate a CSV file, etc.

Related

Constraints and dates - postgresql

I'm trying to make a graduation year column that is only allowed to be a year(4 digits)
and has the constraint of dob year + atleast 10 years.
Something like :
dob DATE CHECK (dob < CURRENT_TIMESTAMP),
graduation_year DATE CHECK (graduation_year >= dob + 10)
How would i get this to work?
Because two columns are involved, the constraint becomes a table constraint and can't be associated to a column. See the [doc] (https://www.postgresql.org/docs/current/ddl-constraints.html#DDL-CONSTRAINTS-CHECK-CONSTRAINTS).
You must also specify that you are adding years
CREATE TABLE myTable (
dob DATE CHECK (dob < CURRENT_TIMESTAMP),
graduation_year DATE,
CHECK (graduation_year >= dob + interval '10 years')
);

Postgresql group transactions by category with a column for each month

Here is the schema I'm working with
-- Table Definition ----------------------------------------------
CREATE TABLE transactions (
id BIGSERIAL PRIMARY KEY,
date date,
amount double precision,
category character varying,
full_category character varying,
transaction_id character varying,
created_at timestamp(6) without time zone NOT NULL,
updated_at timestamp(6) without time zone NOT NULL
);
-- Indices -------------------------------------------------------
CREATE UNIQUE INDEX transactions_pkey ON transactions(id int8_ops);
I would like to group the data with the following columns:
Category, January Total, February Total, March Total, and so on for every month.
This is as far as I've got:
SELECT
category, sum(amount) as january_total
from transactions
where category NOT IN ('Transfer', 'Payment', 'Deposit', 'Income')
AND date >= '2021-01-01' AND date < '2021-02-01'
group by category
Order by january_total asc
How do I add a column for every month to this output?
Here is the solution I came up with:
SELECT
category,
SUM(CASE WHEN date >= '2021-01-01' AND date <'2021-02-01' THEN amount ELSE 0.00 END) AS january,
SUM(CASE WHEN date >= '2021-02-01' AND date <'2021-03-01' THEN amount ELSE 0.00 END) AS february,
SUM(CASE WHEN date >= '2021-03-01' AND date <'2021-04-01' THEN amount ELSE 0.00 END) AS march,
SUM(CASE WHEN date >= '2021-04-01' AND date <'2021-05-01' THEN amount ELSE 0.00 END) AS april
from transactions
where category NOT IN ('Transfer', 'Payment', 'Deposit', 'Income')
GROUP BY category
Order by category

CROSSTAB PostgreSQL - Alternative for PIVOT in Oracle

I'm migrating a query of Oracle pivot to PostgreSQL crosstab.
create table(cntry numeric,week numeric,year numeric,days text,day text);
insert into x_c values(1,15,2015,'DAY1','MON');
...
insert into x_c values(1,15,2015,'DAY7','SUN');
insert into x_c values(2,15,2015,'DAY1','MON');
...
values(4,15,2015,'DAY7','SUN');
I have 4 weeks with 28 rows like this in a table. My Oracle query looks like this:
SELECT * FROM(select * from x_c)
PIVOT (MIN(DAY) FOR (DAYS) IN
('DAY1' AS DAY1 ,'DAY2' DAY2,'DAY3' DAY3,'DAY4' DAY4,'DAY5' DAY5,'DAY6' DAY6,'DAY7' DAY7 ));
Result:
cntry|week|year|day1|day2|day3|day4|day4|day6|day7|
---------------------------------------------------
1 | 15 |2015| MON| TUE| WED| THU| FRI| SAT| SUN|
...
4 | 18 |2015| MON| ...
Now I have written a Postgres crosstab query like this:
select *
from crosstab('select cntry,week,year,days,min(day) as day
from x_c
group by cntry,week,year,days'
,'select distinct days from x_c order by 1'
) as (cntry numeric,week numeric,year numeric
,day1 text,day2 text,day3 text,day4 text, day5 text,day6 text,day7 text);
I'm getting only one row as output:
1|17|2015|MON|TUE| ... -- only this row is coming
Where am I doing wrong?
ORDER BY was missing in your original query. The manual:
In practice the SQL query should always specify ORDER BY 1,2 to ensure that the input rows are properly ordered, that is, values with the same row_name are brought together and correctly ordered within the row.
More importantly (and more tricky), crosstab() requires exactly one row_name column. Detailed explanation in this closely related answer:
Crosstab splitting results due to presence of unrelated field
The solution you found is to nest multiple columns in an array and later unnest again. That's needlessly expensive, error prone and limited (only works for columns with identical data types or you need to cast and possibly lose proper sort order).
Instead, generate a surrogate row_name column with rank() or dense_rank() (rnk in my example):
SELECT cntry, week, year, day1, day2, day3, day4, day5, day6, day7
FROM crosstab (
'SELECT dense_rank() OVER (ORDER BY cntry, week, year)::int AS rnk
, cntry, week, year, days, day
FROM x_c
ORDER BY rnk, days'
, $$SELECT unnest('{DAY1,DAY2,DAY3,DAY4,DAY5,DAY6,DAY7}'::text[])$$
) AS ct (rnk int, cntry int, week int, year int
, day1 text, day2 text, day3 text, day4 text, day5 text, day6 text, day7 text)
ORDER BY rnk;
I use the data type integer for out columns cntry, week, year because that seems to be the (cheaper) appropriate type. You can also use numeric like you had it.
Basics for crosstab queries here:
PostgreSQL Crosstab Query
I got this figured out from http://www.postgresonline.com/journal/categories/24-tablefunc
select year_wk_cntry.t[1],year_wk_cntry.t[2],year_wk_cntry.t[3],day1,day2,day3,day4,day5,day6,day7
from crosstab('select ARRAY[country :: numeric,week,year] as t,days,min(day) as day
from x_c group by country,week,year,days order by 1,2
','select distinct days from x_c order by 1')
as year_wk_cntry (t numeric[],day1 text,day2 text,day3 text,
day4 text, day5 text,day6 text,day7 text);
thanks!!

How to date trunc in HANA

I have a query to get the count of buses which travel less than 100 km per day. So I use the query in PostgreSQL
select day,count(*)as bus_count from(
SELECT date_trunc('hour',start_time)::timestamp::date as day,bus_id,sum(distance_two_points) as distance
FROM public.datatable where start_time >= '2015-09-05 00:00:00' and start_time <= '2015-09-05 23:59:59'
group by day,bus_id
) as A where distance<=250000 group by day
The query returns the result
day bus_id distance
___ ________ _________
"2015-09-05 00:00:00" 1 523247
"2015-09-05 00:00:00" 2 135114
"2015-09-05 00:00:00" 3 178560
"2015-09-05 00:00:00" 4 400071
"2015-09-05 00:00:00" 5 312832
"2015-09-05 00:00:00" 6 237075
So I now want to use this same query (achieving same results) in SAP HANA but there is no date trunc function and I also tried
SELECT EXTRACT (DAY FROM TO_DATE (START_TIME, 'YYYY-MM-DD')) "extract" as day,
bus_id, sum(distance_two_points) as distance
FROM public.datatable
where start_time >= '2015-09-05 00:00:00' and start_time <= '2015-09-05 23:59:59'
group by day,bus_id
) as A where distance<=250000 group by day
Any help is appreciated.
SELECT SERIES_ROUND('2013-05-24', 'INTERVAL 1 YEAR', ROUND_DOWN) "result" FROM DUMMY;
SELECT SERIES_ROUND('04:25:01', 'INTERVAL 10 MINUTE') "result" FROM DUMMY;
The SERIES_ROUND from SAP Hana provides similar functionalities as DATE_TRUNC in other vendors.
https://help.sap.com/docs/SAP_HANA_PLATFORM/4fe29514fd584807ac9f2a04f6754767/435ec476ab494ad6b8409f22abec13fe.html?version=2.0.00
Converting to a non-datetime data type is usually not a good idea (additional parsing, encoding, semantics...).
Instead use a less granular datetime data type: daydate in this case.
create column table datatab (start_time seconddate, bus_id int, distance_two_points decimal (10, 2));
insert into datatab values (to_seconddate('05.09.2015 13:12:00'), 1, 50.2);
insert into datatab values (to_seconddate('05.09.2015 13:22:00'), 1, 1.2);
insert into datatab values (to_seconddate('05.09.2015 15:32:00'), 1, 24);
insert into datatab values (to_seconddate('05.09.2015 13:12:00'), 1, 50.2);
insert into datatab values (to_seconddate('05.09.2015 14:22:00'), 2, 1.2);
insert into datatab values (to_seconddate('05.09.2015 16:32:00'), 2, 24);
select to_seconddate(day) as day,count(*) as bus_count from(
SELECT to_date(start_time) as day, bus_id, sum(distance_two_points) as distance
FROM datatab
where start_time between '2015-09-05 00:00:00' and '2015-09-05 23:59:59'
group by to_date(start_time),bus_id
) as A
where distance<=250000
group by day;
The inner query gives you:
DAY BUS_ID DISTANCE
2015-09-05 1 75.40
2015-09-05 2 25.20
So, your seconddate "start_time" is now aggregated as daydate and then converted back to 'seconddate'.
What I prefer is using the seconds_between() or nano100_between() function.
select now(),
add_seconds( to_date('1970.01.01', 'YYYY.MM.DD'),
round(
SECONDS_BETWEEN(
to_date('1970.01.01', 'YYYY.MM.DD'),
now()
)/3600
)*3600
)
from dummy;
This looks a bit ugly but given the to_date() is calculated just once and not for each row and the seconds arithmetic is close to how Hana stores the value internally, it should be the most efficient of the lot.
Also it is the most flexible, round by second, minute, hour, day,... everything below year is fine.
PS: round() supports all round and truncate options.
Assuming your start_time is of some data/time type (e.g. SECONDDATE) you could use
...TO_NVARCHAR(START_TIME, 'YYYY-MM-DD') AS DAY...
Instead of date_trunc... in PostgreSQL
Why don't you use CAST() conversion function?
select
cast( now() as date ) myDate
from dummy;

INSERT aggregated values from a query into a table in SQL

I'm new to SQL and proceeded much by trial and error as well as searching books and the internet. I have to repeat a query for the sum over monthly data for five years and I'd like to insert the results for every month as a column in a table. I tried adding new columns for every month
alter table add column, insert etc.
but I can't get it right. Here's the code I used for jan and feb07:
CREATE TABLE "TVD_db"."lebendetiere"
(nuar text,
ak text,
sex text,
jan07 text,
feb07 text,
märz07 text,
april07 text,
mai07 text,
juni07 text,
juli07 text,
aug07 text,
sept07 text,
okt07 text,
nov07 text,
dez07 text,
jan08 text,
....
dez11 text);
INSERT INTO "TVD_db"."lebendetiere" (nuar, ak, sex, jan07)
SELECT
"AUFENTHALTE"."nuar",
CASE WHEN DATE ('2007-01-01')- DATE ("AUFENTHALTE"."gebdat") < 365 THEN '1' WHEN DATE('2007-01-01')- DATE ("AUFENTHALTE"."gebdat") > 730 THEN 3 ELSE 2 END AS AK,
CASE WHEN "AUFENTHALTE"."isweiblich" = 'T' THEN 'female' ELSE 'male' END AS sex,
COUNT("AUFENTHALTE"."tierid")
FROM "TVD_db"."AUFENTHALTE"
WHERE DATE("AUFENTHALTE"."gueltigvon") <= DATE('2007-01-01')
AND DATE("AUFENTHALTE"."gueltigbis") >= DATE('2007-01-01')
GROUP BY "AUFENTHALTE"."nuar",
CASE WHEN DATE ('2007-01-01')- DATE ("AUFENTHALTE"."gebdat") < 365 THEN '1' WHEN DATE ('2007-01-01')- DATE ("AUFENTHALTE"."gebdat") > 730 THEN 3 ELSE 2 END,
CASE WHEN "AUFENTHALTE"."isweiblich" = 'T' THEN 'female' ELSE 'male' END
ORDER BY "AUFENTHALTE"."nuar",
CASE WHEN DATE ('2007-01-01')- DATE ("AUFENTHALTE"."gebdat") < 365 THEN '1' wWHEN DATE ('2007-01-01')- DATE ("AUFENTHALTE"."gebdat") > 730 THEN 3 ELSE 2 END,
CASE WHEN "AUFENTHALTE"."isweiblich" = 'T' THEN 'female' ELSE 'male' END
;
--until here it works fine
UPDATE "TVD_db"."lebendetiere" SET "feb07"= --this is the part I cant get right...
(SELECT
COUNT("AUFENTHALTE"."tierid")
FROM "TVD_db"."AUFENTHALTE"
WHERE DATE("AUFENTHALTE"."gueltigvon") <= DATE('2007-02-01')
AND DATE("AUFENTHALTE"."gueltigbis") >= DATE('2007-02-01')
GROUP BY "AUFENTHALTE"."nuar",
CASE WHEN DATE ('2007-02-01')- DATE ("AUFENTHALTE"."gebdat") < 365 THEN '1' WHEN DATE ('2007-02-01')- DATE ("AUFENTHALTE"."gebdat") > 730 THEN 3 ELSE 2 END,
CASE WHEN "AUFENTHALTE"."isweiblich" = 'T' THEN 'female' ELSE 'male' END
ORDER BY "AUFENTHALTE"."nuar",
CASE WHEN DATE ('2007-01-01')- DATE ("AUFENTHALTE"."gebdat") < 365 THEN '1' wWHEN DATE ('2007-01-01')- DATE ("AUFENTHALTE"."gebdat") > 730 THEN 3 ELSE 2 END,
CASE WHEN "AUFENTHALTE"."isweiblich" = 'T' THEN 'female' ELSE 'male' END);
Has anyone a solution or do I have to make a table for every month and then join the results?
After reading your post thoroughly, here is a complete redesign that should hold some insight for beginners in the field of SQL / PostgreSQL.
I would advise not to use mixed case identifiers in PostgreSQL. Use lower case exclusively, then you don't have to double-quote them and your code is much easier to read. You also avoid a lot of possible confusion.
Use table aliases to make your code more readable.
Column names in the SELECT statement for the INSERT are irrelevant. That's why I commented then out (avoids possible naming conflicts).
Use ordinal numbers in GROUP BY and ORDER BY to further simplify.
Don't use a separate column for every new month. Use a column identifying the month and add a row per month.
If you actually need the design with one column per month, then you need a large CASE statement or a pivot query. Refer to the tablefunc extension. But this is complicated stuff for an SQL newbie. I really think, you want a row per month.
I use generate_series() to generate one row per month between Jan 2007 and Dec 2011.
With my changed design, you don't need extra UPDATEs. It's all done in one INSERT.
I simplified quite a couple of other things. Here is what I would propose instead:
CREATE TABLE tvd_db.lebendetiere(
nuar text,
,alterskat integer
,sex text
,datum date
,anzahl integer
);
INSERT INTO tvd_db.lebendetiere (nuar, alterskat, sex, datum, anzahl)
SELECT a.nuar
,CASE WHEN a.gebdat >= '2006-01-01'::date THEN 1 -- use >= !
WHEN a.gebdat < '2005-01-01'::date THEN 3
ELSE 2 END -- AS alterskat
,CASE WHEN a.isweiblich = 'T' THEN 'female' ELSE 'male' END -- AS sex
,m.m
,count(*) -- AS anzahl
FROM tvd_db.aufenthalte a
CROSS JOIN (
SELECT generate_series('2007-01-01'::date
,'2011-12-01'::date, interval '1 month')::date
) m(m)
WHERE a.gueltigvon <= m.m
AND a.gueltigbis >= m.m
GROUP BY a.nuar, 2, 3, m.m
ORDER BY a.nuar, 2, 3, m.m;