SQLAlchemy (postgres) cast time difference to (integer) days - postgresql

I am running a query (postgres) which creates a temporary column grp dividing the time difference between event_time and start_date (a datetime object) into time periods of given length.
date = func.date_trunc('day', M.event_time)
session.query(..., ((date - start_date) / period_length).label('grp'))
.filter(..., date >= start_date)
.group_by('grp', ...)
This works fine as long as period_length = 1, but if it is 2, for example, the result is a timedelta which has half days datetime.timedelta(days=2, seconds=43200).
How do I cast this to integer before group_by to get my expected result?

Related

TIMESTAMP- creation_date :: date between '2022-05-15' and '2022-06-15'

I just wanted to know the difference between these two codes:
select count (user_id) from tb_users where
creation_date :: date between '2022-05-15' and '2022-06-15'
Result: 41,232
select count (user_id) from tb_users where
creation_date between '2022-05-15' and '2022-06-15'
Result: 40,130
As far as I see, it is related with the timestamp, but I do not understand the difference.
Thank you!
Your column creation_date in the table is most probably in timestamp format, which is '2022-05-15 00:00:00'. By adding ::date <- you are casting your timestamp format to date format: '2022-05-15'.
You can read more about casting data types here:
https://www.postgresqltutorial.com/postgresql-tutorial/postgresql-cast/
When you ask Postgres to implicitly coerce a DATE value to a TIMESTAMP value - the hours, minutes and seconds are set to zero.
In the first query, you explicitly cast the creation date to DATE which is successfully compared to the provided DATE values.
In the second query, the creation date is of type TIMESTAMP and so PostgreSQL converts your DATE values to TIMESTAMP values and the comparison becomes
creation_date >= '2022-05-15 00:00:00' AND creation_date <= '2022-06-15 00:00:00'
Obviously, this produces different resultset than the first query.

How to convert a column in seconds to dateformat in Teradata?

I have a column which is of bigint datatype(in seconds) which should be added to a date, so i need to convert this column into dateformat.
The arithmetic must be done against a timestamp data type in Teradata. The date data type does not have a time element associated with it. The following SQL should help point you in the right direction:
SELECT CAST(CAST(1234 AS BIGINT) AS INTERVAL SECOND(4)) AS Seconds_
, CURRENT_TIMESTAMP(0) AS CurrentTimestamp_
, CURRENT_TIMESTAMP + Seconds_ AS NewTimeStamp
If the number of seconds is less than 864000000 you can simply use interval arithmetic:
CAST(col AS TIMESTAMP) + (bigintcol * INTERVAL '0000 00:00:01' DAY TO SECOND)
Based on your other question your input is a Unixtime, those are two functions for converting them from/to Teradata timestamps:
/**********
Converting Unix/POSIX time to a Timestamp
Unix time: Number of seconds since 1970-01-01 00:00:00 UTC not counting leap seconds (currently 24 in 2011)
Also working for negative numbers.
The maximum range of Timestamps is based on the range of INTEGERs:
1901-12-13 20:45:52 (-2147483648) to 2038-01-19 03:14:07 (2147483647)
Can be changed to use BIGINT instead of INTEGER
20101211 initial version - Dieter Noeth
**********/
REPLACE FUNCTION UnixTime_to_TimeStamp (UnixTime INT)
RETURNS TimeStamp(0)
LANGUAGE SQL
CONTAINS SQL
DETERMINISTIC
SQL SECURITY DEFINER
COLLATION INVOKER
INLINE TYPE 1
RETURN
CAST(DATE '1970-01-01' + (UnixTime / 86400) AS TIMESTAMP(0))
+ ((UnixTime MOD 86400) * INTERVAL '00:00:01' HOUR TO SECOND)
;
SELECT
UnixTime_to_TimeStamp(-2147483648)
,UnixTime_to_TimeStamp(0)
,UnixTime_to_TimeStamp(2147483647)
;
/**********
Converting a Timestamp to Unix/POSIX time
Unix time: Number of seconds since 1970-01-01 00:00:00 UTC not counting leap seconds (currently 24 in 2011)
The maximum range of Timestamps is based on the range of INTEGERs:
1901-12-13 20:45:52 (-2147483648) to 2038-01-19 03:14:07 (2147483647)
Can be changed to use BIGINT instead of INTEGER
20101211 initial version - Dieter Noeth
**********/
REPLACE FUNCTION TimeStamp_to_UnixTime (ts TimeStamp(6))
RETURNS INTEGER
LANGUAGE SQL
CONTAINS SQL
DETERMINISTIC
SQL SECURITY DEFINER
COLLATION INVOKER
INLINE TYPE 1
RETURN
(CAST(ts AS DATE) - DATE '1970-01-01') * 86400
+ (EXTRACT(HOUR FROM ts) * 3600)
+ (EXTRACT(MINUTE FROM ts) * 60)
+ (EXTRACT(SECOND FROM ts))
;
SELECT
TimeStamp_to_UnixTime(TIMESTAMP '1901-12-13 20:45:52')
,TimeStamp_to_UnixTime(CURRENT_TIMESTAMP)
,TimeStamp_to_UnixTime(TIMESTAMP '2038-01-19 03:14:07')
;

Postgresql Query very slow with ::date, ::time, and interval

I have a sql query that is very slow:
select number1 from mytable
where symbol = 25
and timeframe = 1
and date::date = '2008-02-05'
and date::time='10:40:00' + INTERVAL '30 minutes'
The goal is to return one value, and postgresql takes 1.7 seconds to return the desired value(always a single value). I need to execute hundreds of those queries for one task, so this gets extremely slow.
Executing the same query, but pointing to the time directly without using interval and ::date, ::time takes only 17ms:
select number1 from mytable
where symbol = 25
and timeframe = 1
and date = '2008-02-05 11:10:00'
I thought it would be faster if I would not use ::date and ::time, but when I execute a query like:
select number1 from mytable
where symbol = 25
and timeframe = 1
and date = '2008-02-05 10:40:00' + interval '30 minutes'
I get a sql error (22007). I've experimented with different variations but I couldn't get interval to work without using ::date and ::time. Date/Time Functions on postgresql.org didn't help me out.
The table got a multi column index on symbol, timeframe, date.
Is there a fast way to execute the query with adding time, or a working syntax with interval where I do not have to use ::date and ::time? Or do I need to have a special index when using queries like these?
Postgresql version is 9.2.
Edit:
The format of the table is:
date = timestamp with time zone,
symbol, timeframe = numeric.
Edit 2:
Using
select open from ohlc_dukascopy_bid
where symbol = 25
and timeframe = 1
and date = timestamp '2008-02-05 10:40:00' + interval '30' minute
Explain shows:
"Index Scan using mcbidindex on mytable (cost=0.00..116.03 rows=1 width=7)"
" Index Cond: ((symbol = 25) AND (timeframe = 1) AND (date = '2008-02-05 11:10:00'::timestamp without time zone))"
Time is now considerably faster: 86ms on first run.
The first version will not use a (regular) index on the column named date.
You didn't provide much information, but assuming the column named date has the datatype timestamp (and not date), then the following should work:
and date = timestamp '2008-02-05 10:40:00' + interval '30 minutes'
this should use an index on the column named date (but only if it is in fact a timestamp not a date). It is essentially the same as yours, the only difference is the explicit timestamp literal (although Postgres should understand '2008-02-05 10:40:00' as a timestamp literal as well).
You will need to run an explain to find out if it's using an index.
And please: change the name of that column. It's bad practise to use a reserved word as an identifier, and it's a really horrible name, which doesn't say anything about what kind of information is stored in the column. Is it the "start date", the "end date", the "due date", ...?

TSQL update Datetime with Random Value between 2 Dates

What's the easiest way to update a table that contains a DATETIME column on TSQL with RANDOM value between 2 dates?
I see various post related to that but their Random values are really sequential when you ORDER BY DATE after the update.
Assumptions
First assume that you have a database containing a table with a start datetime column and a end datetime column, which together define a datetime range:
CREATE DATABASE StackOverflow11387226;
GO
USE StackOverflow11387226;
GO
CREATE TABLE DateTimeRanges (
StartDateTime DATETIME NOT NULL,
EndDateTime DATETIME NOT NULL
);
GO
ALTER TABLE DateTimeRanges
ADD CONSTRAINT CK_PositiveRange CHECK (EndDateTime > StartDateTime);
And assume that the table contains some data:
INSERT INTO DateTimeRanges (
StartDateTime,
EndDateTime
)
VALUES
('2012-07-09 00:30', '2012-07-09 01:30'),
('2012-01-01 00:00', '2013-01-01 00:00'),
('1988-07-25 22:30', '2012-07-09 00:30');
GO
Method
The following SELECT statement returns the start datetime, the end datetime, and a pseudorandom datetime with minute precision greater than or equal to the start datetime and less than the second datetime:
SELECT
StartDateTime,
EndDateTime,
DATEADD(
MINUTE,
ABS(CHECKSUM(NEWID())) % DATEDIFF(MINUTE, StartDateTime, EndDateTime) + DATEDIFF(MINUTE, 0, StartDateTime),
0
) AS RandomDateTime
FROM DateTimeRanges;
Result
Because the NEWID() function is nondeterministic, this will return a different result set for every execution. Here is the result set I generated just now:
StartDateTime EndDateTime RandomDateTime
----------------------- ----------------------- -----------------------
2012-07-09 00:30:00.000 2012-07-09 01:30:00.000 2012-07-09 00:44:00.000
2012-01-01 00:00:00.000 2013-01-01 00:00:00.000 2012-09-08 20:41:00.000
1988-07-25 22:30:00.000 2012-07-09 00:30:00.000 1996-01-05 23:48:00.000
All the values in the column RandomDateTime lie between the values in columns StartDateTime and EndDateTime.
Explanation
This technique for generating random values is due to Jeff Moden. He wrote a great article on SQL Server Central about data generation. Read it for a more thorough explanation. Registration is required, but it's well worth it.
The idea is to generate a random offset from the start datetime, and add the offset to the start datetime to get a new datetime in between the start datetime and the end datetime.
The expression DATEDIFF(MINUTE, StartDateTime, EndDateTime) represents the total number of minutes between the start datetime and the end datetime. The offset must be less than or equal to this value.
The expression ABS(CHECKSUM(NEWID())) generates an independent random positive integer for every row. The expression can have any value from 0 to 2,147,483,647. This expression mod the first expression gives a valid offset in minutes.
The epxression DATEDIFF(MINUTE, 0, StartDateTime) represents the total number of minutes between the start datetime and a reference datetime of 0, which is shorthand for '1900-01-01 00:00:00.000'. The value of the reference datetime does not matter, but it matters that the same reference date is used in the whole expression. Add this to the offset to get the total number of minutes between the reference datetime.
The ecapsulating DATEADD function converts this to a datetime value by adding the number of minutes produced by the previous expression to the reference datetime.
You can use RAND for this:
select cast(cast(RAND()*100000 as int) as datetime)
from here
Sql-Fiddle looks quite good: http://sqlfiddle.com/#!3/b9e44/2/0

How can I compare two datetime fields but ignore the year?

I get to dust off my VBScript hat and write some classic ASP to query a SQL Server 2000 database.
Here's the scenario:
I have two datetime fields called fieldA and fieldB.
fieldB will never have a year value that's greater than the year of fieldA
It is possible the that two fields will have the same year.
What I want is all records where fieldA >= fieldB, independent of the year. Just pretend that each field is just a month & day.
How can I get this? My knowledge of T-SQL date/time functions is spotty at best.
You may want to use the built in time functions such as DAY and MONTH. e.g.
SELECT * from table where
MONTH(fieldA) > MONTH(fieldB) OR(
MONTH(fieldA) = MONTH(fieldB) AND DAY(fieldA) >= DAY(fieldB))
Selecting all rows where either the fieldA's month is greater or the months are the same and fieldA's day is greater.
select *
from t
where datepart(month,t.fieldA) >= datepart(month,t.fieldB)
or (datepart(month,t.fieldA) = datepart(month,t.fieldB)
and datepart(day,t.fieldA) >= datepart(day,t.fieldB))
If you care about hours, minutes, seconds, you'll need to extend this to cover the cases, although it may be faster to cast to a suitable string, remove the year and compare.
select *
from t
where substring(convert(varchar,t.fieldA,21),5,20)
>= substring(convert(varchar,t.fieldB,21),5,20)
SELECT *
FROM SOME_TABLE
WHERE MONTH(fieldA) > MONTH(fieldB)
OR ( MONTH(fieldA) = MONTH(fieldB) AND DAY(fieldA) >= DAY(fieldB) )
I would approach this from a Julian date perspective, convert each field into the Julian date (number of days after the first of year), then compare those values.
This may or may not produce desired results with respect to leap years.
If you were worried about hours, minutes, seconds, etc., you could adjust the DateDiff functions to calculate the number of hours (or minutes or seconds) since the beginning of the year.
SELECT *
FROM SOME_Table
WHERE DateDiff(d, '1/01/' + Cast(DatePart(yy, fieldA) AS VarChar(5)), fieldA) >=
DateDiff(d, '1/01/' + Cast(DatePart(yy, fieldB) AS VarChar(5)), fieldB)
Temp table for testing
Create table #t (calDate date)
Declare #curDate date = '2010-01-01'
while #curDate < '2021-01-01'
begin
insert into #t values (#curDate)
Set #curDate = dateadd(dd,1,#curDate)
end
Example of any date greater than or equal to today
Declare #testDate date = getdate()
SELECT *
FROM #t
WHERE datediff(dd,dateadd(yy,1900 - year(#testDate),#testDate),dateadd(yy,1900 - year(calDate),calDate)) >= 0
One more example with any day less than today
Declare #testDate date = getdate()
SELECT *
FROM #t
WHERE datediff(dd,dateadd(yy,1900 - year(#testDate),#testDate),dateadd(yy,1900 - year(calDate),calDate)) < 0