PSQL generate_series inserting series of dates? - postgresql

I have the following column in a table:
account_no (text)
total_equity (float64)
insertion_date (timestamptz)
How can I insert dummy datas into the table with a series of date like what below script. What I currently have, which obviously wrong:
select
'123123' as account_no,
random() * 100000 + 1000000000 as total_equity,
insertion_date::timestamptz,
from
generate_series('2022-03-03'::timestamptz, '2022-05-09'::timestamptz, interval '1 day') as insertion_date;
What I'm trying to achieve is that I inserted data from 22-03-03 to 22-05-09 with a static "account_no" and randomized total_equity.
Thanks in advance!

Related

Postgresql timestamp difference greater than 1 hour

Hi I have a entrytime and exittime timestamp in my database, how can I query it to display only ones where the person exited more than an hour later;
Select * from store where EXTRACT(EPOCH FROM (exittime - entrytime))/3600 >60
That's what I have so far but it won't work, any help would be appreciated.
Just subtract the values and compare it with an interval
Select *
from store
where exittime - entrytime > interval '1 hour';
This assumes that both columns are defined as timestamptz or timestamp

How to select data in 2 month(from now) use postgresql?

I'm new to sql and database work . Now I want select data in 2 month from now . the key is xxdate lookslike 2019-4-11
like:
select * from table where date > now() - 2 month
but I don't know the correct way to express it. can someone help?
You can use this query:
SELECT * FROM table WHERE date > (current_date - interval '2 month')::date;

Postgresql: Query Between time range using jsonb field

I have a table with two fields:
id(serial), data(jsonb)
And into data I have records with a Datetime field stored as UNIX timestamps:
{"Device":132,"Datetime": 1434166552,...}
I'm trying to query between ranges:
SELECT *
FROM trips
WHERE data->>'Datetime' BETWEEN
EXTRACT(EPOCH FROM date '2014-04-01') AND
EXTRACT(EPOCH FROM date '2014-04-15' + interval '1 day')
AND id = 123
Message
ERROR: operator does not exist: text >= double precision
LINE 3: WHERE data->>'Datetime' BETWEEN
Something I'm doing wrong, please cloud somebody help me? Thanks.
The ->> operator returns an JSON object field as text (see here). You need to cast it :
SELECT *
FROM trips
WHERE (data->>'Datetime')::int
BETWEEN EXTRACT(EPOCH FROM date '2014-04-01')
AND EXTRACT(EPOCH FROM date '2014-04-15' + interval '1 day')
AND id = 123

generate_series function in Amazon Redshift

I tried the below:
SELECT * FROM generate_series(2,4);
generate_series
-----------------
2
3
4
(3 rows)
SELECT * FROM generate_series(5,1,-2);
generate_series
-----------------
5
3
1
(3 rows)
But when I try,
select * from generate_series('2011-12-31'::timestamp, '2012-12-31'::timestamp, '1 day');
It generated error.
ERROR: function generate_series(timestamp without time zone, timestamp without time zone, "unknown") does not exist
HINT: No function matches the given name and argument types. You may need to add explicit type casts.
I use PostgreSQL 8.0.2 on Redshift 1.0.757.
Any idea why it happens?
UPDATE:
generate_series is working with Redshift now.
SELECT CURRENT_DATE::TIMESTAMP - (i * interval '1 day') as date_datetime
FROM generate_series(1,31) i
ORDER BY 1
This will generate last 30 days date
The version of generate_series() that supports dates and timestamps was added in Postgres 8.4.
As Redshift is based on Postgres 8.0, you need to use a different way:
select timestamp '2011-12-31 00:00:00' + (i * interval '1 day')
from generate_series(1, (date '2012-12-31' - date '2011-12-31')) i;
If you "only" need dates, this can be abbreviated to:
select date '2011-12-31' + i
from generate_series(1, (date '2012-12-31' - date '2011-12-31')) i;
generate_series is working with Redshift now.
SELECT CURRENT_DATE::TIMESTAMP - (i * interval '1 day') as date_datetime
FROM generate_series(1,31) i
ORDER BY 1
This will generate last 30 days date
I found a solution here for my problem of not being able to generate a time dimension table on Redshift using generate_series(). You can generate a temporary sequence by using the following SQL snippet.
with digit as (
select 0 as d union all
select 1 union all select 2 union all select 3 union all
select 4 union all select 5 union all select 6 union all
select 7 union all select 8 union all select 9
),
seq as (
select a.d + (10 * b.d) + (100 * c.d) + (1000 * d.d) as num
from digit a
cross join
digit b
cross join
digit c
cross join
digit d
order by 1
)
select (getdate()::date - seq.num)::date as "Date"
from seq;
The generate_series() function, it seems, is not supported completely on Redshift yet. If I run the SQL mentioned in the answer by DJo, it works, because the SQL runs only on the leader node. If I prepend insert into dim_time to the same SQL it doesn't work.
There is no generate_series() function in Redshift for Date Range but you can generate the series with below steps...
Step 1: Created a table genid and insert constant value as 1 for number of times you need to generate the series. If you need the series to be generated for 12 month you can insert 12 times. Better you can insert for more number of times like 100, so that you do not face any issue.
create table genid(id int)
------------ for number of months
insert into genid values(1)
Step 2: The table for which you need to generate the series.
create table pat(patid varchar(10),stdt timestamp, enddt timestamp);
insert into pat values('Pat01','2018-03-30 00:00:00.0','2018-04-30 00:00:00.0')
insert into pat values('Pat02','2018-02-28 00:00:00.0','2018-04-30 00:00:00.0')
insert into pat values('Pat03','2017-10-28 00:00:00.0','2018-04-30 00:00:00.0')
Step 3: This query will generate the series for you.
with cte as
(
select max(enddt) as maxdt
from pat
) ,
cte2 as(
select dateadd('month', -1 * row_number() over(order by 1), maxdt::date ) as gendt
from genid , cte
) select *
from pat, cte2
where gendt between stdt and enddt

Create date efficiently

On Pavel's page is the following function:
CREATE OR REPLACE FUNCTION makedate(year int, dayofyear int)
RETURNS date AS $$
SELECT (date '0001-01-01' + ($1 - 1) * interval '1 year' + ($2 - 1) * interval '1 day'):: date
$$ LANGUAGE sql;
I have the following code:
makedate(y.year,1)
What is the fastest way in PostgreSQL to create a date for January 1st of a given year?
Pavel's function would lead me to believe it is:
date '0001-01-01' + y.year * interval '1 year' + interval '1 day';
My thought would be more like:
to_date( y.year||'-1-1', 'YYYY-MM-DD');
Am looking for the fastest way using PostgreSQL 8.4. (The query that uses the date function can select between 100,000 and 1 million records, so it needs speed.)
Thank you!
I would just use the following, given that year is a variable holding the year, instead of using a function:
(year || '-01-01')::date
Btw. I can't believe that this conversion is your bottleneck. But maybe you should have a look at generate_series here (I don't know your usecase).
select current_date + s.a as dates from generate_series(0,14,7) as s(a);
dates
------------
2004-02-05
2004-02-12
2004-02-19
(3 rows)
Using to_date() is even simpler than you expect:
> select to_date('2008','YYYY');
to_date
------------
2008-01-01
(1 row)
> select to_date(2008::text,'YYYY');
to_date
------------
2008-01-01
(1 row)
Note that you still have to pass the year as a string, but no concatenation is needed.
As suggested by Daniel, in the unlikely case that this conversion is a bottleneck, you might prefer to precompute the function and store in a table. Eg:
select ynum, to_date( ynum ||'-01-01', 'YYYY-MM-DD') ydate
from generate_series(2000,2009) as ynum;
If there are a few years (and hence no need of indexes), you might even create the table dinamically for the scope of each query, with the new WITH.