Updating just day from an existent dateandtime column for all table - postgresql

I'm trying to create a query that updates the timestamp of all existent rows of a specific table.
Below is an example table to exemplify:
"DateAndTime"; ID; value
"2014-04-01 00:00:03"; 44; 10
"2014-04-01 00:00:03"; 45 ; 120
"2014-04-01 00:00:03"; 46 ; 80
"2014-04-01 00:00:03"; 47 ; 30000
"2014-04-01 00:00:13"; 44 ; 11
"2014-04-01 00:00:13"; 45 ; 122
"2014-04-01 00:00:13"; 46 ; 76
"2014-04-01 00:00:13"; 47 ; 30200
Now I want to change JUST the day by the current date but maintain the time
the result would be something like
"DateAndTime" | ID| value
Current_date + '00:00:03'; 44; 10
Current_date + '00:00:03'; 45; 120
Current_date + '00:00:03'; 46; 80
Current_date + '00:00:03'; 47; 30000
Current_date + 00:00:13"; 44; 11
Current_date + 00:00:13"; 45; 122
Current_date + 00:00:13"; 46; 76
Current_date + 00:00:13"; 47; 30200
I know that I can do the following
UPDATE "MyTable"
SET "DateAndTime"= (Select date_trunc('day',Localtimestamp) + ''00:00:03' )
And this works, But I have thousands of rows so i need replace + ''00:00:03' by each row dateandtime column default value with all time extracted. Like if there was an inverted date_trunc witch truncates from the seconds until the hours.
Do you now what can I use to replace the time with the default row value. Or do you know a better way to implement this query.
Thanks in advance

I found the solution; Instead of trying to manipulate separatly the date and time I just need to find the difference between the current_timmestamp and the timmestamp that I want to update and add that interval
The code is :
UPDATE "MyTable"
SET "DateAndTime"= "DateAndTime" + age(date_trunc('day',"DateAndTime"));
You have to trunc the timestamp by the day so the hours, minutes, second and beyond does not enter in the equation.

update MyTable set DateAndTime = now()::date + DateAndTime::time;

Related

Complex logic to create time series in Postgres

I have a sample dataset like below and I would like to create a report in such a format that the Value is updated for all the dates between the Start and End date.
Input Dataset
ID Start End Value
232 "2022-06-08 18:49:00" "2022-11-18 08:06:00" 55
456 "2022-10-17 10:24:00" "2022-12-16 12:52:00" 100
From the above Dataset I would like to create another dataset as below.
I need to generate the date series from the START and END date from the Input dataset and fill the same value to all of those value.
Any ideas or suggestions will be helpful.
Expected Output
ID Date Value
232 "2022-06-08" 55
232 "2022-06-09" 55
232 "2022-06-10" 55
232 "2022-06-11" 55
232 "2022-06-12" 55
.
.
232 "2022-11-17" 55
232 "2022-11-18" 55
456 "2022-10-17" 100
456 "2022-10-18" 100
456 "2022-10-19" 100
.
.
456 "2022-12-15" 100
456 "2022-12-16" 100
Database : Postgres 12
You can use generate_series()
select t.id,
g.dt::date as date,
t.value
from the_table t
cross join generate_series(t."Start"::date, t."End"::date, interval '1 day') as g(dt)
order by t.id, g.dt

Usage of DISTINCT in reversed int pairs duplicates elimination

I have a following question:
create table memorization_word_translation
(
id serial not null
from_word_id integer not null
to_word_id integer not null
);
This table stores pairs of integers, that are often in reverse order, for example:
35 36
35 37
36 35
37 35
37 39
39 37
Question is - if I make a query, for example:
select * from memorization_word_translation
where from_word_id = 35 or to_word_id = 35
I would get
35 36
35 37
36 35 - duplicate of 35 36
37 35 - duplicate of 35 37
How is to use DISTINCT in this example to filter out all duplicates even if they are reversed?
I want to keep it only like this:
35 36
35 37
You can do it with ROW_NUMBER() window function:
select from_word_id, to_word_id
from (
select *,
row_number() over (
partition by least(from_word_id, to_word_id),
greatest(from_word_id, to_word_id)
order by (from_word_id > to_word_id)::int
) rn
from memorization_word_translation
where 35 in (from_word_id, to_word_id)
) t
where rn = 1
See the demo.
demo:db<>fiddle
You could try a it with a small sorting algorithm (here a comparison) in combination with DISTINCT ON.
The DISTINCT ON clause works an arbitrary columns or terms, e.g. on a tuple. This CASE clause sorts the two columns into tuples and removes tied (ordered) ones. The source columns can be returned in your SELECT statement:
select distinct on (
CASE
WHEN (from_word_id >= to_word_id) THEN (from_word_id, to_word_id)
ELSE (to_word_id, from_word_id)
END
)
*
from memorization_word_translation
where from_word_id = 35 or to_word_id = 35

Adding a column to a table from the previous row in T-SQL

Given a row with a timestamp column and some value column (from a device) which are already in a table in Azure SQL database, I want to add a new column to the row from a most recent record which meets certain criteria (most recent will be defined by the timestamp column). The criteria is whether the value falls into a range (between 95 and 5). I want to do this for every row.
Here is an input table:
ts (Timestamp) value (integer)
------------------------------------
2019-09-22 00:00:00 90
2019-09-21 23:10:05 75
2019-09-21 23:09:00 85
2019-09-21 22:09:00 00
2019-09-21 14:09:00 70
Now I want to add a column to this table:
ts (Timestamp) value prev_value
---------------------------------------
2019-09-22 00:00:00 90 75
2019-09-21 23:10:05 75 85
2019-09-21 23:09:00 85 70
2019-09-21 22:09:00 00 70
2019-09-21 14:09:00 70 NULL
I have been trying different SQL statements but haven't bee successful so far.
So basically you want something like lag, but with a condition.
The easy way to do that is to use a correlated subquery.
First, create and populate sample table (Please save us this step in your future questions):
DECLARE #T AS TABLE
(
ts datetime2,
[value] int
)
INSERT INTO #T (ts, [value]) VALUES
('2019-09-22T00:00:00', 90),
('2019-09-21T23:10:05', 75),
('2019-09-21T23:09:00', 85),
('2019-09-21T22:09:00', 00),
('2019-09-21T14:09:00', 70);
The query:
SELECT ts,
value,
(
SELECT TOP 1 value
FROM #T T1
WHERE T0.ts > T1.ts
AND T1.value >= 5
AND T1.value <= 95
ORDER BY t1.ts DESC
) As prev_value
FROM #T T0
ORDER BY ts DESC
Results:
ts value prev_value
2019-09-22 00:00:00 90 75
2019-09-21 23:10:05 75 85
2019-09-21 23:09:00 85 70
2019-09-21 22:09:00 0 70
2019-09-21 14:09:00 70 NULL

PostgreSQL - filter function for dates

I am trying to use the built-in filter function in PostgreSQL to filter for a date range in order to sum only entries falling within this time-frame.
I cannot understand why the filter isn't being applied.
I am trying to filter for all product transactions that have a created_at date of the previous month (so in this case that were created in June 2017).
SELECT pt.created_at::date, pt.customer_id,
sum(pt.amount/100::double precision) filter (where (date_part('month', pt.created_at) =date_part('month', NOW() - interval '1 month') and
date_part('year', pt.created_at) = date_part('year', NOW()) ))
from
product_transactions pt
LEFT JOIN customers c
ON c.id= pt.customer_id
GROUP BY pt.created_at::date,pt.customer_id
Please find my expected results (sum of the amount for each day in the previous month - for each customer_id if an entry for that day exists) and the actual results I get from the query - below (using date_trunc).
Expected results:
created_at| customer_id | amount
2017-06-30 1 220.5
2017-06-28 15 34.8
2017-06-28 12 157
2017-06-28 48 105.6
2017-06-27 332 425.8
2017-06-25 1 58.0
2017-06-25 23 22.5
2017-06-21 14 88.9
2017-06-17 2 34.8
2017-06-12 87 250
2017-06-05 48 135.2
2017-06-05 12 95.7
2017-06-01 44 120
Results:
created_at| customer_id | amount
2017-06-30 1 220.5
2017-06-28 15 34.8
2017-06-28 12 157
2017-06-28 48 105.6
2017-06-27 332 425.8
2017-06-25 1 58.0
2017-06-25 23 22.5
2017-06-21 14 88.9
2017-06-17 2 34.8
2017-06-12 87 250
2017-06-05 48 135.2
2017-06-05 12 95.7
2017-06-01 44 120
2017-05-30 XX YYY
2017-05-25 XX YYY
2017-05-15 XX YYY
2017-04-30 XX YYY
2017-03-02 XX YYY
2016-11-02 XX YYY
The actual results give me the sum for all dates in the database, so no date time-frame is being applied in the query for a reason I cannot understand. I'm seeing dates that are both not for June 2017 and also from previous years.
Use date_trunc(..) function:
SELECT pt.created_at::date, pt.customer_id, c.name,
sum(pt.amount/100::double precision) filter (where date_trunc('month', pt.created_at) = date_trunc('month', NOW() - interval '1 month'))
from
product_transactions pt
LEFT JOIN customers c
ON c.id= pt.customer_id
GROUP BY pt.created_at::date

Postgresql Query for display of records every 45 days

I have a table that has data of user_id and the timestamp they joined.
If I need to display the data month-wise I could just use:
select
count(user_id),
date_trunc('month',(to_timestamp(users.timestamp))::timestamp)::date
from
users
group by 2
The date_trunc code allows to use 'second', 'day', 'week' etc. Hence I could get data grouped by such periods.
How do I get data grouped by "n-day" period say 45 days ?
Basically I need to display number users per 45 day period.
Any suggestion or guidance appreciated!
Currently I get:
Date Users
2015-03-01 47
2015-04-01 72
2015-05-01 123
2015-06-01 132
2015-07-01 136
2015-08-01 166
2015-09-01 129
2015-10-01 189
I would like the data to come in 45 days interval. Something like :-
Date Users
2015-03-01 85
2015-04-15 157
2015-05-30 192
2015-07-14 229
2015-08-28 210
2015-10-12 294
UPDATE:
I used the following to get the output, but one problem remains. I'm getting values that are offset.
with
new_window as (
select
generate_series as cohort
, lag(generate_series, 1) over () as cohort_lag
from
(
select
*
from
generate_series('2015-03-01'::date, '2016-01-01', '45 day')
)
t
)
select
--cohort
cohort_lag -- This worked. !!!
, count(*)
from
new_window
join users on
user_timestamp <= cohort
and user_timestamp > cohort_lag
group by 1
order by 1
But the output I am getting is:
Date Users
2015-04-15 85
2015-05-30 157
2015-07-14 193
2015-08-28 225
2015-10-12 210
Basically The users displayed at 2015-03-01 should be the users between 2015-03-01 and 2015-04-15 and so on.
But I seem to be getting values of users upto a date. ie: upto 2015-04-15 users 85. which is not the results I want.
Any help here ?
Try this query :
SELECT to_char(i::date,'YYYY-MM-DD') as date, 0 as users
FROM generate_series('2015-03-01', '2015-11-30','45 day'::interval) as i;
OUTPUT :
date users
2015-03-01 0
2015-04-15 0
2015-05-30 0
2015-07-14 0
2015-08-28 0
2015-10-12 0
2015-11-26 0
This looks like a hot mess, and it might be better wrapped in a function where you could use some variables, but would something like this work?
with number_of_intervals as (
select
min (timestamp)::date as first_date,
ceiling (extract (days from max (timestamp) - min (timestamp)) / 45)::int as num
from users
),
intervals as (
select
generate_series(0, num - 1, 1) int_start,
generate_series(1, num, 1) int_end
from number_of_intervals
),
date_spans as (
select
n.first_date + 45 * i.int_start as interval_start,
n.first_date + 45 * i.int_end as interval_end
from
number_of_intervals n
cross join intervals i
)
select
d.interval_start, count (*) as user_count
from
users u
join date_spans d on
u.timestamp >= d.interval_start and
u.timestamp < d.interval_end
group by
d.interval_start
order by
d.interval_start
With this sample data:
User Id timestamp derived range count
1 3/1/2015 3/1-4/15
2 3/26/2015 "
3 4/4/2015 "
4 4/6/2015 " (4)
5 5/6/2015 4/16-5/30
6 5/19/2015 " (2)
7 6/16/2015 5/31-7/14
8 6/27/2015 "
9 7/9/2015 " (3)
10 7/15/2015 7/15-8/28
11 8/8/2015 "
12 8/9/2015 "
13 8/22/2015 "
14 8/27/2015 " (5)
Here is the output:
2015-03-01 4
2015-04-15 2
2015-05-30 3
2015-07-14 5