Time read out another table Postgresql - postgresql

I have one Function in postgresql. And now I have fixed data in my code.
Here one piece of my code:
"TimeStamp_data" > now() - interval '100 hours');
How is it possible to get the 100 hour read out from another table?
Example:
Table: Times
Column: cg_time.
So i like to read out from the cg_time column the time. So enough just change in the table, not need to change in the code.

Related

Move rows older that x days to archive table or partition table in Postgres 11

I would like to speed up the queries on my big table that contains lots of old data.
I have a table named post that has the date column created_at. The table has over ~31 million rows and ~30 million rows older than 30 days.
Actually, I want this:
move data older than 30 days into the post_archive table or create a partition table.
when the value in column created_at becomes older than 30 days then that row should be moved to the post_archive table or partition table.
Any detailed and concrete solution in PostgresSQL 11.15?
My ideas:
Solution 1. create a cron script in whatever language (e.g. JavaScript) and run it every day to copy data from the post table into post_archive and then delete data from the post table
Solution 2. create a Postgres function that should copy the data from the post table into the partition table, and create a cron job that will call the function every day
Thanks
This is to split your data into a post and post_archive table. It's a common approach, and I've done it (with SQL Server).
Before you do anything else, make sure you have an index on your created_at column on your post table. Important.
Next, you need to use a common expression to mean "thirty days ago". This is it.
(CURRENT_DATE - INTERVAL '30 DAY')::DATE
Next, back everything up. You knew that.
Then, here's your process to set up your two tables.
CREATE TABLE post_archive AS TABLE post; to populate your archive table.
Do these two steps to repopulate your post table with the most recent thirty days. It will take forever to DELETE all those rows, so we'll truncate the table and repopulate it. That's also good because it's like starting from scratch with a much smaller table, which is what you want. This takes a modest amount of downtime.
TRUNCATE TABLE post;
INSERT INTO post SELECT * FROM post_archive
WHERE created_at > (CURRENT_DATE - INTERVAL '30 DAY')::DATE;
DELETE FROM post_archive WHERE created_at > (CURRENT_DATE - INTERVAL '30 DAY')::DATE; to remove the most recent thirty days from your archive table.
Now, you have the two tables.
Your next step is the daily row-migration job. PostgreSQL lacks a built-in job scheduler like SQL Server's Job or MySQL's EVENT so your best bet is a cronjob.
It's probably wise to do the migration daily if that fits with your business rules. Why? Many-row DELETEs and INSERTs cause big transactions, and that can make your RDBMS server thrash. Smaller numbers of rows are better.
The SQL you need is something like this:
INSERT INTO post_archive SELECT * FROM post
WHERE created_at <= (CURRENT_DATE - INTERVAL '30 DAY')::DATE;
DELETE FROM post
WHERE created_at <= (CURRENT_DATE - INTERVAL '30 DAY')::DATE;
You can package this up as a shell script. On UNIX-derived systems like Linux and FreeBSD the shell script file might look like this.
#!/bin/sh
psql postgres://username:password#hostname:5432/database << SQLSTATEMENTS
INSERT INTO post_archive SELECT * FROM post
WHERE created_at <= (CURRENT_DATE - INTERVAL '30 DAY')::DATE;
DELETE FROM post
WHERE created_at <= (CURRENT_DATE - INTERVAL '30 DAY')::DATE;
SQLSTATEMENTS
Then run the shell script from cron a few minutes after 3am each day.
Some notes:
3am? Why? In many places daylight-time switchover messes up the time between 02:00 and 03:00 twice a year. A choice of, say 03:22 as a time to run the daily migration keeps you well away from that problem.
CURRENT_DATE gets you midnight of today. So, if you run the script more than once in any calendar day, no harm is done.
If you miss a day, the next day's migration will catch up.
You could package up the SQL as a stored procedure and put it into your RDBMS, then invoke it from your shell script. But then your migration procedure lives in two different places. You need the cronjob and shell script in any case in PostgreSQL.
Will your application go off the rails if it sees identical rows in both post and post_archive while the migration is in progress? If so, you'll need to wrap your SQL statements in a transaction. That way other users of the database won't see the duplicate rows. Do this.
#!/bin/sh
psql postgres://username:password#hostname:5432/database << SQLSTATEMENTS
START TRANSACTION;
INSERT INTO post_archive SELECT * FROM post
WHERE created_at <= (CURRENT_DATE - INTERVAL '30 DAY')::DATE;
DELETE FROM post
WHERE created_at <= (CURRENT_DATE - INTERVAL '30 DAY')::DATE;
COMMIT;
SQLSTATEMENTS
Cronjobs are quite reliable on Linux and FreeBSD.

Postgres - create a column (alter table) as a calculation of other two columns

I have a table in Posgres that contains task start and task end dates. It's possible to generate a column in this tale as rate between (current day -start day) /(start day-end day) the column is the % of time elapse. I try in this way but does not work.
ALTER TABLE public.gantt_task
ADD COLUMN
percentage_progress
GENERATED ALWAYS AS (
(DATEDIFF("day",
CURRENT_DATE,public.gantt_Tasks.start_date)) / DATEDIFF("day", public.gantt_Tasks.end_date ,public.gantt_Tasks.start_date))
STORED
The manual says postgres only supports materialized (ie, stored) generated columns, which means the value is generated when the row is inserted or updated, which means it will use the insert/update date, not the CURRENT_DATE you want.
So, you need to create a view instead. This allows evaluating CURRENT_DATE at the date of the SELECT, not the INSERT/UPDATE, to generate columns.
CREATE VIEW foo AS SELECT *,
(CURRENT_DATE - public.gantt_Tasks.start_date)
/ (public.gantt_Tasks.end_date-public.gantt_Tasks.start_date)
AS percentage_progress
FROM public.gantt_task
Note DATEDIFF is mysql syntax not postgres, and division by zero is not allowed, so if start_date and end_date can be identical then you'll have to modify the expression conditions depending on what you want. Also your expression will go over 100% when CURRENT_DATE is later than end_date. Perhaps something like:
least( 1.0, (CURRENT_DATE-start_date)/greatest( 1, end_date-start_date)::FLOAT )
I won't write proper SQL code. But you might/should split it into two or three tasks:
Add new column that allows null (that should be default)
Update table
Add constrains (if required)

Need to fix timestamps in my TimescaleDB database (the number of seconds provided to TO_TIMESTAMP was incorrect by exactly a factor of 1000)

I have a TimescaleDB database in which some of the timestamps across several tables are incorrect- I inadvertently gave the TO_TIMESTAMP() function the number of milliseconds in Unix time, instead of seconds. Thus, all of these data points are 1000 times longer since 1970 than they should be. I can easily isolate which of these rows need to be fixed with a check for future dates in the where clause, but I am a little stuck on how to convert and replace these incorrect timestamps. I essentially need to get the unix time representation, divide it by 1000, and replace that value in the row, but my SQL is too rusty to piece this query together.
I see that i can use extract(epoch from ) to get the number of seconds, but how to do this to every row and then updating its timestamp is not clear to me.
Edit:
When using the query:
UPDATE table_name
SET time = TO_TIMESTAMP(extract(epoch from time) / 1000.0)
WHERE
time > '2020-01-01 00:00:00';
I get the error:
new row for relation "_hyper_8_295_chunk" violates check constraint
"constraint_295"
I think it would probably be best to create a new hypertable and run an insert into select from the old hypertable to the new. Or potentially do it in batches. This is because Timescale restricts updating of the partitioning keys so that items don't move between partitions. You can do a delete and then an insert to make that work similarly, but it's going to be more efficient to just create a new hypertable, move everything over with the correct timestamps and then rename than to try doing updates etc.

Postgresql select count(*) takes too long time

I have a table in my postgresql table. The table has about 9.100.000 rows. When I execute a query select count(*) from table the execution time is about 1.5 minutes. Is this normal? And what can I do decrease this tim?
If you want an estimation of the size you can use count_estimate. It is much faster.
https://wiki.postgresql.org/wiki/Count_estimate
Another workaround is to use a statistics field, to increase it every time a new row is being added.
Also please read https://www.citusdata.com/blog/2016/10/12/count-performance/

getting second difference with sql statement

I have a table in postgresql which stores time stamp with timezone for every row inserted.
How can I use postgresql's function to find a the difference in seconds from the timestamp in one of the rows already inserted to the current postgresql server time stamp?
Assuming that the column name is ts and the table name is t, you can query like this:
select current_timestamp - max(ts) from t;
If the table contains large amount of data, this query will be very slow. In that case, you should have index on the timestamp column.