Extracting date from day of the year and put it in a separate column - postgresql

I have file MOD17A1.A2002047.h20v09.058.2007117021342.hdf that is stored in my database. I have a column with the file names like that (the name of column is _filename_). 2002047 in the filename means day 47 of the year 2002. I have to extract the date in the date format like 2002-02-16 into a separate column for each file name.
I came up with this solution:
with S as (
select (select substring(filename from 10 for 4) || (select hiph)
|| (select substring(filename from 14 for 3))) as D
,mytable.rid as s_rid from mytable)
update mytable
set date_column = (select (to_date(d , 'IYYY-IDDD')')::date)
from S
where id=s_id
The only problem with this solution is that I need a column _hiph_ which all the values are '-'. Is there a way that I run this command in postgresql without this _hiph_ column? Or any alternative solution?

After untangling this .. monster I was able to simplify it .. a bit ..
UPDATE mytable
SET date_column = to_date(substr(filename, 10, 7), 'IYYYIDDD');
Seriously.
This updates all rows in the table.
hiph - obviously a hyphen (-) is not needed. Just adapt your pattern in to_date().
As commented, this produces the date '2002-02-15' for your example according to ISO format.
If the count of days starts with Jan. 1st use the pattern 'YYYYDDD' instead.
More about that in the manual.

Related

Postgres: Storing output of moving average query to a column

I have a table in Postgres 14.2
Table name is test
There are 3 columns: date, high, and five_day_mavg (date is PK if it matters)
I have a select statement which properly calculates a 5 day moving average based on the data in high.
select date,
avg(high) over (order by date rows between 4 preceding and current row) as mavg_calc
from test
It products output as such:
I have 2 goals:
First to store the output of the query in five_day_mavg.
Second to store this in such a way that when I a new row with data
in high, it automatically calculates that value
The closest I got was:
update test set five_day_mavg = a.mav_calc
from (
select date,
avg(high) over (order by date rows between 4 preceding and current row) as mav_calc
from test
) a;
but all that does is sets the value of every row in five_day_mavg to entire average of high
Thanks to #a_horse_with_no_name
I played around with the WHERE clause
update test l set five_day_mavg = b.five_day_mavg from (select date, avg(high) over (order by date rows between 4 preceding and current row) as five_day_mavg from test )b where l.date = b.date;
a couple of things. I defined each table. The original table I aliased as l, the temporary table created by doing a windows function (the select statement in parenthesis) I aliased as b and I joined with the WHERE clause on date which is the index/primary key.
Also, I was using 'a' as the letter for alias, and I think that may have contributed to the issue.
Either way, solved now.

How to get a list of dates in Pervasive SQL

Our time & attendance database is a Pervasive/Actian Zen database. What I'm trying to do is create a query that just lists the next 14 days from today. I'll then cross apply this list of dates with employee records so that in effect I have a list of people/dates for the next 14 days.
I've done it with a recursive CTE on SQL server quite easily. I could also do it with a loop in SQL Server too but I can't figure it out with Pervasive SQL. Loops can only exist within Stored Procedures and triggers.
Looking around I thought that this code that I found and adapted might work, but it doesn't (and further research suggests that there isn't a recursive option within Pervasive at all.
WITH RECURSIVE cte_numbers(n, xDate)
AS (
SELECT
0, CURDATE() + 1
UNION ALL
SELECT
n+1,
dateAdd(day,n,xDate)
FROM
cte_numbers
WHERE n < 14
)
SELECT
xDate
FROM
cte_numbers;
I just wondered whether anyone could help me write an SQL query that gives me this list of dates, outside of a stored procedure.
When you create a table like this:
CREATE TABLE dates(d DATE PRIMARY KEY, x INTEGER);
And create a first record like this:
INSERT INTO dates VALUES ('2021-01-01',0);
Then you can use this statement which doubles the number of records in the table dates, every time it is executed. (so you need to run it a couple of times
When you run it 10 times the table dates will have 21 oktober 2023 as last date.
When you run it 12 times the last date will be 19 march 2032.
INSERT INTO dates
SELECT
DATEADD(DAY,m.m+1,d),
x+m.m+1
from dates
cross join (select max(x) m from dates) m
order by d;
Of course the column x can be deleted (optionally) with next statement, but you cannot add more records using the previous statement:
ALTER TABLE dates DROP COLUMN x;
Finally, to return the next 14 day from today:
SELECT d
FROM DATES
WHERE d BETWEEN CURDATE( ) AND DATEADD(DAY,13,CURDATE());

Selecting certain columns from a table with dates as columns

I have a table where column names are like years "2020-05","2020-06", "2020-07" etc and so many years as columns.I need to select only the current month, next month and third month columns alone from this table.(DB : PostgreSQL Version 11)
But since the column names are "TEXT" are in the format YYYY-MM , How can I select only the current month and future 2 months from this table without hard-coding the column names.
Below is the table structure , Name : static_data
Required select statement is like this,The table contains the 14 months data as in the above screen shot like DATES as columns.From this i want the current month , and next 2 month columns along with their data, something like below.
SELECT "2020-05","2020-06","2020-07" from static
-- SELECT Current month and next 2 months
Required output:
It's nearly impossible to get the actual value of the current month as the column name, but you can do something like this:
select d.item_sku,
d.status,
to_jsonb(d) ->> to_char(current_date, 'yyyy-mm') as current_month,
to_jsonb(d) ->> to_char(current_date + interval '1 month', 'yyyy-mm') as "month + 1",
to_jsonb(d) ->> to_char(current_date + interval '2 month', 'yyyy-mm') as "month + 2"
from bad_design d
;
Technically, you can use the information schema to achieve this. But, like GMB said, please re-design your schema and do not approach this issue like this, in the first place.
The special schema information_schema contains meta-data about your DB. Among these is are details about existing columns. In other words, you can query it and convert their names into dates to compare them to what you need.
Here are a few hints.
Query existing column names.
SELECT column_name
FROM information_schema.columns
WHERE table_schema = 'your_schema'
AND table_name = 'your_table'
Compare two dates.
SELECT now() + INTERVAL '3 months' < now() AS compare;
compare
---------
f
(1 row)
You're already pretty close with the conversion yourself.
Have fun and re-design your schema!
Disclaimer: this does not answer your question - but it's too long for a comment.
You need to fix the design of this table. Instead of storing dates in columns, you should have each date on a separate row.
There are numerous drawbacks to your current design:
very simple queries are utterly complicated : filtering on dates, aggregation... All these operations require dynamic SQL, which adds a great deal of complexity
adding or removing new dates requires modifying the structure of the table
storage is wasted for rows where not all columns are filled
Instead, consider this simple design, with one table that stores the master data of each item_sku, and a child table
create table myskus (
item_sku int primary key,
name text,
cat_level_3_name text
);
create table myvalues (
item_sku int references myskus(item_sku),
date_sku date,
value_sku text,
primary key (item_sku, date_sku)
);
Now your original question is easy to solve:
select v.*, s.name, s.cat_level_3_name
from myskus s
inner join myvalues v on v.item_sku = s.item_sku
where
v.date_sku >= date_trunc('month', now())
and v.date_sku < date_trunc('month', now()) + interval '3 month'

Add dates ranges to a table for individual values using a cursor

I have a calendar table called CalendarInformation that gives me a list of dates from 2015 to 2025. This table has a column called BusinessDay that shows what dates are weekends or holidays. I have another table called OpenProblemtimeDiffTable with a column called number for my problem number and a date for when the problem was opened called ProblemNew and another date for the current column called Now. What I want to do is for each problem number grab its date ranges and find the dates between and then sum them up to give me the number of business days. Then I want to insert these values in another table with the problem number associated with the business day.
Thanks in advance and I hope I was clear.
TRUNCATE TABLE ProblemsMoreThan7BusinessDays
DECLARE #date AS date
DECLARE #businessday AS INT
DECLARE #Startdate as DATE, #EndDate as DATE
DECLARE CONTACT_CURSOR CURSOR FOR
SELECT date, businessday
FROM CalendarInformation
OPEN contact_cursor
FETCH NEXT FROM Contact_cursor INTO #date, #businessday
WHILE (##FETCH_STATUS=0)
BEGIN
SELECT #enddate= now FROM OpenProblemtimeDiffTable
SELECT #Startdate= problemnew FROM OpenProblemtimeDiffTable
SET #Date=#Startdate
PRINT #enddate
PRINT #startdate
SELECT #businessday= SUM (businessday) FROM CalendarInformation WHERE date > #startdate AND date <= #Enddate
INSERT INTO ProblemsMoreThan7BusinessDays (businessdays, number)
SELECT #businessday, number
FROM OpenProblemtimeDiffTable
FETCH NEXT FROM CONTACT_CURSOR INTO #date, #businessday
END
CLOSE CONTACT_CURSOR
DEALLOCATE CONTACT_CURSOR
I tried this code using a cursor and I'm close, but I cannot get the date ranges to change for each row.
So if I have a problemnumber with date ranges between 02-07-2018 and 05-20-2019, I would want in my new table the sum of business days from the calendar along with the problem number. So my output would be column number PROB0421 businessdays (with the correct sum). Then the next problem PRB0422 with date ranges of 11-6-18 to 5-20-19. So my output would be PROB0422 with the correct sum of business days.
Rather than doing this in with a cursor, you should approach this in a set based manner. That you already have a calendar table makes this a lot easier. The basic approach is to select from your data table and join into your calendar table to return all the rows in the calendar table that sit within your date range. From here you can then aggregate as you require.
This would look something like the below, though apply it to your situation and adjust as required:
select p.ProblemNow
,p.Now
,sum(c.BusinessDay) as BusinessDays
from dbo.Problems as p
join dbo.calendar as c
on c.CalendarDate between p.ProblemNow and p.Now
and c.BusinessDay = 1
group by p.ProblemNow
,p.Now
I think you can do this without a cursor. Should only require a single insert..select statement.
I assume your "businessday" column is just a bit or flag-type field that is 1 if the date is a business day and 0 if not? If so, this should work (or something close to it if I'm not understanding your environment properly).:
insert ProblemsMoreThan7BusinessDays
(
businessdays
, number
)
select
number
, sum( businessday ) -- or count(*)
from OpenProblemtimeDiffTable op
inner join CalendarInformation ci on op.problem_new >= ci.[date]
and op.[now] <= ci.[date]
and ci.businessday = 1
group by
problem_number
I usually try to avoid the use of cursors and working with data in a procedural manner, especially if I can handle the task as above. Dont think of the data as 1000's of individual rows, but think of the data as only two sets of data. How do they relate?

psql 8.4.1 select all the person born in a specific month

I am supposed to select all the persons born in July (or 07). This did not work:
select * from people where date_trunc('month',dob)='07';
ERROR: invalid input syntax for type timestamp with time zone: "07"
LINE 1: ...ct * from people where date_trunc('month',dob)='07';
What is the right way?
to_char() is meant to format dates. For a condition like yours, extract() is simpler & faster:
SELECT *
FROM people
WHERE extract(month FROM dob) = 7;
If you want to search for
a specific year and month too (YYYY-MM)
... like mentioned in the comment, use date_trunc() like you had initially. Just compare it to a date or timestamp, not to a string, which wouldn't make any sense (and was the cause of the error message). To find people born July 1970:
SELECT *
FROM people
WHERE date_trunc('month', dob) = '1970-07-01 0:0'::timestamp;
If performance is relevant, rewrite that to:
SELECT *
FROM people
WHERE dob >= '1970-07-01 0:0'::timestamp
AND dob < '1970-08-01 0:0'::timestamp; -- note the < with the upper limit
Because this form can use a plain index on people.dob:
CREATE INDEX people_dob_idx ON people (dob);
... and will therefore nuke the performance of the previous queries with big tables. Doesn't matter much with small tables.
You could also speed up the first query with a functional index, if needed.
select * from people where to_char(dob, 'MM') = '09';
gives you all people who where born in September, if the date of birth is stored in a timestamp table column called 'dob'.
The second param is the date format pattern. All typical patterns should be supported.
E.g.:
select * from people where to_char(dob, 'MON') = 'SEP';
would do the same.
look here for timestamp format patterns in Postgres: