Can I use to_char() and make_date() in postgreSQL table definition? - postgresql

I'm working on a poc to migrate an on-prem SQL Server database to Amazon Aurora for PostgreSQL. Amazon's Schema Conversion Tool struggled to translate the SQL Server code for the creation of a table on this column:
[DOB] AS (CONVERT([varchar],datefromparts([DOB_year],[DOB_month],[DOB_day]),(120))) PERSISTED,
as the CONVERT function is unsupported in Postgres.
The best translation I can come up with is:
dob varchar(30) GENERATED ALWAYS AS (to_char((make_date(dob_year, dob_month, dob_day))::timestamp, 'YYYY-MM-DD HH24:MI:SS')) STORED,
but neither the SCT nor pgAdmin4 are recognising to_char() and make_date() as functions. 'dob_day', 'dob_month' and 'dob_year' are all column names with datatype of integer. I'm new to all this but another column definition is using other functions, e.g. replace() and right(), successfully, so I'm confused why this isn't working.
When I tried to run the code in pgAdmin I got this error:
ERROR: generation expression is not immutable
SQL state: 42P17
Thanks

to_char() is is not marked as immutable even though in your case it would be. But there are format masks that are not immutable if e.g. time zones or different locales are involved.
If you really want to (or are forced to) convert day,month, year in a formatted string (rather than a proper date which would be the correct thing to do), then you can only achieve this with a custom function:
create function create_string_date(p_year int, p_month int, p_day int)
returns text
as
$$
select to_char(make_date(p_year, p_month, p_day), 'yyyy-mm-dd hh24:mi:ss');
$$
language sql
immutable;
Marking the function as immutable isn't cheating, because we know that with the given input and format string this is indeed immutable.
dob text generated always as (create_string_date(dob_year, dob_month, dob_day)) stored

Related

Redshift datediff() vs. date_diff()

It appears that Redshift supports two possible functions for computing a time interval distance between two DATE-like objects: DATEDIFF() & date_diff(). The following code snippet provides an example of this behavior:
SELECT datediff(DAYS, '2021-01-01'::DATE, '2021-02-01'::DATE) AS datediff_interval_output
, datediff('day', '2021-01-01'::DATE, '2021-02-01'::DATE) AS datediff_str_literal_output
, date_diff('day', '2021-01-01'::DATE, '2021-02-01'::DATE) AS date_diff_output
;
AWS provides documentation on DATEDIFF(), however no record of date_diff() appears to exist within either Redshift or PostgreSQL documentation. A curious difference between the two functions is that DATEDIFF will accept either a raw interval for its first argument (e.g. DAY, MONTH, SECOND) or a string literal ('day', 'month', 'second'), but date_diff() only accepts string literal interval representations.
The only reference I can find to date_diff() is an entry within pg_catalog which returns the following auto-generated definition:
CREATE FUNCTION date_diff(text, time with time zone, time with time zone) RETURNS bigint
IMMUTABLE
LANGUAGE internal AS
$$
begin
-- missing source code
end;
$$;
Does anyone have additional insight around the origin of the date_diff() function, and why it's included on Redshift but is undocumented?
Note: Using Redshift Version: 1.0.25109

Changing the format of my parameter specifically

in my Pentaho Data Integration program I enter a parameter DATE, e.g. 2016-03-15 (or differently, doesn't matter for me).
Now i want to use this parameter in a Call DB Procedure step, so i need the parameter in the format PL/SQL uses it. The PL/SQL Procedure looks like this: start_test(key_date date, name varchar2)
I have tried to solve it with the select values step but it didn't work so far...
What do i need to change so my parameter works with Call DB Procedure?
Thanks.
I'm not familiar with Pentaho so I'm not sure what is the context you'll call a PL/SQL procedure but I hope you'll find the following helpful.
date is a native Oracle data type. If you have a string presenting a date you have to convert it to a "real" date with to_date function:
begin
start_test(
key_date => to_date('2016-03-15', 'YYYY-MM-DD')
,name => 'a clever test name'
);
end;
The second to_date parameter is a format model that have to match the date string (the first parameter).

Pentaho Data Integration Input / Output Bit Type Error

I am using Pentaho Data Integration for numerous projects at work. We predominantly use Postgres for our database's. One of our older tables has two columns that are set to type bit(1) to store 0 for false and 1 for true.
My task is to synchronize a production table with a copy in our development environment. I am reading the data in using Table Input and immediately trying to do an Insert/Update. However, it fails because of the conversion to Boolean by PDI. I updated the query to cast the values to integers to retain the 0 and 1 but when I run it again, my transformation fails because an integer cannot be a bit value.
I have looked for several days trying different things like using the javascript step to convert to a bit but I have not been able to successfully read in a bit type and use the Insert/Update step to store the data. I also do not believe the Insert/Update step has the capabilities of updating the SQL that is being used to define the data type for the column.
The database connection is set up using:
Connection Type: PostgreSQL
Access: Native (JDBC)
Supports the boolean data type: true
Quote all in database: true
Note: Altering the table to change the datatype is not optional at this point in time. Too many applications currently depend on this table so altering it in this way could cause undesirable affects
Any help would be appreciated. Thank you.
You can create cast object (for example from character varying to bit) in your destination database with "as assignment" option. AS ASSIGNMENT allows to apply this cast automatically during inserts.
http://www.postgresql.org/docs/9.3/static/sql-createcast.html
Here is some proof-of-concept for you:
CREATE FUNCTION cast_char_to_bit (arg CHARACTER VARYING)
RETURNS BIT(1) AS
$$
SELECT
CASE WHEN arg = '1' THEN B'1'
WHEN arg = '0' THEN B'0'
ELSE NULL
END
$$
LANGUAGE SQL;
CREATE CAST (CHARACTER VARYING AS BIT(1))
WITH FUNCTION cast_char_to_bit(CHARACTER VARYING)
AS ASSIGNMENT;
Now you should be able to insert/update single-character strings into bit(1) column. However, you will need to cast your input column to character varying/text, so that it would be converted to String after in the table input step and to CHARACTER VARYING in the insert/update step.
Probably, you could create cast object using existing cast functions, which are defined in postgres already (see pg_cast, pg_type and pg_proc tables, join by oid), but I haven't managed to do this, unfortunately.
Edit 1:
Sorry for the previous solution. Adding a cast from boolean to bit looks much more reasonable: you will not even need to cast data in your table input step.
CREATE FUNCTION cast_bool_to_bit (arg boolean)
RETURNS BIT(1) AS
$$
SELECT
CASE WHEN arg THEN B'1'
WHEN NOT arg THEN B'0'
ELSE NULL
END
$$
LANGUAGE SQL;
CREATE CAST (BOOLEAN AS BIT(1))
WITH FUNCTION cast_bool_to_bit(boolean)
AS ASSIGNMENT;
I solved this by writing out the Postgres insert SQL (with B'1' and B'0' for the bit values) in a previous step and using "Execute row SQL Script" at the end to run each insert as individual SQL statements.

Create timestamp index from JSON on PostgreSQL

I have a table on PostgreSQL with a field named data that is jsonb with a lot of objects, I want to make an index to speed up the queries. I'm using few rows to test the data (just 15 rows) but I don't want to have problems with the queries in the future. I'm getting data from the Twitter API, so with a week I get around 10gb of data. If I make the normal index
CREATE INDEX ON tweet((data->>'created_at'));
I get a text index, if I make:
Create index on tweet((CAST(data->>'created_at' AS timestamp)));
I get
ERROR: functions in index expression must be marked IMMUTABLE
I've tried to make it "inmutable" setting the timezone with
date_trunc('seconds', CAST(data->>'created_at' AS timestamp) at time zone 'GMT')
but I'm still getting the "immutable" error. So, How can I accomplish a timestamp index from a JSON? I know that I could make a simple column with the date because probably it will remain constant in the time, but I want to learn how to do that.
This expression won't be allowed in the index either:
(CAST(data->>'created_at' AS timestamp) at time zone 'UTC')
It's not immutable, because the first cast depends on your DateStyle setting (among other things). Doesn't help to translate the result to UTC after the function call, uncertainty has already crept in ...
The solution is a function that makes the cast immutable by fixing the time zone (like #a_horse already hinted).
I suggest to use to_timestamp() (which is also only STABLE, not IMMUTABLE) instead of the cast to rule out some source of trouble - DateStyle being one.
CREATE OR REPLACE FUNCTION f_cast_isots(text)
RETURNS timestamptz AS
$$SELECT to_timestamp($1, 'YYYY-MM-DD HH24:MI')$$ -- adapt to your needs
LANGUAGE sql IMMUTABLE;
Note that this returns timestamptz. Then:
CREATE INDEX foo ON t (f_cast_isots(data->>'created_at'));
Detailed explanation for this technique in this related answer:
Does PostgreSQL support "accent insensitive" collations?
Related:
Query on a time range ignoring the date of timestamps

Rails 4, migration to change datatype of column from daterange to tsrange causing PG::DatatypeMismatch: ERROR:

I'm trying to change a column of type daterange to tsrange (I realized I need time as well as date) using a vanilla Rails migration
def self.up
change_column :events, :when, :tsrange
end
After running rake db:migrate the error is
PG::DatatypeMismatch: ERROR: column "when" cannot be cast automatically to type tsrange
HINT: Specify a USING expression to perform the conversion.
: ALTER TABLE "events" ALTER COLUMN "when" TYPE tsrange
I tried following the hint and used the following
def self.up
change_column :events, :when, :tsrange, 'tsrange USING CAST(when AS tsrange)'
end
but then got
no implicit conversion of Symbol into Integer
From what I can tell, USING CAST is mainly meant for use with ints. Assuming I don't want to drop and then recreate the column, what do you have to specify to alter the type from daterange to tsrange?
I'm using
Rails 4.0.1
ruby-2.0.0-p247
psql (9.2.4)
Some background, daterange and tsrange were introduced to Rails 4 in the following PR: https://github.com/rails/rails/pull/7345. Thanks.
The USING clause is used to specify how to convert the old values to the new ones:
The optional USING clause specifies how to compute the new column value from the old; if omitted, the default conversion is the same as an assignment cast from old data type to new. A USING clause must be provided if there is no implicit or assignment cast from old to new type.
So USING shows up any time there is no default cast from the old type to the new type. Also note that USING is specified as USING expression so any expression (whose value is of the correct type) can be used with a USING, the most common is USING CAST(...) but the expression can be pretty much anything.
Hopefully that should clear up some confusion about USING.
So what's up with the ActiveRecord error? Well, change_column is expecting to see an options Hash in the fourth argument but you're sending in a string. If you look at the change_column source, you'll see things like options[:limit] but String#[] expects integer arguments so your string argument is triggering odd looking complains about Symbols.
AFAIK there is no way to get AR to add a USING clause to the ALTER TABLE ... ALTER COLUMN that change_column generates. This leaves connection.execute(some_sql) if you need a USING clause. Of course this is further complicated by the (apparent) lack of a built-in cast from daterange to tsrange but building the necessary expression isn't terribly difficult if you pull the daterange apart with the upper and lower functions:
connection.execute(%q{
alter table events
alter column "when"
type tsrange using tsrange(lower("when"), upper("when"))
})
You can see the table change in action over here: http://sqlfiddle.com/#!15/fb047/2
That assumes that you're using the default half-open intervals ([...)) for your ranges; if you have ranges that aren't closed on the left and open on the right then you'll have to build a more complicated USING expression using the other range functions to see if the left and right ends of the ranges are open or closed.
BTW, when is a PostgreSQL keyword so it isn't the best choice for an identifier, you'll have to say "when" every time you refer to that column in SQL snippets and that might get tiring. I'd recommend using a different name for that column so that you don't have to worry about quoting.