PostgreSQL create index on cast from string to date - postgresql

I'm trying to create an index on the cast of a varchar column to date. I'm doing something like this:
CREATE INDEX date_index ON table_name (CAST(varchar_column AS DATE));
I'm getting the error: functions in index expression must be marked IMMUTABLE But I don't get why, the cast to date doesn't depends on the timezone or something like that (which makes a cast to timestamp with time zone give this error).
Any help?

Your first error was to store a date as a varchar column. You should not do that.
The proper fix for your problem is to convert the column to a real date column.
Now I'm pretty sure the answer to that statement is "I didn't design the database and I cannot change it", so here is a workaround:
CAST and to_char() are not immutable because they can return different values for the same input value depending on the current session's settings.
If you know you have a consistent format of all values in the table (which - if you had - would mean you can convert the column to a real date column) then you can create your own function that converts a varchar to a date and is marked as immutable.
create or replace function fix_bad_datatype(the_date varchar)
returns date
language sql
immutable
as
$body$
select to_date(the_date, 'yyyy-mm-dd');
$body$
ROWS 1
/
With that definition you can create an index on the expression:
CREATE INDEX date_index ON table_name (fix_bad_datatype(varchar_column));
But you have to use exactly that function call in your query so that Postgres uses it:
select *
from foo
where fix_bad_datatype(varchar_column) < current_date;
Note that this approach will fail badly if you have just one "illegal" value in your varchar column. The only sensible solution is to store dates as dates,

Please provide the database version, table ddl, and some example data.
Would making your own immutable function do what you want, like this? Also look into creating a new cast in the docs and see if that does anything for you.
create table emp2 (emp2_id integer, hire_date VARCHAR(100));
insert into emp2(hire_date)
select now();
select cast(hire_date as DATE)
from emp2
CREATE FUNCTION my_date_cast(VARCHAR) RETURNS DATE
AS 'select cast($1 as DATE)'
LANGUAGE SQL
IMMUTABLE
RETURNS NULL ON NULL INPUT;
CREATE INDEX idx_emp2_hire_date ON emp2 (my_date_cast(hire_date));

Related

Implicitly cast an ISO8601 string to TIMESTAMPTZ (postgresql) for Debezium

I am using a 3rd party application (Debezium Connector). It has to write date time strings in ISO-8601 format into a TIMESTAMPTZ column. Unfortunately this fails, because there is no implicit cast from varchar to timestamp tz.
I did notice that the following works:
SELECT TIMESTAMPTZ('2021-01-05T05:17:46Z');
SELECT TIMESTAMPTZ('2021-01-05T05:17:46.123Z');
I tried the following:
Create a function and a cast
CREATE OR REPLACE FUNCTION varchar_to_timestamptz(val VARCHAR)
RETURNS timestamptz AS $$
SELECT TIMESTAMPTZ(val) INTO tstz;
$$ LANGUAGE SQL;
CREATE CAST (varchar as timestamptz) WITH FUNCTION varchar_to_timestamptz (varchar) AS IMPLICIT;
Unfortunately, it gives the following errors:
function timestamptz(character varying) does not exist
I also tried the same as above but using plpgsql and got the same error.
I tried writing a manual parse, but had issues with the optional microsecond segment which gave me the following
CREATE OR REPLACE FUNCTION varchar_to_timestamptz (val varchar) RETURNS timestamptz AS $$
SELECT CASE
WHEN $1 LIKE '%.%'
THEN to_timestamp($1, 'YYYY-MM-DD"T"HH24:MI:SS.USZ')::timestamp without time zone at time zone 'Etc/UTC'
ELSE to_timestamp($1, 'YYYY-MM-DD"T"HH24:MI:SSZ')::timestamp without time zone at time zone 'Etc/UTC' END $$ LANGUAGE SQL;
Which worked, but didn't feel correct.
Is there a better way to approach this implicit cast?
If the value should be converted upon insert, define an assignment cast. You need no function; using the type input and output functions will do:
CREATE CAST (varchar AS timestamptz) WITH INOUT AS ASSIGNMENT;
Be warned that messing with the casts on standard data types can lead to problems, because it increases the ambiguity. It would be much better if you could find a way to use an explicit cast.

Creating insert function with TIMESTAMP

I created simple table with a simple function, to insert some logs for the elapsed semester:
CREATE TABLE log_elapsedsemester(
sy char(9) NOT NULL,
sem char(1) NOT NULL,
date_recorded TIMESTAMP NOT NULL,
recordedby varchar(255)
);
CREATE OR REPLACE FUNCTION addelapsedsemester(p_sy char,p_sem char,p_date_recorded
TIMESTAMP,p_recordedby varchar)
returns void
AS
$$
BEGIN
insert into log_elapsedsemester (sy,sem,date_recorded,recordedby) values
(p_sy,p_sem,p_date_recorded,p_recordedby);
END
$$
LANGUAGE plpgsql;
But evertime I use
select addelapsedsemester('2019-2020','1',now(),'sample#gmail.com');
I get the error:
No function matches the given name and argument types. You might need to add explicit type casts.
If I use a simple INSERT with no function it inserts successfully:
insert into log_elapsedsemester(sy,sem,date_recorded,recordedby) values ('2020-
2021','1',now(),'sample#gmail.com');
I'm using PostgreSQL 9.5 with pgadmin III.
You need to cast to timestamp explicitly. Like:
SELECT addelapsedsemester('2019-2020', '1', now()::timestamp,'sample#gmail.com');
Or use LOCALTIMESTAMP instead of now()::timestamp (equivalent).
The function now() returns type timestamp with time zone (timestamptz), while your function takes timestamp without time zone (timestamp). The now() function produces a typed value (unlike the other untyped literals), where Postgres is more hesitant to coerce it to a different type. Function type resolution does not succeed.
The same type coercion still works for the bare INSERT command because (quoting the manual):
If the expression for any column is not of the correct data type, automatic type conversion will be attempted.
Be aware that the cast from timestamptz to timestamp depends on the current timezone setting of the session. You may want to be more explicit. Like now() AT TIME ZONE 'Europe/London'. Or use timestamptz to begin with. Then your original call without cast just works. See:
Now() without timezone
Also, you most probably do not want to use the type char, which is misleading short syntax for character(1). Use text or varchar instead. See:
Any downsides of using data type "text" for storing strings?
This table definition would make more sense:
CREATE TABLE log_elapsedsemester(
sy varchar(9) NOT NULL
, sem integer NOT NULL
, date_recorded timestamptz NOT NULL
, recordedby text
);
Or even:
sy integer NOT NULL -- 2019 can stand for '2019-2020'
Function parameters would match the column type.

Error when creating a generated column in Postgresql

CREATE TABLE Person (
id serial primary key,
accNum text UNIQUE GENERATED ALWAYS AS (
concat(right(cast(extract year from current_date) as text), 2), cast(id as text)) STORED
);
Error: generation expression is not immutable
The goal is to populate the accNum field with YYid where YY is the last two letters of the year when the person was added.
I also tried the '||' operator but it was unsuccessful.
As you don't expect the column to be updated, when the row is changed, you can define your own function that generates the number:
create function generate_acc_num(id int)
returns text
as
$$
select to_char(current_date, 'YY')||id::text;
$$
language sql
immutable; --<< this is lying to Postgres!
Note that you should never use this function for any other purpose. Especially not as an index expression.
Then you can use that in a generated column:
CREATE TABLE Person
(
id integer generated always as identity primary key,
acc_num text UNIQUE GENERATED ALWAYS AS (generate_acc_num(id)) STORED
);
As #ScottNeville correctly mentioned:
CURRENT_DATE is not immutable. So it cannot be used int a GENERATED ALWAYS AS expression.
However, you can achieve this using a trigger nevertheless:
demo:db<>fiddle
CREATE FUNCTION accnum_trigger_function()
RETURNS TRIGGER
LANGUAGE PLPGSQL
AS $$
BEGIN
NEW.accNum := right(extract(year from current_date)::text, 2) || NEW.id::text;
RETURN NEW;
END
$$;
CREATE TRIGGER tr_accnum
BEFORE INSERT
ON person
FOR EACH ROW
EXECUTE PROCEDURE accnum_trigger_function();
As #a_horse_with_no_name mentioned correctly in the comments: You can simplify the expression to:
NEW.accNum := to_char(current_date, 'YY') || NEW.id;
I am not exactly sure how to solve this problem (maybe a trigger), but current_date is a stable function not an immutable one. For the generated IDs I believe all function calls must be immutable. You can read more here https://www.postgresql.org/docs/current/xfunc-volatility.html
I dont think any function that gets the date can be immutable as Postgres defines this as "An IMMUTABLE function cannot modify the database and is guaranteed to return the same results given the same arguments forever." This will not be true for anything that returns the current date.
I think your best bet would be to do this with a trigger so on insert it sets the value.

Does pg_typeof ever return alternative type names?

For context:
I have a sql function that takes an array of strings. Those strings are sql expressions that are stored in a table and used some time later for dynamically creating some queries.
I want to restrict the data types of those expressions to some limited set. For that I intend to evaluate the expressions and check the data type with pg_typeof like so:
create function fun(expressions text[]) returns void as $$
declare
expression_type text;
begin
for i in 1..array_length(expressions, 1) loop
execute format('select pg_typeof((select %s from some_table where false))', expressions[i]) into expression_type ;
-- check that expression_type has legal value, raise exception otherwise
end loop;
-- store expressions for later use
end;
$$ language plpgsql;
For example suppose that integer and timestamp without time zone are the allowed types.
I'd like to list the allowed types in an enum:
create type supported_types as enum ('integer', 'timestamp with time zone');
For some types PostgreSQL documentatsion also mentions alternative names, e.g int4 instead of integer, timestamp instead of timestamp without time zone etc.
My queston is that do I have to worry about these "alternative names" for types when I enumerate the ones I care about?
I.e if I include integer do I also have to include int4 or in other words does pg_typeof ever return int4 instead of integer (or timestamp instead of timestamp without time zone etc)?
It does not.
Function internally (C code) returns OID, but since in PostgreSQL it is declared as returning regtype, it is cast to it. Since OID is the same regardless of name/alias for type, it will always give the same result.
postgres=# SELECT 'int4'::regtype::oid::regtype;
regtype
---------
integer
(1 row)

Find what row holds a value which cannot be cast to integer

I have some operations in heavy rearanging data tables which goes good so far.
In one table with more than 50000 rows I have text column where text should be numbers only.
Now I would like to convert it to integer column.
So:
ALTER TABLE mytable ALTER COLUMN mycolumn TYPE integer;
That produces an error 42804: *datatype_mismatch*
By reading docs I find solution:
ALTER TABLE mytable ALTER COLUMN mycolumn TYPE integer USING (TRIM(mycolumn)::integer);
But I am aware that data may not be correct in mean of number order since this "masks" an error and there is possibility that column was edited (by hand). After all, maybe is only trailing space added or some other minor editing was made.
I have backup of data.
How would I find which exact cell of given column contain an error and which value cannot be casted to int with some handy query suitable for use from pgadmin?
Please that query if is not complicated too much.
Expanding on #dystroy's answer, this query should cough the precise value of any offending rows:
CREATE OR REPLACE FUNCTION convert_to_integer(v_input text)
RETURNS INTEGER AS $$
BEGIN
BEGIN
RETURN v_input::INTEGER;
EXCEPTION WHEN OTHERS THEN
RAISE EXCEPTION 'Invalid integer value: "%". Returning NULL.', v_input;
RETURN NULL;
END;
END;
$$ LANGUAGE plpgsql;
Original answer:
If the following works:
ALTER TABLE mytable
ALTER COLUMN mycolumn TYPE integer USING (TRIM(mycolumn)::integer);
Then you should probably be able to run the following to locate the trash:
select mycolumn from mytable
where mycolumn::text <> (TRIM(mycolumn)::integer)::text;