Postgres function to return regexp_matches result - postgresql

I am trying to write some postgresql functions to help me parse a string value like "30Gi" or "25Ti". I tried this, but can't get the syntax right, and can't figure out what the return type should be for the first function.
CREATE FUNCTION get_matches(VARCHAR) RETURNS ARRAY(VARCHAR) AS
SELECT (regexp_matches ($1, '(\d+)([KGTM]i)'));
CREATE FUNCTION get_amount(ARRAY(VARCHAR)) RETURNS VARCHAR AS
SELECT get_matches($1)[1];
CREATE FUNCTION get_units(ARRAY(VARCHAR)) RETURNS VARCHAR AS
SELECT get_matches($1)[2];
Postgres version:
PostgreSQL 13.2 on x86_64-pc-linux-gnu, compiled by gcc (Debian 8.3.0-6) 8.3.0, 64-bit
Testing my functions using pgAdmin4.

You can get both returned in a single query. The following returns a table of 2 columns size and units for valid expressions. After validating the parameter is correctly formatted (via regex) it uses basically the same with regexp_replace to return each component.
create or replace function size_units(p_size_units text)
returns table ( size integer
, units text
)
language sql
as $$;
select regexp_replace(p_size_units, '^(\d+)([KGTM]i)$', '\1')::integer
, regexp_replace(p_size_units, '^(\d+)([KGTM]i)$', '\2')
where p_size_units ~ '^\d+[KGTM]i$';
$$;
You can use the results in a select statement if desired. (see demo).

Related

What is the postgresql equivalent of DUMP function in Oracle?

The dump function in Oracle displays the internal representation of data:
DUMP returns a VARCHAR2 value containing the data type code, length in bytes, and internal representation of expr
Fore example:
SELECT DUMP(cast(1 as number ))
2 FROM DUAL;
DUMP(CAST(1ASNUMBER))
--------------------------------------------------------------------------------
Typ=2 Len=2: 193,2
SQL> SELECT DUMP(cast(1.000001 as number ))
2 FROM DUAL;
DUMP(CAST(1.000001ASNUMBER))
--------------------------------------------------------------------------------
Typ=2 Len=5: 193,2,1,1,2
It shows that the first 1 uses 2 byte for storing and the second example uses 5 bytes for storing.
I suppose the similar function in PostgreSQL is pg_typeof but it returns only the type name without information about byte usage:
SELECT pg_typeof(33);
pg_typeof
integer (1 row)
Does anybody know if there is an equivalent function in PostgreSQL?
I don't speak PostgreSQL.
However, Oracle functionality page says that there's Orafce which
implements in Postgres some of the functions from the Oracle database that are missing (or behaving differently)
It, furthermore, mentions the dump function
dump (anyexpr [, int]): Returns a text value that includes the datatype code, the length in bytes, and the internal representation of the expression
One of examples looks like this:
postgres=# select pg_catalog.dump('Pavel Stehule',10);
dump
-------------------------------------------------------------------------
Typ=25 Len=17: 68,0,0,0,80,97,118,101,108,32,83,116,101,104,117,108,101
(1 row)
To me, it looks like Oracle's dump:
SQL> select dump('Pavel Stehule') result from dual;
RESULT
--------------------------------------------------------------
Typ=96 Len=13: 80,97,118,101,108,32,83,116,101,104,117,108,101
SQL>
I presume you'll have to visit GitHub and install the package to see whether you can use it or not.
It is not a complete equivalent, but if you want to figure out the byte values used to encode a string in PostgreSQL, you can simply cast the value to bytea, which will give you the bytes in hexadecimal:
SELECT CAST ('schön' AS bytea);
This will work for strings, but not for numbers.

Create view using postgres function

I'm trying to build a parametrized view using a postgres function:
CREATE FUNCTION schemaB.testFunc(p INT)
RETURNS TABLE
AS
RETURN (SELECT * FROM schemaZ.mainTable WHERE id=p)
The problem is always the same:
SQL Error [42601]: ERROR: syntax error at or near "AS"
Any idea on what could I be doing wrong?
You need to specify the columns of the "return table", this is either done using
returns table(col_1 integer, col_2 text, ...)
In your case you are returning only rows of one table, so it's easier to use
returns setof maintable
As documented in the manual the function body needs to be enclosed in single quotes, or using dollar quoting.
As stored functions can be written in many different languages in Postgres, you also need to specify a language - in this case language sql is suitable.
So putting all that together, you need:
CREATE FUNCTION schemaB.testFunc(p_id INT)
RETURNS setof schemaZ.mainTable
AS
$$
SELECT *
FROM schemaZ.mainTable
WHERE id = p_id
$$
language sql;
A return statement is not required for language sql functions.

Pentaho Data Integration Input / Output Bit Type Error

I am using Pentaho Data Integration for numerous projects at work. We predominantly use Postgres for our database's. One of our older tables has two columns that are set to type bit(1) to store 0 for false and 1 for true.
My task is to synchronize a production table with a copy in our development environment. I am reading the data in using Table Input and immediately trying to do an Insert/Update. However, it fails because of the conversion to Boolean by PDI. I updated the query to cast the values to integers to retain the 0 and 1 but when I run it again, my transformation fails because an integer cannot be a bit value.
I have looked for several days trying different things like using the javascript step to convert to a bit but I have not been able to successfully read in a bit type and use the Insert/Update step to store the data. I also do not believe the Insert/Update step has the capabilities of updating the SQL that is being used to define the data type for the column.
The database connection is set up using:
Connection Type: PostgreSQL
Access: Native (JDBC)
Supports the boolean data type: true
Quote all in database: true
Note: Altering the table to change the datatype is not optional at this point in time. Too many applications currently depend on this table so altering it in this way could cause undesirable affects
Any help would be appreciated. Thank you.
You can create cast object (for example from character varying to bit) in your destination database with "as assignment" option. AS ASSIGNMENT allows to apply this cast automatically during inserts.
http://www.postgresql.org/docs/9.3/static/sql-createcast.html
Here is some proof-of-concept for you:
CREATE FUNCTION cast_char_to_bit (arg CHARACTER VARYING)
RETURNS BIT(1) AS
$$
SELECT
CASE WHEN arg = '1' THEN B'1'
WHEN arg = '0' THEN B'0'
ELSE NULL
END
$$
LANGUAGE SQL;
CREATE CAST (CHARACTER VARYING AS BIT(1))
WITH FUNCTION cast_char_to_bit(CHARACTER VARYING)
AS ASSIGNMENT;
Now you should be able to insert/update single-character strings into bit(1) column. However, you will need to cast your input column to character varying/text, so that it would be converted to String after in the table input step and to CHARACTER VARYING in the insert/update step.
Probably, you could create cast object using existing cast functions, which are defined in postgres already (see pg_cast, pg_type and pg_proc tables, join by oid), but I haven't managed to do this, unfortunately.
Edit 1:
Sorry for the previous solution. Adding a cast from boolean to bit looks much more reasonable: you will not even need to cast data in your table input step.
CREATE FUNCTION cast_bool_to_bit (arg boolean)
RETURNS BIT(1) AS
$$
SELECT
CASE WHEN arg THEN B'1'
WHEN NOT arg THEN B'0'
ELSE NULL
END
$$
LANGUAGE SQL;
CREATE CAST (BOOLEAN AS BIT(1))
WITH FUNCTION cast_bool_to_bit(boolean)
AS ASSIGNMENT;
I solved this by writing out the Postgres insert SQL (with B'1' and B'0' for the bit values) in a previous step and using "Execute row SQL Script" at the end to run each insert as individual SQL statements.

PostgreSQL execute statement conditionally by server version

I'm currently writing some installer script that fires SQL files against different database types depending on the system's configuration (the webapplication supports multiple database server like MySQL, MSSQL and PostgreSQL).
One of those types is PostgreSQL. I'm not fluent with it and I would like to know if it's possible to make a statement into a define/populate SQL file that makes an SQL query conditional to a specific PostgreSQL server version.
How to make an SQL statement conditionally in plain PGSQL so that it is only executed in version 9? The command is:
ALTER DATABASE dbname SET bytea_output='escape';
The version check is to compare the version with 9.
Postgres does have version() function, however there is no major_vesion(). Assuming that output string always includes version number as number(s).number(s).number(s) you could write your own wrapper as:
CREATE OR REPLACE FUNCTION major_version() RETURNS smallint
AS $BODY$
SELECT substring(version() from $$(\d+)\.\d+\.\d+$$)::smallint;
$BODY$ LANGUAGE SQL;
Example:
=> Select major_version();
major_version
---------------
9
(1 row)
However real issue here is that AFAIK you can't execute your commands conditionally in "pure" SQL and best what you can do is to write some stored function like this:
CREATE OR REPLACE FUNCTION conditionalInvoke() RETURNS void
AS $BODY$
BEGIN
IF major_version() = 9 THEN
ALTER DATABASE postgres SET bytea_output='escape';
END IF;
RETURN;
END;
$BODY$ LANGUAGE plpgsql;
I think that you should rather use some scripting language and generate appropriate SQL with it.
Or you could just use
select setting from pg_settings where name = 'server_version'
Or
select setting from pg_settings where name = 'server_version_num'
If you need major version only
select Substr(setting, 1, 1) from pg_settings where name = 'server_version_num'
or
select Substr(setting, 1, strpos(setting, '.')-1) from pg_settings where name = 'server_version'
if you want it to be compatible with two digit versions.
Maybe you could make things dependent on the output of
select version();
(probably you'll have to trim and substring that a bit)
BTW (some) DDL statements may not be issued from within functions; maybe you'll have to escape to shell-programming and here-documents.

How to find out the length of a varchar parameter in a postgres function

I'm trying to find out the parameter length for a varchar parameter passed into a postgres function.
The SQL I have just now has no values in the character_maximum_length column where I would have expected to find this value
SELECT *
FROM information_schema.parameters
WHERE specific_schema='public'
AND specific_name like 'foo'
ORDER BY ordinal_position
I don't think postgresql keeps this information. If I create function foo(varchar(100)) returns boolean ... and then dump the schema with pg_dump, I find:
CREATE FUNCTION foo(character varying) RETURNS boolean
LANGUAGE sql
AS $$select true$$;
The '100' specification is gone. And passing a 150-character string to foo(varchar) is not trapped or anything. By contrast, if I create a domain based on varchar(100) and define the function in terms of that, then passing an overlong string is trapped.