Postgres 10 - Targeting schema's function on a multiple schema Database - postgresql

I'm running PostgresSQL 10, and I have several schemas on my DB with multiple functions. I've created a schemaless script with all the functions on it (I've removed the schema prefix), with this, everytime i create a new schema, I ran the migration and create all the functions as well.
This was necessary/requested for a better data separation between customers. All the schemas are twins in terms of structure.
All was fine until I figured that SchemaA was calling a function from public. Even if I call:
SchemaA.myFunction(p_param1:= 'A', p_param2:= 'B').
If this "myFunction" calls another from the inside, it will target public schema by default.
The only way I made it work, was using an input parameter called p_user_schema myFunction(p_param1, p_param2, p_user_schema) and add the following statement as the first line of myFunction body.
EXECUTE FORMAT('SET search_path TO %L', p_user_schema);
I've 147 functions, I will need to adapt each of these, does anyone know a better way to target the callers schema, by callers I mean the prefix schema used on the main call.

You can set the search path at the function level with the current user as the 1st one
CREATE OR REPLACE FUNCTION schemaA.myfunction()
RETURNS ..
AS $$
...
$$ LANGUAGE SQL
SET SEARCH_PATH = "$user", public;

Related

Use a postgres extension as a non-privileged user

The Problem
I have the following trigger function that uses the moddatetime extension:
/* BEFORE UPDATE trigger - Updates "entity_version.updated_on", setting it to the current UTC time. */
CREATE EXTENSION IF NOT EXISTS moddatetime; -- Needs superuser privileges!
DROP TRIGGER IF EXISTS trig_entity_version_before_update
ON entity_version;
CREATE TRIGGER trig_entity_version_before_update
BEFORE UPDATE
ON entity_version
FOR EACH ROW EXECUTE PROCEDURE moddatetime(updated_on); -- updated_on is TIMESTAMP type
The trigger works perfectly fine, but the issue is that the first line (CREATE EXTENSION) requires super-user privileges. Since these databases are going to be created by users (via a script) I don't want them the user that makes these databases and triggers to have super user access.
What I've tried
As a first step, running the script as a super user works fine, as you'd expect.
Naturally, the next step would be to separate the creation of the extension as a superuser from the trigger creation script. But doing that, if I run the above script without CREATE EXTENSION line, I get the following error:
function moddatetime() does not exist
Which I suppose makes sense given that the script never declares moddatetime, but I cannot find anywhere in the documentation how to define the extension as available without using CREATE EXTENSION. Surely there must be a way to import or use an extension without having the create it? Something akin to:
USING EXTENSION moddatetime; -- Does something like this exist?
... rest of the trigger script ...
You need the extension installed if you want to use it.
PostgreSQL v13 introduces the notion of a trusted extension, which can be installed by non-superusers with the CREATE privilege on the database, but this extension is not among them. So upgrading won't fix the problem.
You could connect to template1 as superuser and create the extension there. Then it would automatically be present in all newly created databases.
But frankly, I don't see the point. You can do the same thing with a regular PL/pgSQL trigger – only the performance wouldn't be as good.
As per #laurenz-albe his suggestion, I've just stepped off of using the extension for this particular use-case. Fortunately, writing a function to do what the extension does is relatively trivial. I'll include it here for reference in case someone is looking specifically to replace moddatetime.
DROP FUNCTION IF EXISTS fn_entity_version_before_update() CASCADE;
CREATE FUNCTION fn_entity_version_before_update() RETURNS TRIGGER LANGUAGE PLPGSQL AS
$fn_body$
BEGIN
NEW.updated_on = NOW() AT TIME ZONE 'utc';
RETURN NEW;
END;
$fn_body$;
DROP TRIGGER IF EXISTS trig_entity_version_before_update
ON entity_version;
CREATE TRIGGER trig_entity_version_before_update
BEFORE UPDATE
ON entity_version
FOR EACH ROW EXECUTE PROCEDURE fn_entity_version_before_update();

How to make the extension relocatable?

I don’t know how to make the extension relocatable. Question: how to make the extension move to another scheme. Below, the beginning is described. After that comes the creation of operators, functions, variances, mathematical expectation.
create schema IF NOT EXISTS complex;
--Mathematical expectation
--creating a complex data type
create type complex.complex as (re float, im float);
--creation of complex data type with the sum of complex numbers and the amount of numbers--
create type complex.complexSumAndCount as (complexSumVar complex.complex, complexCount integer);
--Creating the function of adding complex numbers
CREATE or REPLACE function complex.complexSumFunction (sum complex.complex, complexTempVar complex.complex)
RETURNS complex.complex as
$$
BEGIN
IF sum is not null and complexTempVar is not null then
sum.re := coalesce(sum.re, 0) + coalesce(complexTempVar.re, 0);
sum.im := coalesce(sum.im, 0) + coalesce(complexTempVar.im, 0);
end IF;
RETURN sum;
end; $$LANGUAGE plpgSQL;
OK, posting a link to a github repo is probably not the recommended way to improve your question. You also still haven't told me whether you are actually trying to write an extension or not.
I am going to guess that you are not since you don't seem to have a .control file.
If what you are trying to do instead is be able to write
select '(6,44)'::complex;
rather than
select '(6,44)'::complex.complex;
then what you need to know about is the search_path.
This controls what schemas will be searched for objects (types, tables, functions etc.) when you do not explicitly specify a schema.
So - you have two options. You can add your "complex" schema to your search_path when you want to run queries. Or you can put your "complex" type and functions in your current search_path.
You can set your search_path temporarily (for the current transaction or session) with:
SET search_path = public, complex;
You can change the default for your user or for your database with something like:
ALTER USER myuser SET search_path = public, complex;
ALTER USER mydatabase SET search_path = public complex;
The second option is to load your "complex" code into a schema that is already in your search_path. To do this you would remove all reference to a schema from your script1.sql file.
create type complex as (re float, im float);
create type complexSumAndCount as (complexSumVar complex.complex, complexCount integer);
...
Then, when you load it, your types and functions will be created in the first schema that can be found in your search_path. It is good practice to set this to exactly what you want.
SET search_path = public;
\i script1.sql
or from the command-line
PGOPTIONS="--search_path=public" psql -U myuser -d mydb -f script1.sql
I hope that helps. If it isn't what you are after then I'm afraid you will need to post a new question and explain precisely what you are trying to do. If you want to have a longer conversation about your problems then the postgresql mailing lists are probably a good place to try.

Changing schema name within functions

I have schema1 in database1. I want to move all the functions of schema1 to schema2 which is present in database2. I have restored backup file of database1 into database2. And changed the schema name. The schema name for function call automatically got changed. But within function definition the schema name is not changed. for ex:
CREATE OR REPLACE FUNCTION schema2.execute(..)
BEGIN
select schema1."VALIDATE_SESSION"(....)
end
How can I change "schema1" to "schema2" automatically?
I have tried to store current schema name in variable and append it to table. But calling current_schema() returns "public". How to get current schema created by user? Because every time I need to change the schema name while generating script.
The essential detail that is missing in your dummy function are the single quotes (or dollar-quotes, all the same) around the function body. Meaning, function bodies are saved as strings. See:
What are '$$' used for in PL/pgSQL
To contrast, consider a reference to a table (or more verbosely: schema.table(column)) in a FK constraint. Object names are resolved to the internal OID of the table (and a column number) at creation time. "Early binding". When names (including schema names) are changed later, that has no effect on the FK at all. Feels like involved names are changed dynamically. But really, actual names just don't matter after the object has been created. So you can rename schemas all day without side effect for the FK.
Names in a function body are stored as strings and interpreted at call time. "Late binding". Those names are not changed dynamically.
Meaning, you'll have to actually edit all function bodies including a hard-coded schema name. A possible alternative is to rely on the search_path instead and not use schema names in function bodies to begin with. There are various. See:
How does the search_path influence identifier resolution and the "current schema"
But that's not always acceptable.
You could hack the dump. Or use sting manipulation inside Postgres to update affected function bodies. Find affected functions with a meta-query like:
SELECT *
FROM pg_catalog.pg_proc
WHERE prosrc ~ '\mschema1\M'; -- not bullet-proof!
Either way, be wary of false matches if the schema name can be part of other strings or pop up as column name etc. And dynamic SQL can concatenate strings in arbitrary ways. If you have such evil trickery in your functions, you need to deal with it appropriately.

How to run the same function on different schemas

I have multiple schemas / user with the same structure but different data. There are some stored functions executed on these data, and so far, they are stored in each schema. I'd like to store these functions together in a new schema, which would make it easier, updating the code, ... as it would be centralized.
I thought, as the search_path is defined to be "$user",public it would reference to the user of the current session / connection, hence those also queries from different schemas would ultimately have the same search_path.
let's say I have a table T1 for the users u1, u2, u3 and a function which uses this table F1.
Originally, F1 would be in copied into the schemas u1, u2, u3 and running select * from F1() would work for each user. However updating the function would become increasingly difficult with the number of users, so I want to have a new schema functions with only one F1 function inside.
Now, running select * from functions.F1() returns an error, that T1 couldn't be found. But the users search_paths contain still the same information. So why does the search_path change based on the function which is executing it, and how can I prevent it from happening?
There was a mail about this on postgres mailing list: http://postgresql.nabble.com/function-doesn-t-see-change-in-search-path-td4971325.html and the final workaround was my original situation. Maybe something change in the meanwhile?
Actually, my thinking was correct. However when I created the new schema, by exporting the old function, pg_dump added SECURITY DEFINER at the definition of each function.
Changing this to SECURITY INVOKER gives the behavior as expected (by me)
from the documentation:
SECURITY INVOKER indicates that the function is to be executed with the privileges of the user that calls it. That is the default. SECURITY DEFINER specifies that the function is to be executed with the privileges of the user that created it.
Add a table parameter to F1. Then add uf1, uf2, and uf3 to u1, u2, and u3. These functions will just call F1 and pass in the correct table.
Have a look at plproxy. This is what Skype used to run queries over multiple database shards via a proxy database with wrapper functions.
You could also write a wrapper function which finds all of the functions in every schema and calls them.

why use stored procedure instead of query directly to db?

My company doing new policy because my company would have certification of some international standards. That policy is, the DBA not allowed to query directly into database, like :
select * from some_table, update some_table, etc.
We have to use stored procedure to do that queries.
Regarding my last question in here : Postgres pl/pgsql ERROR: column "column_name" does not exist
I'm wondering, do we have to create a stored procedure per table, or per condition?
Is there any way to create stored procedures more efficiently?
Thanks for your answer before..
and sorry for my bad english.. :D
Some reasons to use stored procedures are:
They have presumably undergone some testing to ensure that they do
not allow business rules to be broken, as well as some optimization
for performance.
They ensure consistency in results. Every time you are asked to
perform task X, you run the stored procedure associate with task X.
If you write the query, you may not write it the same way every time;
maybe one day you forget something silly like forcing text to the
same case before a comparison and something gets missed.
They start off taking somewhat longer to write than just a query, but
running that stored procedure takes less time than writing the query
again. Run it enough times and it becomes more efficient to have
written the stored procedure.
They reduce or eliminate the need to know the relationships of
underlying tables.
You can grant permissions to execute the stored procedures (with
security definer), but deny permissions on the underlying tables.
Programmers (if you separate DBAs and programmers) can be provided an
API, and that’s all they need to know. So long as you maintain the
API while changing the database, you can make any changes necessary
to the underlying relations without breaking their software; indeed,
you don’t even need to know what they have done with your API.
You will likely end up making one stored procedure per query you would otherwise execute.
I'm not sure why you consider this inefficient, or particularly time-consuming as compared to just writing the query. If all you are doing is putting the query inside of a stored procedure, the extra work should be minimal.
CREATE OR REPLACE FUNCTION aSchema.aProcedure (
IN var1 text,
IN var2 text,
OUT col1 text,
OUT col2 text
)
RETURNS setof record
LANGUAGE plpgsql
VOLATILE
CALLED ON NULL INPUT
SECURITY DEFINER
SET search_path = aSchema, pg_temp
AS $body$
BEGIN
RETURN QUERY /*the query you would have written anyway*/;
END;
$body$;
GRANT EXECUTE ON FUNCTION aSchema.aProcedure(text, text) TO public;
As you used in your previous question, the function can be even more dynamic by passing columns/tables as parameters and using EXECUTE (though this increases how much the person executing the function needs to know about how the function works, so I try to avoid it).
If the "less efficient" is coming from additional logic that is included in the function, then the comparison to just using queries isn't fair, as the function is doing additional work.