How to run the same function on different schemas - postgresql

I have multiple schemas / user with the same structure but different data. There are some stored functions executed on these data, and so far, they are stored in each schema. I'd like to store these functions together in a new schema, which would make it easier, updating the code, ... as it would be centralized.
I thought, as the search_path is defined to be "$user",public it would reference to the user of the current session / connection, hence those also queries from different schemas would ultimately have the same search_path.
let's say I have a table T1 for the users u1, u2, u3 and a function which uses this table F1.
Originally, F1 would be in copied into the schemas u1, u2, u3 and running select * from F1() would work for each user. However updating the function would become increasingly difficult with the number of users, so I want to have a new schema functions with only one F1 function inside.
Now, running select * from functions.F1() returns an error, that T1 couldn't be found. But the users search_paths contain still the same information. So why does the search_path change based on the function which is executing it, and how can I prevent it from happening?
There was a mail about this on postgres mailing list: http://postgresql.nabble.com/function-doesn-t-see-change-in-search-path-td4971325.html and the final workaround was my original situation. Maybe something change in the meanwhile?

Actually, my thinking was correct. However when I created the new schema, by exporting the old function, pg_dump added SECURITY DEFINER at the definition of each function.
Changing this to SECURITY INVOKER gives the behavior as expected (by me)
from the documentation:
SECURITY INVOKER indicates that the function is to be executed with the privileges of the user that calls it. That is the default. SECURITY DEFINER specifies that the function is to be executed with the privileges of the user that created it.

Add a table parameter to F1. Then add uf1, uf2, and uf3 to u1, u2, and u3. These functions will just call F1 and pass in the correct table.

Have a look at plproxy. This is what Skype used to run queries over multiple database shards via a proxy database with wrapper functions.
You could also write a wrapper function which finds all of the functions in every schema and calls them.

Related

Access a PostgreSQL table without directly using its name

Problem
I want to prevent a database user from discovering other tables through the use of queries like select * from pg_tables; by adding sanitization in the application that executes the query.
I'm busy investigating how to limit access to a PostgreSQL database for queries coming through an application layer where I can still sanitize a query with code. The user does not connect straight to the database. They can execute queries through an application layer (e.g. query is passed through as text and then executed by my application).
I've already enabled Row Security Policies (row level security) and the user is accessing their data through a view, so "access to data" has been solved (I believe). The problem I'm trying to solve now is to prevent them from "seeing" other tables in the database, especially the built-in PG tables.
The only grant the user has is grant select on my_view to my_user_role;
Assumption / attempted solution
My assumption is that a user can't access a table without explicitly writing it into the query, so if I were to look for certain keywords in the query string, I can reject a query without executing it. E.g. if the clause/characters "pg_tables" are anywhere in the query, then I can reject it. But, this feels naive.
const query = "select * from pg_tables;";
const flagged = query.includes("pg_tables");
if (flagged) throw Error("Not allowed!");
// Continue to run the user's query
^ This would work reliably if the only way to access a table like pg_tables is to type that out explicitly.
Question
Is there a way to access a table like pg_tables without naming it explicitly, in PostgreSQL [given the context above]?
An example of a similar situation in javascript is if you have a function fooBar(), then you can access it indirectly by calling global["foo" + "Bar"]() – hence not using the text "fooBar" exactly.

Postgres 10 - Targeting schema's function on a multiple schema Database

I'm running PostgresSQL 10, and I have several schemas on my DB with multiple functions. I've created a schemaless script with all the functions on it (I've removed the schema prefix), with this, everytime i create a new schema, I ran the migration and create all the functions as well.
This was necessary/requested for a better data separation between customers. All the schemas are twins in terms of structure.
All was fine until I figured that SchemaA was calling a function from public. Even if I call:
SchemaA.myFunction(p_param1:= 'A', p_param2:= 'B').
If this "myFunction" calls another from the inside, it will target public schema by default.
The only way I made it work, was using an input parameter called p_user_schema myFunction(p_param1, p_param2, p_user_schema) and add the following statement as the first line of myFunction body.
EXECUTE FORMAT('SET search_path TO %L', p_user_schema);
I've 147 functions, I will need to adapt each of these, does anyone know a better way to target the callers schema, by callers I mean the prefix schema used on the main call.
You can set the search path at the function level with the current user as the 1st one
CREATE OR REPLACE FUNCTION schemaA.myfunction()
RETURNS ..
AS $$
...
$$ LANGUAGE SQL
SET SEARCH_PATH = "$user", public;

Changing schema name within functions

I have schema1 in database1. I want to move all the functions of schema1 to schema2 which is present in database2. I have restored backup file of database1 into database2. And changed the schema name. The schema name for function call automatically got changed. But within function definition the schema name is not changed. for ex:
CREATE OR REPLACE FUNCTION schema2.execute(..)
BEGIN
select schema1."VALIDATE_SESSION"(....)
end
How can I change "schema1" to "schema2" automatically?
I have tried to store current schema name in variable and append it to table. But calling current_schema() returns "public". How to get current schema created by user? Because every time I need to change the schema name while generating script.
The essential detail that is missing in your dummy function are the single quotes (or dollar-quotes, all the same) around the function body. Meaning, function bodies are saved as strings. See:
What are '$$' used for in PL/pgSQL
To contrast, consider a reference to a table (or more verbosely: schema.table(column)) in a FK constraint. Object names are resolved to the internal OID of the table (and a column number) at creation time. "Early binding". When names (including schema names) are changed later, that has no effect on the FK at all. Feels like involved names are changed dynamically. But really, actual names just don't matter after the object has been created. So you can rename schemas all day without side effect for the FK.
Names in a function body are stored as strings and interpreted at call time. "Late binding". Those names are not changed dynamically.
Meaning, you'll have to actually edit all function bodies including a hard-coded schema name. A possible alternative is to rely on the search_path instead and not use schema names in function bodies to begin with. There are various. See:
How does the search_path influence identifier resolution and the "current schema"
But that's not always acceptable.
You could hack the dump. Or use sting manipulation inside Postgres to update affected function bodies. Find affected functions with a meta-query like:
SELECT *
FROM pg_catalog.pg_proc
WHERE prosrc ~ '\mschema1\M'; -- not bullet-proof!
Either way, be wary of false matches if the schema name can be part of other strings or pop up as column name etc. And dynamic SQL can concatenate strings in arbitrary ways. If you have such evil trickery in your functions, you need to deal with it appropriately.

How to revoke copy access to a user postgres?

I want to have a role based access for my Database. I've created a new user and want this user to only select access on Table but this also comes copy by default. I want to revoke this copy access to this user. Any help in this context would be really appreciable.
Thanks a Lot!!!
PostgreSQL does not directly support what you want - and it's pretty dubious theoretically, too.
Even if you modified PostgreSQL to add a COPY permission, the user would just use the XML table functions, the new JSON support, or a plain old SELECT ... FROM tablename; to extract the data.
If somebody has SELECT rights to a table, they can COPY ... TO stdout that table and they can SELECT ... INTO another table if they have CREATE rights anywhere. This makes sense; after all, they can always SELECT * FROM tablename; and process it client-side.
You can selectively GRANT rights on individual columns, but they can still do a SELECT allowedcol1, allowedcol2, ... FROM tablename and therefore a COPY (SELECT allowedcol1, allowedcol2, ... FROM tablename) TO stdout.
The only way to (mostly) stop a user from doing a COPY or SELECT INTO is to force all access through SECURITY DEFINER stored procedures that define very limited acess to the table.
A clever user can still write a COPY (SELECT ...) TO stdout with a query that uses your stored procedure API to loop over keys and request each one. It'll be slower, but so long as you provide any way to list all the keys in the table it will work.
In the end, if you give them SELECT you give them the ability to copy one way or the other.

why use stored procedure instead of query directly to db?

My company doing new policy because my company would have certification of some international standards. That policy is, the DBA not allowed to query directly into database, like :
select * from some_table, update some_table, etc.
We have to use stored procedure to do that queries.
Regarding my last question in here : Postgres pl/pgsql ERROR: column "column_name" does not exist
I'm wondering, do we have to create a stored procedure per table, or per condition?
Is there any way to create stored procedures more efficiently?
Thanks for your answer before..
and sorry for my bad english.. :D
Some reasons to use stored procedures are:
They have presumably undergone some testing to ensure that they do
not allow business rules to be broken, as well as some optimization
for performance.
They ensure consistency in results. Every time you are asked to
perform task X, you run the stored procedure associate with task X.
If you write the query, you may not write it the same way every time;
maybe one day you forget something silly like forcing text to the
same case before a comparison and something gets missed.
They start off taking somewhat longer to write than just a query, but
running that stored procedure takes less time than writing the query
again. Run it enough times and it becomes more efficient to have
written the stored procedure.
They reduce or eliminate the need to know the relationships of
underlying tables.
You can grant permissions to execute the stored procedures (with
security definer), but deny permissions on the underlying tables.
Programmers (if you separate DBAs and programmers) can be provided an
API, and that’s all they need to know. So long as you maintain the
API while changing the database, you can make any changes necessary
to the underlying relations without breaking their software; indeed,
you don’t even need to know what they have done with your API.
You will likely end up making one stored procedure per query you would otherwise execute.
I'm not sure why you consider this inefficient, or particularly time-consuming as compared to just writing the query. If all you are doing is putting the query inside of a stored procedure, the extra work should be minimal.
CREATE OR REPLACE FUNCTION aSchema.aProcedure (
IN var1 text,
IN var2 text,
OUT col1 text,
OUT col2 text
)
RETURNS setof record
LANGUAGE plpgsql
VOLATILE
CALLED ON NULL INPUT
SECURITY DEFINER
SET search_path = aSchema, pg_temp
AS $body$
BEGIN
RETURN QUERY /*the query you would have written anyway*/;
END;
$body$;
GRANT EXECUTE ON FUNCTION aSchema.aProcedure(text, text) TO public;
As you used in your previous question, the function can be even more dynamic by passing columns/tables as parameters and using EXECUTE (though this increases how much the person executing the function needs to know about how the function works, so I try to avoid it).
If the "less efficient" is coming from additional logic that is included in the function, then the comparison to just using queries isn't fair, as the function is doing additional work.