Find the usage of a Postgres procedure - postgresql

I want to change a Postgres procedure, but before doing this I want to make sure that it does not break anything.
To do this, I need to find its usage.

If you have track_functions set to pl or all, you can use the view pg_stat_user_functions to see how often a function was called.
But that's probably not what you want; I assume you want to know from which other functions a function is called. This is much more difficult, because PostgreSQL does not track dependencies between a function and the objects used in its code. The best you can do is perform a string search:
SELECT oid::regprocedure
FROM pg_catalog.pg_proc
WHERE prosrc ILIKE '%your_function_name%';
That can of course produce false positives, particularly if the function name you search for is a string commonly used in code.
There can potentially be false negatives as well, consider this piece of PL/pgSQL code:
EXECUTE 'SELECT your_func' || 'tion_name()';
But you may be able to exclude such cases if you know your code base.
A completely different matter is the question where in your application code the function is used. But since you tagged the question only as postgresql, I assume that this is out of scope here.

Related

Is there any significant difference between using SELECT ... FROM ... INTO syntax instead of the standard SELECT ... INTO ... FROM?

I was creating a function following an example from a database class which included the creation of a temporary variable (base_salary) and using a SELECT INTO to calculate its value later.
However, I did not realize I used a different order for the syntax (SELECT ... FROM ... INTO base_salary) and the function could be used later without any visible issues (values worked as expected).
Is there any difference in using "SELECT ... FROM ... INTO" syntax order? I tried looking about it in the PostgreSQL documentation but found nothing about it. Google search did not provide any meaningful information neither. Only thing I found related to it was from MySQL documentation, which only mentioned about supporting the different order in an older version.
There is no difference. From the docs of pl/pgsql:
The INTO clause can appear almost anywhere in the SQL command.
Customarily it is written either just before or just after the list of
select_expressions in a SELECT command, or at the end of the command for other command types. It is recommended that you follow
this convention in case the PL/pgSQL parser becomes stricter in future
versions.
Notice that in (non-procedural) SQL, there is also a SELECT INTO command which works like CREATE TABLE AS, in this version the INTO must come right after the SELECT clause.
I always use SELECT ... INTO ... FROM , I believe that is the standard supported notation
https://www.w3schools.com/sql/sql_select_into.asp
I would recommend using this, also if there are any updates or if the other version might become unsupported as you mentioned...

How to find out if a sequence was initialized in this session?

I need to read the current value of a sequence in a function. However, for the first time in each session I try to use currval(), I get following error:
currval of sequence "foo_seq" is not yet defined in this session
Hint for those who might find this question by googling for this error: you need to initialize the sequence for each session, either by nextval() or setval().
I could use something like lastval() or even setval('your_table_id_seq', (SELECT MAX(id) FROM your_table)); instead, but this seems seems either prone to gaps or slower than simple currval(). My aim is to avoid gaps and inconsistencies (I know some of the values will be added manually), so using nextval() before logic handling them is not ideal for my purpose. I would need this to initialize the sequence for the session anyway, but I would prefer to do something like this:
--start of the function here
IF is_not_initialized THEN
SELECT setval('foo_seq', (SELECT MAX(id) FROM bar_table)) INTO _current;
ELSE
SELECT currval('foo_seq') INTO _current;
END IF;
--some magic with the _current variable and nextvalue() on the right position
The point is that I have no idea how might "is_not_initialized" look like and whether is it possible at all. Is there any function or other trick to do it?
EDIT: Actually, my plan is to let each group of customers choose between proper sequence, no sequence at all, and the strange "something like a sequence" I'm asking for now. Even if the customer wanted such a strange sequence, it would be used only for the columns where it is needed - usually because there are some analog data and we need to store their keys (usually almost gapless sequence) into the DB for backward compatibility.
Anyway, you are right that this is hardly proper solution and that no sequence might be better than such a messy workaround in those situations, so I'll think (and discuss with customers) again whether it is really needed.
Craig, a_horse and pozs have provided information which can help you understand principles of using sequences. Apart from the question how are you going to use it, here is a function which returns current value of a sequence if it has been initialized or null otherwise.
If a sequence seq has not been initialized yet, currval(seq) raises exception with sqlstate 55000.
create or replace function current_seq_value(seq regclass)
returns integer language plpgsql
as $$
begin
begin
return (select currval(seq));
exception
when sqlstate '55000' then return null;
end;
end $$;
select current_seq_value('my_table_id_seq')
My aim is to avoid gaps and inconsistencies
You cannot use sequences if you want to avoid gaps. Nor can you reasonably use sequences if you want to assign some values manually.
The approach you are taking is unsound. It will not work. Forget about it, it isn't going to do what you think it's going to do.
I just wrote a sample implementation of a trivial gapless sequence generator for someone a few days ago, and there's a more complete one in this question.
You need to understand that unlike true sequences, gapless sequences are transactional. A consequence is that only one running transaction can have an uncommitted ID. If 100 concurrent transactions try to get IDs, only one of them will actually get the ID. The others will have to wait until that one commits or rolls back. So they're terrible for concurrency, especially if combined with long running transactions. They can also cause deadlocks if you use multiple different gapless sequences and different transactions might access them in different orders.
So think carefully whether you really need this.
Read: PostgreSQL gapless sequences

Changed property value when selecting in EF4

I need to change the value of a property when I query the database using EF4. I have a company code that gets returned and I need to translate it to another company code, if needed. So, there is a stored procedure that is used to do this currently. Here's the old select statement.
SELECT companyName, TranslateCompanyCode(companyCode) as newCompanyCode FROM companyTable where companyCode = 'AA';
TranslateCompanyCode is the stored proc that does the translation. I'd like to do this in my new code when needed. I think I might need to use a Model-Defined Function. Anyone know how I can do this?
For your scenario, I would use a JOIN. Model-defined functions are cool when you need to perform a quick function on a value (particularly without an additional query). From a performance standpoint, a JOIN will be faster and more efficient than trying to put the sub-query in a model-defined function - particularly if you are selecting more than 1 row at a time.
However, if you do still want to use Model defined functions, then this example should point you in the right direction as to how to run a query within the function. This implementation will also be more complex than just using a join but is an alternative.

Postgresql HAVING clause limitation

Why can't one use an output column in the having clause in postgresql? It doesn't change expressivity of the language anyhow, just forces people to rewrite output column definition in having clause. Is a way to avoid that, apart from putting the whole query as a subquery in SELECT * FROM (...) AS t WHERE condition ?
Bacause it's not implemented? And if you're asking why it wasn't implemented, I see 2 possible explanations:
standard doesn't require it
nobody had time to spent on it
if you'd like to have it - mail to -hackers, talk about, and then implement.
Frankly I don't see it as a big problem - it's not like you have 1000 characters to retype.

Execute statements for every record in a table

I have a temporary table (or, say, a function which returns a table of values).
I want to execute some statements for each record in the table.
Can this be done without using cursors?
I'm not opposed to cursors, but would like a more elegant syntax\way of doing it.
Something like this randomly made-up syntax:
for (select A,B from #temp) exec DoSomething A,B
I'm using Sql Server 2005.
I dont think what you want to to is that easy.
What i have found is that you can create a scalar function taking the arguments A and B and then from within the function execute an Extended Stored Procedure. This might achieve what you want to do, but it seems that this might make the code even more complex.
I think for readibility and maintainability, you should stick to the CURSOR implementation.
I would look into changing the stored proc so that it can work against a set of data rather than a single row input.
Would CROSS/OUTER APPLY do what you want if you need RBAR processing.
It's elegant, but depends on what processing you need to do