The forgotten assignment operator "=" and the commonplace ":=" - postgresql

The documentation for PL/pgSQL says, that declaration and assignment to variables is done with :=.
But a simple, shorter and more modern (see footnote) = seems to work as expected:
CREATE OR REPLACE FUNCTION foo() RETURNS int AS $$
DECLARE
i int;
BEGIN
i = 0;
WHILE NOT i = 25 LOOP
i = i + 1;
i = i * i;
END LOOP;
RETURN i;
END;
$$ LANGUAGE plpgsql;
> SELECT foo();
25
Please note, that Pl/pgSQL can distinguish assignment and comparison clearly as shown in the line
WHILE NOT i = 25 LOOP
So, the questions are:
Didn't I find some section in the docs which mention and/or explains this?
Are there any known consequences using = instead of :=?
Edit / Footnote:
Please take the "more modern" part with a wink like in A Brief, Incomplete, and Mostly Wrong History of Programming Languages:
1970 - Niklaus Wirth creates Pascal, a procedural language. Critics
immediately denounce Pascal because it uses "x := x + y" syntax
instead of the more familiar C-like "x = x + y". This criticism
happens in spite of the fact that C has not yet been invented.
1972 - Dennis Ritchie invents a powerful gun that shoots both forward
and backward simultaneously. Not satisfied with the number of deaths
and permanent maimings from that invention he invents C and Unix.

In PL/PgSQL parser, assignment operator is defined as
assign_operator : '='
| COLON_EQUALS
;
This is a legacy feature, present in source code since 1998, when it was introduced - as we can see in the PostgreSQL Git repo.
Starting from version 9.4 it is oficially documented.
This idiosyncrasy - of having two operators for same thing - was raised on pgsql users list, and some people requested it to be removed, but it's still kept in the core because fair corpus of legacy code relies on it.
See this message from Tom Lane (core Pg developer).
So, to answer your questions straight:
Didn't I find some section in the docs which mention and/or explains
this?
You did not find it because it was undocumented, which is fixed as of version 9.4.
Are there any known consequences using = instead of :=.
There are no side consequences of using =, but you should use := for assignment to make your code more readable, and (as a side effect) more compatible with PL/SQL.
Update: there may be a side consequence in rare scenarios (see Erwin's answer)
UPDATE: answer updated thanks to input from Daniel, Sandy & others.

Q1
This has finally been added to the official documentation with Postgres 9.4:
An assignment of a value to a PL/pgSQL variable is written as:
variable { := | = } expression;
[...] Equal (=) can be used instead of PL/SQL-compliant :=.
Q2
Are there any known consequences using = instead of :=?
Yes, I had a case with severe consequences: Function call with named parameters - which is related but not exactly the same thing.
Strictly speaking, the distinction in this case is made in SQL code. But that's an academic differentiation to the unsuspecting programmer.1
Consider the function:
CREATE FUNCTION f_oracle(is_true boolean = TRUE) -- correct use of "="
RETURNS text
LANGUAGE sql AS
$func$
SELECT CASE $1
WHEN TRUE THEN 'That''s true.'
WHEN FALSE THEN 'That''s false.'
ELSE 'How should I know?'
END
$func$;
Note the correct use of = in the function definition. That's part of the CREATE FUNCTION syntax - in the style of an SQL assignment.2
Function call with named notation:
SELECT * FROM f_oracle(is_true := TRUE);
Postgres identifies := as parameter assignment and all is well. However:
SELECT * FROM f_oracle(is_true = TRUE);
Since = is the SQL equality operator, Postgres interprets is_true = TRUE as SQL expression in the context of the calling statement and tries to evaluate it before passing the result as unnamed positional parameter. It looks for an identifier is_true in the outer scope. If that can't be found:
ERROR: column "is_true" does not exist
That's the lucky case and, luckily, also the common one.
When is_true can be found in the outer scope (and data types are compatible), is_true = TRUE is a valid expression with a boolean result that is accepted by the function. No error occurs. Clearly, this is the intention of the programmer using the SQL equality operator = ...
This db<>fiddle demonstrates the effect.
Old sqlfiddle
Very hard to debug if you're unaware of the distinction between = and :=.
Always use the the correct operator.
1 When using named notation in function calls, only := is the correct assignment operator. This applies to functions of all languages, not just PL/pgSQL, up to and including pg 9.4. See below.
2
One can use = (or DEFAULT) to define default values for function parameters. That's not related to the problem at hand in any way. It's just remarkably close to the incorrect use case.
Postgres 9.0 - 9.4: Transition from := to =>
The SQL standard for assignment to named function parameters is => (and Oracle's PL/SQL uses it. Postgres could not do the same, since the operator had previously been unreserved, so it's using PL/pgSQL's assignment operator := instead. With the release of Postgres 9.0 the use of => for other purposes has been deprecated. The release notes:
Deprecate use of => as an operator name (Robert Haas)
Future versions of PostgreSQL will probably reject this operator name
entirely, in order to support the SQL-standard notation for named
function parameters. For the moment, it is still allowed, but a
warning is emitted when such an operator is defined.
If you should be using => for something else, cease and desist. It will break in the future.
Postgres 9.5: use => now
Starting with this release, the SQL standard operator => is used. := is still supported for backward compatibility. But use the standard operator in new code that doesn't need to run on very old versions.
Documented in the manual, chapter Using Named Notation.
Here's the commit with explanation in GIT.
This applies to named parameter assignment in function calls (SQL scope), not to the assignment operator := in plpgsql code, which remains unchanged.

A partial answer to my own question:
The PL/pgSQL section Obtaining the Result Status shows two examples using a special syntax:
GET DIAGNOSTICS variable = item [ , ... ];
GET DIAGNOSTICS integer_var = ROW_COUNT;
I tried both := and = and they work both.
But GET DIAGNOSTICS is special syntax, so one can argue, that this is also not a normal PL/pgSQL assignment operation.

Reading the Postgresql 9 documentation:
This page lists "=" as an assignment operator in the table on operator precedence.
But strangely this page (assignment operator documentation) doesn't mention it.

Related

How to wrap record_out() function?

I'd like to create an IMMUTABLE wrapper function as discussed by https://stackoverflow.com/a/11007216/14731 but it's not clear how to proceed. The above Stackoverflow answer provides the following example:
For example, given:
CREATE OR REPLACE FUNCTION public.immutable_unaccent(regdictionary, text)
RETURNS text LANGUAGE c IMMUTABLE PARALLEL SAFE STRICT AS
'$libdir/unaccent', 'unaccent_dict';
CREATE OR REPLACE FUNCTION public.f_unaccent(text)
RETURNS text LANGUAGE sql IMMUTABLE PARALLEL SAFE STRICT AS
$func$
SELECT public.immutable_unaccent(regdictionary 'public.unaccent', $1)
$func$;
I scanned all the libraries in lib/ and as far as I can tell none of them export functions related to record_out(). Any ideas?
The record_out() function is an internal built-in function. You can get its definition like this:
select pg_get_functiondef('record_out'::regproc);
pg_get_functiondef
----------------------------------------------------------
CREATE OR REPLACE FUNCTION pg_catalog.record_out(record)+
RETURNS cstring +
LANGUAGE internal +
STABLE PARALLEL SAFE STRICT +
AS $function$record_out$function$ +
I don't know for what purpose you want the wrapping function. It only remains to warn that such a change may bring unexpected results.

Why do I get this compiler error with the djb2 hash function

I copied and pasted the widely available code for the djb2 hashing function, but it generates the error shown below (I am using the CS50.ide, which may be a factor). Since this error IS fixed by a second set of parentheses, can someone explain why those aren't in the code I find everywhere online?
dictionary.c:67:14: error: using the result of an assignment as a condition without
parentheses [-Werror,-Wparentheses]
while (c = *word++)
~~^~~~~~~~~
dictionary.c:67:14: note: place parentheses around the assignment to silence this
warning
while (c = *word++)
^
( )
dictionary.c:67:14: note: use '==' to turn this assignment into an equality comparison
while (c = *word++)
^
==
= is used for setting variables to a value. == is the relational operator used for comparing the equality of values. Perhaps you are finding the C++ version of the function. Perhaps it is the IDE compiler rules/config.+
I understand about = vs ==. My question was how come i get the compiler error with code that is correct, since it is a well established hash function.
turns out is related to the cs50 makefile being more stringent than clang on it's own. needlessly frustrating.

Is there a way to disable function overloading in Postgres

My users and I do not use function overloading in PL/pgSQL. We always have one function per (schema, name) tuple. As such, we'd like to drop a function by name only, change its signature without having to drop it first, etc. Consider for example, the following function:
CREATE OR REPLACE FUNCTION myfunc(day_number SMALLINT)
RETURNS TABLE(a INT)
AS
$BODY$
BEGIN
RETURN QUERY (SELECT 1 AS a);
END;
$BODY$
LANGUAGE plpgsql;
To save time, we would like to invoke it as follows, without qualifying 1 with ::SMALLINT, because there is only one function named myfunc, and it has exactly one parameter named day_number:
SELECT * FROM myfunc(day_number := 1)
There is no ambiguity, and the value 1 is consistent with SMALLINT type, yet PostgreSQL complains:
SELECT * FROM myfunc(day_number := 1);
ERROR: function myfunc(day_number := integer) does not exist
LINE 12: SELECT * FROM myfunc(day_number := 1);
^
HINT: No function matches the given name and argument types.
You might need to add explicit type casts.
When we invoke such functions from Python, we use a wrapper that looks up functions' signatures and qualifies parameters with types. This approach works, but there seems to be a potential for improvement.
Is there a way to turn off function overloading altogether?
Erwin sent a correct reply. My next reply is related to possibility to disable overloading.
It is not possible to disable overloading - this is a base feature of PostgreSQL function API system - and cannot be disabled. We know so there are some side effects like strong function signature rigidity - but it is protection against some unpleasant side effects when function is used in Views, table definitions, .. So you cannot to disable it.
You can simply check if you have or have not overloaded functions:
postgres=# select count(*), proname
from pg_proc
where pronamespace <> 11
group by proname
having count(*) > 1;
count | proname
-------+---------
(0 rows)
This is actually not directly a matter of function overloading (which would be impossible to "turn off"). It's a matter of function type resolution. (Of course, that algorithm could be more permissive without overloaded functions.)
All of these would just work:
SELECT * FROM myfunc(day_number := '1');
SELECT * FROM myfunc('1'); -- note the quotes
SELECT * FROM myfunc(1::smallint);
SELECT * FROM myfunc('1'::smallint);
Why?
The last two are rather obvious, you mentioned that in your question already.
The first two are more interesting, the explanation is buried in the Function Type Resolution:
unknown literals are assumed to be convertible to anything for this purpose.
And that should be the simple solution for you: use string literals.
An untyped literal '1' (with quotes) or "string literal" as defined in the SQL standard is different in nature from a typed literal (or constant).
A numeric constant 1 (without quotes) is cast to a numeric type immediately. The manual:
A numeric constant that contains neither a decimal point nor an
exponent is initially presumed to be type integer if its value fits in
type integer (32 bits); otherwise it is presumed to be type bigint if
its value fits in type bigint (64 bits); otherwise it is taken to be
type numeric. Constants that contain decimal points and/or exponents
are always initially presumed to be type numeric.
The initially assigned data type of a numeric constant is just a
starting point for the type resolution algorithms. In most cases the
constant will be automatically coerced to the most appropriate type
depending on context. When necessary, you can force a numeric value to
be interpreted as a specific data type by casting it.
Bold emphasis mine.
The assignment in the function call (day_number := 1) is a special case, the data type of day_number is unknown at this point. Postgres cannot derive a data type from this assignment and defaults to integer.
Consequently, Postgres looks for a function taking an integer first. Then for functions taking a type only an implicit cast away from integer, in other words:
SELECT casttarget::regtype
FROM pg_cast
WHERE castsource = 'int'::regtype
AND castcontext = 'i';
All of these would be found - and conflict if there were more than one function. That would be function overloading, and you would get a different error message. With two candidate functions like this:
SELECT * FROM myfunc(1);
ERROR: function myfunc(integer) is not unique
Note the "integer" in the message: the numeric constant has been cast to integer.
However, the cast from integer to smallint is "only" an assignment cast. And that's where the journey ends:
No function matches the given name and argument types.
SQL Fiddle.
More detailed explanation in these related answers:
PostgreSQL ERROR: function to_tsvector(character varying, unknown) does not exist
Generate series of dates - using date type as input
Dirty fix
You could fix this by "upgrading" the cast from integer to smallint to an implicit cast:
UPDATE pg_cast
SET castcontext = 'i'
WHERE castsource = 'int'::regtype
AND casttarget = 'int2'::regtype;
But I would strongly discourage tampering with the default casting system. Only consider this if you know exactly what you are doing. You'll find related discussions in the Postgres lists. It can have all kinds of side effects, starting with function type resolution, but not ending there.
Aside
Function type resolution is completely independent from the used language. An SQL function would compete with PL/perl or PL/pgSQL or "internal" functions just the same. The function signature is essential. Built-in functions only come first, because pg_catalog comes first in the default search_path.
There are plenty of in built functions that are overloaded, so it simply would not work if you turned off function overloading.

What is the difference between ( := ) and ( = ) in PostgreSQL? [duplicate]

The documentation for PL/pgSQL says, that declaration and assignment to variables is done with :=.
But a simple, shorter and more modern (see footnote) = seems to work as expected:
CREATE OR REPLACE FUNCTION foo() RETURNS int AS $$
DECLARE
i int;
BEGIN
i = 0;
WHILE NOT i = 25 LOOP
i = i + 1;
i = i * i;
END LOOP;
RETURN i;
END;
$$ LANGUAGE plpgsql;
> SELECT foo();
25
Please note, that Pl/pgSQL can distinguish assignment and comparison clearly as shown in the line
WHILE NOT i = 25 LOOP
So, the questions are:
Didn't I find some section in the docs which mention and/or explains this?
Are there any known consequences using = instead of :=?
Edit / Footnote:
Please take the "more modern" part with a wink like in A Brief, Incomplete, and Mostly Wrong History of Programming Languages:
1970 - Niklaus Wirth creates Pascal, a procedural language. Critics
immediately denounce Pascal because it uses "x := x + y" syntax
instead of the more familiar C-like "x = x + y". This criticism
happens in spite of the fact that C has not yet been invented.
1972 - Dennis Ritchie invents a powerful gun that shoots both forward
and backward simultaneously. Not satisfied with the number of deaths
and permanent maimings from that invention he invents C and Unix.
In PL/PgSQL parser, assignment operator is defined as
assign_operator : '='
| COLON_EQUALS
;
This is a legacy feature, present in source code since 1998, when it was introduced - as we can see in the PostgreSQL Git repo.
Starting from version 9.4 it is oficially documented.
This idiosyncrasy - of having two operators for same thing - was raised on pgsql users list, and some people requested it to be removed, but it's still kept in the core because fair corpus of legacy code relies on it.
See this message from Tom Lane (core Pg developer).
So, to answer your questions straight:
Didn't I find some section in the docs which mention and/or explains
this?
You did not find it because it was undocumented, which is fixed as of version 9.4.
Are there any known consequences using = instead of :=.
There are no side consequences of using =, but you should use := for assignment to make your code more readable, and (as a side effect) more compatible with PL/SQL.
Update: there may be a side consequence in rare scenarios (see Erwin's answer)
UPDATE: answer updated thanks to input from Daniel, Sandy & others.
Q1
This has finally been added to the official documentation with Postgres 9.4:
An assignment of a value to a PL/pgSQL variable is written as:
variable { := | = } expression;
[...] Equal (=) can be used instead of PL/SQL-compliant :=.
Q2
Are there any known consequences using = instead of :=?
Yes, I had a case with severe consequences: Function call with named parameters - which is related but not exactly the same thing.
Strictly speaking, the distinction in this case is made in SQL code. But that's an academic differentiation to the unsuspecting programmer.1
Consider the function:
CREATE FUNCTION f_oracle(is_true boolean = TRUE) -- correct use of "="
RETURNS text
LANGUAGE sql AS
$func$
SELECT CASE $1
WHEN TRUE THEN 'That''s true.'
WHEN FALSE THEN 'That''s false.'
ELSE 'How should I know?'
END
$func$;
Note the correct use of = in the function definition. That's part of the CREATE FUNCTION syntax - in the style of an SQL assignment.2
Function call with named notation:
SELECT * FROM f_oracle(is_true := TRUE);
Postgres identifies := as parameter assignment and all is well. However:
SELECT * FROM f_oracle(is_true = TRUE);
Since = is the SQL equality operator, Postgres interprets is_true = TRUE as SQL expression in the context of the calling statement and tries to evaluate it before passing the result as unnamed positional parameter. It looks for an identifier is_true in the outer scope. If that can't be found:
ERROR: column "is_true" does not exist
That's the lucky case and, luckily, also the common one.
When is_true can be found in the outer scope (and data types are compatible), is_true = TRUE is a valid expression with a boolean result that is accepted by the function. No error occurs. Clearly, this is the intention of the programmer using the SQL equality operator = ...
This db<>fiddle demonstrates the effect.
Old sqlfiddle
Very hard to debug if you're unaware of the distinction between = and :=.
Always use the the correct operator.
1 When using named notation in function calls, only := is the correct assignment operator. This applies to functions of all languages, not just PL/pgSQL, up to and including pg 9.4. See below.
2
One can use = (or DEFAULT) to define default values for function parameters. That's not related to the problem at hand in any way. It's just remarkably close to the incorrect use case.
Postgres 9.0 - 9.4: Transition from := to =>
The SQL standard for assignment to named function parameters is => (and Oracle's PL/SQL uses it. Postgres could not do the same, since the operator had previously been unreserved, so it's using PL/pgSQL's assignment operator := instead. With the release of Postgres 9.0 the use of => for other purposes has been deprecated. The release notes:
Deprecate use of => as an operator name (Robert Haas)
Future versions of PostgreSQL will probably reject this operator name
entirely, in order to support the SQL-standard notation for named
function parameters. For the moment, it is still allowed, but a
warning is emitted when such an operator is defined.
If you should be using => for something else, cease and desist. It will break in the future.
Postgres 9.5: use => now
Starting with this release, the SQL standard operator => is used. := is still supported for backward compatibility. But use the standard operator in new code that doesn't need to run on very old versions.
Documented in the manual, chapter Using Named Notation.
Here's the commit with explanation in GIT.
This applies to named parameter assignment in function calls (SQL scope), not to the assignment operator := in plpgsql code, which remains unchanged.
A partial answer to my own question:
The PL/pgSQL section Obtaining the Result Status shows two examples using a special syntax:
GET DIAGNOSTICS variable = item [ , ... ];
GET DIAGNOSTICS integer_var = ROW_COUNT;
I tried both := and = and they work both.
But GET DIAGNOSTICS is special syntax, so one can argue, that this is also not a normal PL/pgSQL assignment operation.
Reading the Postgresql 9 documentation:
This page lists "=" as an assignment operator in the table on operator precedence.
But strangely this page (assignment operator documentation) doesn't mention it.

What is the correct way to select real solutions?

Suppose one needs to select the real solutions after solving some equation.
Is this the correct and optimal way to do it, or is there a better one?
restart;
mu := 3.986*10^5; T:= 8*60*60:
eq := T = 2*Pi*sqrt(a^3/mu):
sol := solve(eq,a);
select(x->type(x,'realcons'),[sol]);
I could not find real as type. So I used realcons. At first I did this:
select(x->not(type(x,'complex')),[sol]);
which did not work, since in Maple 5 is considered complex! So ended up with no solutions.
type(5,'complex');
(* true *)
Also I could not find an isreal() type of function. (unless I missed one)
Is there a better way to do this that one should use?
update:
To answer the comment below about 5 not supposed to be complex in maple.
restart;
type(5,complex);
true
type(5,'complex');
true
interface(version);
Standard Worksheet Interface, Maple 18.00, Windows 7, February
From help
The type(x, complex) function returns true if x is an expression of the form
a + I b, where a (if present) and b (if present) are finite and of type realcons.
Your solutions sol are all of type complex(numeric). You can select only the real ones with type,numeric, ie.
restart;
mu := 3.986*10^5: T:= 8*60*60:
eq := T = 2*Pi*sqrt(a^3/mu):
sol := solve(eq,a);
20307.39319, -10153.69659 + 17586.71839 I, -10153.69659 - 17586.71839 I
select( type, [sol], numeric );
[20307.39319]
By using the multiple argument calling form of the select command we here can avoid using a custom operator as the first argument. You won't notice it for your small example, but it should be more efficient to do so. Other commands such as map perform similarly, to avoid having to make an additional function call for each individual test.
The types numeric and complex(numeric) cover real and complex integers, rationals, and floats.
The types realcons and complex(realcons) includes the previous, but also allow for an application of evalf done during the test. So Int(sin(x),x=1..3) and Pi and sqrt(2) are all of type realcons since following an application of evalf they become floats of type numeric.
The above is about types. There are also properties to consider. Types are properties, but not necessarily vice versa. There is a real property, but no real type. The is command can test for a property, and while it is often used for mixed numeric-symbolic tests under assumptions (on the symbols) it can also be used in tests like yours.
select( is, [sol], real );
[20307.39319]
It is less efficient to use is for your example. If you know that you have a collection of (possibly non-real) floats then type,numeric should be an efficient test.
And, just to muddy the waters... there is a type nonreal.
remove( type, [sol], nonreal );
[20307.39319]
The one possibility is to restrict the domain before the calculation takes place.
Here is an explanation on the Maplesoft website regarding restricting the domain:
4 Basic Computation
UPD: Basically, according to this and that, 5 is NOT considered complex in Maple, so there might be some bug/error/mistake (try checking what may be wrong there).
For instance, try putting complex without quotes.
Your way seems very logical according to this.
UPD2: According to the Maplesoft Website, all the type checks are done with type() function, so there is rather no isreal() function.