Understanding COALESCE in postgres - postgresql

Precise question.
Table ROW
value1 a
value2 b
value3 null
value4 d
Function parameters
CREATE OR REPLACE FUNCTION "GetValues"(
"#value1" VARCHAR(50),
"#value2" VARCHAR(50),
"#value3" VARCHAR(50),
"#value4" VARCHAR(50)
)
BEGIN
RETURN QUERY SELECT(
t."value1",
t."value2",
t."value3",
t."value4",
)
FROM "table" as t
WHERE t."value1" = COALESCE("#value1", c."value1")
AND t."value2" = COALESCE("#value2", c."value2")
AND t."value3" = COALESCE("#value3", c."value3")
AND t."value4" = COALESCE("#value4", c."value4");
END;
If I use the above function and only provide the following:
('a', null, null, 'd')
It will return [] even if 'a' and 'd' are found and I found that this only happens if I provide a parameter to search for something that is null and the value of the row is also null.
OLD DESCRIPTION BELOW
I have setup a get which uses COALESCE successfully to search by multiple or 1 parameter(s). However, if any one of those params that are not provided (so default to NULL) are actually NULL in the db because I haven't updated that field before, then it will always return an empty array, even though one of the provided params will successful match to a row in the table.
I just want to know if I need a new system all together to complete this search or if it is just an unfortunate effect of COALESCE?
Below is the relevant snippet.
FROM "table" as t
WHERE t."value1" = COALESCE("#value1", c."value1")
AND t."value2" = COALESCE("#value2", c."value2")
AND t."value3" = COALESCE("#value3", c."value3")
AND t."value4" = COALESCE("#value4", c."value4");
In the above, if I provide value1 and it matches but value4 is NULL in that row, then it will return [].
The return is a table with each of those 4 values.

Should this be a simple row comparison (give out all rows which have the same values as the input parameters)?
This could simply be achieved by the row comparator (documentation):
WHERE row(t.*) IS NOT DISTINCT FROM row("#value1", "#value2", "#value3", "#value4")
demo: db<>fiddle
If NULL as function input parameter should be a wildcard then #kurkle's solution works well.

You could do it like this:
FROM test as t
WHERE ("#value1" IS NULL OR t."value1" = "#value1")
AND ("#value2" IS NULL OR t."value2" = "#value2")
AND ("#value3" IS NULL OR t."value3" = "#value3")
AND ("#value4" IS NULL OR t."value4" = "#value4");
db<>fiddle

Related

CREATE FUNCTION failed because a column name is not specified for column 1. error for the Multiple parameter of function

Wanted to create the multiple parameter of function but it gives me this error:
CREATE FUNCTION failed because a column name is not specified for
column 1.
Code below:
create function dmt.Impacted(
#nameOfColumn varchar , #nameOfParam varchar)
returns table
as
return
(select
case when '['+#nameOfColumn+']' is null or len(rtrim('['+#nameOfColumn+']')) = 0
then Convert(nvarchar(2),0)
else
#nameOfParam end from employee) ;
As the error message clearly said, the column in the returned result need a name. Either give it an alias in the SELECT like
SELECT CASE
...
END a_column_name
...
or define it in the declaration of the return type as in
...
RETURNS TABLE
(a_column_name nvarchar(max)
...
As you can see in the second form you have to specify a data type. As your current code doesn't make much sense now I cannot figure out what is the right one there. You'd need to amend it.
Note, that len(rtrim('['+#nameOfColumn+']')) = 0 is never true as len(rtrim('['+#nameOfColumn+']')) is either NULL, when #nameOfColumn is NULL or at least 2 because of the added brackets.
If #nameOfColumn is supposed to be a column name you shouldn't use varchar (especially without a length specified for it) but sysname which is a special type for object names.
Either way you should define a length for #nameOfColumn and #nameOfParam as just varchar without any length means varchar(1), which is probably not what you want. And maybe instead of varchar you want nvarchar.
You may also want to look into quotename().
Define name of column in SELECT statement :
(select case when '['+#nameOfColumn+']' is null or
len(rtrim('['+#nameOfColumn+']')) = 0
then Convert(nvarchar(2),0)
else #nameOfParam
end as name_column -- define column name
from employee)
Also, your function parameter has no data length, by default it will accept only 1 character #nameOfColumn varchar , #nameOfParam varchar & rest will trim.

Update with ISNULL and operation

original query looks like this :
UPDATE reponse_question_finale t1, reponse_question_finale t2 SET
t1.nb_question_repondu = (9-(ISNULL(t1.valeur_question_4)+ISNULL(t1.valeur_question_6)+ISNULL(t1.valeur_question_7)+ISNULL(t1.valeur_question_9))) WHERE t1.APPLICATION = t2.APPLICATION;
I know you cannot update 2 tables in a single query so i tried this :
UPDATE reponse_question_finale t1
SET nb_question_repondu = (9-(COALESCE(t1.valeur_question_4,'')::int+COALESCE(t1.valeur_question_6,'')::int+COALESCE(t1.valeur_question_7)::int+COALESCE(t1.valeur_question_9,'')::int))
WHERE t1.APPLICATION = t1.APPLICATION;
But this query gaves me an error : invalid input syntax for integer: ""
I saw that the Postgres equivalent to MySQL is COALESCE() so i think i'm on the good way here.
I also know you cannot add varchar to varchar so i tried to cast it to integer to do that. I'm not sure if i casted it correctly with parenthesis at the good place and regarding to error maybe i cannot cast to int with coalesce.
Last thing, i can certainly do a co-related sub-select to update my two tables but i'm a little lost at this point.
The output must be an integer matching the number of questions answered to a backup survey.
Any thoughts?
Thanks.
coalesce() returns the first non-null value from the list supplied. So, if the column value is null the expression COALESCE(t1.valeur_question_4,'') returns an empty string and that's why you get the error.
But it seems you want something completely different: you want check if the column is null (or empty) and then subtract a value if it is to count the number of non-null columns.
To return 1 if a value is not null or 0 if it isn't you can use:
(nullif(valeur_question_4, '') is null)::int
nullif returns null if the first value equals the second. The IS NULL condition returns a boolean (something that MySQL doesn't have) and that can be cast to an integer (where false will be cast to 0 and true to 1)
So the whole expression should be:
nb_question_repondu = 9 - (
(nullif(t1.valeur_question_4,'') is null)::int
+ (nullif(t1.valeur_question_6,'') is null)::int
+ (nullif(t1.valeur_question_7,'') is null)::int
+ (nullif(t1.valeur_question_9,'') is null)::int
)
Another option is to unpivot the columns and do a select on them in a sub-select:
update reponse_question_finale
set nb_question_repondu = (select count(*)
from (
values
(valeur_question_4),
(valeur_question_6),
(valeur_question_7),
(valeur_question_9)
) as t(q)
where nullif(trim(q),'') is not null);
Adding more columns to be considered is quite easy then, as you just need to add a single line to the values() clause

Most efficient way to join to a two-part key, with a fallback to matching only the first part?

In purely technical terms
Given a table with a two-column unique key, and input values for those two columns, what is the most efficient way to return the first matching row based on a two-step match?:
If an exact match exists on both key parts, return that
Otherwise, return the first (if any) matching row based on the first part alone
This operation will be done in many different places, on many rows. The "payload" of the match will be a single string column (nvarchar(400)). I want to optimize for fast reads. Paying for this with slower inserts and updates and more storage is acceptable. So having multiple indexes with the payload included is an option, as long is there is a good way to execute the two-step match described above. There absolutely will be a unique index on (key1, key2) with the payload included, so essentially all reads will be going off of this index alone, unless there is some clever approach that would use additional indexes.
A method that returns the entire matching row is preferred, but if a scalar function that only returns the payload is an order of magnitude faster, then that is worth considering.
I've tried three different methods, two of which I have posted as answers below. The third method was about 20x more expensive in the explain plan cost, and I've included it at the end of this post as an example of what not to do.
I'm curious to see if there are better ways, though, and will happily vote someone else's suggestion as the answer if it is better. In my dev database the query planner estimates similar costs to my two approaches, but my dev database doesn't have anywhere near the volume of multilingual text that will be in production, so it's hard to know if this accurately reflects the comparative read performance on a large data set. As tagged, the platform is SQL Server 2012, so if there are new applicable features available as of that version do make use of them.
Business context
I have a table LabelText that represents translations of user-supplied dynamic content:
create table Label ( bigint identity(1,1) not null primary key );
create table LabelText (
LabelTextID bigint identity(1,1) not null primary key
, LabelID bigint not null
, LanguageCode char(2) not null
, LabelText nvarchar(400) not null
, constraint FK_LabelText_Label
foreign key ( NameLabelID ) references Label ( LabelID )
);
There is a unique index on LabelID and LanguageCode, so there can only be one translation of a text item for each ISO 2-character language code. The LabelText field is also included, so reads can access the index along without having to fetch back from the underlying table:
create unique index UQ_LabelText
on LabelText ( LabelID, LanguageCode )
include ( LabelText);
I'm looking for the fastest-performing way to return the best match from the LabelText table in a two-step match, given a LabelID and LanguageCode.
For examples, let's say we have a Component table that looks like this:
create table Component (
ComponentID bigint identity(1,1) not null primary key
, NameLabelID bigint not null
, DescriptionLabelID bigint not null
, constraint FK_Component_NameLabel
foreign key ( NameLabelID ) references Label ( LabelID )
, constraint FK_Component_DescLabel
foreign key ( DescriptionLabelID ) references Label ( LabelID )
);
Users will each have a preferred language, but there is no guarantee that a text item will have a translation in their language. In this business context it makes more sense to show any available translation rather than none, when the user's preferred language is not available. So for example a German user may call a certain widget the 'linkenpfostenklammer'. A British user would prefer to see an English translation if one is available, but until there is one it is better to see the German (or Spanish, or French) version than to see nothing.
What not to do: Cross apply with dynamic sort
Whether encapsulated in a table-valued function or included inline, the following use of cross apply with a dynamic sort was about 20x more expensive (per explain plan estimate) than either the scalar-valued function in my first answer or the union all approach in my second answer:
declare #LanguageCode char(2) = 'de';
select
c.ComponentID
, c.NameLabelID
, n.LanguageCode as NameLanguage
, n.LabelText as NameText
from Component c
outer apply (
select top 1
lt.LanguageCode
, lt.LabelText
from LabelText lt
where lt.LabelID = c.NameLabelID
order by
(case when lt.LanguageCode = #LanguageCode then 0 else 1 end)
) n
I think this is going to be most performant
select lt.*, c.*
from ( select LabelText, LabelID from LabelText
where LabelTextID = #LabelTextID and LabelID = #LabelID
union
select LabelText, min(LabelID) from LabelText
where LabelTextID = #LabelTextID
and not exists (select 1 from LabelText
where LabelTextID = #LabelTextID and LabelID = #LabelID)
group by LabelTextID, LabelText
) lt
join component c
on c.NameLabelID = lt.LabelID
OP solution 1: Scalar function
A scalar function would make it easy to encapsulate the lookup for reuse elsewhere, though it does not return the language code of the text actually returned. I'm also unsure of the cost of executing multiple times per row in denormalized views.
create function GetLabelText(#LabelID bigint, #LanguageCode char(2))
returns nvarchar(400)
as
begin
declare #text nvarchar(400);
select #text = LabelText
from LabelText
where LabelID = #LabelID and LanguageCode = #LanguageCode
;
if #text is null begin
select #text = LabelText
from LabelText
where LabelID = #LabelID;
end
return #text;
end
Usage would look like this:
declare #LanguageCode char(2) = 'de';
select
ComponentID
, NameLabelID
, DescriptionLabelID
, GetLabelText(NameLabelID, #LanguageCode) AS NameText
, GetLabelText(DescriptionLabelID, #LanguageCode) AS DescriptionText
from Component
OP solution 2: Inline table-valued function using top 1, union all
A table-valued function is nice because it encapsulates the lookup for reuse just as with a scalar function, but also returns the matching LanguageCode of the row that was actually selected. In my dev database with limited data the explain plan cost of the following use of top 1 and union all is comparable to the scalar function approach in "OP Solution 1":
create function GetLabelText(#LabelID bigint, #LanguageCode char(2))
returns table
as
return (
select top 1
A.LanguageCode
, A.LabelText
from (
select
LanguageCode
, LabelText
from LabelText
where LabelID = #LabelID
and LanguageCode = #LanguageCode
union all
select
LanguageCode
, LabelText
from LabelText
where LabelID = #LabelID
) A
);
Usage:
declare #LanguageCode char(2) = 'de';
select
c.ComponentID
, c.NameLabelID
, n.LanguageCode AS NameLanguage
, n.LabelText AS NameText
, c.DescriptionLabelID
, c.LanguageCode AS DescriptionLanguage
, c.LabelText AS DescriptionText
from Component c
outer apply GetLabelText(c.NameLabelID, #LanguageCode) n
outer apply GetLabelText(c.DescriptionLabelID, #LanguageCode) d

Update a varchar column depending of the actual value

I'm using PostgreSQL 9.3.
I have a varchar column in a table that can be null and I want to update it depending of its value is null or not.
I didn't manage to do a function that takes a String as argument and updates the value like this:
If the column is null, the function concatenates the current string value, a comma and the string given as argument, else it just adds the string at the end of the current string value (without comma).
So how can I make a different Update depending of the column value to update?
You can use a case statement to conditionally update a column:
update the_table
set the_colum = case
when the column is null then 'foobar'
else the_column||', '||'foobar'
end
An another approach
UPDATE foo
SET bar = COALESCE(NULLIF(concat_ws(', ', NULLIF(bar, ''), NULLIF('a_string', '')), ''), 'a_string')

SSRS Parameters. Allowing "All" or "Null"

SSRS parameters are a pain. I want to be able to re-use reports for many different needs by allowing the users access to many different parameters and making them optional.
So, if I start out with code such as:
Select * from mytable myt
where myt.date between '1/1/2010' and '12/31/2010'
and year(myt.date) = '2010'
and myt.partnumber = 'XYZ-123'
I want those parameters to be optional so my first attempts were to make the parameters default to null such as:
and (myt.partnumber = (#PartNumber) or (#PartNumber) is null)
That has problems because if the database fields in question are nullable then you will drop records because null does not equal null.
I then used code such as this:
DECLARE #BeginDate AS DATETIME
DECLARE #EndDate AS DATETIME
DECLARE #PartNumber AS VARCHAR(25)
SET #Year = '..All'
SET #BeginDate = '1/1/2005'
SET #EndDate = '12/31/2010'
SET #PartNumber = '..All'
SET #Year = '..All'
Select * from mytable myt
where (myt.date between (#BeginDate) and (#EndDate))
and (year(myt.date) = (#Year) or (#Year) = '..All' )
and (myt.partnumber = (#PartNumber) or (#PartNumber) = '..All')
That doesn't work because Year(myt.date) is an integer and #Year is not.
So, here are my questions.
How can I make my dates optional? Is the best way to simply default them to dates outside of a practical range so I return all values?
What is the best way to handle the null or '..All' options to make my queries as readable as possible and allow my users to have optional parameters for most data types? I'd rather not use null for
Go ahead and allow nulls, which indicates the filter should not be applied. Then, you can use the following:
SELECT *
FROM mytable myt
WHERE COALESCE(myt.date, '1/1/1900') between COALESCE(#BeginDate, myt.date, '1/1/1900') and COALESCE(#EndDate, myt.date, '1/1/1900')
AND COALESCE(YEAR(myt.date), -1) = COALESCE(#Year, YEAR(myt.date), -1)
AND COALESCE(myt.partnumber, -1) = COALESCE(#PartNumber, myt.partnumber, -1)
In summary, if any variable value is NULL, then compare the column value to itself, which effectively ignores the condition. More specifically, when testing myt.date, if #BeginDate is NULL then set the lower range value equal to the myt.date value. Do the same substitution with the #EndDate value. Even, if both #BeginDate and #EndDate are NULL, the condition will be true.
A similar approach is used for YEAR(myt.date) and myt.partnumber. If the variable value is NULL, then compare the column value to itself, which is always true.
UPDATE:
Added a default value to each COALESCE to handle the situation where the column value is NULL.
I like your third code block. It seems like your WHERE clause could be corrected to work with a non-int value. The AND clause for the year line would look like this--not my best T-SQL, but it should get you pointed in the right direction:
and 1 = CASE #Year WHEN '..All' THEN 1 ELSE CASE WHEN year ( myt.date ) = CONVERT ( int, #Year ) THEN 1 ELSE 0 END END
This will allow you to have a string value of '..All' or an int value. Either will match correctly. You can do the same with partnumber.
try it like this, the key is to fix your null parameters values to surrogate nulls, also since sql server supports short circuit evaluation, putting the null check should generally perform better.
Select * from mytable myt
where (myt.date between (#BeginDate) and (#EndDate))
and (#Year IS NULL OR COALESCE(myt.date,'1900') = #Year)
and (#PartNumber IS NULL OR ISNULL(myt.partnumber, '<NULL>') = (#PartNumber)