Npgsql and NOT IN or !=ANY Queries - postgresql

It seems that there is not standard support for the IN clause in Npgsql. I see posts that recommend using = ANY instead of IN. This works great as a replacement for a standard IN clause. However, Postgres (pgsql) does not seem to have anything that allows you to do a NOT ANY or !=ANY query. It does, however, support NOT IN, but it seems that Npgsql does not. Can someone help me understand how I might write an Npgsql compatible query like this one:
select * my_table where id NOT IN(1,2,3,4)

First, this has nothing to do with Npgsql - it's a PostgreSQL question.
Second, PostgreSQL does have full standard support for IN clauses. It's important to understand the difference between IN and ANY: IN operates on rows, whereas ANY operates on arrays - the two definitely aren't the same, even though you can convert one into the other (e.g. see unnest). Read the docs carefully.
Finally, to answer your question... Saying WHERE x != ANY(some_array) means "where there's some element of some_array that isn't equal to x". This indeed isn't the same as what you want, which is "where none of some_array's elements are equal to x". You can achieve the latter with WHERE x != ALL(some_array): this checks x against each and every element, returning true only if all of them are unequal.
You can also use ANY with simple logical negation: WHERE NOT (x = ANY(SOME_ARRAY)).

Related

Dealing with division errors in PostgresQL without procedural code - is it possible?

I have to produce (in PostgresQL, if that matters) a table containing a column with the quotient of two sums, basically like this (quite simplified):
select name, sum(a)/sum(b), sum(c)/sum(d)
from a_complex_nested_select_query_with_many_zeros
group by name
order by name;
The table has tens of thousands of rows (not too big), but in a few cases, summing over b or d does produce 0, which causes the whole query to fail with Divide by 0.
In researching how to deal with the exception, I was only able to find information on PL/pgSQL Control Structures, which appears to require the creation of a function (but I'm not sure).
My question is of course how to make this query work. Perhaps the answer has something to do with
Can an exception be caught in non-procedural SQL (PostgresQL, perhaps?)
Is this a case where procedural code is necessary?
Can a CASE..WHEN..ELSE..END structure avoid the problem (I'm stuck on this because it looks like the SUM() calls are repeated!), but it is appealing because I do not know enough about Postgres to know whether exception catching has a performance penalty.
Is there a way to, again without a function, ensure SUM() is evaluated once in a CASE expression?
If a function is required, what would it look like?
EDIT By "repeating sum calls" I mean that I know I could write:
select name,
case when sum(b)=0 then null else sum(a)/sum(b) end,
case when sum(d)=0 then null else sum(c)/sum(d) end
and so on, but am not sure if that is a good thing. (I guess someone will answer with a why-don't-you-profile-it but I think there may be better approaches out there, somewhere.)
nullif will return null if the arguments are equal. A division by null evaluates to null
select
name,
sum(a) / nullif(sum(b), 0),
sum(c) / nullif(sum(d), 0)
from a_complex_nested_select_query_with_many_zeros
group by name
order by name;

Is there any logical reason to use CFQUERYPARAM in Query of Queries?

I primarily use CFQUERYPARAM to prevent SQL injection. Since Query-of-Queries (QoQ) does not touch the database, is there any logical reason to use CFQUERYPARAM in them? I know that values that do not match the cfsqltype and maxlength will throw an exception, but, these values should already be validated before that and display friendly messages (from a UX viewpoint).
Since Query-of-Queries (QoQ) does not touch the database, is there any logical reason to use CFQUERYPARAM in them? Actually, it does touch the database, the database that you currently have stored in memory. The data in that database could still theoretically be tampered with via some sort of injection from the user. Does that affect your physical database - no. Does that affect the use of the data within your application - yes.
You did not give any specific details but I would err on the side of caution. If ANY of the data you are using to build your query comes from the client then use cfqueryparam in them. If you can guarantee that none of the elements in your query comes from the client then I think it would be okay to not use the cfqueryparam.
As an aside, using cfqueryparam also helps optimize the query for the database although I'm not sure if that is true for query of queries. It also escapes characters for you like apostrophes.
Here is a situation where it's simpler, in my opinion.
<cfquery name="NoVisit" dbtype="query">
select chart_no, patient_name, treatment_date, pr, BillingCompareField
from BillingData
where BillingCompareField not in
(<cfqueryparam cfsqltype="cf_sql_varchar"
value="#ValueList(FinalData.FinalCompareField)#" list="yes">)
</cfquery>
The alternative would be to use QuotedValueList. However, if anything in that value list contained an apostrophe, cfqueryparam will escape it. Otherwise I would have to.
Edit starts here
Here is another example where not using query parameters causes an error.
QueryAddRow(x,2);
QuerySetCell(x,"dt",CreateDate(2001,1,1),1);
QuerySetCell(x,"dt",CreateDate(2001,1,11),2);
</cfscript>
<cfquery name="y" dbtype="query">
select * from x
<!---
where dt in (<cfqueryparam cfsqltype="cf_sql_date" value="#ValueList(x.dt)#" list="yes">)
--->
where dt in (#ValueList(x.dt)#)
</cfquery>
The code as written throws this error:
Query Of Queries runtime error.
Comparison exception while executing IN.
Unsupported Type Comparison Exception:
The IN operator does not support comparison between the following types:
Left hand side expression type = "DATE".
Right hand side expression type = "LONG".
With the query parameter, commented out above, the code executes successfully.

Implement custom comparison in postgresql

I have some data in a postgres table with one column called version (of type varchar). I would like to use my own comparison function to to order/sort on that column, but I am not sure what is the most appropriate answer:
I have an JS implementation of the style comp(left, right) -> -1/0/1, but I don't know how I can use it in a sql order by clause (through plv8)
I could write a C extension, but I am not particularly excited about this (mostly for maintenance reason, as writing the comparison in C would not be too difficult in itself)
others ?
The type of comparisons I am interested are similar to version string ordering used in package managers.
You want:
ORDER BY mycolumn USING operator
See the docs for SELECT. It looks like you may need to define an operator for the function, and a b-tree operator class containing the operator to use it; you can't just write USING myfunc().
(No time to test this and write a demo right now).

T-SQL speed comparison between LEFT() vs. LIKE operator

I'm creating result paging based on first letter of certain nvarchar column and not the usual one, that usually pages on number of results.
And I'm not faced with a challenge whether to filter results using LIKE operator or equality (=) operator.
select *
from table
where name like #firstletter + '%'
vs.
select *
from table
where left(name, 1) = #firstletter
I've tried searching the net for speed comparison between the two, but it's hard to find any results, since most search results are related to LEFT JOINs and not LEFT function.
"Left" vs "Like" -- one should always use "Like" when possible where indexes are implemented because "Like" is not a function and therefore can utilize any indexes you may have on the data.
"Left", on the other hand, is function, and therefore cannot make use of indexes. This web page describes the usage differences with some examples. What this means is SQL server has to evaluate the function for every record that's returned.
"Substring" and other similar functions are also culprits.
Your best bet would be to measure the performance on real production data rather than trying to guess (or ask us). That's because performance can sometimes depend on the data you're processing, although in this case it seems unlikely (but I don't know that, hence why you should check).
If this is a query you will be doing a lot, you should consider another (indexed) column which contains the lowercased first letter of name and have it set by an insert/update trigger.
This will, at the cost of a minimal storage increase, make this query blindingly fast:
select * from table where name_first_char_lower = #firstletter
That's because most database are read far more often than written, and this will amortise the cost of the calculation (done only for writes) across all reads.
It introduces redundant data but it's okay to do that for performance as long as you understand (and mitigate, as in this suggestion) the consequences and need the extra performance.
I had a similar question, and ran tests on both. Here is my code.
where (VOUCHER like 'PCNSF%'
or voucher like 'PCLTF%'
or VOUCHER like 'PCACH%'
or VOUCHER like 'PCWP%'
or voucher like 'PCINT%')
Returned 1434 rows in 1 min 51 seconds.
vs
where (LEFT(VOUCHER,5) = 'PCNSF'
or LEFT(VOUCHER,5)='PCLTF'
or LEFT(VOUCHER,5) = 'PCACH'
or LEFT(VOUCHER,4)='PCWP'
or LEFT (VOUCHER,5) ='PCINT')
Returned 1434 rows in 1 min 27 seconds
My data is faster with the left 5. As an aside my overall query does hit some indexes.
I would always suggest to use like operator when the search column contains index. I tested the above query in my production environment with select count(column_name) from table_name where left(column_name,3)='AAA' OR left(column_name,3)= 'ABA' OR ... up to 9 OR clauses. My count displays 7301477 records with 4 secs in left and 1 second in like i.e where column_name like 'AAA%' OR Column_Name like 'ABA%' or ... up to 9 like clauses.
Calling a function in where clause is not a best practice. Refer http://blog.sqlauthority.com/2013/03/12/sql-server-avoid-using-function-in-where-clause-scan-to-seek/
Entity Framework Core users
You can use EF.Functions.Like(columnName, searchString + "%") instead of columnName.startsWith(...) and you'll get just a LIKE function in the generated SQL instead of all this 'LEFT' craziness!
Depending upon your needs you will probably need to preprocess searchString.
See also https://github.com/aspnet/EntityFrameworkCore/issues/7429
This function isn't present in Entity Framework (non core) EntityFunctions so I'm not sure how to do it for EF6.

Postgresql HAVING clause limitation

Why can't one use an output column in the having clause in postgresql? It doesn't change expressivity of the language anyhow, just forces people to rewrite output column definition in having clause. Is a way to avoid that, apart from putting the whole query as a subquery in SELECT * FROM (...) AS t WHERE condition ?
Bacause it's not implemented? And if you're asking why it wasn't implemented, I see 2 possible explanations:
standard doesn't require it
nobody had time to spent on it
if you'd like to have it - mail to -hackers, talk about, and then implement.
Frankly I don't see it as a big problem - it's not like you have 1000 characters to retype.