VARCHAR comparison on an indexed column

VARCHAR comparison on an indexed column - postgresql

Postgres is behaving differently from the 'common sense' expected behavior:
Given a table 'my_table' and a VARCHAR(250) column named 'MyVarcharColumn' where an index IDX_MyvarcharColumn is created based on the 'MyVarcharColumn'.
Collation: Default
Postgres version: 11.12
LC_COLLATE: en_US.utf8
Enconding: UTF8
CTYPE: en_US.utf8
The problem is presented below:
Given a query (A)
SELECT * FROM my_table t
WHERE 'mystring' = t.MyVarcharColumn
When running the query above, no records are returned even though there is a value 'mystring' present in 'my_table'.
Workaround:
SELECT * FROM my_table t
WHERE 'mystring' = t.MyVarcharColumn collate "C"
By adding 'collate "C"' the query works fine, obviously no one wants to have to add the "collate" statement at the end of every query.
Second 'Workaround':
By recreating the databases indexes 'REINDEX database myDB' the query also starts to work as expected without the need of adding the statement 'collate'.
The question is: is there a way to avoid using the collate statement and/or the REINDEX to make this work without a workaround?
Re-creating the database with a different collation it also not an option at the moment.
Using lower(column_name) to compare isn't an option since it does not use indexes and it would make the query slow.

Related

Query failed: collation "numerickn" for encoding "UTF8" does not exist

I have a column (vendor_name) in Postgresql (AWS RDS) table which can contain alphanumeric values. I would like to do a natural sort on this column.
The sample data in the table is as follows
delta 20221120
delta 20220109
costco delivery 564
costco delivery 561
united 01672519702943
Uber
I have created a colllate in the db as below.
CREATE COLLATION IF NOT EXISTS numerickn (provider = icu, locale = 'en-u-kn-true')
If anyone sorts on the vendor name column in the UI grid, I am adding the following clause dynamically in my query.
ORDER BY "vendor" COLLATE "numerickn"
However, it gives the following error, though I see collation exists in DB.
Error: Query failed: collation "numerickn" for encoding "UTF8" does
not exist
I am not sure why it does not work if collate exists in the DB. In my vendor name, numeric can appear anywhere within the string, so there is no pattern.

I could not find why it was not working in the stage environment while in my local it was working.
In the end, I moved away from colation logic and implement the natural sort in a different way found in stack overflow only.
PostgreSQL ORDER BY issue - natural sort
I am using Nodejs in my api code. My solution goes as follows
qOrderBy = String.raw` ORDER BY ARRAY(
SELECT ROW(
CAST(COALESCE(NULLIF(match[1], ''), '9223372036854775807') AS BIGINT),
match[2]
)
FROM REGEXP_MATCHES(vendor, '(\d*)|(\D*)', 'g')
AS match ) ${sortOrder}`
}

In Postgres, running ANALYZE changes behaviour of ILIKE clause with COLLATE

We are migrating an application from SQL Server to Postgres and attempting to emulate various aspects of the case insensitivity of SQL Server. We have created a non-deterministic collation to support case-insensitive matching of foreign keys and equality comparisons.
But we are seeing some weird behaviour when using ILIKE which we can't explain, and would appreciate some assistance.
To see the behaviour, run the following on a fresh database:
CREATE COLLATION IF NOT EXISTS public.ci (provider = icu, locale = 'und-u-ks-level2', deterministic = false);
DROP TABLE IF EXISTS sort_test;
CREATE TABLE sort_test (a text COLLATE public.ci);
INSERT INTO sort_test SELECT md5(n::text) FROM generate_series(1, 10000) n;
-- Removing the following line fixes the issue
ANALYZE sort_test;
-- This line throws "nondeterministic collations are not supported for ILIKE"
SELECT * FROM sort_test WHERE a ILIKE 'c4ca4238a0%' COLLATE "und-x-icu";
Why does running the ANALYZE statement break the ILIKE statement?

That behavior is a PostgreSQL bug.
The reason why it works without the ANALYZE is that the error is thrown when applying the operator to the “histogram bounds” in the statistics. Before ANALYZE there are no statistics, so no error is thrown.

How to add a date column which is 7 days later than an existing column in a Postgres table? [duplicate]

Does PostgreSQL support computed / calculated columns, like MS SQL Server? I can't find anything in the docs, but as this feature is included in many other DBMSs I thought I might be missing something.
Eg: http://msdn.microsoft.com/en-us/library/ms191250.aspx

Postgres 12 or newer
STORED generated columns are introduced with Postgres 12 - as defined in the SQL standard and implemented by some RDBMS including DB2, MySQL, and Oracle. Or the similar "computed columns" of SQL Server.
Trivial example:
CREATE TABLE tbl (
int1 int
, int2 int
, product bigint GENERATED ALWAYS AS (int1 * int2) STORED
);
fiddle
VIRTUAL generated columns may come with one of the next iterations. (Not in Postgres 15, yet).
Related:
Attribute notation for function call gives error
Postgres 11 or older
Up to Postgres 11 "generated columns" are not supported.
You can emulate VIRTUAL generated columns with a function using attribute notation (tbl.col) that looks and works much like a virtual generated column. That's a bit of a syntax oddity which exists in Postgres for historic reasons and happens to fit the case. This related answer has code examples:
Store common query as column?
The expression (looking like a column) is not included in a SELECT * FROM tbl, though. You always have to list it explicitly.
Can also be supported with a matching expression index - provided the function is IMMUTABLE. Like:
CREATE FUNCTION col(tbl) ... AS ... -- your computed expression here
CREATE INDEX ON tbl(col(tbl));
Alternatives
Alternatively, you can implement similar functionality with a VIEW, optionally coupled with expression indexes. Then SELECT * can include the generated column.
"Persisted" (STORED) computed columns can be implemented with triggers in a functionally equivalent way.
Materialized views are a related concept, implemented since Postgres 9.3.
In earlier versions one can manage MVs manually.

YES you can!! The solution should be easy, safe, and performant...
I'm new to postgresql, but it seems you can create computed columns by using an expression index, paired with a view (the view is optional, but makes makes life a bit easier).
Suppose my computation is md5(some_string_field), then I create the index as:
CREATE INDEX some_string_field_md5_index ON some_table(MD5(some_string_field));
Now, any queries that act on MD5(some_string_field) will use the index rather than computing it from scratch. For example:
SELECT MAX(some_field) FROM some_table GROUP BY MD5(some_string_field);
You can check this with explain.
However at this point you are relying on users of the table knowing exactly how to construct the column. To make life easier, you can create a VIEW onto an augmented version of the original table, adding in the computed value as a new column:
CREATE VIEW some_table_augmented AS
SELECT *, MD5(some_string_field) as some_string_field_md5 from some_table;
Now any queries using some_table_augmented will be able to use some_string_field_md5 without worrying about how it works..they just get good performance. The view doesn't copy any data from the original table, so it is good memory-wise as well as performance-wise. Note however that you can't update/insert into a view, only into the source table, but if you really want, I believe you can redirect inserts and updates to the source table using rules (I could be wrong on that last point as I've never tried it myself).
Edit: it seems if the query involves competing indices, the planner engine may sometimes not use the expression-index at all. The choice seems to be data dependant.

One way to do this is with a trigger!
CREATE TABLE computed(
one SERIAL,
two INT NOT NULL
);
CREATE OR REPLACE FUNCTION computed_two_trg()
RETURNS trigger
LANGUAGE plpgsql
SECURITY DEFINER
AS $BODY$
BEGIN
NEW.two = NEW.one * 2;
RETURN NEW;
END
$BODY$;
CREATE TRIGGER computed_500
BEFORE INSERT OR UPDATE
ON computed
FOR EACH ROW
EXECUTE PROCEDURE computed_two_trg();
The trigger is fired before the row is updated or inserted. It changes the field that we want to compute of NEW record and then it returns that record.

PostgreSQL 12 supports generated columns:
PostgreSQL 12 Beta 1 Released!
Generated Columns
PostgreSQL 12 allows the creation of generated columns that compute their values with an expression using the contents of other columns. This feature provides stored generated columns, which are computed on inserts and updates and are saved on disk. Virtual generated columns, which are computed only when a column is read as part of a query, are not implemented yet.
Generated Columns
A generated column is a special column that is always computed from other columns. Thus, it is for columns what a view is for tables.
CREATE TABLE people (
...,
height_cm numeric,
height_in numeric GENERATED ALWAYS AS (height_cm * 2.54) STORED
);
db<>fiddle demo

Well, not sure if this is what You mean but Posgres normally support "dummy" ETL syntax.
I created one empty column in table and then needed to fill it by calculated records depending on values in row.
UPDATE table01
SET column03 = column01*column02; /*e.g. for multiplication of 2 values*/
It is so dummy I suspect it is not what You are looking for.
Obviously it is not dynamic, you run it once. But no obstacle to get it into trigger.

Example on creating an empty virtual column
,(SELECT *
From (values (''))
A("virtual_col"))
Example on creating two virtual columns with values
SELECT *
From (values (45,'Completed')
, (1,'In Progress')
, (1,'Waiting')
, (1,'Loading')
) A("Count","Status")
order by "Count" desc

I have a code that works and use the term calculated, I'm not on postgresSQL pure tho we run on PADB
here is how it's used
create table some_table as
select category,
txn_type,
indiv_id,
accum_trip_flag,
max(first_true_origin) as true_origin,
max(first_true_dest ) as true_destination,
max(id) as id,
count(id) as tkts_cnt,
(case when calculated tkts_cnt=1 then 1 else 0 end) as one_way
from some_rando_table
group by 1,2,3,4 ;

A lightweight solution with Check constraint:
CREATE TABLE example (
discriminator INTEGER DEFAULT 0 NOT NULL CHECK (discriminator = 0)
);

PostgreSQL - Query on hstore - column does not exists

I wonder if someone could have an idea what is going wrong with this simple query on a hstore column in PostgreSQL 9.2
The queries are runned in pgAdmin
select attributeValue->"CODE_MUN" from shapefile_feature
returns: « attributevalue » column does not exists
when doing:
select * from shapefile_feature;
all the columns are returned including attributeValue, the hstore column
what is the problem?

PostgreSQL distinguish between "identifiers" and 'literal'. Identifiers are schema, table, column's, .. names, literals are others. A attribute in hstore are not SQL identifiers. So you have to pass their names as literals. Operator "->" is only shortcut for function "fetchval(hstore, text)" with possibility be indexed.
select attributeValue->'CODE_MUN' from shapefile_feature
is internally transformed to (don't do this transformation by self!)
select fetchval(attributeValue, 'CODE_MUN') from shapefile_feature
on buggy example in transformed form, you can better understand to error message:
select fetchval(attributeValue, "CODE_MUN") from shapefile_feature
PostgreSQL tries to find column "CODE_MUN" in shapefile_feature, bacause used double quotes means identifiers (in case sensitive notation).

Postgresql order by - danish characters is expanded

I'm trying to make a "order by" statement in a sql query work. But for some reason danish special characters is expanded in stead of their evaluating their value.
SELECT roadname FROM mytable ORDER BY roadname
The result:
Abildlunden
Æblerosestien
Agern Alle 1
The result in the middle should be the last.
The locale is set to danish, so it should know the value of the danish special characters.

What is the collation of your database? (You might also want to give the PostgreSQL version you are using) Use "\l" from psql to see.
Compare and contrast:
steve#steve#[local] =# select * from (values('Abildlunden'),('Æblerosestien'),('Agern Alle 1')) x(word)
order by word collate "en_GB";
word
---------------
Abildlunden
Æblerosestien
Agern Alle 1
(3 rows)
steve#steve#[local] =# select * from (values('Abildlunden'),('Æblerosestien'),('Agern Alle 1')) x(word)
order by word collate "da_DK";
word
---------------
Abildlunden
Agern Alle 1
Æblerosestien
(3 rows)
The database collation is set when you create the database cluster, from the locale you have set at the time. If you installed PostgreSQL through a package manager (e.g. apt-get) then it is likely taken from the system-default locale.
You can override the collation used in a particular column, or even in a particular expression (as done in the examples above). However if you're not specifying anything (likely) then the database default will be used (which itself is inherited from the template database when the database is created, and the template database collation is fixed when the cluster is created)
If you want to use da_DK as your default collation throughout, and it's not currently your database default, your simplest option might be to dump the database, then drop and re-create the cluster, specifying the collation to initdb (or pg_createcluster or whatever tool you use to create the server)
BTW the question isn't well-phrased. PostgreSQL is very much not ignoring the "special" characters, it is correctly expanding "Æ" into "AE"- which is a correct rule for English. Collating "Æ" at the end is actually more like the unlocalised behaviour.
Collation documentation: http://www.postgresql.org/docs/current/static/collation.html

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

VARCHAR comparison on an indexed column - postgresql

Related

Query failed: collation "numerickn" for encoding "UTF8" does not exist

In Postgres, running ANALYZE changes behaviour of ILIKE clause with COLLATE

How to add a date column which is 7 days later than an existing column in a Postgres table? [duplicate]

PostgreSQL - Query on hstore - column does not exists

Postgresql order by - danish characters is expanded

Categories

Resources