~~ Operator In Postgres - postgresql

I have a query in Postgres:
SELECT DISTINCT a.profn FROM tprof a, sap_tstc b, tgrc c
WHERE ((c.grcid ~~ a.grcid)
AND ((c.tcode) = (b.tcode)));
What is ~~ mean?

From 9.7.1. LIKE of PostgreSQL documentation:
The operator ~~ is equivalent to LIKE, and ~~* corresponds to ILIKE. There are also !~~ and !~~* operators that represent NOT LIKE and NOT ILIKE, respectively. All of these operators are PostgreSQL-specific.

It isn't listed in the index of the documentation which is frustrating.
So I had a look with psql:
regress=> \do ~~
List of operators
Schema | Name | Left arg type | Right arg type | Result type | Description
------------+------+---------------+----------------+-------------+-------------------------
pg_catalog | ~~ | bytea | bytea | boolean | matches LIKE expression
pg_catalog | ~~ | character | text | boolean | matches LIKE expression
pg_catalog | ~~ | name | text | boolean | matches LIKE expression
pg_catalog | ~~ | text | text | boolean | matches LIKE expression
(4 rows)
It's an operator alias for LIKE, that's all.

Related

What Postgres 13 index types support distance searches?

Original Question
We've had great results using a K-NN search with a GiST index with gist_trgm_ops. Pure magic. I've got other situations, with other datatypes like timestamp where distance functions would be quite useful. If I didn't dream it, this is, or was, available through pg_catalog. Looking around, I can't find a way to search on indexes by such properties. I think what I'm after, in this case, is AMPROP_DISTANCE_ORDERABLE under-the-hood.
Just checked, and pg_am did have a lot more attributes than it does now, prior to 9.6.
Is there another way to figure out what options various indexes have with a catalog search?
Catalogs
jjanes' answer inspired me to look at the system information functions some more, and to spend a day in the pg_catalog tables. The catalogs for indexes and operators are complicated. The system information functions are a big help. This piece proved super useful for getting a handle on things:
https://postgrespro.com/blog/pgsql/4161264
I think the conclusion is "no, you can't readily figure out what data types and indexes support proximity searches." The relevant attribute is a property of a column in a specific index. However, it looks like nearest-neighbor searching requires a GiST index, and that there are readily-available index operator classes to add K-NN searching to a huge range of common types. Happy for corrections on these conclusions, or the details below.
Built-in Distance Support
https://www.postgresql.org/docs/current/gist-builtin-opclasses.html
From various bits of the docs, it sounds like there are distance (proximity, nearest neighbor, K-NN) operators for GiST indexes on a handful of built-in geometric types.
box
circle
point
poly
B-tree Operator Classes
Not listed as such in the docs, but visible with this query:
select am.amname AS index_method
, opc.opcname AS opclass_name
, opc.opcintype::regtype AS indexed_type
, opc.opcdefault AS is_default
from pg_am am
, pg_opclass opc
where opc.opcmethod = am.oid
and am.amname = 'btree'
order by 1,2;
B-tree GiST Distance Support
https://www.postgresql.org/docs/current/btree-gist.html
I guess a B-tree is a special case of a GiST, and there's a B-tree operator class to match. The docs say these native types are supported:
int2
int4
int8
float4
float8
timestamp with time zone
timestamp without time zone
time without time zone
date
interval
oid
money
BRIN Built-in Operator Classes
https://www.postgresql.org/docs/current/brin-builtin-opclasses.html
There are over 70 listed in the internals docs.
GIN Built-in Operator Classes
https://www.postgresql.org/docs/12/gin-builtin-opclasses.html
array_ops
jsonb_ops
jsonb_path_ops
tsvector_ops
Alternative Text Opts
https://www.postgresql.org/docs/current/indexes-opclass.html
There are special operator classes for text comparisons made character-by-character, rather than through a collation. Or so the docs say:
text_pattern_ops
varchar_pattern_ops
bpchar_pattern_ops
pg_trgm
Beyond this, the included pg_trgm module includes operators for GIN and GiST, with the GiST version optimizing <->. I think this shows up as:
text
Note: Postgres 14 modifies pg_trgm to allow you to adjust the "signature length" for the index entry. Longer is possibly more accurate, shorter signatures are smaller on disk. If you've been using pg_trgm, it might be worth experimenting with the signature length in PG 14.
https://www.postgresql.org/docs/current/pgtrgm.html
SP-GiST Built-in Operator Classes
box_ops
kd_point_ops
network_ops
poly_ops
quad_point_ops
range_ops
text_ops
pg_operator search
Here's a search on pg_operator to look for matches starting from the <-> operator itself:
select oprnamespace::regnamespace::text as schema_name,
oprowner::regrole as owner,
oprname as operator,
oprleft::regtype as left,
oprright::regtype as right,
oprresult::regtype as result,
oprcom::regoperator as commutator
from pg_operator
where oprname = '<->'
order by 1
Output from one of our severs:
| schema_name | owner | operator | left | right | result | commutator |
+-------------+----------+----------+-----------------------------+-----------------------------+------------------+--------------------------------------------------------------+
| extensions | postgres | <-> | text | text | real | <->(text,text) |
| extensions | postgres | <-> | money | money | money | <->(money,money) |
| extensions | postgres | <-> | date | date | integer | <->(date,date) |
| extensions | postgres | <-> | real | real | real | <->(real,real) |
| extensions | postgres | <-> | double precision | double precision | double precision | <->(double precision,double precision) |
| extensions | postgres | <-> | smallint | smallint | smallint | <->(smallint,smallint) |
| extensions | postgres | <-> | integer | integer | integer | <->(integer,integer) |
| extensions | postgres | <-> | bigint | bigint | bigint | <->(bigint,bigint) |
| extensions | postgres | <-> | interval | interval | interval | <->(interval,interval) |
| extensions | postgres | <-> | oid | oid | oid | <->(oid,oid) |
| extensions | postgres | <-> | time without time zone | time without time zone | interval | <->(time without time zone,time without time zone) |
| extensions | postgres | <-> | timestamp without time zone | timestamp without time zone | interval | <->(timestamp without time zone,timestamp without time zone) |
| extensions | postgres | <-> | timestamp with time zone | timestamp with time zone | interval | <->(timestamp with time zone,timestamp with time zone) |
| pg_catalog | postgres | <-> | box | box | double precision | <->(box,box) |
| pg_catalog | postgres | <-> | path | path | double precision | <->(path,path) |
| pg_catalog | postgres | <-> | line | line | double precision | <->(line,line) |
| pg_catalog | postgres | <-> | lseg | lseg | double precision | <->(lseg,lseg) |
| pg_catalog | postgres | <-> | polygon | polygon | double precision | <->(polygon,polygon) |
| pg_catalog | postgres | <-> | circle | circle | double precision | <->(circle,circle) |
| pg_catalog | postgres | <-> | point | circle | double precision | <->(circle,point) |
| pg_catalog | postgres | <-> | circle | point | double precision | <->(point,circle) |
| pg_catalog | postgres | <-> | point | polygon | double precision | <->(polygon,point) |
| pg_catalog | postgres | <-> | polygon | point | double precision | <->(point,polygon) |
| pg_catalog | postgres | <-> | circle | polygon | double precision | <->(polygon,circle) |
| pg_catalog | postgres | <-> | polygon | circle | double precision | <->(circle,polygon) |
| pg_catalog | postgres | <-> | point | point | double precision | <->(point,point) |
| pg_catalog | postgres | <-> | box | line | double precision | <->(line,box) |
| pg_catalog | postgres | <-> | tsquery | tsquery | tsquery | 0 |
| pg_catalog | postgres | <-> | line | box | double precision | <->(box,line) |
| pg_catalog | postgres | <-> | point | line | double precision | <->(line,point) |
| pg_catalog | postgres | <-> | line | point | double precision | <->(point,line) |
| pg_catalog | postgres | <-> | point | lseg | double precision | <->(lseg,point) |
| pg_catalog | postgres | <-> | lseg | point | double precision | <->(point,lseg) |
| pg_catalog | postgres | <-> | point | box | double precision | <->(box,point) |
| pg_catalog | postgres | <-> | box | point | double precision | <->(point,box) |
| pg_catalog | postgres | <-> | lseg | line | double precision | <->(line,lseg) |
| pg_catalog | postgres | <-> | line | lseg | double precision | <->(lseg,line) |
| pg_catalog | postgres | <-> | lseg | box | double precision | <->(box,lseg) |
| pg_catalog | postgres | <-> | box | lseg | double precision | <->(lseg,box) |
| pg_catalog | postgres | <-> | point | path | double precision | <->(path,point) |
| pg_catalog | postgres | <-> | path | point | double precision | <->(point,path) |
+-------------+----------+----------+-----------------------------+-----------------------------+------------------+--------------------------------------------------------------+
Did I miss any index opts worth knowing about?
Checking Out Live Indexes
Here's a longer-than-it-should-be-because-I-still-find-the-catalogs-confusing query to pull out the columns from each user index, and figure out their more interesting properties. For a nice, short catalog search of much utility, see https://dba.stackexchange.com/questions/186944/how-to-list-all-the-indexes-along-with-their-type-btree-brin-hash-etc
with
basic_details as (
select relnamespace::regnamespace::text as schema_name,
indrelid::regclass::text as table_name,
indexrelid::regclass::text as index_name,
unnest(indkey) as column_ordinal_position , -- WITH ORDINALITY would be nice here, didn't get it working.
generate_subscripts(indkey, 1) + 1 as column_position_in_index --
from pg_index
join pg_class on pg_class.oid = pg_index.indrelid
),
enriched_details as (
select basic_details.schema_name,
basic_details.table_name,
basic_details.index_name,
basic_details.column_ordinal_position,
basic_details.column_position_in_index,
columns.column_name,
columns.udt_name as column_type_name
from basic_details
join information_schema.columns as columns
on columns.table_schema = basic_details.schema_name
and columns.table_name = basic_details.table_name
and columns.ordinal_position = basic_details.column_ordinal_position
where schema_name not like 'pg_%'
)
select *,
-- https://postgrespro.com/blog/pgsql/4161264
coalesce(pg_index_column_has_property(index_name,column_position_in_index,'distance_orderable'), false) as supports_knn_searches,
coalesce(pg_index_column_has_property(index_name,column_position_in_index,'search_array'), false) as supports_in_searches,
coalesce(pg_index_column_has_property(index_name,column_position_in_index,'returnable'), false) as supports_index_only_scans,
(select indexdef
from pg_indexes
where pg_indexes.schemaname = enriched_details.schema_name
and pg_indexes.indexname = enriched_details.index_name) as index_definition
from enriched_details
order by supports_in_searches desc,
schema_name,
table_name,
index_name
timestamp type supports KNN with GiST indexes using the <-> operator created by the btree_gist extension.
You can check if a specific column of a specific index supports it, like this:
select pg_index_column_has_property('pgbench_history_mtime_idx'::regclass,1,'distance_orderable');
As best as I can tell, here's the state of play as of PG 14:
GiST indexes may support nearest-neighbor (K-NN) proximity <--> search, and always have.
SP-GiST added such support as of PG 12.
RUM indexes (not in core) also support K-NN.
In all cases, support is done in the operator class:
https://www.postgresql.org/docs/current/indexes-opclass.html
That's what determines if distance_orderable works for a specific data type on a specific kind of index. Built-in, some of the geometric and text vector types work out-of-the box. Other than that small set, many more types are supported via specific operator classes, such as:
https://www.postgresql.org/docs/current/btree-gist.html
https://www.postgresql.org/docs/current/pgtrgm.html
In the case of SP-GiST, there are a lot fewer types supported than with GiST, once you've installed btree_gist:
https://www.postgresql.org/docs/14/spgist-builtin-opclasses.html
It looks like text_opts and range_opts do not support proximity searches. However, for tsrange, etc., there are likely enough options with other tools.

How to convert PSQLs ::json #> ::json to a jpa/jpql-predicate

Say i have a db-table looking like this:
CREATE TABLE myTable(
id BIGINT,
date TIMESTAMP,
user_ids JSONB
);
user_ids are a JSONB-ARRAY
Let a record of this table look like this:
{
"id":13,
"date":"2019-01-25 11:03:57",
"user_ids":[25, 661, 88]
};
I need to query all records where user_ids contain 25. In SQL i can achieve it with the following select-statement:
SELECT * FROM myTable where user_ids::jsonb #> '[25]'::jsonb;
Now i need to write a JPA-Predicate that renders "user_ids::jsonb #> '[25]'::jsonb" to a hibernate parseable/executable Criteria, which i then intent to use in a session.createQuery() statement.
In simpler terms i need to know how i can write that PSQL-snippet (user_ids::jsonb #> '[25]'::jsonb) as a HQL-expression.
Fortunately, every comparison operator in PostgreSQL is merely an alias to a function, and you can find the alias through the psql console by typing \doS+ and the operator (although some operators are considered wildcards in this search, so they give more results than desired).
Here is the result:
postgres=# \doS+ #>
List of operators
Schema | Name | Left arg type | Right arg type | Result type | Function | Description
------------+------+---------------+----------------+-------------+---------------------+-------------
pg_catalog | #> | aclitem[] | aclitem | boolean | aclcontains | contains
pg_catalog | #> | anyarray | anyarray | boolean | arraycontains | contains
pg_catalog | #> | anyrange | anyelement | boolean | range_contains_elem | contains
pg_catalog | #> | anyrange | anyrange | boolean | range_contains | contains
pg_catalog | #> | box | box | boolean | box_contain | contains
pg_catalog | #> | box | point | boolean | box_contain_pt | contains
pg_catalog | #> | circle | circle | boolean | circle_contain | contains
pg_catalog | #> | circle | point | boolean | circle_contain_pt | contains
pg_catalog | #> | jsonb | jsonb | boolean | jsonb_contains | contains
pg_catalog | #> | path | point | boolean | path_contain_pt | contains
pg_catalog | #> | polygon | point | boolean | poly_contain_pt | contains
pg_catalog | #> | polygon | polygon | boolean | poly_contain | contains
pg_catalog | #> | tsquery | tsquery | boolean | tsq_mcontains | contains
(13 rows)
What you want is jsonb arguments on both sides, and we see the function that has that is called jsonb_contains. So the equivalent to jsonbcolumn #> jsonbvalue is jsonb_contains(jsonbcolumn, jsonbvalue). Now you can't use the function in either JPQL or CriteriaBuilder, unless you register it through a custom Dialect if you're using Hibernate. If you're using EclipseLink, I don't know the situation there.
From here on, your options are to use native queries, or add your own Hibernate Dialect by extending an existing one.
Replacing "#>" with "jsonb_contains()" is not a good idea. The operator is indexed, not the function. Example: https://dbfiddle.uk/-xMuHYAA

Operator ~<~ in Postgres

(Originally part of this question, but it was bit irrelevant, so I decided to make it a question of its own.)
I cannot find what the operator ~<~ is. The Postgres manual only mentions ~ and similar operators here, but no sign of ~<~.
When fiddling in the psql console, I found out that these commands give the same results:
SELECT * FROM test ORDER BY name USING ~<~;
SELECT * FROM test ORDER BY name COLLATE "C";
And these gives the reverse ordering:
SELECT * FROM test ORDER BY name USING ~>~;
SELECT * FROM test ORDER BY name COLLATE "C" DESC;
Also some info on the tilde operators:
\do ~*~
List of operators
Schema | Name | Left arg type | Right arg type | Result type | Description
------------+------+---------------+----------------+-------------+-------------------------
pg_catalog | ~<=~ | character | character | boolean | less than or equal
pg_catalog | ~<=~ | text | text | boolean | less than or equal
pg_catalog | ~<~ | character | character | boolean | less than
pg_catalog | ~<~ | text | text | boolean | less than
pg_catalog | ~>=~ | character | character | boolean | greater than or equal
pg_catalog | ~>=~ | text | text | boolean | greater than or equal
pg_catalog | ~>~ | character | character | boolean | greater than
pg_catalog | ~>~ | text | text | boolean | greater than
pg_catalog | ~~ | bytea | bytea | boolean | matches LIKE expression
pg_catalog | ~~ | character | text | boolean | matches LIKE expression
pg_catalog | ~~ | name | text | boolean | matches LIKE expression
pg_catalog | ~~ | text | text | boolean | matches LIKE expression
(12 rows)
The 3rd and 4th rows is the operator I'm looking for, but the description is a bit insufficient for me.
~>=~, ~<=~, ~>~ and ~<~ are text pattern (or varchar, basically the same) operators, the counterparts of their respective siblings >=, <=, >and <. They sort character data strictly by their byte values, ignoring rules of any collation setting (as opposed to their siblings). This makes them faster, but also invalid for most languages / countries.
The "C" locale is effectively the same as no locale, meaning no collation rules. That explains why ORDER BY name USING ~<~ and ORDER BY name COLLATE "C" end up doing the same. The latter syntax variant should be preferred: more standard, less error-prone.
Detailed explanation in the last chapter of this related answer on dba.SE:
Pattern matching with LIKE, SIMILAR TO or regular expressions in PostgreSQL
Note that ~~ / ~~* are Postgres operators for LIKE / ILIKE and barely related to the above. Similarly, !~~ / !~~* for NOT LIKE / NOT IILKE. (Use standard LIKE notation instead of these "internal" operators.)
Related:
~~ Operator In Postgres
Symfony2 Doctrine - ILIKE clause for PostgreSQL?
Find rows where text array contains value similar to input

PG::UndefinedFunction: ERROR: operator does not exist: geometry && box

Why does PostgreSQL complain that the && operator does not exist? (I have PostGIS installed - see below).
mydb=# SELECT "monuments".* FROM "monuments" WHERE
mydb=# (coord && '-10,-10,10,10'::box)
mydb=# ORDER BY created_at DESC ;
ERROR: operator does not exist: geometry && box
LINE 1: ...LECT "monuments".* FROM "monuments" WHERE (coord && '-10...
^
HINT: No operator matches the given name and argument type(s). You might need to add explicit type casts.
I have PostGIS installed:
mydb=# select postgis_full_version();
NOTICE: Function postgis_topology_scripts_installed() not found. Is topology support enabled and topology.sql installed?
postgis_full_version
----------------------------------------------------------------------------------------------------------------------------------------------------------------
POSTGIS="2.1.0 r11822" GEOS="3.3.8-CAPI-1.7.8" PROJ="Rel. 4.8.0, 6 March 2012" GDAL="GDAL 1.10.0, released 2013/04/24" LIBXML="2.9.1" LIBJSON="UNKNOWN" RASTER
And by the way, my table looks like this:
mydb=# \d monuments
id | integer | not null default nextval('monuments_id_seq'::regclass)
coord | geometry(Point,3785) |
Let me know if you need any more info.
box is a built-in PostgreSQL primitive geometric type, like point.
postgres=> \dT box
List of data types
Schema | Name | Description
------------+------+------------------------------------------
pg_catalog | box | geometric box '(lower left,upper right)'
(1 row)
PostGIS uses its own geometry type, and doesn't generally inter-operate well with the PostgreSQL built-in basic geometric types. These are the supported data type combinations for && with PostGIS 2 on my PostgreSQL 9.3 install:
postgres=# \do &&
List of operators
Schema | Name | Left arg type | Right arg type | Result type | Description
------------+------+---------------+----------------+-------------+-----------------
pg_catalog | && | anyarray | anyarray | boolean | overlaps
pg_catalog | && | anyrange | anyrange | boolean | overlaps
pg_catalog | && | box | box | boolean | overlaps
pg_catalog | && | circle | circle | boolean | overlaps
pg_catalog | && | polygon | polygon | boolean | overlaps
pg_catalog | && | tinterval | tinterval | boolean | overlaps
pg_catalog | && | tsquery | tsquery | tsquery | AND-concatenate
public | && | geography | geography | boolean |
public | && | geometry | geometry | boolean |
public | && | geometry | raster | boolean |
public | && | raster | geometry | boolean |
public | && | raster | raster | boolean |
(12 rows)
You'll see that box is supported for box && box but not box && geometry. Since your coord column is a geometry type, you'll need to convert the box to geometry, so as to end up with geometry && geometry.
Example:
WHERE (coord && geometry(polygon('((-10, -10), (10, 10))'::box)))
The simplest explanation would be that you installed the extension into some schema that is not in your current search_path.
Did you know, that you can even "schema-qualify" operators? Like:
SELECT 3 OPERATOR(pg_catalog.+) 4;
Or:
SELECT * FROM public.monuments
WHERE coord OPERATOR(my_postgis_schema.&&) '-10,-10,10,10'::box);
This way you could make your query independent of the current search_path. Better though, to fix it.

Bit masking in Postgres

I have this query
SELECT * FROM "functions" WHERE (models_mask & 1 > 0)
and the I get the following error:
PGError: ERROR: operator does not exist: character varying & integer
HINT: No operator matches the given name and argument type(s). You might need to add explicit type casts.
The models_mask is an integer in the database. How can I fix this.
Thank you!
Check out the docs on bit operators for Pg.
Essentially & only works on two like types (usually bit or int), so model_mask will have to be CASTed from varchar to something reasonable like bit or int:
models_mask::int & 1 -or- models_mask::int::bit & b'1'
You can find out what types an operator works with using \doS in psql
pg_catalog | & | bigint | bigint | bigint | bitwise and
pg_catalog | & | bit | bit | bit | bitwise and
pg_catalog | & | inet | inet | inet | bitwise and
pg_catalog | & | integer | integer | integer | bitwise and
pg_catalog | & | smallint | smallint | smallint | bitwise and
Here is a quick example for more information
# SELECT 11 & 15 AS int, b'1011' & b'1111' AS bin INTO foo;
SELECT
# \d foo
Table "public.foo"
Column | Type | Modifiers
--------+---------+-----------
int | integer |
bin | "bit" |
# SELECT * FROM foo;
int | bin
-----+------
11 | 1011