Unable to create bloom index - postgresql

I'm new to bloom indexes. I'm referring https://habr.com/en/company/postgrespro/blog/452968/ link to learn about new indexes.
When I was trying to create bloom index on own test table, I got below error:
SQL Error [42704]: ERROR: data type bigint has no default operator class for access method "bloom"
Hint: You must specify an operator class for the index or define a default operator class for the data
type.
No doubt, because in my table I have a column where I'm using bigint datatype and the same column I'm including in my index creation.
To avoid that error, I tried to create my own class for bigint datatype. Like below:
CREATE OPERATOR CLASS bigint_ops
DEFAULT FOR TYPE int USING bloom AS
OPERATOR 1 =(bigint,bigint),
FUNCTION 1 hashbigint;
and I got below error:
SQL Error [42883]: ERROR: could not find a function named "hashbigint"
Any help to avoid this error will be much appreciated.

The hashing function for bigint is hashint8, not hashbigint. I found this by running the query in the post you linked and filtering to where the type is 'bigint'.
testdb=# with t0 as (select distinct
opc.opcintype::regtype::text,
amop.amopopr::regoperator,
ampr.amproc
from pg_am am, pg_opclass opc, pg_amop amop, pg_amproc ampr
where am.amname = 'hash'
and opc.opcmethod = am.oid
and amop.amopfamily = opc.opcfamily
and amop.amoplefttype = opc.opcintype
and amop.amoprighttype = opc.opcintype
and ampr.amprocfamily = opc.opcfamily
and ampr.amproclefttype = opc.opcintype
order by opc.opcintype::regtype::text) select * from t0 where opcintype='bigint';
opcintype | amopopr | amproc
-----------+------------------+----------
bigint | =(bigint,bigint) | hashint8
(1 row)
There's also an error in your CREATE OPERATOR statement; it needs to be DEFAULT FOR TYPE bigint, not int.
testdb=# create extension bloom;
CREATE EXTENSION
testdb=# CREATE OPERATOR CLASS bigint_ops
DEFAULT FOR TYPE int USING bloom AS
OPERATOR 1 =(bigint,bigint),
FUNCTION 1 hashint8;
ERROR: could not make operator class "bigint_ops" be default for type pg_catalog.int4
DETAIL: Operator class "int4_ops" already is the default.
testdb=# CREATE OPERATOR CLASS bigint_ops
DEFAULT FOR TYPE bigint USING bloom AS
OPERATOR 1 =(bigint,bigint),
FUNCTION 1 hashint8;
CREATE OPERATOR CLASS
testdb=#

Related

Postgres: create index on attribute of attribute in JSONB column?

I'm working in Postgres 9.6.5. I have the following table:
id | integer
data | jsonb
The data in the data column is nested, in the form:
{ 'identification': { 'registration_number': 'foo' }}
I'd like to index registration_number, so I can query on it. I've tried this (based on this answer):
CREATE INDEX ON mytable((data->>'identification'->>'registration_number'));
But got this:
ERROR: operator does not exist: text ->> unknown
LINE 1: CREATE INDEX ON psc((data->>'identification'->>'registration... ^
HINT: No operator matches the given name and argument type(s). You might need to add explicit type casts.
What am I doing wrong?
You want:
CREATE INDEX ON mytable((data -> 'identification' ->> 'registration_number'));
The -> operator returns the jsonb object under the key, and the ->> operator returns the jsonb object under the key as text. The most notable difference between the two operators is that ->> will "unwrap" string values (i.e. remove double quotes from the TEXT representation).
The error you're seeing is reported because data ->> 'identification' returns text, and the subsequent ->> is not defined for the text type.
Since version 9.3 Postgres has the #> and #>> operators. This operators allow the user to specify a path (using an array of text) inside jsonb column to get the value.
You could use this operator to achieve your goal in a simpler way.
CREATE INDEX ON mytable((data #>> '{identification, registration_number}'));

How to check if an unbounded range is NULL on the right in PostgreSQL 9.3 - ensuring GIST indexes are used

I'm using range datatypes in PG 9.3 (with btree_gist enabled, though I don't think it matters). I have GIST indexes that include these range columns. Some are int8range and some are tsrange.
I want to query with a WHERE expression essentially saying "range is NULL (unbounded) on the right side". How do I write that?
For tsrange, I can do "tsrange #> timestamp 'infinity'". But there's no equivalent for int8range. And I assume the way to do this properly for int8range should be the way for tsrange as well (not relying on timestamp-specific treatment of 'infinity').
The expression should be usable for GIST indexes (i.e. falls into the default operator class for these range types).
Help?
From the fine manual: http://www.postgresql.org/docs/9.4/static/functions-range.html
The upper_inf function will tell you that.
# select upper_inf(int8range(1, null));
upper_inf
-----------
t
(1 row)
# select upper_inf(int8range(1, 2));
upper_inf
-----------
f
(1 row)
If you need to query on that, I don't think that indexes will help you.
http://www.postgresql.org/docs/9.4/static/rangetypes.html
A GiST or SP-GiST index can accelerate queries involving these range
operators: =, &&, <#, #>, <<, >>, -|-, &<, and &> (see Table 9-47
for more information).
You can create a partial index that will help with that query though. e.g.
# create table foo (id int primary key, bar int8range);
CREATE TABLE
# create index on foo(bar) where upper_inf(bar) = true;
CREATE INDEX
# \d foo
Table "freshop.foo"
Column | Type | Modifiers
--------+-----------+-----------
id | integer | not null
bar | int8range |
Indexes:
"foo_pkey" PRIMARY KEY, btree (id)
"foo_bar_idx" btree (bar) WHERE upper_inf(bar) = true
Then if you put upper_inf(bar) = true into a query, the optimizer should understand to use the foo_upper_inf_idx index.

PostgreSQL index array of int4range using GIN - custom operator class

Here is my table:
CREATE TABLE
mytable
(
id INT NOT NULL PRIMARY KEY,
val int4range[]
);
I want to index the val column:
CREATE INDEX
ix_mytable_val
ON mytable
USING GIN (INT4RANGE(val, '[]')); // error, as is GIN(val)
I came up with the following:
CREATE OPERATOR CLASS gin_int4range_ops
DEFAULT FOR TYPE int4range[] USING gin AS
OPERATOR 1 <(anyrange,anyrange),
OPERATOR 2 <=(anyrange,anyrange),
OPERATOR 3 =(anyrange,anyrange),
OPERATOR 4 >=(anyrange,anyrange),
OPERATOR 5 >(anyrange,anyrange),
FUNCTION 1 lower(anyrange),
FUNCTION 2 upper(anyrange),
FUNCTION 3 isempty(anyrange),
FUNCTION 4 lower_inc(anyrange),
FUNCTION 5 upper_inc(anyrange);
But when I try to create the index, it fails (error below). However, if I call the create from within a DO $$ block, it executes.
If the create index executed, I get the error on INSERT INTO instead.
"ERROR: cache lookup failed for type 1"
I also tried this:
OPERATOR 1 &&(anyrange,anyrange),
OPERATOR 2 <#(anyrange,anyrange),
OPERATOR 3 #>(anyrange,anyrange),
OPERATOR 4 =(anyrange,anyrange),
In order to try and solve this, I have rebooted PG, the machine, and vacuumed the DB. I believe there is an error in the CREATE OPERATOR code.
If I can index an array of custom type of (int, int4range), that would be even better.
I've spent quite some time (a full day) wading through documentation, forums, etc., but can find nothing that really helps me to understand how to solve this (i.e. create a working custom operator class).
You need to CREATE OPERATOR CLASS based on Range Functions and Operators, for example:
CREATE OPERATOR CLASS gin_int4range_ops
DEFAULT FOR TYPE int4range[] USING gin AS
OPERATOR 1 =(anyrange,anyrange),
FUNCTION 1 lower(anyrange),
FUNCTION 2 upper(anyrange),
FUNCTION 3 isempty(anyrange),
FUNCTION 4 lower_inc(anyrange),
FUNCTION 5 upper_inc(anyrange);
Now you can CREATE INDEX:
CREATE INDEX ix_mytable4_vhstore_low
ON mytable USING gin (val gin_int4range_ops);
Check also:
Operator Classes and Operator Families
CREATE OPERATOR CLASS
The following query shows all defined operator classes:
SELECT am.amname AS index_method,
opc.opcname AS opclass_name
FROM pg_am am, pg_opclass opc
WHERE opc.opcmethod = am.oid
ORDER BY index_method, opclass_name;
This query shows all defined operator families and all the operators included in each family:
SELECT am.amname AS index_method,
opf.opfname AS opfamily_name,
amop.amopopr::regoperator AS opfamily_operator
FROM pg_am am, pg_opfamily opf, pg_amop amop
WHERE opf.opfmethod = am.oid AND
amop.amopfamily = opf.oid
ORDER BY index_method, opfamily_name, opfamily_operator;

How to create an operator in PostgreSQL for the hstore type with an int4range value

I have a table with an HSTORE column 'ext', where the value is an int4range. An example:
"p1"=>"[10, 18]", "p2"=>"[24, 32]", "p3"=>"[29, 32]", "p4"=>"[18, 19]"
However, when I try to create an expression index on this, I get an error:
CREATE INDEX ix_test3_p1
ON test3
USING gist
(((ext -> 'p1'::text)::int4range));
ERROR: data type text has no default operator class for access method
"gist" SQL state: 42704 Hint: You must specify an operator class for
the index or define a default operator class for the data type.
How do I create the operator for this?
NOTE
Each record may have its own unique set of keys. Each key represents an attribute, and the values the value range. So not all records will have "p1". Consider this an EAV model in hstore.
I don't get that error - I get "functions in index expression must be marked IMMUTABLE"
CREATE TABLE ht (ext hstore);
INSERT INTO ht VALUES ('p1=>"[10,18]"'), ('p1=>"[99,99]"');
CREATE INDEX ht_test_idx ON ht USING GIST ( ((ext->'p1'::text)::int4range) );
ERROR: functions in index expression must be marked IMMUTABLE
CREATE FUNCTION foo(hstore) RETURNS int4range LANGUAGE SQL AS $$ SELECT ($1->'p1')::int4range; $$ IMMUTABLE;
CREATE INDEX ht_test_idx ON ht USING GIST ( foo(ext) );
SET enable_seq_scan=false;
EXPLAIN SELECT * FROM ht WHERE foo(ext) = '[10,19)';
QUERY PLAN
-----------------------------------------------------------------------
Index Scan using ht_test_idx on ht (cost=0.25..8.52 rows=1 width=32)
Index Cond: (foo(ext) = '[10,19)'::int4range)
I'm guessing the cast isn't immutable because you can change the default format of the range from inclusive...exclusive "[...)" to something else. You presumably won't be doing that though.
Obviously you'll want your real function to deal with things like missing "p1" entries, badly formed range values etc.

Postgres query error

I have a query in postgres
insert into c_d (select * from cd where ak = '22019763');
And I get the following error
ERROR: column "region" is of type integer but expression is of type character varying
HINT: You will need to rewrite or cast the expression.
An INSERT INTO table1 SELECT * FROM table2 depends entirely on order of the columns, which is part of the table definition. It will line each column of table1 up with the column of table2 with the same order value, regardless of names.
The problem you have here is whatever column from cd with the same order value as c_d of the table "region" has an incompatible type, and an implicit typecast is not available to clear the confusion.
INSERT INTO SELECT * statements are stylistically bad form unless the two tables are defined, and will forever be defined, exactly the same way. All it takes is for a single extra column to get added to cd, and you'll start getting errors about extraneous extra columns.
If it is at all possible, what I would suggest is explicitly calling out the columns within the SELECT statement. You can call a function to change type within each of the column references (or you could define a new type cast to do this implicitly -- see CREATE CAST), and you can use AS to set the column label to match that of your target column.
If you can't do this for some reason, indicate that in your question.
Check out the PostgreSQL insert documentation. The syntax is:
INSERT INTO table [ ( column [, ...] ) ]
{ DEFAULT VALUES | VALUES ( { expression | DEFAULT } [, ...] ) | query }
which here would look something like:
INSERT INTO c_d (column1, column2...) select * from cd where ak = '22019763'
This is the syntax you want to use when inserting values from one table to another where the column types and order are not exactly the same.