How to make fulltext search in PostgreSQL useful?

How to make fulltext search in PostgreSQL useful? - postgresql

I have a russian dictionary in postgresql 8.4.
I try to use full search but have some troubles.
I dont get any results becouse try to find words by 4-5symbols. For example:
select * from parcels_temp where name_dispatcher ## to_tsquery('Нику');
Get result: 0 rows.
select * from parcels_temp where name_dispatcher ## to_tsquery('Никуд');
Get result: 2 rows. its correct.
I try to do search by words not contained in dictionary. What i gonna do in this case? How can i update dictionary in PostgreSQL?
Its must create column to tsvector or i can use to_tsvector function i queries? Or its more slowly?

Postgresql indexer does not process strings shorter than 3 characters. It is done to reduce index size and generation time.
Obviously, you get nothing. There is way to update dictionary, refer to documentation.
Use GIN index on tsvector field.

1: By default, postgreSQL only matches the whole word. You can opt in to do a prefix matching:
select * from parcels_temp where name_dispatcher ## to_tsquery('Нику:*');
See: https://www.postgresql.org/docs/9.5/static/textsearch-controls.html
3: You can create a column and GIN index on it, or you can just create the index without creating a column. I think they are the same in terms of performance. For details, see:
https://www.postgresql.org/docs/9.5/static/textsearch-tables.html

Related

How would I diagnose what error seems to lead to non-functional underscore wildcard queries in Postgresql 15?

I am working through a quick refresher ('SQL Handbook' by Flavio Copes), and any LIKE or ILIKE query I use with the underscore wildcard returns no results.
The table is created as such:
CREATE TABLE people (
names CHAR(20)
);
INSERT INTO people VALUES ('Joe'), ('John'), ('Johanna'), ('Zoe');
Given this table, I use the following query:
SELECT * FROM people WHERE names LIKE '_oe';
I expect it to return
names
1
Joe
2
Zoe
Instead, it returns
names
The install is PostgreSQL 15 (x64), pgAdmin 4, and PostGIS v3.3.1

Using char(20) means all strings are exactly 20 chars long, being padded with spaces out to that length. The spaces make it not match the pattern, as there is nothing in the pattern to accommodate spaces at the end.
If you make the pattern be '_oe%' it would work. Or better yet, don't use char(20).

Timescaledb - How to display chunks of a hypertable in a specific schema

I have a table named conditions on a schema named test. I created a hypertable and inserted hundreds of rows.
When I run select show_chunks(), it works and displays chunks but I cannot use the table name as parameter as suggested in the manual. This does not work:
SELECT show_chunks("test"."conditions");
How can I fix this?
Ps: I want to query the chunk itself by its name? How can I do this?

The show_chunks expects a regclass, which depending on your current search path means you need to schema qualify the table.
The following should work:
SELECT public.show_chunks('test.conditions');
The double quotes are only necessary if your table is a delimited identifier, for example if your tablename contains a space, you would need to add the double quotes for the identifier. You will still need to wrap it in single quotes though:
SELECT public.show_chunks('test."equipment conditions"');
SELECT public.show_chunks('"test schema"."equipment conditions"');
For more information about identifier quoting:
https://www.postgresql.org/docs/current/sql-syntax-lexical.html#SQL-SYNTAX-IDENTIFIERS
Edit: Addressing the PS:
I want to query the chunk itself by its name? How can I do this?
feike=# SELECT public.show_chunks('test.conditions');
show_chunks
--------------------------------------------
_timescaledb_internal._hyper_28_1176_chunk
_timescaledb_internal._hyper_28_1177_chunk
[...]
SELECT * FROM _timescaledb_internal._hyper_28_1176_chunk;

Indexing using pg_trgm module not working while using more than one OR operator in the where condition inside the query in postgresql 9.4

Indexing using pg_trgm module not working for me while using more than one OR operator in the where condition. Query is given below.
EXPLAIN ANALYZE
SELECT *
FROM registration
where
data->>'firstName' LIKE '%lali%' OR
data->>'spouseName' LIKE '%lali%' OR
data->>'vhnName' LIKE '%lali%' OR
data ->> 'subCentreName' LIKE '%lali%'
When I run the above I get sequence scan but I expect index scan to happen.
I am using JsonB.
I tried out both gist and gin indexing for all the columns mentioned in the where clause in above query but indexing didn't work for both.
How to do indexing for the case if we use multiple OR operators inside where clause?

A pattern starting with % is worthless in a index search since it can start with anything.

Tsquery return exact matched keyword

I have a query like
select * from mytable where posttext ## to_tsquery('Intelence');
I just want to return results with exact match of the keyword 'Intelence' rather than 'intel', how can I do this in postgresql?
Thanks.

This is not possible with full-text search unless you want to tell PostgreSQL not to stem Intelence at all by changing the text search dictionary. Pg doesn't include the word in the index, only the stems:
regress=> SELECT to_tsvector('english', 'Intelence');
to_tsvector
-------------
'intel':1
(1 row)
You can suppress stemming entirely with the simple dictionary:
regress=> SELECT to_tsvector('simple','Intelence');
to_tsvector
---------------
'intelence':1
(1 row)
but that must be done on the index, you can't do it per-query let alone per search term. So the text cats are bothering me would not match a search for cat in the simple dictionary because of the plural, or bother because the unstemmed words are not the same.
If you want to make individual exceptions you can edit the english dictionary used by tsearch2 and define a custom dictionary with the desired changes, then use that dictionary instead of english in queries where you want the exceptions. Again, you must use the same dictionary for the index creation and the queries, though.
This might land up with you needing multiple fulltext indexes, which is very undesirable from the point of view of slowing down updates/inserts/deletes and from a memory use efficiency perspective.

PostgreSql XML Text search

I have a text column in a table. We store XML in this column. Now I want to search for tags and values
Example data:
<bank>
<name>Citi Bank</name>
.....
.....
/<bank>
I would like to run the following query:
select * from xxxx where to_tsvector('english',xml_column) ## to_tsquery('<name>Citi Bank</name>')
This works fine but it also works for tags like name1 or no tag.
How do I have to setup my search in order for this to work so I get an exact match for the tag and value ?

You could use the xpath function like this
select *
from xxx
where xpath(xml_column, 'bank/name/text()') = 'CitiBank';
BUT it won't use the full-text search index. You could use a subquery to find probable matches and avoid full scans, and the xpath expression for getting correct answers, or create a function index if the queries are going to be always the same.

You might want to reconsider storing XML in a database, instead you could look at inserting the data into related tables, since using XML is a poor replacement for a relational store. Even if you go with XML in database, use the XML type, not the TEXT type, and create an index like this (yes, basically you'd need an index per xpath expression):
CREATE INDEX my_funcidx ON my_table USING GIN ( CAST(xpath('/bank/name/text()', xmlfield) AS TEXT[]) );
then, query it like this:
SELECT * FROM my_table WHERE CAST(xpath('/bank/name/text()', xmlfield) AS TEXT[]) #> '{Citi Bank}'::TEXT[];
and this will use the index, as EXPLAIN will indicate.
The important part is the CASTing to TEXT[], as XML[], which the xpath function returns, isn't indexable by default.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

How to make fulltext search in PostgreSQL useful? - postgresql

Postgresql indexer does not process strings shorter than 3 characters. It is done to reduce index size and generation time. Obviously, you get nothing. There is way to update dictionary, refer to documentation. Use GIN index on tsvector field.

Related

How would I diagnose what error seems to lead to non-functional underscore wildcard queries in Postgresql 15?

Timescaledb - How to display chunks of a hypertable in a specific schema

Indexing using pg_trgm module not working while using more than one OR operator in the where condition inside the query in postgresql 9.4

Tsquery return exact matched keyword

PostgreSql XML Text search

Categories

Resources