What should be here instead of "???" PostgreSQL - postgresql

It's giving me errors because of the question marks so I was wondering what should I replace them with?I'm quite new to this so sorry if I phrased it badly.
create unique index product_meta_table_productid_uindex
on product_meta_table using ??? ("productId");
If there is anything else you guys need I will be very happy to do so, thanks in advance

using ... defines an index type.
It is optional and when left out, will create a B-Tree index which is a sensible choice for a unique index on a string or number column.
So
create unique index product_meta_table_productid_uindex
on product_meta_table ("productId");
is the same as
create unique index product_meta_table_productid_uindex
on product_meta_table using btree ("productId");

Related

GIN Index implementation

Generally Trigram Indexes are supposed to store the trigrams of the values in the index value.
I have understood the structure of GIN Index and how they store the values.
One thing I am stuck with is, whether they would store the trigrams of the texts given or the texts themselves.
I've read some articles and they all show gin index storing words with tsvector
Now If this is the case, GIN index shouldn't be working for searches like
SELECT * FROM table WHERE data LIKE '%word%';
But it seems to work for such a case too. I have used a database of a million rows where the column I'm searching on is a random text of size 30. I haven't used tsvector since the column is just a single word of size 30.
Example Column Value: bVeADxRVWpCeEHyNLxxfkfVkSAKkKw
But on using gin index on this column using trgm_gin_ops,
The fuzzy search seems to be much much faster. It works well.
But if gin is just storing the words as it is shown in the above image, it should'nt work for %word%. but it does, which leads me to ask the question: are gin indexes simply made up of the text values or the trigrams of the text values ?
My whole question can be simplified into this:
If I create an index a column with values like this 'bVeADxRVWpCeEHyNLxxfkfVkSAKkKw', would gin simply index this value or would it store the trigrams of the values in it's index tree. (bVe, VeA, eAD,...., kKw)
The G in GIN stands for generalized. It just works with a list of tokens per tuple-field to be indexed, but what that token actually represents depends on the operator class to define and extract. The default operator class for tsvector uses stemmed words, the operator class "gin_trgm_ops" (which is for text, but not the default one for text) uses trigrams. An example based on one will have limited applicability to the other. To understand it in a generalized way, you need to consider the tokens to just be labels. One token can point to many rows, and one row can be pointed to by many tokens. Once you get into what the tokens mean, that is the business of the operator class, not of the GIN machinery itself.
When using gin_trgm_ops, '%word%' breaks down to 'wor' and 'ord', both of which must be present in the index (for the same row) in order for '%word%' to possibly match. But 'ordinary worry' also has both of those trigrams in it, so it would pass the bitmap index scan but then be rejected by the recheck

How to efficiently index fields with an identical (and long) prefix in PostgreSQL?

I’m working with identifiers in a rather unusual format: every single ID has the same prefix and the prefix consists of as many as 25 characters. The only thing that is unique is the last part of the ID string and it has a variable length of up to ten characters:
ID
----------------------------------
lorem:ipsum:dolor:sit:amet:12345
lorem:ipsum:dolor:sit:amet:abcd123
lorem:ipsum:dolor:sit:amet:efg1
I’m looking for advice on the best strategy around indexing and matching this kind of ID string in PostgreSQL.
One approach I have considered is basically cutting these long prefixes out and only storing the unique suffix in the table column.
Another option that comes to mind is only indexing the suffix:
CREATE INDEX ON books (substring(book_id FROM 26));
I don’t think this is the best idea though as you would need to remember to always strip out the prefix when querying the table. If you forgot to do it and had a WHERE book_id = '<full ID here>' filter, the index would basically be ignored by the planner.
Most times I always create an integer type ID for my tables if even I have one unique string type of field. Recommendation for you is a good idea, I must view all your queries in DB. If you are recently using substring(book_id FROM 26) after the where statement, this is the best way to create expression index (function-based index). Basically, you need to check table joining conditions, which fields are used in the joining processes, and which fields are used after WHERE statements in your queries. After then you can prepare the best plan for creating indexes. If on the process of table joining you are using last part unique characters on the ID field then this is the best way to extract unique last characters and store this in additional fields or create expression index using the function for extracting unique characters.

Is there anyway to make the unique index ignore old data?

What I need to do is to block rows from being duplicade at the values number, serie and model.
My initial thought was to do something like this:
CREATE UNIQUE INDEX idx_fiscal_number
ON Fiscal(number, serie, model);
The problem is that this database is really old, and there's a lot of data already duplicated. So my question is:
Is there anyway to make this unique index start validating now and accept the old data already there?
You can try to create a partial index if you can express the condition "old" with existing columns. For example, if you have a column "age"
CREATE UNIQUE INDEX idx_fiscal_number_partial
ON fiscal(number, serie, model)
WHERE age < 1000;
See https://www.postgresql.org/docs/current/indexes-partial.html.

Is it possible to index labels in Titan?

When I run a query like this:
g.V().hasLabel("myDefinedLabel").values("myKey").next()
It prints
20:46:30 WARN com.thinkaurelius.titan.graphdb.transaction.StandardTitanTx -
Query requires iterating over all vertices [(~label = myDefinedLabel)].
For better performance, use indexes
So, I guess to solve this issue, I need to index the label. Is there anyway to do this?
I tried doing the normal procedure to create an index described here in Titan docs but the label is not a regular property key to index.
Assuming I want to create a composite index for labels, how to do it?
Titan doesn't allow you to index labels, and according to the devs it isn't something they're interested in enabling.

Create index on first 3 characters (area code) of phone field?

I have a Postgres table with a phone field stored as varchar(10), but we search on the area code frequently, e.g.:
select * from bus_t where bus_phone like '555%'
I wanted to create an index to facilitate with these searches, but I got an error when trying:
CREATE INDEX bus_ph_3 ON bus_t USING btree (bus_phone::varchar(3));
ERROR: 42601: syntax error at or near "::"
My first question is, how do I accomplish this, but also I am wondering if it makes sense to index on the first X characters of a field or if indexing on the entire field is just as effective.
Actually, a plain B-tree index is normally useless for pattern matching with LIKE (~~) or regexp (~), even with left-anchored patterns, if your installation runs on any other locale than "C", which is the typical case. Here is an overview over pattern matching and indices in a related answer on dba.SE
Create an index with the varchar_pattern_ops operator class (matching your varchar column) and be sure to read the chapter on operator classes in the manual.
CREATE INDEX bus_ph_pattern_ops_idx ON bus_t (bus_phone varchar_pattern_ops);
Your original query can use this index:
... WHERE bus_phone LIKE '555%'
Performance of a functional index on the first 3 characters as described in the answer by #a_horse is pretty much the same in this case.
-> SQLfiddle demo.
Generally a functional index on relevant leading characters would be be a good idea, but your column has only 10 characters. Consider that the overhead per tuple is already 28 bytes. Saving 7 bytes is just not substantial enough to make a big difference. Add the cost for the function call and the fact that xxx_pattern_ops are generally a bit faster.
In Postgres 9.2 or later the index on the full column can also serve as covering index in index-only scans.
However, the more characters in the columns, the bigger the benefit from a functional index.
You may even have to resort to a prefix index (or some other kind of hash) if the strings get too long. There is a maximum length for indices.
If you decide to go with the functional index, consider using the xxx_pattern_ops variant for a small additional performance benefit. Be sure to read about the pros and cons in the manual and in Peter Eisentraut's blog entry:
CREATE INDEX bus_ph_3 ON bus_t (left(bus_phone, 3) varchar_pattern_ops);
Explain error message
You'd have to use the standard SQL cast syntax for functional indices. This would work - pretty much like the one with left(), but like #a_horse I'd prefer left().
CREATE INDEX bus_ph_3 ON bus_t USING btree (cast(bus_phone AS varchar(3));
When using like '555%' an index on the complete column will be used just as well. There is no need to only index the first three characters.
If you do want to index only the first 3 characters (e.g. to save space), then you could use the left() funcion:
CREATE INDEX bus_ph_3 ON bus_t USING btree (left(bus_phone,3));
But in order for that index to be used, you would need to use that expression in your where clause:
where left(bus_phone,3) = '555';
But again: that is most probably overkill and the index on the complete column will be good enough and can be used for other queries as well e.g. bus_phone = '555-1234' which the index on just the first three characters would not.