How to index a multilanguage entity in PostgreSQL - postgresql

Here I am creating table product_feature_text, having a 1:N relation with table product. As the application must support several user languages, a lang_code column is added to segment english texts from other language texts.
As I want to present the product features alphabetically ordered in every language, I have created four partial indexes with their specific collate. It is expected that all products features have title in all of the four languages, i.e., there will be 25% of rows with lang_code = 'ES', for example.
This is an oversimplification of the real case but enough to depict the situation.
create table product_feature_text (
id bigint generated by default as identity primary key,
-- reference to the parent product
product_id bigint not null,
-- language dependent columns
lang_code char(2),
title varchar,
foreign key (product_id) references product (id)
);
create index on product_feature_text (title collate "en-US") where lang_code = 'EN';
create index on product_feature_text (title collate "es-ES") where lang_code = 'ES';
create index on product_feature_text (title collate "fr_FR") where lang_code = 'FR';
create index on product_feature_text (title collate "de_DE") where lang_code = 'DE';
Is this the best index approach for the case?
Addendum from a comment: a typical query would be
select text
from product_feature
where product_id = 1024
and lang_code = 'FR'
order by title collate "fr_FR"
where product_id could be anything.

It depends on the intended use of the indexes.
If you want to use them for
SELECT ... FROM product_feature_text
WHERE lang_code = 'EN' AND ...
ORDER BY title COLLATE "en-US";
your indexes might be useful.
Also, if your query looks like
WHERE product_feature_text > 'bhd' COLLATE ...
it might help.
However, for most cases that I can envision, a single index whose collation doesn't matter would be better.
For the query in the addendum, the perfect index would be:
CREATE INDEX ON product_feature (product_id, title COLLATE "fr_FR")
WHERE lang_code = FR';

Related

Add data from one table to an array in another table based on matching value in postgres 15

I have two tables, one for reviews and another with urls for photos. I want to add the urls to an array in the reviews table. I want to match them based on the id for reviews they both have. Ideally I would be able to add the photos as a JSON object to the array, with one property for the photo id and another for the url. I'm new to postgres and SQL and struggling to come up with the query to be able to do this. Below is the sql for the two tables, below them is my attempt at a query:
CREATE TABLE IF NOT EXISTS public.reviews
(
id integer NOT NULL,
product_id integer NOT NULL,
rating integer,
date text COLLATE pg_catalog."default",
summary text COLLATE pg_catalog."default" NOT NULL,
body text COLLATE pg_catalog."default" NOT NULL,
recommend boolean,
reported boolean,
reviewer_name text COLLATE pg_catalog."default" NOT NULL,
reviewer_email text COLLATE pg_catalog."default" NOT NULL,
response text COLLATE pg_catalog."default",
helpfulness integer,
photos text[] COLLATE pg_catalog."default",
CONSTRAINT reviews_pkey PRIMARY KEY (id)
)
CREATE TABLE IF NOT EXISTS public.photos
(
id integer,
review_id integer,
url text COLLATE pg_catalog."default"
)
update reviews
set photos = array_append(photos, photos.url)
where photos.review_id = reviews.id;
You are missing a FROM clause for the photos table:
update reviews
set photos = array_append(reviews.photos, photos.url)
from photos
where photos.review_id = reviews.id;
Prefixing the photos column on the right hand side of the assignment isn't strictly necessary, but if there is a table and a column with the same name, I find this to be more readable.
Note that array_append(reviews.photos, photos.url) can also be written as reviews.photos || photos.url

Postgres - how to bulk insert table with foreign keys

I am looking to do a bulk insert into my postgreSQL database.
database is not yet live
postgreSQL 13
I have a temporary staging table which I bulk inserted data
TABLE public.temp_inverter_location
(
id integer ,
inverter_num_in_sld integer,
lift_requirements character varying,
geo_location_id integer NOT NULL (foreign key references geo_location.id),
location_name character varying,
project_info_id integer NOT NULL (foreign key references project_info.id)
)
I am trying to populate the two foreign key columns temp_inverter_location.geo_location_id and temp_inverter_location.project_info_id.
The two referenced tables are referenced by their id columns:
geo_location
CREATE TABLE public.geo_location
(
id integer,
country character varying(50) COLLATE pg_catalog."default",
region character varying(50) COLLATE pg_catalog."default",
city character varying(100) COLLATE pg_catalog."default",
location_name character varying COLLATE pg_catalog."default",
)
and
project_info
CREATE TABLE public.project_info
(
id integer
operation_name character varying,
project_num character varying(10),
grafana_site_num character varying(10)
)
I want to populate the correct foreign keys into the columns temp_inverter_location.geo_location_id and temp_inverter_location.project_info_id.
I am trying to use INSERT INTO SELECT to populate temp_inverter_location.geo_location_id with a JOIN that matches geo_location.location_name and temp_inverter_location.name.
I have tried this query however inverter_location.geo_location_id remains blank:
INSERT INTO temp_inverter_location(geo_location_id) SELECT geo_location.id FROM geo_location INNER JOIN temp_inverter_location ON geo_location.location_name=temp_inverter_location.location_name;
Please let me know if more info is needed, thanks!
I was able to resolve this issue using update referencing another table.
Basically, I updated the geo_location_id column using
UPDATE temp_inverter_location SET geo_location_id = geo_location.id FROM geo_location WHERE geo_location.location_name = temp_inverter_location.location_name;
and updated the project_info_id using
UPDATE load_table SET project_info_id = project_info.id FROM project_info WHERE project_info.operation_name = load_table.location_name;
It seems to have worked.

update fields of table by an other PostgreSQL

I have two tables
Table 1 :
CREATE TABLE public.my_line
(
id bigint NOT NULL,
geom geometry,
name character varying(254) COLLATE pg_catalog."default",
CONSTRAINT my_line_pkey PRIMARY KEY (id)
)
Table 2 :
CREATE TABLE public.ligne
(
id integer NOT NULL DEFAULT nextval('ligne_id_seq'::regclass),
name text COLLATE pg_catalog."default",
geom geometry,
CONSTRAINT ligne_pkey PRIMARY KEY (id)
)
I update the second by the first, like this :
update ligne set
name = my_line.name
from my_line
where ligne.id = my_line.id
It works good, but what I want to do is being able to update just the rows that make difference between the two tables. If you have an idea in-light me.
Cordially.
You need to check whether they are different in your WHERE clause. Try it like this:
UPDATE ligne
SET name = my_line.name
FROM my_line
WHERE ligne.id = my_line.id
AND ligne.name <> my_line.name
-- and whatever else you want to check for

Why postgresql recognize only numbers in full text search?

I learn full text search in postgresql and I need to make english dictionary with FTS. I made dictionary mydict_en. I calculate words with my dictionary and other case with simple dictionary.
CREATE TEXT SEARCH DICTIONARY mydict_en (
TEMPLATE = ispell,
DictFile = english,
AffFile = english,
StopWords = english
);
CREATE TEXT SEARCH CONFIGURATION public.mydict_en (PARSER = default);
ALTER TEXT SEARCH CONFIGURATION mydict_en ADD MAPPING
FOR email, url, url_path, host, file, version,
sfloat, float, int, uint,
numword, hword_numpart, numhword
WITH simple;
ALTER TEXT SEARCH CONFIGURATION mydict_en ADD MAPPING
FOR word, hword_part, hword
WITH mydict_en;
My test table (I add FTS field):
CREATE TABLE matches
(
id Serial NOT NULL,
opponents Varchar(1024) NOT NULL,
metaKeywords Varchar(2048),
metaDescription Varchar(1024),
score Varchar(100) NOT NULL,
primary key (id)
);
ALTER TABLE matches ADD COLUMN fts tsvector;
When I insert data to this table, for example:
INSERT INTO matches (opponents, metaKeywords, metaDescription, score)
VALUES ('heat - thunder', 'nba, ball', 'Heat plays at home.', '99 - 85');
I update my fts field based on priority:
UPDATE matches SET fts =
setweight( coalesce( to_tsvector('mydict_en', opponents),''),'A') ||
setweight( coalesce( to_tsvector('mydict_en', metaKeywords),''),'B') ||
setweight( coalesce( to_tsvector('mydict_en', metaDescription),''),'C') ||
setweight( coalesce( to_tsvector('mydict_en', score),''),'D');
And my fts contain this record:
'85':2 '99':1
Why it contain only numbers, where are words?

Postgresql - retrieving referenced fields in a query

I have a table created like
CREATE TABLE data
(value1 smallint references labels,
value2 smallint references labels,
value3 smallint references labels,
otherdata varchar(32)
);
and a second 'label holding' table created like
CREATE TABLE labels (id serial primary key, name varchar(32));
The rationale behind it is that value1-3 are a very limited set of strings (6 options) and it seems inefficient to enter them directly in the data table as varchar types. On the other hand these do occasionally change, which makes enum types unsuitable.
My question is, how can I execute a single query such that instead of the label IDs I get the relevant labels?
I looked at creating a function for it and stumbled at the point where I needed to pass the label holding table name to the function (there are several such (label holding) tables across the schema). Do I need to create a function per label table to avoid that?
create or replace function translate
(ref_id smallint,reference_table regclass) returns varchar(128) as
$$
begin
select name from reference_table where id = ref_id;
return name;
end;
$$
language plpgsql;
And then do
select
translate(value1, labels) as foo,
translate(value2, labels) as bar
from data;
This however errors out with
ERROR: relation "reference_table" does not exist
All suggestions welcome - at this point a can still alter just about anything...
CREATE TABLE labels
( id smallserial primary key
, name varchar(32) UNIQUE -- <<-- might want this, too
);
CREATE TABLE data
( value1 smallint NOT NULL REFERENCES labels(id) -- <<-- here
, value2 smallint NOT NULL REFERENCES labels(id)
, value3 smallint NOT NULL REFERENCES labels(id)
, otherdata varchar(32)
, PRIMARY KEY (value1,value2,value3) -- <<-- added primary key here
);
-- No need for a function here.
-- For small sizes of the `labels` table, the query below will always
-- result in hash-joins to perform the lookups.
SELECT l1.name AS name1, l2.name AS name2, l3.name AS name3
, d.otherdata AS the_data
FROM data d
JOIN labels l1 ON l1.id = d.value1
JOIN labels l2 ON l2.id = d.value2
JOIN labels l3 ON l3.id = d.value3
;
Note: labels.id -> labels.name is a functional dependency (id is the primary key), but that doesn't mean that you need a function. The query just acts like a function.
You can pass the label table name as string, construct a query as string and execute it:
sql = `select name from ` || reference_table_name || `where id = ` || ref_id;
EXECUTE sql INTO name;
RETURN name;