Say I have a Postgres table where one of its column has jsonb type to store the Multilanguage value.
Here's the example.
| word |
|----------------------------------------|
| {"en": "Hello", "es": "Hola"} |
| {"it": "Rosso", "es": "Rojo"} |
|{"en": "Sea", "it": "Mare", "es": "Mar"}|
As you can see, the values do not always have all language version. My question is how I can select only the English version or any first value from json if the English version does not exist?
Thank you.
You'll have to unpack the JSON:
SELECT kv.value
FROM atable
CROSS JOIN LATERAL jsonb_each(atable.word) AS kv(key,value)
WHERE atable.id = ...
ORDER BY kv.key <> 'en'
LIMIT 1;
This relies on the fact that FALSE < TRUE.
Related
I'm trying to query jsonb field via Postgrex adapter, however I receive errors I cannot understand.
Notification schema
def all_for(user_id, external_id) do
from(n in __MODULE__,
where: n.to == ^user_id and fragment("? #> '{\"external_id\": ?}'", n.data, ^external_id)
)
|> order_by(desc: :id)
end
it generates the following sql
SELECT n0."id", n0."data", n0."to", n0."inserted_at", n0."updated_at" FROM "notifications"
AS n0 WHERE ((n0."to" = $1) AND n0."data" #> '{"external_id": $2}') ORDER BY n0."id" DESC
and then I receive the following error
↳ :erl_eval.do_apply/6, at: erl_eval.erl:680
** (Postgrex.Error) ERROR 22P02 (invalid_text_representation) invalid input syntax for type json. If you are trying to query a JSON field, the parameter may need to be interpolated. Instead of
p.json["field"] != "value"
do
p.json["field"] != ^"value"
query: SELECT n0."id", n0."data", n0."to", n0."inserted_at", n0."updated_at" FROM "notifications" AS n0 WHERE ((n0."to" = $1) AND n0."data" #> '{"external_id": $2}') ORDER BY n0."id" DESC
Token "$" is invalid.
(ecto_sql 3.9.1) lib/ecto/adapters/sql.ex:913: Ecto.Adapters.SQL.raise_sql_call_error/1
(ecto_sql 3.9.1) lib/ecto/adapters/sql.ex:828: Ecto.Adapters.SQL.execute/6
(ecto 3.9.2) lib/ecto/repo/queryable.ex:229: Ecto.Repo.Queryable.execute/4
(ecto 3.9.2) lib/ecto/repo/queryable.ex:19: Ecto.Repo.Queryable.all/3
however if I just copypaste generated sql to psql console and run it, it will succeed.
SELECT n0."id", n0."data", n0."to", n0."inserted_at", n0."updated_at" FROM "notifications" AS n0 WHERE ((n0."to" = 233) AND n0."data" #> '{"external_id": 11}') ORDER BY n0."id" DESC
notifications-# ;
id | data | to | inserted_at | updated_at
----+---------------------+-----+---------------------+---------------------
90 | {"external_id": 11} | 233 | 2022-12-15 14:07:44 | 2022-12-15 14:07:44
(1 row)
data is jsonb column
Column | Type | Collation | Nullable | Default
-------------+--------------------------------+-----------+----------+-------------------------------------------
data | jsonb | | | '{}'::jsonb
What am I missing in my elixir notification query code?
Searching for solution I came across only using raw sql statement, as I couldn't figure out what's wrong with my query when it gets passed through Postgrex
so as a solution I found the following:
def all_for(user_id, external_ids) do
{:ok, result} =
Ecto.Adapters.SQL.query(
Notifications.Repo,
search_by_external_id_query(user_id, external_ids)
)
Enum.map(result.rows, &Map.new(Enum.zip(result.columns, &1)))
end
defp search_by_external_id_query(user_id, external_id) do
"""
SELECT * FROM "notifications" AS n0 WHERE ((n0."to" = #{user_id})
AND n0.data #> '{\"external_id\": #{external_id}}')
ORDER BY n0."id" DESC
"""
end
But as a result I'm receiving Array with Maps inside not with Ecto.Schema as if I've been using Ecto.Query through Postgrex, so be aware.
I create a migration to to convert a jsonb column into a one-to-many table.
-- upgrade
insert into device_component (
warranty_request_uuid,
serial_number,
component_type,
description
)
select
warranty_request.uuid,
value->>'serial_number',
value->>'type',
value->>'description'
from
warranty_request,
jsonb_array_elements(warranty_request.device_components)
I need to provide a corresponding downgrade statement. To revert the migration, I am trying something like the below.
-- downgrade
update
warranty_request
set
device_components = jsonb_set(
device_components,
'{}',
jsonb_build_object(
'serial_number', device_component.serial_number,
'type', device_component.component_type,
'description', device_component.description
)
)
from
device_component
where
warranty_request.uuid = device_component.warranty_request_uuid
The problem is that the device_components column contains only null values after downgrading. So nothing is inserted.
How, should the downgrade statement look like to make this work?
I want to upgrade and downgrade between these 2 formats.
Upgrade:
# warranty_request
uuid
-----
abc
# device_compoent
uuid | warranty_request_uuid | serial_number | device_type | description
-----|-----------------------|---------------|-------------|------------
efg | abc | 1 | foo | bar
hij | abc | 2 | foo | bar
Downgrade:
# warranty_request
uuid | device_components
------|----------------------------------------------------------------------------------------------------------------------
abc | [{"serial_number": 1, "type": "foo", "description": bar}, {"serial_number": 2, "type": "foo", "description": bar}]
You can try this :
UPDATE
warranty_request AS w
SET
device_components = a.device_components
FROM
( SELECT d.warranty_request_uuid
, jsonb_agg (jsonb_build_object
( 'serial_number', d.serial_number
, 'type', d.device_type
, 'description', d.description
)
) AS device_components
FROM device_component AS d
GROUP BY d.warranty_request_uuid
) AS a
WHERE w.uuid = a.warranty_request_uuid
see the demo result in dbfiddle.
I use PostgreSQL and I have a table 'PERSON' in schema 'public' that looks like this:
+----+-------------+-------+----------------------------+
| id | internal_id | name | created |
+----+-------------+-------+----------------------------+
| 1 | P0001-XX00 | Bob | 2021-05-24 22:10:01.93025 |
+----+-------------+-------+----------------------------+
| 2 | P0001-CX00 | Tom | 2021-06-27 22:10:01.93025 |
+----+-------------+-------+----------------------------+
| 3 | P0002-XX00 | Anna | 2021-05-24 22:10:01.93025 |
+----+-------------+-------+----------------------------+
id -> bigint; internal_id -> character varying; name -> character varying; created -> timestamp without timezone
I need to write procedure that delete those records that are older than fixed timestamp, for example: now(). But as soon as such an old record has been found, I need to check if there are other records in the table which are not old yet and with the same first 5 characters in internal_id as the found old record. If there are such records, then I should not delete the old record.
So I wrote the following procedure with plpgsql and it seems to work:
BEGIN
DELETE FROM public."PERSON" AS t1
WHERE t1.created < now()
AND NOT EXISTS
(
SELECT
FROM public."PERSON" AS t2
WHERE left(t2.internal_id, 5) = left(t1.internal_id, 5)
AND t2.created >= now()
);
COMMIT;
END;
Questions:
Could it have been made more correct or prettier or cleaner? Perhaps, instead of using the left() function, it was necessary to use LIKE, or, in principle, to do it somehow differently?
Do you think this procedure has normal performance or it can be improved?
Thank you in advance!
That should work just fine.
For good performance, create an index:
CREATE INDEX ON public."PERSON" (left(internal_id, 5), created);
I have the next table, and data:
/* script for people table, with field tsvector and gin */
CREATE TABLE public.people (
id INTEGER,
name VARCHAR(30),
lastname VARCHAR(30),
complete TSVECTOR
)
WITH (oids = false);
CREATE INDEX idx_complete ON public.people
USING gin (complete);
/* data for people table */
INSERT INTO public.people ("id", "name", "lastname", "complete")
VALUES
(1, 'MICHAEL', 'BRYANT BRYANT', '''bryant'':2,3 ''michael'':1'),
(2, 'HENRY STEVEN', 'BUSH TIESSEN', '''bush'':3 ''henri'':1 ''steven'':2 ''tiessen'':4'),
(3, 'WILLINGTON STEVEN', 'STEPHENS FLINN', '''flinn'':4 ''stephen'':3 ''steven'':2 ''willington'':1'),
(4, 'BRET', 'MARTINEZ AROCH', '''aroch'':3 ''bret'':1 ''martinez'':2'),
(5, 'TERENCE BERT', 'CAVALIERE ENRON', '''bert'':2 ''cavalier'':3 ''terenc'':1');
I need retrieve the names and lastnames, according the tsvector field. Actually I have the query:
SELECT * FROM people WHERE complete ## to_tsquery('WILLINGTON & FLINN');
And the result is right (the third record). BUT if I try with
SELECT * FROM people WHERE complete ## to_tsquery('STEVEN & FLINN');
/* the same record! */
I don't have results. Why? What can I do?
You should use the same language to search your table as the values in your field 'complete' where inserted.
Check the result of that query compared english and german:
select * ,
to_tsvector('english', concat_ws(' ', name, lastname )) as english,
to_tsvector('german', concat_ws(' ', name, lastname )) as german
from public.people
so that should work for you :
SELECT * FROM people WHERE complete ## to_tsquery('english','STEVEN & FLINN');
You are probably using a text search configuration where either STEVEN or FLINN are modified by stemming.
I can reproduce this here:
test=> SHOW default_text_search_config;
default_text_search_config
----------------------------
pg_catalog.german
(1 row)
test=> SELECT complete FROM public.people WHERE id = 3;
complete
-------------------------------------------------
'flinn':4 'stephen':3 'steven':2 'willington':1
(1 row)
test=> SELECT * FROM ts_debug('STEVEN & FLINN');
alias | description | token | dictionaries | dictionary | lexemes
-----------+-----------------+--------+---------------+-------------+---------
asciiword | Word, all ASCII | STEVEN | {german_stem} | german_stem | {stev}
blank | Space symbols | | {} | |
blank | Space symbols | & | {} | |
asciiword | Word, all ASCII | FLINN | {german_stem} | german_stem | {flinn}
(4 rows)
test=> SELECT * FROM public.people
WHERE complete ## to_tsquery('STEVEN & FLINN');
id | name | lastname | complete
----+------+----------+----------
(0 rows)
So you see, the German Snowball dictionary stems STEVEN to stev.
Since complete contains the unstemmed version steven, no match is found.
You should use the same text search configuration when you populate complete and in the query.
How do I check in postgres that a varchar contains 'aaa' or 'bbb'?
I tried myVarchar IN ('aaa', 'bbb') but, obviously, it's true when myvarchar is exactly equal to 'aaa' or 'bbb'.
for multiple similarity check the best fit in terms of speed and laconic syntax would be
SIMILAR TO '%(aaa|bbb|ccc)%'
you can use ANY & LIKE operators together.
SELECT * FROM "myTable" WHERE "myColumn" LIKE ANY( ARRAY[ '%aaa%', '%bbb%' ] );
Assuming this is your table:
CREATE TABLE t
(
myVarchar varchar
) ;
INSERT INTO t (myVarchar)
VALUES
('something aaa else'),
('also some bbb'),
('maybe ccc') ;
-- (some random data, this query is PostgreSQL specific)
INSERT INTO t (myVarchar)
SELECT
random()::varchar
FROM
generate_series(1, 10000) ;
SQL Standard approach:
You can do (in all SQL standard databases):
SELECT
*
FROM
t
WHERE
myVarchar LIKE '%aaa%' or myVarchar LIKE '%bbb%' ;
and you'll get:
| myvarchar |
| :----------------- |
| something aaa else |
| also some bbb |
PostgreSQL specific approaches
Specifically for PostgreSQL, you can use a (single) regex with multiple values to look for:
SELECT
*
FROM
t
WHERE
myVarchar ~ 'aaa|bbb' ;
| myvarchar |
| :----------------- |
| something aaa else |
| also some bbb |
dbfiddle here
If you need quick finds, you can use trigram indexes, like this:
CREATE EXTENSION pg_trgm; -- Only needed if extension not already installed
CREATE INDEX myVarchar_like_idx
ON t
USING GIST (myVarchar gist_trgm_ops);
... the query using LIKE will be much faster.