Select a column by another table's value in PostgreSQL - postgresql

I have Table users:
user_id | lang_id
--------+---------
12345 | en
54321 | ru
77777 | uz
and Table texts:
text_id | en | ru | uz
--------+--------+---------+-------
hi | Hello! | Привет! | Salom!
bye | Bye! | Пока! | Xayr!
I have two informations:
user_id = 12345
text_id = 'hi'
and I'm trying this query, to get a text for user's chosen language:
SELECT (SELECT lang_id FROM users WHERE user_id = 12345) FROM texts
WHERE text_id = 'hi'
and getting this:
lang_id
------
en
I should get the text "Hello!"
I'm kinda newbie in PostgreSQL would be good if you help me to solve this :)

While I 100% recommend changing your schema to something properly normalized like Jorge Campos suggests up in the comments, you can use some hard coding in a CASE statement to get at your texts.
SELECT
CASE
WHEN users.lang_id = 'en' THEN texts.en
WHEN users.lang_id = 'ru' THEN texts.ru
WHEN users.lang_id = 'uz' THEN texts.uz
END as user_language_text
FROM
users, texts
WHERE
user_id = 12345
AND text_id = 'hi';
There are some major downsides here though:
That CASE statement is costly from a CPU perspective
To determine which column in texts from which you retrieve your data you have to hard code the possibly language values. Meaning every time you add a new language not only do you have to add a new column to your table (major anti-pattern by itself) but you also have to tweak ALL of your SQL to accommodate.
You must cross join (or subquery without correlation) to derive the relationship between your texts and the user's lang_id. If lang_id were a column in your text table you would join there and just pick up values in your texts table that correspond to the user's lang_id.
Again. I would highly highly encourage you to rethink your schema since this has you headed toward a nightmare that won't scale, will cause you to constantly edit your schema, and hard code values in your SQL.

Well the simplest way is to cast row to json and then access field dynamically. It is very similar to your original query:
select row_to_json(t.*)->(select lang_id
from users
where user_id = 12345)
from texts t
where text_id = 'hi'

In my opinion it's incorrect approach to develop localization. You can make table texts
with parameters text_id, lang_id and text. And it can help to develop more flexible
In your case you can do it
SELECT
case
when u.user_id = 'en' then t.en
when u.user_id = 'ru' then t.ru
when u.user_id = 'uz' then t.uz
else t.en
end as text
FROM texts as t
left join lateral (
SELECT lang_id
FROM users
WHERE user_id = 12345
) as u on true
WHERE text_id = 'hi'

Related

Comparing two text array columns using % and ordering by number of matches

I have a user table that contains a "skills" column which is a text array. Given some input array, I would like to find all the users whose skills % one or more of the entries in the input array, and order by number of matches (according to the % operator from pg_trgm).
For example, I have Array['java', 'ruby', 'postgres'] and I want users who have these skills ordered by the number of matches (max is 3 in this case).
I tried unnest() with an inner join. It looked like I was getting somewhere, but I still have no idea how I can capture the count of the matching array entries. Any ideas on what the structure of the query may look like?
Edit: Details:
Here is what my programmers table looks like:
id | skills
----+-------------------------------
1 | {javascript,rails,css}
2 | {java,"ruby on rails",adobe}
3 | {typescript,nodejs,expressjs}
4 | {auth0,c++,redis}
where skills is a text array.
Here is what I have so far:
SELECT * FROM programmers, unnest(skills) skill_array(x)
INNER JOIN unnest(Array['ruby', 'node']) search(y)
ON skill_array.x % search.y;
which outputs the following:
id | skills | x | y
----+-------------------------------+---------------+---------
2 | {java,"ruby on rails",adobe} | ruby on rails | ruby
3 | {typescript,nodejs,expressjs} | nodejs | node
3 | {typescript,nodejs,expressjs} | expressjs | express
*Assuming pg_trgm is enabled.
For an exact match between the user skills and the searched skills, you can proceed like this :
You put the searched skills in the target_skills text array
You filter the users from the table user_table whose user_skills array has at least one common element with the target_skills array by using the && operator
For each of the selected users, you select the common skills by using unnest and INTERSECT, and you calculate the number of these common skills
You order the result by the number of common skills DESC
In this process, the users with skill "ruby" will be selected for the target skill "ruby", but not the users with skill "ruby on rails".
This process can be implemented as follow :
SELECT u.user_id
, u.user_skills
, inter.skills
FROM user_table AS u
CROSS JOIN LATERAL
( SELECT array( SELECT unnest(u.user_skills)
INTERSECT
SELECT unnest(target_skills)
) AS skills
) AS inter
WHERE u.user_skills && target_skills
ORDER BY array_length(inter.skills, 1) DESC
or with this variant :
SELECT u.user_id
, u.user_skills
, array_agg(t_skill) AS inter_skills
FROM user_table AS u
CROSS JOIN LATERAL unnest(target_skills) AS t_skill
WHERE u.user_skills && array[t_skill]
GROUP BY u.user_id, u.user_skills
ORDER BY array_length(inter_skills, 1) DESC
This query can be accelerated by creating a GIN index on the user_skills column of the user_table.
For a partial match between the user skills and the target skills (ie the users with skill "ruby on rails" must be selected for the target skill "ruby"), you need to use the pattern matching operator LIKE or the regular expression, but it is not possible to use them with text arrays, so you need first to transform your user_skills text array into a simple text with the function array_to_string. The query becomes :
SELECT u.user_id
, u.user_skills
, array_agg(t_skill) AS inter_skills
FROM user_table AS u
CROSS JOIN unnest(target_skills) AS t_skill
WHERE array_to_string(u.user_skills, ' ') ~ t_skill
GROUP BY u.user_id, u.user_skills
ORDER BY array_length(inter_skills, 1) DESC ;
Then you can accelerate the queries by creating the following GIN (or GiST) index :
DROP INDEX IF EXISTS user_skills ;
CREATE INDEX user_skills
ON user_table
USING gist (array_to_string(user_skills, ' ') gist_trgm_ops) ; -- gin_trgm_ops and gist_trgm_ops indexes are compliant with the LIKE operator and the regular expressions
In any case, managing the skills as text will ever fail if there are typing errors or if the skills list is not normalized.
I accepted Edouard's answer, but I thought I'd show something else I adapted from it.
CREATE OR REPLACE FUNCTION partial_and_and(list1 TEXT[], list2 TEXT[])
RETURNS BOOLEAN AS $$
SELECT EXISTS(
SELECT * FROM unnest(list1) x, unnest(list2) y
WHERE x % y
);
$$ LANGUAGE SQL IMMUTABLE;
Then create the operator:
CREATE OPERATOR &&% (
LEFTARG = TEXT[],
RIGHTARG = TEXT[],
PROCEDURE = partial_and_and,
COMMUTATOR = &&%
);
And finally, the query:
SELECT p.id, p.skills, array_agg(t_skill) AS inter_skills
FROM programmers AS p
CROSS JOIN LATERAL unnest(Array['ruby', 'java']) AS t_skill
WHERE p.skills &&% array[t_skill]
GROUP BY p.id, p.skills
ORDER BY array_length(inter_skills, 1) DESC;
This will output an error saying column 'inter_skills' does not exist (not sure why), but oh well point is the query seems to work. All credit goes to Edouard.

Select by id and generate column with relationships in array

Essentially what i want to do is to get by id from "Tracks" but i also want to get the relations it has to other tracks (found in table "Remixes").
I can write a simple query that gets the track i want by id, ex.
SELECT * FROM "Tracks" WHERE id IN ('track-id1');
That gives me:
id | dateModified | channels | userId
-----------+---------------------+-----------------+--------
track-id1 | 2019-07-21 12:15:46 | {"some":"json"} | 1
But this is what i want to get:
id | dateModified | channels | userId | remixes
-----------+---------------------+-----------------+--------+---------
track-id1 | 2019-07-21 12:15:46 | {"some":"json"} | 1 | track-id2, track-id3
So i want to generate a column called "remixes" with ids in an array based on the data that is available in the "Remixes" table by a SELECT query.
Here is example data and database structure:
http://sqlfiddle.com/#!17/ec2e6/3
Don't hesitate to ask questions in case anything is unclear,
Thanks in advance
Left join the remixes and then GROUP BY the track ID and use array_agg() to get an array of the remix IDs.
SELECT t.*,
CASE
WHEN array_agg(r."remixTrackId") = '{NULL}'::varchar(255)[] THEN
'{}'::varchar(255)[]
ELSE
array_agg(r."remixTrackId")
END "remixes"
FROM "Tracks" t
LEFT JOIN "Remixes" r
ON r."originalTrackId" = t."id"
WHERE t."id" = 'track-id1'
GROUP BY t."id";
Note that, if there are no remixes array_agg() will return {NULL}. But I figured you rather want an empty array in such a case. That's what the CASE is for.
BTW, providing a fiddle is a nice move of yours! But please also include the code in the original question. The fiddle site might be down (even permanently) and that renders the question useless because of the missing information.
That's a simple outer join with a string aggregation to get the comma separated list:
SELECT t.*,
string_agg(r."remixTrackId", ', ') as remixes
FROM "Tracks" t
LEFT JOIN "Remixes" r ON r."originalTrackId" = t.id
WHERE t.id = 'track-id1'
GROUP BY t.id;
The above assumes that Tracks.id is the primary key of the Tracks table.

Postgresql query results to depend on few rows of same table

I'm working on some application, and we're using postgres as our DB. I don't a lot of experience with SQL at all, and now i encountered a problem, that i can't find answer to.
So here's a problem:
We have privacy settings stored in separate table, and accessibility of each row of data depends on few rows of this privacy table.
Basically structure of privacy table is:
entityId | entityType | privacyId | privacyType | allow | deletedAt
-------------------------------------------------------------------
5 | user | 6 | user | f | //example entry
5 | user | 1 | user_all | t |
In two words, this settings mean, that user id5 allows to have access to his data to everybody except user id6.
So i get available data by query like:
SELECT <some_relevant_fields> FROM <table>
JOIN <join>
WHERE
(privacy."privacyId"=6 AND privacy."privacyType"='user' AND privacy.allow=true)
OR (
(privacy."privacyType"='user_all' AND privacy."deletedAt" IS NOT NULL)
AND
(privacy."privacyType"='user' AND privacy."privacyId"=6 AND privacy.allow!=false)
);
I know that this query is incorrect in this form, but i want you to get idea of what i try to achieve.
So it must check for field with its type/id and allow=true, OR check that user_all is not deleted(deletedAt field is null) and there is no field restricting access with allow=false to this user.
But it seems like postgres is chaining all expressions, so it overrides privacy."privacyType"='user_all' with 'user' at the end of expression, and returns no results, or returns data even if user "blocked", because 'user_all' exist.
Is there a way to write WHERE clause to return result if AND expression is true for 2 different rows, for example in code above: (privacy."privacyType"='user_all' AND privacy."deletedAt" IS NOT NULL) is true for one row AND (privacy."privacyType"='user' AND privacy."privacyId"=6 AND privacy.allow!=false) is true for other, or maybe check for absence of row with this values.
Is this what you want?
select <some_fields> from <table> where
privacyType='user_all' AND deletedAt IS NOT NULL
union
select <some_fields> from <table> where
privacyType='user' AND privacyId=6 AND allow<>'f';
You left join the table with itself and found what element doesnt have a match using the where.
SELECT p1.*
FROM privacy p1
LEFT JOIN privacy p2
ON p1."entityId" = p2."entityId"
AND p1."privacyType" = 'user_all'
AND p1."deletedAt" IS NULL
AND p2."privacyType"='user' AND
AND p2."privacyId"= 6
AND p2.allow!=false
WHERE
p2.privacyId IS NOT NULL

Calculate table column using over table on server side

Suppose, there are two tables in db:
Table registries:
Column | Type |
--------------------+-----------------------------+---------
registry_id | integer | not null
name | character varying | not null
...
uploaded_at | timestamp without time zone | not null
Table rows:
Column | Type | Modifiers
---------------+-----------------------------+-----------
row_id | character varying | not null
registry_id | integer | not null
row | character varying | not null
In real world registries is just a csv-file and rows is lines of the files. In my scala-slick application, I want to know how many lines in each file.
registries:
1,foo,...
2,bar,...
3,baz,...
rows:
aaa,1,...
bbb,1,...
ccc,2,...
desired result:
1,foo,... - 2
2,bar,... - 1
3,baz,... - 0
My code now is (slick-3.0):
def getRegistryWithLength(rId: Int) = {
val q1 = registries.filter(_.registryId===rId).take(1).result.headOption
val q2 = rows.filter(_.registryId===rId).length.result
val registry = Await.result(db.run(q1), 5.seconds)
val length = Await.result(db.run(q2), 5.seconds)
(registry, length)
}
(Await is bad idea, I know it)
How can I do getRegistryWithLength using single sql query?
I could add column row_n into table registries, but then I'll be forced to do updating column row_n after delete/insert query of rows table.
How can I do automatic calculation column row_n in table registries on db server side?
The basic query could be:
SELECT r.*, COALESCE(n.ct, 0) AS ct
FROM registry r
LEFT JOIN (
SELECT registry_id, count(*) AS ct
FROM rows
GROUP BY registry_id
) n USING (registry_id);
The LEFT [OUTER] JOIN is essential so you do not filter rows from registry without related rows in rows.
COALESCE to return 0 instead of NULL where no related rows are found.
There are many related answers on SO. One here:
SQL: How to save order in sql query?
You could wrap this in a VIEW for convenience:
CREATE VIEW reg_rn AS
SELECT ...
... which you query like a table.
Aside: It's unwise to use reserved SQL key words as identifiers. row is a no-go for a column name (even if allowed in Postgres).
Thanks Erwin Brandstetter for awesome answer, using it, I wrote code for my scala-slick application.
Scala code looks much more complicated than plain sql:
val registryQuery = registries.filter(_.userId === userId)
val rowQuery = rows groupBy(_.registryId) map { case (regId, rowItems) => (regId, rowItems.length)}
val q = registryQuery joinLeft rowQuery on (_.registryId === _._1) map {
case (registry, rowsCnt) => (registry, rowsCnt.map(_._2))
}
but it works!

postgresql self join

Say I have a table like so
id | device | cmd | value |
------+----------------+-------+---------
id = unique row ID
device = device identifier (mac address)
cmd = some arbitrary command
value = value of corresponding command
I would like to somehow self join on this table to grab specific cmds and their corresponding values for a particular device.
I do not want just SELECT cmd,value FROM table WHERE device='00:11:22:33:44:55';
Say the values I want correspond to the getname and getlocation commands. I would like to have output something like
mac | name | location
--------------------+-----------+------------
00:11:22:33:44:55 | some name | somewhere
My sql fu is pretty pants. I've been trying different combinations like SELECT a.value,b.value FROM table AS a INNER JOIN table AS b ON a.device=b.device but I am getting nowhere.
Thanks for any help.
SELECT a.value AS thisval ,b.value AS thatval
FROM table AS a JOIN table AS b USING (device)
WHERE a.command='this' AND b.command='that';