Essentially what i want to do is to get by id from "Tracks" but i also want to get the relations it has to other tracks (found in table "Remixes").
I can write a simple query that gets the track i want by id, ex.
SELECT * FROM "Tracks" WHERE id IN ('track-id1');
That gives me:
id | dateModified | channels | userId
-----------+---------------------+-----------------+--------
track-id1 | 2019-07-21 12:15:46 | {"some":"json"} | 1
But this is what i want to get:
id | dateModified | channels | userId | remixes
-----------+---------------------+-----------------+--------+---------
track-id1 | 2019-07-21 12:15:46 | {"some":"json"} | 1 | track-id2, track-id3
So i want to generate a column called "remixes" with ids in an array based on the data that is available in the "Remixes" table by a SELECT query.
Here is example data and database structure:
http://sqlfiddle.com/#!17/ec2e6/3
Don't hesitate to ask questions in case anything is unclear,
Thanks in advance
Left join the remixes and then GROUP BY the track ID and use array_agg() to get an array of the remix IDs.
SELECT t.*,
CASE
WHEN array_agg(r."remixTrackId") = '{NULL}'::varchar(255)[] THEN
'{}'::varchar(255)[]
ELSE
array_agg(r."remixTrackId")
END "remixes"
FROM "Tracks" t
LEFT JOIN "Remixes" r
ON r."originalTrackId" = t."id"
WHERE t."id" = 'track-id1'
GROUP BY t."id";
Note that, if there are no remixes array_agg() will return {NULL}. But I figured you rather want an empty array in such a case. That's what the CASE is for.
BTW, providing a fiddle is a nice move of yours! But please also include the code in the original question. The fiddle site might be down (even permanently) and that renders the question useless because of the missing information.
That's a simple outer join with a string aggregation to get the comma separated list:
SELECT t.*,
string_agg(r."remixTrackId", ', ') as remixes
FROM "Tracks" t
LEFT JOIN "Remixes" r ON r."originalTrackId" = t.id
WHERE t.id = 'track-id1'
GROUP BY t.id;
The above assumes that Tracks.id is the primary key of the Tracks table.
Related
I have the following query:
let p1 = pageViews | where url has "xxx";
p1
| join kind=inner (pageViews
| where url !has "xxx")
on session_Id
| project timestamp1, session_Id1, url1, client_CountryOrRegion1, client_StateOrProvince1, client_City1, user_Id1
It does get users that originated from a certain provider and then looks at which URLs they are going to.
I am now trying to get how many users I got from that provider.
I could just do distinct session_Id and count but what I would like to do is add two columns, first for specific session_id and then increment it when it changes and another one to increment for the number of requests made.
i.e
I tried:
let p1 = pageViews | where url has "project-management";
p1
| join kind=inner (pageViews
| where url !has "project-management")
on session_Id
| project timestamp1, session_Id1, url1, client_CountryOrRegion1, client_StateOrProvince1, client_City1, user_Id1
| extend Rank=row_number(1)
but it gave me
Function 'row_number' cannot be invoked in current context. Details: the row set must be serialized
The records in the output aren't sorted, therefore there's no meaning to row_number().
row_number() only works on serialized records, which you have after using order by, or serialize.
So the solution to your question is to add | serialize before | extend Rank=row_number(1).
I have a user table that contains a "skills" column which is a text array. Given some input array, I would like to find all the users whose skills % one or more of the entries in the input array, and order by number of matches (according to the % operator from pg_trgm).
For example, I have Array['java', 'ruby', 'postgres'] and I want users who have these skills ordered by the number of matches (max is 3 in this case).
I tried unnest() with an inner join. It looked like I was getting somewhere, but I still have no idea how I can capture the count of the matching array entries. Any ideas on what the structure of the query may look like?
Edit: Details:
Here is what my programmers table looks like:
id | skills
----+-------------------------------
1 | {javascript,rails,css}
2 | {java,"ruby on rails",adobe}
3 | {typescript,nodejs,expressjs}
4 | {auth0,c++,redis}
where skills is a text array.
Here is what I have so far:
SELECT * FROM programmers, unnest(skills) skill_array(x)
INNER JOIN unnest(Array['ruby', 'node']) search(y)
ON skill_array.x % search.y;
which outputs the following:
id | skills | x | y
----+-------------------------------+---------------+---------
2 | {java,"ruby on rails",adobe} | ruby on rails | ruby
3 | {typescript,nodejs,expressjs} | nodejs | node
3 | {typescript,nodejs,expressjs} | expressjs | express
*Assuming pg_trgm is enabled.
For an exact match between the user skills and the searched skills, you can proceed like this :
You put the searched skills in the target_skills text array
You filter the users from the table user_table whose user_skills array has at least one common element with the target_skills array by using the && operator
For each of the selected users, you select the common skills by using unnest and INTERSECT, and you calculate the number of these common skills
You order the result by the number of common skills DESC
In this process, the users with skill "ruby" will be selected for the target skill "ruby", but not the users with skill "ruby on rails".
This process can be implemented as follow :
SELECT u.user_id
, u.user_skills
, inter.skills
FROM user_table AS u
CROSS JOIN LATERAL
( SELECT array( SELECT unnest(u.user_skills)
INTERSECT
SELECT unnest(target_skills)
) AS skills
) AS inter
WHERE u.user_skills && target_skills
ORDER BY array_length(inter.skills, 1) DESC
or with this variant :
SELECT u.user_id
, u.user_skills
, array_agg(t_skill) AS inter_skills
FROM user_table AS u
CROSS JOIN LATERAL unnest(target_skills) AS t_skill
WHERE u.user_skills && array[t_skill]
GROUP BY u.user_id, u.user_skills
ORDER BY array_length(inter_skills, 1) DESC
This query can be accelerated by creating a GIN index on the user_skills column of the user_table.
For a partial match between the user skills and the target skills (ie the users with skill "ruby on rails" must be selected for the target skill "ruby"), you need to use the pattern matching operator LIKE or the regular expression, but it is not possible to use them with text arrays, so you need first to transform your user_skills text array into a simple text with the function array_to_string. The query becomes :
SELECT u.user_id
, u.user_skills
, array_agg(t_skill) AS inter_skills
FROM user_table AS u
CROSS JOIN unnest(target_skills) AS t_skill
WHERE array_to_string(u.user_skills, ' ') ~ t_skill
GROUP BY u.user_id, u.user_skills
ORDER BY array_length(inter_skills, 1) DESC ;
Then you can accelerate the queries by creating the following GIN (or GiST) index :
DROP INDEX IF EXISTS user_skills ;
CREATE INDEX user_skills
ON user_table
USING gist (array_to_string(user_skills, ' ') gist_trgm_ops) ; -- gin_trgm_ops and gist_trgm_ops indexes are compliant with the LIKE operator and the regular expressions
In any case, managing the skills as text will ever fail if there are typing errors or if the skills list is not normalized.
I accepted Edouard's answer, but I thought I'd show something else I adapted from it.
CREATE OR REPLACE FUNCTION partial_and_and(list1 TEXT[], list2 TEXT[])
RETURNS BOOLEAN AS $$
SELECT EXISTS(
SELECT * FROM unnest(list1) x, unnest(list2) y
WHERE x % y
);
$$ LANGUAGE SQL IMMUTABLE;
Then create the operator:
CREATE OPERATOR &&% (
LEFTARG = TEXT[],
RIGHTARG = TEXT[],
PROCEDURE = partial_and_and,
COMMUTATOR = &&%
);
And finally, the query:
SELECT p.id, p.skills, array_agg(t_skill) AS inter_skills
FROM programmers AS p
CROSS JOIN LATERAL unnest(Array['ruby', 'java']) AS t_skill
WHERE p.skills &&% array[t_skill]
GROUP BY p.id, p.skills
ORDER BY array_length(inter_skills, 1) DESC;
This will output an error saying column 'inter_skills' does not exist (not sure why), but oh well point is the query seems to work. All credit goes to Edouard.
I have Table users:
user_id | lang_id
--------+---------
12345 | en
54321 | ru
77777 | uz
and Table texts:
text_id | en | ru | uz
--------+--------+---------+-------
hi | Hello! | Привет! | Salom!
bye | Bye! | Пока! | Xayr!
I have two informations:
user_id = 12345
text_id = 'hi'
and I'm trying this query, to get a text for user's chosen language:
SELECT (SELECT lang_id FROM users WHERE user_id = 12345) FROM texts
WHERE text_id = 'hi'
and getting this:
lang_id
------
en
I should get the text "Hello!"
I'm kinda newbie in PostgreSQL would be good if you help me to solve this :)
While I 100% recommend changing your schema to something properly normalized like Jorge Campos suggests up in the comments, you can use some hard coding in a CASE statement to get at your texts.
SELECT
CASE
WHEN users.lang_id = 'en' THEN texts.en
WHEN users.lang_id = 'ru' THEN texts.ru
WHEN users.lang_id = 'uz' THEN texts.uz
END as user_language_text
FROM
users, texts
WHERE
user_id = 12345
AND text_id = 'hi';
There are some major downsides here though:
That CASE statement is costly from a CPU perspective
To determine which column in texts from which you retrieve your data you have to hard code the possibly language values. Meaning every time you add a new language not only do you have to add a new column to your table (major anti-pattern by itself) but you also have to tweak ALL of your SQL to accommodate.
You must cross join (or subquery without correlation) to derive the relationship between your texts and the user's lang_id. If lang_id were a column in your text table you would join there and just pick up values in your texts table that correspond to the user's lang_id.
Again. I would highly highly encourage you to rethink your schema since this has you headed toward a nightmare that won't scale, will cause you to constantly edit your schema, and hard code values in your SQL.
Well the simplest way is to cast row to json and then access field dynamically. It is very similar to your original query:
select row_to_json(t.*)->(select lang_id
from users
where user_id = 12345)
from texts t
where text_id = 'hi'
In my opinion it's incorrect approach to develop localization. You can make table texts
with parameters text_id, lang_id and text. And it can help to develop more flexible
In your case you can do it
SELECT
case
when u.user_id = 'en' then t.en
when u.user_id = 'ru' then t.ru
when u.user_id = 'uz' then t.uz
else t.en
end as text
FROM texts as t
left join lateral (
SELECT lang_id
FROM users
WHERE user_id = 12345
) as u on true
WHERE text_id = 'hi'
I've been migrating some of my MySQL queries to PostgreSQL to use Heroku. Most of my queries work fine, but I keep having a similar recurring error when I use group by:
ERROR: column "XYZ" must appear in the GROUP BY clause or be used in
an aggregate function
Could someone tell me what I'm doing wrong?
MySQL which works 100%:
SELECT `availables`.*
FROM `availables`
INNER JOIN `rooms` ON `rooms`.id = `availables`.room_id
WHERE (rooms.hotel_id = 5056 AND availables.bookdate BETWEEN '2009-11-22' AND '2009-11-24')
GROUP BY availables.bookdate
ORDER BY availables.updated_at
PostgreSQL error:
ActiveRecord::StatementInvalid: PGError: ERROR: column
"availables.id" must appear in the GROUP BY clause or be used in an
aggregate function:
SELECT "availables".* FROM "availables" INNER
JOIN "rooms" ON "rooms".id = "availables".room_id WHERE
(rooms.hotel_id = 5056 AND availables.bookdate BETWEEN E'2009-10-21'
AND E'2009-10-23') GROUP BY availables.bookdate ORDER BY
availables.updated_at
Ruby code generating the SQL:
expiration = Available.find(:all,
:joins => [ :room ],
:conditions => [ "rooms.hotel_id = ? AND availables.bookdate BETWEEN ? AND ?", hostel_id, date.to_s, (date+days-1).to_s ],
:group => 'availables.bookdate',
:order => 'availables.updated_at')
Expected Output (from working MySQL query):
+-----+-------+-------+------------+---------+---------------+---------------+
| id | price | spots | bookdate | room_id | created_at | updated_at |
+-----+-------+-------+------------+---------+---------------+---------------+
| 414 | 38.0 | 1 | 2009-11-22 | 1762 | 2009-11-20... | 2009-11-20... |
| 415 | 38.0 | 1 | 2009-11-23 | 1762 | 2009-11-20... | 2009-11-20... |
| 416 | 38.0 | 2 | 2009-11-24 | 1762 | 2009-11-20... | 2009-11-20... |
+-----+-------+-------+------------+---------+---------------+---------------+
3 rows in set
MySQL's totally non standards compliant GROUP BY can be emulated by Postgres' DISTINCT ON. Consider this:
MySQL:
SELECT a,b,c,d,e FROM table GROUP BY a
This delivers 1 row per value of a (which one, you don't really know). Well actually you can guess, because MySQL doesn't know about hash aggregates, so it will probably use a sort... but it will only sort on a, so the order of the rows could be random. Unless it uses a multicolumn index instead of sorting. Well, anyway, it's not specified by the query.
Postgres:
SELECT DISTINCT ON (a) a,b,c,d,e FROM table ORDER BY a,b,c
This delivers 1 row per value of a, this row will be the first one in the sort according to the ORDER BY specified by the query. Simple.
Note that here, it's not an aggregate I'm computing. So GROUP BY actually makes no sense. DISTINCT ON makes a lot more sense.
Rails is married to MySQL, so I'm not surprised that it generates SQL that doesn't work in Postgres.
PostgreSQL is more SQL compliant than MySQL. All fields - except computed field with aggregation function - in the output must be present in the GROUP BY clause.
MySQL's GROUP BY can be used without an aggregate function (which is contrary to the SQL standard), and returns the first row in the group (I don't know based on what criteria), while PostgreSQL must have an aggregate function (MAX, SUM, etc) on the column, on which the GROUP BY clause is issued.
Correct, the solution to fixing this is to use :select and to select each field that you wish to decorate the resulting object with and group by them.
Nasty - but it is how group by should work as opposed to how MySQL works with it by guessing what you mean if you don't stick fields in your group by.
If I remember correctly, in PostgreSQL you have to add every column you fetch from the table where the GROUP BY clause applies to the GROUP BY clause.
Not the prettiest solution, but changing the group parameter to output every column in model works in PostgreSQL:
expiration = Available.find(:all,
:joins => [ :room ],
:conditions => [ "rooms.hotel_id = ? AND availables.bookdate BETWEEN ? AND ?", hostel_id, date.to_s, (date+days-1).to_s ],
:group => Available.column_names.collect{|col| "availables.#{col}"},
:order => 'availables.updated_at')
According to MySQL's "Debuking GROUP BY Myths" http://dev.mysql.com/tech-resources/articles/debunking-group-by-myths.html. SQL (2003 version of the standard) doesn't requires columns referenced in the SELECT list of a query to also appear in the GROUP BY clause.
For others looking for a way to order by any field, including joined field, in postgresql, use a subquery:
SELECT * FROM(
SELECT DISTINCT ON(availables.bookdate) `availables`.*
FROM `availables` INNER JOIN `rooms` ON `rooms`.id = `availables`.room_id
WHERE (rooms.hotel_id = 5056
AND availables.bookdate BETWEEN '2009-11-22' AND '2009-11-24')
) AS distinct_selected
ORDER BY availables.updated_at
or arel:
subquery = SomeRecord.select("distinct on(xx.id) xx.*, jointable.order_field")
.where("").joins(")
result = SomeRecord.select("*").from("(#{subquery.to_sql}) AS distinct_selected").order(" xx.order_field ASC, jointable.order_field ASC")
I think that .uniq [1] will solve your problem.
[1] Available.select('...').uniq
Take a look at http://guides.rubyonrails.org/active_record_querying.html#selecting-specific-fields
I'm working on some application, and we're using postgres as our DB. I don't a lot of experience with SQL at all, and now i encountered a problem, that i can't find answer to.
So here's a problem:
We have privacy settings stored in separate table, and accessibility of each row of data depends on few rows of this privacy table.
Basically structure of privacy table is:
entityId | entityType | privacyId | privacyType | allow | deletedAt
-------------------------------------------------------------------
5 | user | 6 | user | f | //example entry
5 | user | 1 | user_all | t |
In two words, this settings mean, that user id5 allows to have access to his data to everybody except user id6.
So i get available data by query like:
SELECT <some_relevant_fields> FROM <table>
JOIN <join>
WHERE
(privacy."privacyId"=6 AND privacy."privacyType"='user' AND privacy.allow=true)
OR (
(privacy."privacyType"='user_all' AND privacy."deletedAt" IS NOT NULL)
AND
(privacy."privacyType"='user' AND privacy."privacyId"=6 AND privacy.allow!=false)
);
I know that this query is incorrect in this form, but i want you to get idea of what i try to achieve.
So it must check for field with its type/id and allow=true, OR check that user_all is not deleted(deletedAt field is null) and there is no field restricting access with allow=false to this user.
But it seems like postgres is chaining all expressions, so it overrides privacy."privacyType"='user_all' with 'user' at the end of expression, and returns no results, or returns data even if user "blocked", because 'user_all' exist.
Is there a way to write WHERE clause to return result if AND expression is true for 2 different rows, for example in code above: (privacy."privacyType"='user_all' AND privacy."deletedAt" IS NOT NULL) is true for one row AND (privacy."privacyType"='user' AND privacy."privacyId"=6 AND privacy.allow!=false) is true for other, or maybe check for absence of row with this values.
Is this what you want?
select <some_fields> from <table> where
privacyType='user_all' AND deletedAt IS NOT NULL
union
select <some_fields> from <table> where
privacyType='user' AND privacyId=6 AND allow<>'f';
You left join the table with itself and found what element doesnt have a match using the where.
SELECT p1.*
FROM privacy p1
LEFT JOIN privacy p2
ON p1."entityId" = p2."entityId"
AND p1."privacyType" = 'user_all'
AND p1."deletedAt" IS NULL
AND p2."privacyType"='user' AND
AND p2."privacyId"= 6
AND p2.allow!=false
WHERE
p2.privacyId IS NOT NULL