Inner join "many-to-many" table rows as array - postgresql

I'm relatively new to PostgreSQL and trying to figure out how to solve the following scenario. Let's say I have three tables:
stores
| store_id |
|----------|
| 1 |
| 2 |
| 3 |
products
| product_id |
|------------|
| 1 |
| 2 |
| 3 |
store_has_product
| store_id | product_id |
|----------|------------|
| 1 | 3 |
| 1 | 2 |
| 2 | 1 |
| 3 | 3 |
| 1 | 1 |
| 3 | 1 |
| 3 | 2 |
And now I'm trying to build a query to join all products to the stores table and group them in an array, so that I have an output like this:
| store_id | products |
|----------|-----------|
| 1 | {3, 2, 1} |
| 2 | {2} |
| 3 | {3, 1, 2} |
I know that Arrays are possible with PostgreSQL, but I don't get how to write such a query and probably already spent too much time thinking about a solution.
Thanks for your help!

If you are using version 8.4 or later you can use array_agg:
SELECT store_id, array_agg(product_id::text) as products
FROM store_has_product
GROUP BY store_id

Related

Postgres distinct rows whilst also summing

I have a dataset that is similar to this. I need to pick out the most recent metadata (greater execution time = more recent) for a client including the sum of quantities and the latest execution time and meta where the quantity > 0
| Name | Quantity | Metadata | Execution time |
| -------- | ---------|----------|----------------|
| Neil | 1 | [1,3] | 4 |
| James | 1 | [2,18] | 5 |
| Neil | 1 | [4, 1] | 6 |
| Mike | 1 | [5, 42] | 7 |
| James | -1 | Null | 8 |
| Neil | -1 | Null | 9 |
Eg the query needs to return:
| Name | Summed Quantity | Metadata | Execution time |
| -------- | ----------------|----------|----------------|
| James | 0 | [2,18] | 5 |
| Neil | 1 | [4, 1] | 6 |
| Mike | 1 | [5, 42] | 7 |
My query doesn't quite work as it's not returning the sum of the quantities correctly.
SELECT
distinct on (name) name,
(
SELECT
cast(
sum(quantity) as int
)
) as summed_quantity,
meta,
execution_time
FROM
table
where
quantity > 0
group by
name,
meta,
execution_time
order by
name,
execution_time desc;
This query gives a result of
| Name | Summed Quantity | Metadata | Execution time |
| -------- | ----------------|----------|----------------|
| James | 1 | [2,18] | 5 |
| Neil | 1 | [4, 1] | 6 |
| Mike | 1 | [5, 42] | 7 |
ie it's just taking the quantity > 0 from the where and not adding up the quantities in the sub query (i assume because of the distinct clause) I'm unsure how to fix my query to produce the desired output.
This can be achieved using window functions (hence with a single pass of the data)
select
name
, sum_qty
, metadata
, execution_time
from (
select
*
, sum(Quantity) over(partition by name) sum_qty
, row_number() over(partition by name, case when quantity > 0 then 1 else 0 end
order by Execution_time DESC) as rn
from mytable
) d
where rn = 1 and quantity > 0
order by name
result
+-------+---------+----------+----------------+
| name | sum_qty | metadata | execution_time |
+-------+---------+----------+----------------+
| James | 0 | [2,18] | 5 |
| Mike | 1 | [5,42] | 7 |
| Neil | 1 | [4,1] | 6 |
+-------+---------+----------+----------------+
db<>fiddle here

Create a PostgreSQL function that becomes a formula field of a table retrieving related data from other table

The example above can be done on a SQL Server. It is a function that performs the calculation on another table while getting the current table field Id to list data from other table, return a single value.
Question: how to do the exact thing with PostgreSQL
SELECT TOP(5) * FROM Artists;
+------------+------------------+--------------+-------------+
| ArtistId | ArtistName | ActiveFrom | CountryId |
|------------+------------------+--------------+-------------|
| 1 | Iron Maiden | 1975-12-25 | 3 |
| 2 | AC/DC | 1973-01-11 | 2 |
| 3 | Allan Holdsworth | 1969-01-01 | 3 |
| 4 | Buddy Rich | 1919-01-01 | 6 |
| 5 | Devin Townsend | 1993-01-01 | 8 |
+------------+------------------+--------------+-------------+
SELECT TOP(5) * FROM Albums;
+-----------+------------------------+---------------+------------+-----------+
| AlbumId | AlbumName | ReleaseDate | ArtistId | GenreId |
|-----------+------------------------+---------------+------------+-----------|
| 1 | Powerslave | 1984-09-03 | 1 | 1 |
| 2 | Powerage | 1978-05-05 | 2 | 1 |
| 3 | Singing Down the Lane | 1956-01-01 | 6 | 3 |
| 4 | Ziltoid the Omniscient | 2007-05-21 | 5 | 1 |
| 5 | Casualties of Cool | 2014-05-14 | 5 | 1 |
+-----------+------------------------+---------------+------------+-----------+
The function
CREATE FUNCTION [dbo].[ufn_AlbumCount] (#ArtistId int)
RETURNS smallint
AS
BEGIN
DECLARE #AlbumCount int;
SELECT #AlbumCount = COUNT(AlbumId)
FROM Albums
WHERE ArtistId = #ArtistId;
RETURN #AlbumCount;
END;
GO
Now, (at SQL Server), after update the first table fields with ALTER TABLE Artists ADD AlbumCount AS dbo.ufn_AlbumCount(ArtistId); whe can list and get the following result.
+------------+------------------+--------------+-------------+--------------+
| ArtistId | ArtistName | ActiveFrom | CountryId | AlbumCount |
|------------+------------------+--------------+-------------+--------------|
| 1 | Iron Maiden | 1975-12-25 | 3 | 5 |
| 2 | AC/DC | 1973-01-11 | 2 | 3 |
| 3 | Allan Holdsworth | 1969-01-01 | 3 | 2 |
| 4 | Buddy Rich | 1919-01-01 | 6 | 1 |
| 5 | Devin Townsend | 1993-01-01 | 8 | 3 |
| 6 | Jim Reeves | 1948-01-01 | 6 | 1 |
| 7 | Tom Jones | 1963-01-01 | 4 | 3 |
| 8 | Maroon 5 | 1994-01-01 | 6 | 0 |
| 9 | The Script | 2001-01-01 | 5 | 1 |
| 10 | Lit | 1988-06-26 | 6 | 0 |
+------------+------------------+--------------+-------------+--------------+
but how to achieve this on postgresql?
Postgres doesn't support "virtual" computed column (i.e. computed columns that are generated at runtime), so there is no exact equivalent. The most efficient solution is a view that counts this:
create view artists_with_counts
as
select a.*,
coalesce(t.album_count, 0) as album_count
from artists a
left join (
select artist_id, count(*) as album_count
from albums
group by artist_id
) t on a.artist_id = t.artist_id;
Another option is to create a function that can be used as a "virtual column" in a select - but as this is done row-by-row, this will be substantially slower than the view.
create function album_count(p_artist artists)
returns bigint
as
$$
select count(*)
from albums a
where a.artist_id = p_artist.artist_id;
$$
language sql
stable;
Then you can include this as a column:
select a.*, a.album_count
from artists a;
Using the function like that, requires to prefix the function reference with the table alias (alternatively, you can use album_count(a))
Online example

Is there a V-lookup effect in Microsoft Access?

I am a novice self-teaching Microsoft Access.
I have an MS Access database with a table of students (Table1).
Table1
+----+-----------+----------+------------+------------+
| id | firstname | lastname | Year_Group | Form_Group |
+----+-----------+----------+------------+------------+
| 2 | mnb | nbgfv | 7 | 1 |
| 3 | jhg | uhgf | 8 | 2 |
| 4 | poi | ijuy | 9 | 2 |
| 5 | tgf | tgfd | 10 | 2 |
| 6 | wer | qwes | 11 | 2 |
+----+-----------+----------+------------+------------+
Every day students days are recorded sort of like Table2.
Table2
+----------+----+-----------+----------+------------+--------+-----------+----------+
| Date | id | firstname | lastname | Year_Group | Effort | Behaviour | Homework |
+----------+----+-----------+----------+------------+--------+-----------+----------+
| 28/02/19 | 2 | mnb | nbgfv | 7 | Good | Good | Y |
| 28/02/19 | 3 | jhg | uhgf | 8 | OK | OK | Y |
| 28/02/19 | 4 | poi | ijuy | 9 | Bad | Bad | N |
| 01/03/19 | 5 | tgf | tgfd | 10 | Good | OK | Y |
| 01/03/19 | 6 | wer | qwes | 11 | Good | Good | Y |
+----------+----+-----------+----------+------------+--------+-----------+----------+
Is there a way (when using a list box or combo box) to select a student from Table1 so that their information is used for the corresponding columns in Table2?
Or is there a more efficient way to do this?
Firstly, you should normalise your data.
Currently, you are repeating the firstname, lastname, and Year_Group data in two separate tables, which not only bloats your database, but also means that such data must be maintained in two separate places, potentially leading to inconsistencies and then uncertainty as to which is the master.
Instead, I would suggest that your Students table should contain all information pertaining to the characteristics of a student:
Students
+----+-----------+----------+------------+------------+
| id | firstname | lastname | Year_Group | Form_Group |
+----+-----------+----------+------------+------------+
| 2 | mnb | nbgfv | 7 | 1 |
| 3 | jhg | uhgf | 8 | 2 |
| 4 | poi | ijuy | 9 | 2 |
| 5 | tgf | tgfd | 10 | 2 |
| 6 | wer | qwes | 11 | 2 |
+----+-----------+----------+------------+------------+
And the information pertaining to each school day should only reference the student ID in the Students table:
SchoolDays
+----------+----+--------+-----------+----------+
| Date | id | Effort | Behaviour | Homework |
+----------+----+--------+-----------+----------+
| 28/02/19 | 2 | Good | Good | Y |
| 28/02/19 | 3 | OK | OK | Y |
| 28/02/19 | 4 | Bad | Bad | N |
| 01/03/19 | 5 | Good | OK | Y |
| 01/03/19 | 6 | Good | Good | Y |
+----------+----+--------+-----------+----------+
Then, if you want to display the data in its entirety, you would use a query which joins the two tables, e.g.:
select
t2.date,
t1.firstname,
t1.lastname,
t1.year_group,
t2.effort,
t2.behaviour,
t2.homework
from
students t1 inner join schooldays t2 on t1.id = t2.id

postgres sql : getting unified rows

I have one table where I dump all records from different sources (x, y, z) like below
+----+------+--------+
| id | source |
+----+--------+
| 1 | x |
| 2 | y |
| 3 | x |
| 4 | x |
| 5 | y |
| 6 | z |
| 7 | z |
| 8 | x |
| 9 | z |
| 10 | z |
+----+--------+
Then I have one mapping table where I map values between sources based on my usecase like below
+----+-----------+
| id | mapped_id |
+----+-----------+
| 1 | 2 |
| 1 | 9 |
| 3 | 7 |
| 4 | 10 |
| 5 | 1 |
+----+-----------+
I want merged results where I can see only unique results like
+-----+------------+
| id | mapped_ids |
+-----+------------+
| 1 | 2,9,5 |
| 3 | 7 |
| 4 | 10 |
| 6 | null |
| 8 | null |
+-----+------------+
I am trying different options but could not figure this out, is there way I can write joins to do this. I have to use the mapping table where associations are stored and identify unique records along with records which are not mapped anywhere.
My understanding is, you want to see all dump_table IDs that do not appear in the mapping_id column and then aggregate the mapped_ids for those that are left:
select d1.id,
array_agg(m1.mapped_id order by m1.mapped_id) filter (where m1.mapped_id is not null) as mapped_ids
from dump_table d1
left join mapping_table m1 using (id)
where not exists (select *
from mapping_table m2
where m2.mapped_id = d1.id)
group by d1.id;
Online example: https://rextester.com/JQZ17650
Try something like this:
SELECT id, name, ARRAY_AGG(mapped_id) AS mapped_ids
FROM table1 AS t1
LEFT JOIN table2 AS t2 USING (id)
GROUP BY id, name

PostgreSQL 9.3:Updating table(order column) from another table, getting same values in rows

I need help with updating table from another table in Postgres Db.
Long story short we ended up with corrupted data in db, and now I need to update one table with values from another.
I have table with this data table wfc:
| step_id | command_id | commands_order |
|---------|------------|----------------|
| 1 | 1 | 0 |
| 1 | 2 | 1 |
| 1 | 3 | 2 |
| 1 | 4 | 3 |
| 1 | 1 | 0 |
| 2 | 2 | 0 |
| 2 | 3 | 1 |
| 2 | 3 | 1 |
| 2 | 4 | 3 |
and I want to update values in command_order column from another table, so I can have result like this:
| step_id | command_id | commands_order|
|---------|------------|---------------|
| 1 | 1 | 0 |
| 1 | 2 | 1 |
| 1 | 3 | 2 |
| 1 | 4 | 3 |
| 1 | 1 | 4 |
| 2 | 2 | 0 |
| 2 | 3 | 1 |
| 2 | 3 | 2 |
| 2 | 4 | 3 |
It was looking like easy task, but problem is to update rows for same command_id, it is writing same value in commands_order
SQL that I tried is:
UPDATE wfc
SET commands_order = CAST(sq.input_step_id as INTEGER)
FROM (
SELECT wfp.step_id, wfp.input_command_id, wfp.input_step_id
from wfp
order by wfp.step_id, wfp.input_step_id
) AS sq
WHERE (wfc.step_id=sq.step_id AND wfc.command_id=CAST(sq.input_command_id as INTEGER));
SQL Fiddle http://sqlfiddle.com/#!15/4efff4/4
I am pretty stuck with this, please help.
Thanks in advance.
Assuming you are trying to number the rows in the order in which they were created, and as long as you understand that ctid will chnage on update and with VACCUUM FULL, you can do the following:
select step_id, command_id, rank - 1 as command_order
from (select step_id, command_id, ctid as wfc_ctid, rank() over
(partition by step_id order by ctid)
from wfc) as wfc_ordered;
This will give you the wfc table with the ordering that you want. If you do update the original table, the ctids will change, so it's probably safer to create a copy of the table with the above query.