Query table with array_agg of ALL previous positions, excluding current position - postgresql

I have a database table with:
id | date | position | name
--------------------------------------
1 | 2016-06-29 | 9 | Ben Smith
2 | 2016-06-29 | 1 | Ben Smith
3 | 2016-06-29 | 5 | Ben Smith
4 | 2016-06-29 | 6 | Ben Smith
5 | 2016-06-30 | 2 | Ben Smith
6 | 2016-06-30 | 2 | Tom Brown
7 | 2016-06-29 | 4 | Tom Brown
8 | 2016-06-30 | 2 | Tom Brown
9 | 2016-06-30 | 1 | Tom Brown
How can I query the table efficiently so that I can get a new column using array_agg().
I have already tried the following query however its incredibly slow and also wrong as it doesn't group the previous_positions by the name column:
SELECT
j.*,
(SELECT array_agg(id) FROM jockeys j2 WHERE j2.id < j.id)
FROM jockeys j
I expect the table output to look like this
id | date | position | name | previous_positions
----------------------------------------------------------
1 | 2016-06-29 | 9 | Ben Smith | {}
2 | 2016-06-29 | 1 | Ben Smith | {9}
3 | 2016-06-29 | 5 | Ben Smith | {9,1}
4 | 2016-06-29 | 6 | Ben Smith | {9,1,5}
5 | 2016-06-30 | 2 | Ben Smith | {9,1,5,6}
6 | 2016-06-30 | 2 | Tom Brown | {}
7 | 2016-06-29 | 4 | Tom Brown | {2}
8 | 2016-06-30 | 2 | Tom Brown | {2,4}
9 | 2016-06-30 | 1 | Tom Brown | {2,4,2}

You may use the WINDOW clause for array_agg
SELECT
j.* , array_agg(position) over w as previous_positions
FROM jockeys j
WINDOW w as
( partition by name ORDER BY id rows between
unbounded preceding and 1 preceding
)
DEMO

Related

Running count of working days in Postgres

How can I get the number of productive days as below in Postgres?
In sample data, there are 29 days and 21 working days.
Productive days are the days when owner_name has a call.
+------------+-------------+---------------+-----------------+-------------+
| date | owner_name | working_days | productive_days | Total_calls |
+------------+-------------+---------------+-----------------+-------------+
| 11/29/2019 | James Jones | 21 | 18 | 78 |
| 11/28/2019 | James Jones | 20 | 17 | 725 |
| 11/27/2019 | James Jones | 19 | 16 | 424 |
| 11/26/2019 | James Jones | 18 | 15 | 42 |
| 11/25/2019 | James Jones | 17 | 14 | 98 |
| 11/24/2019 | James Jones | | | 0 |
| 11/23/2019 | James Jones | | | 0 |
| 11/22/2019 | James Jones | 16 | 13 | 55 |
| 11/21/2019 | James Jones | 15 | 12 | 142 |
| 11/20/2019 | James Jones | 14 | 11 | 346 |
| 11/18/2019 | James Jones | 12 | 10 | 47 |
| 11/15/2019 | James Jones | 11 | | 0 |
| 11/14/2019 | James Jones | 10 | | 0 |
| 11/13/2019 | James Jones | 9 | 9 | 754 |
| 11/12/2019 | James Jones | 8 | 8 | 78 |
| 11/11/2019 | James Jones | 7 | 7 | 74 |
| 11/10/2019 | James Jones | | | 0 |
| 11/9/2019 | James Jones | | | 0 |
| 11/8/2019 | James Jones | 6 | 6 | 78 |
| 11/7/2019 | James Jones | 5 | 5 | 75 |
| 11/6/2019 | James Jones | 4 | 4 | 74 |
| 11/5/2019 | James Jones | 3 | 3 | 424 |
| 11/4/2019 | James Jones | 2 | 2 | 424 |
| 11/3/2019 | James Jones | | | 0 |
| 11/2/2019 | James Jones | | | 0 |
| 11/1/2019 | James Jones | 1 | 1 | 24 |
+------------+-------------+---------------+-----------------+-------------+
With count() window function:
select *,
case when "Total_calls" <> 0
then count(case when "Total_calls" <> 0 then "Total_calls" end) over (partition by owner_name order by date)
end
from tablename
order by owner_name, date desc
See the demo.
Results:
| date | owner_name | working_days | Total_calls | productive_days |
| -----------| ----------- | ------------ | ----------- | --------------- |
| 2019-11-29 | James Jones | 21 | 78 | 18 |
| 2019-11-28 | James Jones | 20 | 725 | 17 |
| 2019-11-27 | James Jones | 19 | 424 | 16 |
| 2019-11-26 | James Jones | 18 | 42 | 15 |
| 2019-11-25 | James Jones | 17 | 98 | 14 |
| 2019-11-24 | James Jones | | 0 | |
| 2019-11-23 | James Jones | | 0 | |
| 2019-11-22 | James Jones | 16 | 55 | 13 |
| 2019-11-21 | James Jones | 15 | 142 | 12 |
| 2019-11-20 | James Jones | 14 | 346 | 11 |
| 2019-11-18 | James Jones | 12 | 47 | 10 |
| 2019-11-15 | James Jones | 11 | 0 | |
| 2019-11-14 | James Jones | 10 | 0 | |
| 2019-11-13 | James Jones | 9 | 754 | 9 |
| 2019-11-12 | James Jones | 8 | 78 | 8 |
| 2019-11-11 | James Jones | 7 | 74 | 7 |
| 2019-11-10 | James Jones | | 0 | |
| 2019-11-09 | James Jones | | 0 | |
| 2019-11-08 | James Jones | 6 | 78 | 6 |
| 2019-11-07 | James Jones | 5 | 75 | 5 |
| 2019-11-06 | James Jones | 4 | 74 | 4 |
| 2019-11-05 | James Jones | 3 | 424 | 3 |
| 2019-11-04 | James Jones | 2 | 424 | 2 |
| 2019-11-03 | James Jones | | 0 | |
| 2019-11-02 | James Jones | | 0 | |
| 2019-11-01 | James Jones | 1 | 24 | 1 |

Query table with sum of ALL previous positions, excluding current position

I have a database table with:
id | date | position | name
--------------------------------------
1 | 2016-06-29 | 9 | Ben Smith
2 | 2016-06-29 | 1 | Ben Smith
3 | 2016-06-29 | 5 | Ben Smith
4 | 2016-06-29 | 6 | Ben Smith
5 | 2016-06-30 | 2 | Ben Smith
6 | 2016-06-30 | 2 | Tom Brown
7 | 2016-06-29 | 4 | Tom Brown
8 | 2016-06-30 | 2 | Tom Brown
9 | 2016-06-30 | 1 | Tom Brown
How can I query the table efficiently so that I can get a new column using sum().
I expect the table output to look like this
id | date | position | name | races | wins | places
--------------------------------------------------------------
1 | 2016-06-29 | 9 | Ben Smith | 1 | 0 | 0
2 | 2016-06-29 | 1 | Ben Smith | 2 | 1 | 0
3 | 2016-06-29 | 5 | Ben Smith | 3 | 1 | 0
4 | 2016-06-29 | 6 | Ben Smith | 4 | 1 | 0
5 | 2016-06-30 | 2 | Ben Smith | 5 | 1 | 1
6 | 2016-06-30 | 2 | Tom Brown | 1 | 0 | 2
7 | 2016-06-29 | 4 | Tom Brown | 1 | 0 | 2
8 | 2016-06-30 | 2 | Tom Brown | 2 | 0 | 3
9 | 2016-06-30 | 1 | Tom Brown | 4 | 1 | 3
Looks like this can easily be done using window functions:
select id, date, position, name,
row_number(*) over (partition by name, date order by id) as races,
count(*) filter (where position = 1) over (partition by name, date) as wins
from the_table;
I don't understand the logic to calculate the places column though.
#FatFreddy #a_horse_with_no_name
Thanks for getting me started, this is what I came up with. Do you think it can be improved on?
WITH runners AS (
SELECT
r.*,
CASE
WHEN position = 1 THEN 1
ELSE 0
END AS win,
CASE
WHEN position = 2 THEN 1
WHEN position = 3 THEN 1
ELSE 0
END AS place
FROM
runners r
ORDER BY id
)
SELECT
date,
r.id,
r.position,
name,
row_number(*) OVER foo AS races,
sum(win) OVER foo AS win,
sum(place) OVER foo AS place
FROM
runners r
LEFT JOIN markets m ON m.id = r.market_id
WINDOW foo AS (PARTITION BY name) ORDER BY r.id)

how to migrate relational tables to dynamoDB table

I am new at DynamoDB, in my current project, I am trying to migrate most relational tables to Dynamo DB. I am facing a tricky scenario which I don't know how to solve
In Posgresql, 2 tables:
Student
id | name | age | address | phone
---+--------+-----+---------+--------
1 | Alex | 18 | aaaaaa | 88888
2 | Tome | 19 | bbbbbb | 99999
3 | Mary | 18 | ccccc | 00000
4 | Peter | 20 | dddddd | 00000
Registration
id | class | student | year
---+--------+---------+---------
1 | A1 | 1 | 2018
2 | A1 | 3 | 2018
3 | A1 | 4 | 2017
4 | B1 | 2 | 2018
My query:
select s.id, s.name, s.age, s.address, s.phone
from Registration r inner join Student s on r.student = s.id
where r.class = 'A1' and r.year = '2018'
Result:
id | name | age | address | phone
---+--------+-----+---------+--------
1 | Alex | 18 | aaaaaa | 88888
3 | Mary | 18 | ccccc | 00000
So, how can I design the dynamoDB table to achieve this result? in extend for CRUD
Any advice is appreciated
DynamoDB table design is going to depend largely on your access patterns. Without knowing the full requirements and queries needed by your app, it's not going to be possible to write a proper answer. But given your example here's a table design that might work:
| (GSI PK) |
(P. Key) | (Sort) | (GSI Sort)
studentId | itemType | name | age | address | phone | year
----------+----------+--------+-----+---------+-------+------
1 | Details | Alex | 18 | aaaaaa | 88888 |
1 | Class_A1 | | | | | 2018
2 | Details | Tome | 19 | bbbbbb | 99999 |
2 | Class_B1 | | | | | 2018
3 | Details | Mary | 18 | ccccc | 00000 |
3 | Class_A1 | | | | | 2018
4 | Details | Peter | 20 | dddddd | 00000 |
4 | Class_A1 | | | | | 2017
Note the global secondary index with the partition key on the item type and the sort key on the year.
With this design we have a few query options:
1) Get student for a given id: GetItem(partitionKey: studentId, sortkey: Details)
2) Get all classes for a given student id: Query(partitionKey: studentId, sortkey: STARTS_WITH("Class"));
3) Get all students in class A1 and year 2018: Query(GSI partitionkey: "Class_A1", sortkey: equals(2018))
For global secondary indexes, the partition and sort key don't need to be unique therefore you can have many Class_A1, 2018 combos. If you haven't already read the Best Practices for DyanmoDB I highly recommend reading it in full.

how to flatten rows to columns in postgreSQL

using postgresql 9.3 I have a table that shows indivual permits issued across a single year below:
permit_typ| zipcode| address| name
-------------+------+------+-----
CONSTRUCTION | 20004 | 124 fake streeet | billy joe
SUPPLEMENTAL | 20005 | 124 fake streeet | james oswald
POST CARD | 20005 | 124 fake streeet | who cares
HOME OCCUPATION | 20007 | 124 fake streeet | who cares
SHOP DRAWING | 20009 | 124 fake streeet | who cares
I am trying to flatten this so it looks like
CONSTRUCTION | SUPPLEMENTAL | POST CARD| HOME OCCUPATION | SHOP DRAWING | zipcode
-------------+--------------+-----------+----------------+--------------+--------
1 | 2 | 3 | 5 | 6 | 20004
1 | 2 | 3 | 5 | 6 | 20005
1 | 2 | 3 | 5 | 6 | 20006
1 | 2 | 3 | 5 | 6 | 20007
1 | 2 | 3 | 5 | 6 | 20008
have been trying to use Crosstab but its a bit above my rusty SQL experiance. anybody have any ideas
I usually approach this type of query using conditional aggregation. In Postgres, you can do:
select zipcode,
sum( (permit_typ = 'CONSTRUCTION')::int) as Construction,
sum( (permit_typ = 'SUPPLEMENTAL')::int) as SUPPLEMENTAL,
. . .
from t
group by zipcode;

SQL Select with root parent

I have a table Members(id, name, parent_id), where parent_id is the parent of the member(it is also a member which can have its parent). For example
id | name | parent_id
----------------------
1 | John | NULL
2 | Smith| 1
3 | Andy | 1
4 | Joe | 2
5 | Rick | 2
6 | Craig| 5
7 | Greg | NULL
8 | Bob | 5
9 | Mike | 8
And I'd like to run statement select from members, and I want to have
id | name | parent_id | root_parent_id
--------------------------------------
1 | John | NULL | NULL
2 | Smith| 1 | 1
3 | Andy | 1 | 1
4 | Joe | 2 | 1
5 | Rick | 2 | 1
6 | Craig| 5 | 1
7 | Greg | NULL | NULL
8 | Bob | 7 | 7
9 | Mike | 8 | 7
I want to find the root_parent_id for all members as deeply as possible. Help me please
with recursive recursive_members as (
select *, id root_id, 1 depth
from members
union all
select r.id, r.name, r.parent_id, m.parent_id, r.depth+ 1
from recursive_members r
join members m on r.root_id = m.id
where m.parent_id notnull
)
select distinct on (id) *
from recursive_members
order by id, depth desc;
id | name | parent_id | root_id | depth
----+-------+-----------+---------+-------
1 | John | | 1 | 1
2 | Smith | 1 | 1 | 2
3 | Andy | 1 | 1 | 2
4 | Joe | 2 | 1 | 3
5 | Rick | 2 | 1 | 3
6 | Craig | 5 | 1 | 4
7 | Greg | | 7 | 1
8 | Bob | 5 | 1 | 4
9 | Mike | 8 | 1 | 5
(9 rows)
Read about recursive WITH queries.