Break on Column, Compute sum in postgres - postgresql

I am trying to transform an SQL query into Postgres, and i am struggling to do the below,
Example :
RESULTS FROM FIRST QUERY FOR MY NEXT QUERY : select sourcename, type, count(type), sum(size) from table;
sourcename | Type | count | sum
------------+-----------------------------------------------------+-------+----------
A | TYPE1 | 21 | 10485378
B | TYPE1 | 12 | 5241177
C | TYPE1 | 12 | 5242254
D | TYPE1 | 12 | 5570560
A | TYPE2 | 11 | 5239645
B | TYPE2 | 12 | 5570560
C | TYPE2 | 12 | 5241862
D | TYPE2 | 11 | 5570560
OUTPUT I NEED:
sourcename | Type | count | sum
------------+-----------------------------------------------------+-------+----------
A | TYPE1 | 21 | 10485378
| TYPE2 | 12 | 5241177
TOTAL | | 33 | sum(values above)
B | TYPE1 | 12 | 5242254
| TYPE2 | 12 | 5570560
TOTAL | | 24 | sum(values above)
C | TYPE1 | 11 | 5239645
| TYPE2 | 12 | 5570560
TOTAL | | 23 | sum(values above)
D | TYPE1 | 12 | 5241862
| TYPE2 | 11 | 5570560
TOTAL | | 23 | sum(values above)
NOTE: I have used WITH AS to get the total but displaying them between each Source name i.e. A,B,C,D is something i am not able to do, i am also not able to suppress A from getting displayed twice in the results, like in oracle its break on sourcename, but is there a postgres equivalent?. also compute sum on column is in oracle , i could not find a postgres equivalent. I have finally found a way out using both shell and psql to get the below desired results, but i need to know if there is a better way to do this without using shell in between. Any help is much appreciated. I am using psql 9.1.3
I am new to this forum, so if you see my table results not aligned for viewing let me know, i will try to set it right.
(Using PostgreSQL 9.1)

Related

postgres sum different table columns from many to one joined data

Suppose I have the following two tables:
foo:
| id | goober | value |
|----|--------|-------|
| 1 | a1 | 25 |
| 2 | a1 | 125 |
| 3 | b2 | 500 |
bar:
| id | foo_id | value |
|----|--------|-------|
| 1 | 1 | 4 |
| 2 | 3 | 19 |
| 3 | 3 | 42 |
| 4 | 3 | 22 |
| 5 | 3 | 56 |
Note the n:1 relationship of bar.foo_id : foo.id.
My goal is to sum the value columns for tables foo and bar, joining on bar.foo_id=foo.id, and finally grouping by goober from foo. Then performing a calculation if possible, though not critical.
Resulting in a final output looking something like:
| goober | foo_value_sum | bar_value_sum | foo_bar_diff |
|--------|---------------|---------------|--------------|
| a1 | 150 | 4 | 146 |
| b2 | 500 | 139 | 361 |
This should be rather simple by the following query that creates two CTEs and then joins them afterwards:
with bar_agg as
(
select foo.goober
,sum(bar.value) bar_value_sum
from foo
join bar
on bar.foo_id = foo.id
group by foo.goober
)
,foo_agg as
(
select foo.goober
,sum(foo.value) foo_value_sum
from foo
group by foo.goober
)
select foo.goober
,foo_value_sum
,bar_value_sum
,foo_value_sum - bar_value_sum foo_bar_diff
from foo_agg foo
left join bar_agg bar
on bar.goober = foo.goober
order by foo.goober

postgres sql : getting unified rows

I have one table where I dump all records from different sources (x, y, z) like below
+----+------+--------+
| id | source |
+----+--------+
| 1 | x |
| 2 | y |
| 3 | x |
| 4 | x |
| 5 | y |
| 6 | z |
| 7 | z |
| 8 | x |
| 9 | z |
| 10 | z |
+----+--------+
Then I have one mapping table where I map values between sources based on my usecase like below
+----+-----------+
| id | mapped_id |
+----+-----------+
| 1 | 2 |
| 1 | 9 |
| 3 | 7 |
| 4 | 10 |
| 5 | 1 |
+----+-----------+
I want merged results where I can see only unique results like
+-----+------------+
| id | mapped_ids |
+-----+------------+
| 1 | 2,9,5 |
| 3 | 7 |
| 4 | 10 |
| 6 | null |
| 8 | null |
+-----+------------+
I am trying different options but could not figure this out, is there way I can write joins to do this. I have to use the mapping table where associations are stored and identify unique records along with records which are not mapped anywhere.
My understanding is, you want to see all dump_table IDs that do not appear in the mapping_id column and then aggregate the mapped_ids for those that are left:
select d1.id,
array_agg(m1.mapped_id order by m1.mapped_id) filter (where m1.mapped_id is not null) as mapped_ids
from dump_table d1
left join mapping_table m1 using (id)
where not exists (select *
from mapping_table m2
where m2.mapped_id = d1.id)
group by d1.id;
Online example: https://rextester.com/JQZ17650
Try something like this:
SELECT id, name, ARRAY_AGG(mapped_id) AS mapped_ids
FROM table1 AS t1
LEFT JOIN table2 AS t2 USING (id)
GROUP BY id, name

In postgresql, how do you find aggregate base on time range

For example, if I have a database table of transactions done over the counter. And I would like to search whether there was any time that was defined as extremely busy (Processed more than 10 transaction in the span of 10 minutes). How would I go about querying it? Could I aggregate based on time range and count the amount of transaction id within those ranges?
Adding example to clarify my input and desired output:
+----+--------------------+
| Id | register_timestamp |
+----+--------------------+
| 25 | 08:10:50 |
| 26 | 09:07:36 |
| 27 | 09:08:06 |
| 28 | 09:08:35 |
| 29 | 09:12:08 |
| 30 | 09:12:18 |
| 31 | 09:12:44 |
| 32 | 09:15:29 |
| 33 | 09:15:47 |
| 34 | 09:18:13 |
| 35 | 09:18:42 |
| 36 | 09:20:33 |
| 37 | 09:20:36 |
| 38 | 09:21:04 |
| 39 | 09:21:53 |
| 40 | 09:22:23 |
| 41 | 09:22:42 |
| 42 | 09:22:51 |
| 43 | 09:28:14 |
+----+--------------------+
Desired output would be something like:
+-------+----------+
| Count | Min |
+-------+----------+
| 1 | 08:10:50 |
| 3 | 09:07:36 |
| 7 | 09:12:08 |
| 8 | 09:20:33 |
+-------+----------+
How about this:
SELECT time,
FROM (
SELECT count(*) AS c, min(time) AS time
FROM transactions
GROUP BY floor(extract(epoch from time)/600);
)
WHERE c > 10;
This will find all ten minute intervals for which more than ten transactions occurred within that interval. It assumes that the table is called transactions and that it has a column called time where the timestamp is stored.
Thanks to redneb, I ended up with the following query:
SELECT count(*) AS c, min(register_timestamp) AS register_timestamp
FROM trak_participants_data
GROUP BY floor(extract(epoch from register_timestamp)/600)
order by register_timestamp
It works close enough for me to be able tell which time chunks are the most busiest for the counter.

Subtract fields of a column - Tableau

I would like to subtract promoters and detractors in Tableau by creating a new column. Thanks for all the help!
Customer Type Table (I would like to create the NPS field as shown below):
+---------+------------+----------+-----------+--------------+
| Quarter | Detractors | Passives | Promoters | NPS |
+---------+------------+----------+-----------+--------------+
| Q1 15 | 40.56 | 23.56 | 35.79 | =35.79-40.56 |
| ... | ... | ... | ... | ... |
+---------+------------+----------+-----------+--------------+
Simply create a calculated field (called NPS):
[Promoters] - [Detractors]
This will add a new field to every row of your partition called NPS.
Check out the Tableau online help on calculated fields - this is a skill well worth learning.
I understand the OPs question. The data comes in like this:
+---------+---------------+------+
| Quarter | Customer Type | Score|
+---------+------------+---------+
| Q1 15 | Detractors | 25 |
| Q1 15 | Promoters | 32 |
| Q1 15 | Passives | 45 |
| Q1 15 | Detractors | 17 |
| Q1 15 | Detractors | 28 |
| ... | ... | ... |
+---------+------------+---------+
And when brought into Tableau, the [Customer Type] field is put in the Column shelf and this arranges the data like the table below. The OP wants to calculate the [NPS] column (Promoters - Detractors).
+---------+------------+----------+-----------+--------------+
| Quarter | Detractors | Passives | Promoters | NPS |
+---------+------------+----------+-----------+--------------+
| Q1 15 | 40.56 | 23.56 | 35.79 | =35.79-40.56 |
| ... | ... | ... | ... | ... |
+---------+------------+----------+-----------+--------------+
I hope this clarifies. I am stuck with a similar situation (I want a column that shows the difference between 2015 and 2016):
+---------+-------+-------+------------+
| Measure | 2015 | 2016 | Difference |
+---------+---------------+------------+
| # Hires | 100 | 115 | 15 |
| # Terms | 9 | 6 | 3 |
+---------+---------------+------------+
I believe the steps are similar. I hope someone can help.

PostgreSQL simple count query

Trying to scale this down so the answer is simple. I can probably extrapolate the answers here to apply to a bigger data set.
Given the following table:
+------+-----+
| name | age |
+------+-----+
| a | 5 |
| b | 7 |
| c | 8 |
| d | 8 |
| e | 10 |
+------+-----+
I want to make a table that shows the count of people where their age is equal to or greater than x. For instance, the table about would produce:
+--------------+-------+
| at least age | count |
+--------------+-------+
| 5 | 5 |
| 6 | 4 |
| 7 | 4 |
| 8 | 3 |
| 9 | 1 |
| 10 | 1 |
+--------------+-------+
Is there a single query that can accomplish this task? Obviously, it is easy to write a simple function for it, but I'm hoping to be able to do this quickly with one query.
Thanks!
Yes, what you're looking for is a window function.
with cte_age_count as (
select age,
count(*) c_star
from people
group by age)
select age,
sum(c_star) over (order by age
range between unbounded preceding
and current row)
from cte_age_count
Not syntax checked ... let me know if it works!