How to group by counted rows in Postgres? - postgresql

If I have a table:
id | status
----+--------
2 | 200
1 | 0
4 | 100
3 | 200
5 | 200
I want to count the number of occurrences of each status. I have tried to use the COUNT/OVER function
SELECT status, COUNT(*) OVER () AS all, COUNT(*) OVER (PARTITION by status) as count FROM my_table;
The results are what is expected per the postgres docs on windows "However, window functions do not cause rows to become grouped into a single output row like non-window aggregate calls would. Instead, the rows retain their separate identities"
status | all | count
--------+-------+-------
0 | 5 | 1
100 | 5 | 1
200 | 5 | 3
200 | 5 | 3
200 | 5 | 3
How instead can get an output that combines the rows, so that I only get 1 row per unique status if the partition is required?
status | all | count
--------+-------+-------
0 | 5 | 1
100 | 5 | 1
200 | 5 | 3

No window function necessary in the first stage of the query, i.e. getting the counts per status. Window functions work on the result of the non-windowing part of the query, thus you can have a window function referring the aggregate & non-aggregate columns in a query. To get all_counts, it is sufficient to SUM the status_count over all the rows.
SELECT
status
, COUNT(*) status_count
, SUM(COUNT(*)) OVER () all_count
FROM my_table
GROUP BY status

Related

How to get the change number?

How to increase value when source value is changed?
I have tried rank, dense_rank, row_number without success =(
id | src | how to get this?
--------
1 | 1 | 1
2 | 1 | 1
3 | 7 | 2
4 | 1 | 3
5 | 3 | 4
6 | 3 | 4
7 | 1 | 5
NOTICE: src is guaranteed to be in this order you see
is there simple way to do this?
You can achieve this by nesting two window functions - the first to get whether the src value changed from the previous row, the second to sum the number of changes. Unfortunately Postgres doesn't allow nesting window functions directly, but you can work around that with a subquery:
SELECT
id,
src,
sum(incr) OVER (ORDER BY id)
FROM (
SELECT
*,
(lag(src) OVER (ORDER BY id) IS DISTINCT FROM src)::int AS incr
FROM example
) AS _;
(online demo)

PostgresQL for each row, generate new rows and merge

I have a table called example that looks as follows:
ID | MIN | MAX |
1 | 1 | 5 |
2 | 34 | 38 |
I need to take each ID and loop from it's min to max, incrementing by 2 and thus get the following WITHOUT using INSERT statements, thus in a SELECT:
ID | INDEX | VALUE
1 | 1 | 1
1 | 2 | 3
1 | 3 | 5
2 | 1 | 34
2 | 2 | 36
2 | 3 | 38
Any ideas of how to do this?
The set-returning function generate_series does exactly that:
SELECT
id,
generate_series(1, (max-min)/2+1) AS index,
generate_series(min, max, 2) AS value
FROM
example;
(online demo)
The index can alternatively be generated with RANK() (example, see also #a_horse_­with_­no_­name's answer) if you don't want to rely on the parallel sets.
Use generate_series() to generate the numbers and a window function to calculate the index:
select e.id,
row_number() over (partition by e.id order by g.value) as index,
g.value
from example e
cross join generate_series(e.min, e.max, 2) as g(value);

Take new columns as output table - KDB

I have a query which returns results of data, which runs on a frequent basis. The new table will contain results of the old table as well but I only want to take whatever is in new in the most recent run of the new table and send that as an email. I already have the line for the email and trade data but just need a way to be able to:
display the results of the new table to be emailed
save the complete results of the new table to be used in the next run of the query
e.g.
Old results: tbl
| idx | name | age |
| 0 | Tom | 30 |
| 1 | Jerry | 25 |
| 2 | Bob | 30 |
| 3 | Ken | 45 |
New results: tbl
| idx | name | age |
| 0 | Tom | 30 |
| 1 | Jerry | 25 |
| 2 | Bob | 30 |
| 3 | Ken | 45 |
| 4 | Sam | 40 |
output required:
| 4 | Sam | 40 |
and then save the New results to be used in the next run
Thanks! :)
If the only changes between runs is that records are being appended onto the new table, you could just keep a variable denoting the last index seen and then select only those rows where idx is larger than that.
If the indexes are always increasing, this could be achieved using a query like
lastidx:exec last idx from tbl
select from tbl where idx>lastidx
If the idx values don't always increase monotonically, you could keep a count of the number of rows instead and only
lasti:count tbl
select from tbl where i>=lasti
This doesn't require saving the whole table in memory for use in the next iteration.
E.g to start with the old table had 4 rows so lasti = 4
q)tbl
idx name age
-------------
0 Tom 30
1 Jerry 25
2 Bob 30
3 Ken 45
q)lasti
4
The new table comes in and running the command selects the new row
q)tbl
idx name age
-------------
0 Tom 30
1 Jerry 25
2 Bob 30
3 Ken 45
4 Sam 40
q)select from tbl where i>lasti
idx name age
------------
4 Sam 40
lasti can then be updated to reflect the new count
q)lasti:count tbl
q)lasti
5
One way you can get this done, assuming the idx is the unique key :
q)old:([] idx:0 1 2 3; name:`T`J`B`K; age: 30 25 30 45)
q)new:old,enlist `idx`name`age!(4; `S;40) //new output from your query
q)out:()
q)if[0<count i:new[`idx] except old[`idx] ; out:new i ; old:new]
q)out
idx name age
------------
4 S 40
Another way, if your new records are always added to the last of old records:
q)old:([] idx:0 1 2 3; name:`T`J`B`K; age: 30 25 30 45)
q)i:count old
q)new:old,enlist `idx`name`age!(4; `S;40) //new output from your query
q)out:()
q)if[i<c:count new ; out:(i-c)#new ; old:new; i:c]
q)out
idx name age
------------
4 S 40

Select rows that satisfy a certain group condition in psql

Given the following table:
id | value
---+---------
1 | 1
1 | 0
1 | 3
2 | 1
2 | 3
2 | 5
3 | 2
3 | 1
3 | 0
3 | 1
I want the following table:
id | value
---+---------
1 | 1
1 | 0
1 | 3
3 | 2
3 | 1
3 | 0
3 | 1
The table contains ids that have a minimum value of 0.
I have tried using exist and having but to no success.
try this :
select * from foo where id in (SELECT id FROM foo GROUP BY id HAVING MIN(value) = 0)
or that ( with window functions)
select * from
(select *,min(value) over (PARTITION BY id) min_by_id from foo) a
where min_by_id=0
If I'm understanding correctly, it's a fairly simple having clause:
=# SELECT id, MIN(value), MAX(value) FROM foo GROUP BY id HAVING MIN(value) = 0;
id | min | max
----+-----+-----
1 | 0 | 3
3 | 0 | 2
(2 rows)
Did I miss something that is making it more complicated?
It looks it is not possible to use window function in WHERE or HAVING. Below is solution based on JOINs.
JOIN every row with all rows of the same id.
Filter based on second set.
Show result from first set.
The SQL looks like this.
SELECT a.*
FROM a_table AS a
INNER JOIN a_table AS value ON a.id = b.id
WHERE b.value = 0;

Count number of rows with distinct value in column in T-SQL

In T-SQL, how can I query this table to show me record counts based on how many times a distinct value appears in a column?
For example, I have a table defined as:
ControlSystemHierarchy
----------------------
ParentDeviceID int
ChildDeviceID int
Instrument bit
I want to display the number of records that match each distinct ParentDeviceID in the table so that this table
ParentDeviceID | ChildDeviceID | Instrument
1 | 1 | 0
1 | 2 | 0
1 | 2 | 1
2 | 3 | 0
would return
ParentDeviceID | Count
1 | 3
2 | 1
select ParentDeviceID, count(*) as [Count]
from ControlSystemHierarchy
group by ParentDeviceID