postgresql Recycle ID numbers - postgresql

I have a table as follows
|GroupID | UserID |
--------------------
|1 | 1 |
|1 | 2 |
|1 | 3 |
|2 | 1 |
|2 | 2 |
|3 | 20 |
|3 | 30 |
|5 | 200 |
|5 | 100 |
Basically what this does is create a "group" which user IDs get associated with, so when I wish to request members of a group I can call on the table.
Users have the option of leaving a group, and creating a new one.
When all users have left a group, there's no longer that groupID in my table.
Lets pretend this is for a chat application where users might close and open chats constantly, the group IDs will add up very quickly, but the number of chats will realistically not reach millions of chats with hundreds of users.
I'd like to recycle the group ID numbers, such that when I goto insert a new record, if group 4 is unused (as is the case above), it gets assigned.

There are good reasons not to do this, but it's pretty straightforward in PostgreSQL. The technique--using generate_series() to find gaps in a sequence--is useful in other contexts, too.
WITH group_id_range AS (
SELECT generate_series((SELECT MIN(group_id) FROM groups),
(SELECT MAX(group_id) FROM groups)) group_id
)
SELECT min(gir.group_id)
FROM group_id_range gir
LEFT JOIN groups g ON (gir.group_id = g.group_id)
WHERE g.group_id IS NULL;
That query will return NULL if there are no gaps or if there are no rows at all in the table "groups". If you want to use this to return the next group id number regardless of the state of the table "groups", use this instead.
WITH group_id_range AS (
SELECT generate_series(
(COALESCE((SELECT MIN(group_id) FROM groups), 1)),
(COALESCE((SELECT MAX(group_id) FROM groups), 1))
) group_id
)
SELECT COALESCE(min(gir.group_id), (SELECT MAX(group_id)+1 FROM groups))
FROM group_id_range gir
LEFT JOIN groups g ON (gir.group_id = g.group_id)
WHERE g.group_id IS NULL;

Related

Reordering the output of rollup where the subtotals are shown at the beginning or end of the respective category

We would like the subtotals from rollup to show at the beginning/end of each respective category. Below is the query on the following tables we tried. However, the Coalesce is not replacing the nulls with "total" because there are no nulls or "total" showing in the output, which means the rollup did not work. Our hypothesis is that something is not working with the sequence of the script (with the sequence). We are only familiar with Postgres sql. Thank you for any suggestions!
Tables
customers table:
| customer_id|customer_name|segment |
----------- ----------- ----------
|1 |Bob |Consumer |
|2 |Mary |Corporate |
|3 |Bill |Home Office|
|4 |Kathy |Consumer |
products table:
|product_id |category |sub_category|product_name |
-------------- ---------- ----------- -----------------------
|FUR-ADV-10000002|furniture |furnishings |Advantus Clock Ergonomic|
|FUR-BO-10000002 |furniture |bookcases |Bush Classic Bookcase |
|TEC-BRO-10000348|technology|copiers |Brother Copy Machine |
|TEC-EPS-10000053|technology|machines |Epson Printer Red |
|OFF-AR-10000019 |office sup|art |BIC Highlighters Blue |
|OFF-EAT-10000522|office sup|paper |Eaton Comp Printout |
orders tables:
|order_id |product_id |customer_id|sales |
-------------- --------------- ---------- -----
|AE-2016-1308551|FUR-ADV-10000002|1 |82.67 |
|AE-2016-1308552|FUR-BO-10000002 |1 |101.54|
|AE-2016-1308553|TEC-BRO-10000348|2 |79.28 |
|AE-2016-1308554|TEC-EPS-10000053|3 |101.23|
|AE-2016-1308555|OFF-AR-10000019 |4 |39.78 |
Query:
SELECT COALESCE(products.category, 'total'), COALESCE(products.sub_category, 'total') , SUM(orders.sales) AS total_sales
FROM products
INNER JOIN orders
ON products.product_id=orders.product_id,
(SELECT products.category, products.sub_category, SUM (orders.sales)
FROM orders
INNER JOIN products
ON orders.product_id=products.product_id
GROUP BY
ROLLUP (1,2)) t
GROUP BY 1,2
ORDER BY 3 desc;

Aggregating a table based on one column and then joining it with another table

I am working with the following two tables;
Table 1
Key |Clicks |Impressions
-------------+-------+-----------
USA-SIM-CARDS|55667 |544343
DE-SIM-CARDS |4563 |234829
AU-SIM-CARDS |3213 |232242
UK-SIM-CARDS |3213 |1333223
CA-SIM-CARDS |4321 |8883111
MX-SIM-CARDS |3193 |3291023
Table 2
Key |Conversions |Final Conversions|Active Sims
-----------------+------------+-----------------+-----------
USA-SIM-CARDS |456 |43 |4
USA-SIM-CARDS |65 |2 |1
UK-SIM-CARDS |123 |4 |3
UK-SIM-CARDS |145 |34 |5
The goal is to get the following output;
Key |Clicks |Impressions|Conversions|Final Conversions|Active Sims
-------------+-------+-----------+-----------+-----------------+-----------
USA-SIM-CARDS|55667 |544343 |521 |45 |5
DE-SIM-CARDS |4563 |234829 | | |
AU-SIM-CARDS |3213 |232242 | | |
UK-SIM-CARDS |3213 |1333223 |268 |38 |8
CA-SIM-CARDS |4321 |8883111 | | |
MX-SIM-CARDS |3193 |3291023 | | |
The most crucial part of this function involves aggregating the second table based on conversions
I would then I imagine execute this with an inner join.
Thank you.
Take this in two steps then:
1) Aggregate the second table:
SELECT Key, sum(Conversions) as Conversions, sum("Final Conversions") as FinalConversions, Sum("Active Sims") as ActiveSims FROM Table2 GROUP BY key
2) Use that as a subquery/derived table joining to your first table:
SELECT
t1.key,
t1.clicks,
t1.impressions,
t2.conversions,
t2.finalConversions,
t2.ActiveSims
From Table1 t1
LEFT OUTER JOIN (SELECT Key, sum(Conversions) as Conversions, sum("Final Conversions") as FinalConversions, Sum("Active Sims") as ActiveSims FROM Table2 GROUP BY 2) t2
ON t1.key = t2.key;
As an alternative, you could join and then group by as well since there isn't any need to aggregate twice or anything:
SELECT
t1.key,
t1.clicks,
t1.impressions,
sum(Conversions) as Conversions,
sum("Final Conversions") as FinalConversions,
Sum("Active Sims") as ActiveSims
From Table1 t1
LEFT OUTER JOIN table2 t2
ON t1.key = t2.key
GROUP BY t1.key, t1.clicks, t1.impressions
The only other important thing here is that we are using a LEFT OUTER JOIN since we want all record from Table1 and any records from Table2 that match on the key.

How to get back aggregate values across 2 dimensions using Python Cubes?

Situation
Using Python 3, Django 1.9, Cubes 1.1, and Postgres 9.5.
These are my datatables in pictorial form:
The same in text format:
Store table
------------------------------
| id | code | address |
|-----|------|---------------|
| 1 | S1 | Kings Row |
| 2 | S2 | Queens Street |
| 3 | S3 | Jacks Place |
| 4 | S4 | Diamonds Alley|
| 5 | S5 | Hearts Road |
------------------------------
Product table
------------------------------
| id | code | name |
|-----|------|---------------|
| 1 | P1 | Saucer 12 |
| 2 | P2 | Plate 15 |
| 3 | P3 | Saucer 13 |
| 4 | P4 | Saucer 14 |
| 5 | P5 | Plate 16 |
| and many more .... |
|1000 |P1000 | Bowl 25 |
|----------------------------|
Sales table
----------------------------------------
| id | product_id | store_id | amount |
|-----|------------|----------|--------|
| 1 | 1 | 1 |7.05 |
| 2 | 1 | 2 |9.00 |
| 3 | 2 | 3 |1.00 |
| 4 | 2 | 3 |1.00 |
| 5 | 2 | 5 |1.00 |
| and many more .... |
| 1000| 20 | 4 |1.00 |
|--------------------------------------|
The relationships are:
Sales belongs to Store
Sales belongs to Product
Store has many Sales
Product has many Sales
What I want to achieve
I want to use cubes to be able to do a display by pagination in the following manner:
Given the stores S1-S3:
-------------------------
| product | S1 | S2 | S3 |
|---------|----|----|----|
|Saucer 12|7.05|9 | 0 |
|Plate 15 |0 |0 | 2 |
| and many more .... |
|------------------------|
Note the following:
Even though there were no records in sales for Saucer 12 under Store S3, I displayed 0 instead of null or none.
I want to be able to do sort by store, say descending order for, S3.
The cells indicate the SUM total of that particular product spent in that particular store.
I also want to have pagination.
What I tried
This is the configuration I used:
"cubes": [
{
"name": "sales",
"dimensions": ["product", "store"],
"joins": [
{"master":"product_id", "detail":"product.id"},
{"master":"store_id", "detail":"store.id"}
]
}
],
"dimensions": [
{ "name": "product", "attributes": ["code", "name"] },
{ "name": "store", "attributes": ["code", "address"] }
]
This is the code I used:
result = browser.aggregate(drilldown=['Store','Product'],
order=[("Product.name","asc"), ("Store.name","desc"), ("total_products_sale", "desc")])
I didn't get what I want.
I got it like this:
----------------------------------------------
| product_id | store_id | total_products_sale |
|------------|----------|---------------------|
| 1 | 1 | 7.05 |
| 1 | 2 | 9 |
| 2 | 3 | 2.00 |
| and many more .... |
|---------------------------------------------|
which is the whole table with no pagination and if the products not sold in that store it won't show up as zero.
My question
How do I get what I want?
Do I need to create another data table that aggregates everything by store and product before I use cubes to run the query?
Update
I have read more. I realised that what I want is called dicing as I needed to go across 2 dimensions. See: https://en.wikipedia.org/wiki/OLAP_cube#Operations
Cross-posted at Cubes GitHub issues to get more attention.
This is a pure SQL solution using crosstab() from the additional tablefunc module to pivot the aggregated data. It typically performs better than any client-side alternative. If you are not familiar with crosstab(), read this first:
PostgreSQL Crosstab Query
And this about the "extra" column in the crosstab() output:
Pivot on Multiple Columns using Tablefunc
SELECT product_id, product
, COALESCE(s1, 0) AS s1 -- 1. ... displayed 0 instead of null
, COALESCE(s2, 0) AS s2
, COALESCE(s3, 0) AS s3
, COALESCE(s4, 0) AS s4
, COALESCE(s5, 0) AS s5
FROM crosstab(
'SELECT s.product_id, p.name, s.store_id, s.sum_amount
FROM product p
JOIN (
SELECT product_id, store_id
, sum(amount) AS sum_amount -- 3. SUM total of product spent in store
FROM sales
GROUP BY product_id, store_id
) s ON p.id = s.product_id
ORDER BY s.product_id, s.store_id;'
, 'VALUES (1),(2),(3),(4),(5)' -- desired store_id's
) AS ct (product_id int, product text -- "extra" column
, s1 numeric, s2 numeric, s3 numeric, s4 numeric, s5 numeric)
ORDER BY s3 DESC; -- 2. ... descending order for S3
Produces your desired result exactly (plus product_id).
To include products that have never been sold replace [INNER] JOIN with LEFT [OUTER] JOIN.
SQL Fiddle with base query.
The tablefunc module is not installed on sqlfiddle.
Major points
Read the basic explanation in the reference answer for crosstab().
I am including with product_id because product.name is hardly unique. This might otherwise lead to sneaky errors conflating two different products.
You don't need the store table in the query if referential integrity is guaranteed.
ORDER BY s3 DESC works, because s3 references the output column where NULL values have been replaced with COALESCE. Else we would need DESC NULLS LAST to sort NULL values last:
PostgreSQL sort by datetime asc, null first?
For building crosstab() queries dynamically consider:
Dynamic alternative to pivot with CASE and GROUP BY
I also want to have pagination.
That last item is fuzzy. Simple pagination can be had with LIMIT and OFFSET:
Displaying data in grid view page by page
I would consider a MATERIALIZED VIEW to materialize results before pagination. If you have a stable page size I would add page numbers to the MV for easy and fast results.
To optimize performance for big result sets, consider:
SQL syntax term for 'WHERE (col1, col2) < (val1, val2)'
Optimize query with OFFSET on large table

Postgresql select, show fixed count rows

Simple question. I have a table "tablename" with 3 rows. I need show 5 rows in my select when count rows < 5.
select * from tablename
+------------------+
|colname1 |colname2|
+---------+--------+
|1 |AAA |
|2 |BBB |
|3 |CCC |
+---------+--------+
In this query I show all rows in the table.
But I need show 5 rows. 2 rows is empty.
For example (I need):
+------------------+
|colname1 |colname2|
+---------+--------+
|1 |AAA |
|2 |BBB |
|3 |CCC |
| | |
| | |
+---------+--------+
Last 2 rows is empty.
It is possible?
Something like this:
with num_rows (rn) as (
select i
from generate_series(1,5) i -- adjust here the desired number of rows
), numbered_table as (
select colname1,
colname2,
row_number() over (order by colname1) as rn
from tablename
)
select t.colname1, t.colname2
from num_rows r
left outer join numbered_table t on r.rn = t.rn;
This assigns a number for each row in tablename and joins that to a fixed number of rows. If you know that your values in colname1 are always sequential and without gaps (which is highly unlikely) then you can remove the generation of row numbers in the second CTE using row_number().
If you don't care which rows are returned, you can leave out the order by part - but then the rows that are matched will be random. Leaving out the order by will be a bit more efficient.
The above will always return exactly 5 rows, regardless of how many rows tablename contains. If you want at least 5 rows, then you need to flip the outer join:
....
select t.colname1, t.colname2
from numbered_table t
left outer join num_rows r on r.rn = t.rn;
SQLFiddle example: http://sqlfiddle.com/#!15/e5770/3

How to eliminate repeated field with GROUP BY clause?

I have 3 tables called:
1.app_tenant pk:id, fk:pasar_id
---+--------+-----------+
id | nama | pasar_id |
----+--------+-----------+
1 | joe | 1 |
2 | adi | 2 |
3 | adam | 3 |
2.app_pasar pk:id
----+------------- +
id | nama |
----+------------- +
1 | kosambi |
2 | gede bage |
3 | pasar minggu |
3.app_kios pk:id, fk:tenant_id
----+---------------+----------
id | nama |tenant_id
----+-------------- +----------
1 | kios1 |1
2 | kios2 |2
3 | kios3 |3
4 | kios4 |1
5 | kios5 |1
6 | kios6 |2
7 | kios7 |2
8 | kios8 |3
9 | kios9 |3
Then with a LEFT JOIN query and grouping by id in every table I want to displaying data like this:
----+---------------+------------+-----------
id | nama_tenant |nama_pasar |nama_kios
----+-------------- +------------------------
1 | joe |kosambi |kios 1
2 | adi |gede bage |kios 2
2 | adam |pasar minggu|kios 3
but after I execute this query, data are not shown as expected. The problem is
redundancy in the nama_tenant field. How can I eliminate repeated nama_tenantrecords?
This is my query:
select a.id,a.nama as nama_tenant,
b.nama as nama_pasar,
c.nama as nama_kios
from app_tenant a
left join app_pasar b on a.id=b.id
left join app_kios c on a.id= c.tenant_id
group by
a.id,
b.id,
c.id
Table definitions:
CREATE TABLE app_tenant (
id serial PRIMARY KEY,
nama character varying,
pasar_id integer);
CREATE TABLE app_kios (
id serial PRIMARY KEY,
nama character varying,
tenant_id integer REFERENCES app_tenant);
The problem is that tenants can have multiple kiosks. From your sample data it looks like you want to display the first kiosk of every tenant (although "first" is a vague concept on strings, here I use alphabetical sort order). Your query would be like this:
SELECT t.id, t.nama AS nama_tenant, p.nama AS nama_pasar, k.nama AS nama_kios
FROM app_tenant t
LEFT JOIN app_pasar p ON p.id = t.pasar_id
LEFT JOIN (
SELECT tenant_id, nama, rank() OVER (PARTITION BY tenant_id ORDER BY nama) AS rnk
FROM app_kios
WHERE rnk = 1) k ON k.tenant_id = t.id
ORDER BY t.id
The sub-query on app_kios uses a window function to get the first kiosk name after sorting the names of the kiosk for each tenant.
I would also suggest to use meaningful aliases for table names instead of simply a, b, c.