Calculate total spread covered by several ranges - postgresql

I have a table where each record has an indicator and a range, and I want to know the total spread covered by the ranges for each indicator -- but not double-counting when ranges overlap for a certain indicator.
I can see that the wording is hard to follow, but the concept is pretty simple. Let me provide an illustrative example.
CREATE TABLE records(id int, spread int4range);
INSERT INTO records VALUES
(1, int4range(1, 4)),
(1, int4range(2, 7)),
(1, int4range(11, 15)),
(2, int4range(3, 5)),
(2, int4range(6, 10));
SELECT * FROM records;
Yielding the output:
id | spread
----+---------
1 | [1,4)
1 | [2,7)
1 | [11,15)
2 | [3,5)
2 | [6,10)
(5 rows)
I would now like a query which gives the following output:
id | total
---+--------
1 | 10
2 | 6
Where did the numbers 10 and 6 come from? For ID 1, we have ranges that include 1, 2, 3, 4, 5, 6, 11, 12, 13, and 14; a total of 10 distinct integers. For ID 2, we have ranges that include 3, 4, 6, 7, 8, and 9; a total of six distinct integers.
If it helps you understand the problem, you might imagine it as something like "if these records represent the day and time range for meetings on my calendar, how many total hours in each day are there where I'm booked at least once?"
Postgres version is 9.4.8, in case that matters.

select id, count(*)
from (
select distinct id, generate_series(lower(spread), upper(spread) - 1)
from records
) s
group by id
;
id | count
----+-------
1 | 10
2 | 6

Related

[postgresql - generate months from start_date and end_date base on total_x]

I have three columns in postgresql
No
total_car_sales
start_date
end_date
1
5
Jan-01-2022
Aug-03-2022
2
1
April-01-2022
July-03-2022
3
3
March-01-2022
May-03-2022
4
7
Jan-01-2022
July-03-2022
5
56
April-01-2022
April-25-2022
6
3
April-01-2022
Aug-04-2022
Here example from start_date No.1: 'Jan-01-2022' to 'August-03-2022': I will count only for August-2022 so the result for August-2022 is 5.
No.6 the result Aug-2022 is 3.
Result I wanna generate total_car_sales for whole table like this:
Months
total_car_sales
Jan-2022
0
Feb-2022
0
March-2022
0
April-2022
56
May-2022
3
June-2022
0
July-2022
8
August-2022
8
I have tried to use trunc_cate() but it is not works for it
Any help for suggestion for me really appreciate it
Thank you
Make a list of months (generate_series) and calculate total sales for each of them.
with the_table (no,total_car_sales,start_date,end_date) as
(
values
(1, 5, 'Jan-01-2022'::date, 'Aug-03-2022'::date),
(2, 1, 'April-01-2022', 'July-03-2022'),
(3, 3, 'March-01-2022', 'May-03-2022'),
(4, 7, 'Jan-01-2022', 'July-03-2022'),
(5, 56, 'April-01-2022', 'April-25-2022'),
(6, 3, 'April-01-2022', 'Aug-04-2022')
)
select
to_char(m, 'mon-yyyy') "month",
coalesce
(
(select sum(total_car_sales) from the_table where m = date_trunc('month', end_date)),
0
) total_car_sales
from generate_series ('2022-01-01', '2022-08-01', interval '1 month') m;

Get last row from group, limit number of results in PostgreSQL

I have a table with records representing a log, I omit rest of the columns in this example.
The id-column is autoincrement, item_id represents an item in app.
I need to get the latest item_id, for example two or three
CREATE TABLE "log" (
"id" INT,
"item_id" INT
);
-- TRUNCATE TABLE "log";
INSERT INTO "log" ("id", "item_id") VALUES
(1, 1),
(2, 2),
(3, 1),
(4, 1),
(5, 3),
(6, 3);
Basic query will list all results, latest at the top:
SELECT *
FROM "log"
ORDER BY "id" DESC
id item_id
6 3
5 3
4 1
3 1
2 2
1 1
I would like to have just two (LIMIT 2) last item_ids with their id. Last means - inserted last (ORDER BY id).
id item_id
6 3
4 1
Last three would be
id item_id
6 3
4 1
2 2
Once an item_id is returned, it is not returned again. So LIMIT 4 would return only three rows because there are only three unique item_id.
I am probably missing something. I already tried various combinations of DISTINCT OF, GROUP BY, LIMIT etc.
UPDATE #1:
After I tested query by S-man (below), I found out that it works for the data I provided howerer it does not work in general, for another set of data (sequence of item_id A, B and A again.). Here is another data set:
TRUNCATE TABLE "log";
INSERT INTO "log" ("id", "item_id") VALUES
(1, 1),
(2, 2),
(3, 3),
(4, 3),
(5, 1),
(6, 3);
Data in DB, ordered by id desc:
id item_id
6 3
5 1
4 3
3 3
2 2
1 1
Expected result for last three item_id
6 3
5 1
2 2
Well, after three changes, now we come back to the very first idea:
Just take DISTINCT ON:
demo:db<>fiddle
SELECT
*
FROM (
SELECT DISTINCT ON (item_id) -- 1
*
FROM log
ORDER BY item_id, id DESC
) s
ORDER BY id DESC -- 2
LIMIT 2
Returns exact one record of an ordered group. You group is the item_id, the order is id DESC, so you get the highest id for each item_id
Reorder by id DESC (instead of the previously ordered item_id) and limit your query output.

How to retrieve top 3 results for each column in postgresql?

I have given a question. The table looks like this..
STATE | year1 | ... | year 10
AP | 100 | ... | 120
assam | 13 | .. | 42
madhya pradesh | 214 | ... | 421
Now, I need to get the top - 3 states for each year.
I tried everything possible. But, I am not able to filter results per column.
You have a design problem. The enumerated column are almost always a sign of bad design.
For now you could unpivot using unnest and then use window function row_number to get the top 3 states per year:
with unpivoted as (
select state,
unnest(array[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]) as year,
unnest(array[
year_1, year_2, year_3,
year_4, year_5, year_6,
year_7, year_8, year_9,
year_10
]) as value,
from your_table
)
select *
from (
select t.*,
row_number() over (
partition by year
order by value desc
) as seqnum
from unpivoted t
) t
where seqnum <= 3;
Demo

TSQL : Find combinations of rows within group without a cross joins

I'm trying to develop a T-SQL routine (SQL Server 2014), which will allow me to find all combinations of records within group.
Given the following data:
ID_COMBINATION | ID_POSITION | MULTIPLY_FACTOR
-----------------------------------------------
1 | 1 | 1
1 | 1 | 2
1 | 1 | 3
1 | 2 | 1
1 | 2 | 2
1 | 2 | 3
I would like to calculate a combination of MULTIPLY_FACTOR for full set of ID_POSITIONS for a given ID_COMBINATION
The result should be:
1 | 1 | 1
1 | 2 | 1
1 | 1 | 1
1 | 2 | 2
1 | 1 | 1
1 | 2 | 3
...
1 | 1 | 3
1 | 2 | 3
For the moment I prefer to have a closed routine definition (over using dynamic SQL to generate multi cross joins code at run-time, depending on the number of unique ID_POSITIONS within a group)
Thank you very much for your help!
EDIT:
The following TSQL code calculates combinations of unique ID_POSITION for a given ID_COMBINATION 1:
declare #Samples as Table ( Id_Combination Int, Id_Position Int, Multiply_Factor Int );
INSERT INTO #Samples (Id_Combination, Id_Position, Multiply_Factor)
VALUES (1, 1, 1), (1, 1, 2), (1, 1, 3)
, (1, 2, 1), (1, 2, 2), (1, 2, 3)
SELECT
S1.Id_Combination
,S1.Id_Position AS s1_idpos
,S1.Multiply_Factor AS s1_mufac
,S2.Id_Position AS s2_idpos
,S2.Multiply_Factor AS s2_mufac
FROM #Samples AS S1
INNER JOIN #Samples AS S2
ON s1.Id_Combination = s2.Id_Combination
AND s1.Id_Position < s2.Id_Position
However, if I add a new ID_POSITION key with respective MULTPLY_FACTOR values I will have to modify join conditions and select statement to cover new scenarios, like:
declare #Samples as Table ( Id_Combination Int, Id_Position Int, Multiply_Factor Int );
INSERT INTO #Samples (Id_Combination, Id_Position, Multiply_Factor)
VALUES (1, 1, 1), (1, 1, 2), (1, 1, 3)
,(1, 2, 1), (1, 2, 2), (1, 2, 3),
,(1, 3, 1), (1, 3, 2), (1, 3, 3);
SELECT
S1.Id_Combination
,S1.Id_Position AS s1_idpos
,S1.Multiply_Factor AS s1_mufac
,S2.Id_Position AS s2_idpos
,S2.Multiply_Factor AS s2_mufac
,S3.Id_Position AS s3_idpos
,S3.Multiply_Factor AS s3_mufac
FROM #Samples AS S1
INNER JOIN #Samples AS S2
ON s1.Id_Combination = s2.Id_Combination
AND s1.Id_Position < s2.Id_Position
INNER JOIN #Samples AS S3
ON s2.Id_Combination = s3.Id_Combination
AND s2.Id_Position < s3.Id_Position
Getting back to my question general idea: how to write "generic" TSQL code here, which will cover all possible, future values from the ID_POSITION domain and present values vertically rather then adding new fields in SELECT clause.
For sure, some SUB_COMBINATION key will have to be introduced, to make those combinations distinct within each other inside a parent ID_COMBINATION...
Since I can't figure out your comment re: "a size of ID_POSITION", I'll just ask why it isn't this easy:
-- Sample data.
declare #Samples as Table ( Id_Combination Int, Id_Position Int, Multiply_Factor Int );
insert into #Samples ( Id_Combination, Id_Position, Multiply_Factor ) values
( 1, 1, 1 ), ( 1, 1, 2 ), ( 1, 1, 3 ), -- ( 1, 1, 4 ), -- Try me.
( 1, 2, 1 ), ( 1, 2, 2 ), ( 1, 2, 3 );
select * from #Samples;
-- Generate all possible combinations of all values.
select distinct S1.Id_Combination, S2.Id_Position, S3.Multiply_Factor
from #Samples as S1 cross join #Samples as S2 cross join #Samples as S3
order by Id_Combination, Id_Position, Multiply_Factor;
Note that if you uncomment the extra sample data row you will get two more result rows.

PostgreSQL, SUM and GROUP from numeric column and hstore

I would kindly ask if someone could make me a query which may SUM up values from numeric column and from hstore column. This is obviously too much for my SQL abbilities.
A table:
DROP TABLE IF EXISTS mytry;
CREATE TABLE IF NOT EXISTS mytry
(mybill int, price numeric, paym text, combined_paym hstore);
INSERT INTO mytry (mybill, price, paym, combined_paym)
VALUES (10, 10.14, '0', ''),
(11, 23.56, '0', ''),
(12, 12.16, '3', ''),
(13, 12.00, '6', '"0"=>"4","3"=>"4","2"=>"4"'),
(14, 14.15, '6', '"0"=>"2","1"=>"4","3"=>"4","4"=>"4.15"'),
(15, 13.00, '1', ''),
(16, 9.00, '4', ''),
(17, 4.00, '4', ''),
(18, 4.00, '1', '');
Here is a list of bills, price and payment method for each bill.
Some bills (here 13 and 14) could have combined payment. Payment methods are enumerated from 0 to 5 which describes specific payment method.
For this I make this query:
SELECT paym, SUM(price) FROM mytry WHERE paym::int<6 GROUP BY paym ORDER BY paym;
This sums prices for payment methods 0-5. 6 is not payment method but a flag which means that we should here consider payment methods and prices from hstore 'combined_paym'. This is what I don't know how to solve. To sum payment methods and prices from 'combined paym' with ones from 'paym' and 'price'.
This query gives result:
"0";33.70
"1";17.00
"3";12.16
"4";13.00
But result is incorrect because here are not summed data from bill's 13 and 14.
Real result should be:
"0";39.70
"1";21.00
"2";4.00
"3";20.16
"4";17.15
Please if someone can make me proper query which would give this last result from given data.
Unnest the hstore column:
select key, value::dec
from mytry, each(combined_paym)
where paym::int = 6
key | value
-----+-------
0 | 4
2 | 4
3 | 4
0 | 2
1 | 4
3 | 4
4 | 4.15
(7 rows)
and use it in union:
select paym, sum(price)
from (
select paym, price
from mytry
where paym::int < 6
union all
select key, value::dec
from mytry, each(combined_paym)
where paym::int = 6
) s
group by 1
order by 1;
paym | sum
------+-------
0 | 39.70
1 | 21.00
2 | 4
3 | 20.16
4 | 17.15
(5 rows)