PostgreSQL, renumber and cumulative sum at once

PostgreSQL, renumber and cumulative sum at once - postgresql

I have temporary table (without primary key) created as a result of few operations which are UNION together and which can be simulated with following one:
DROP TABLE IF EXISTS temp1;
CREATE TEMP TABLE temp1(rownum int, sname text, input_qty decimal(8,3),
output_qty decimal(8,3), cumulativesum decimal(8,3));
INSERT INTO temp1 (rownum, sname, input_qty, output_qty, cumulativesum)
VALUES (0, 'name 1', 3.186, 0, 0),
(0, 'name 2', 0, 0.24, 0),
(0, 'name 3', 0, 1, 0),
(0, 'name 4', 0.18, 0.125, 0),
(0, 'name 5', 0, 1.14, 0);
During past processes columns 'rownum' and 'cumulativesum' was intentionally filled with zeroes.
As a last two steps (or one if possible) I would like to enumerate rownum by +1 from beginning and calculate and write cumulative sum in column 'cumulativesum' to get table ready to write to html document more or less directly.
Cumulative sum should be calculated like:
lastcumulstivesum + input_qty - output_qty.
After all this should be result of operation:
1, 'name 1' 3.186 0.000 3.186
2, 'name 2' 0.000 0.240 2.946
3, 'name 3' 0.000 1.000 1.946
4, 'name 4' 0.180 0.125 2.001
5, 'name 5' 0.000 1.140 0.861
Please if someone can write described query(es) with presented table.
PostgreSQL 9.1, Windows 7

Related

Postgres: Aggregate of Averaged Time Samples

This is a bit of a SQL code golf post...
I have a table that contains a timestamp an identifier, and a price
CREATE TABLE IF NOT EXISTS ts_data (
ts TIMESTAMP WITHOUT TIME ZONE, --timestamp
metric INTEGER, --identifier
usd DOUBLE PRECISION -- price
);
Given the sample data:
INSERT INTO ts_data (ts, metric, usd)
VALUES
('2018-08-21 01:00:00', 1, 5.00),
('2018-08-21 01:05:00', 1, 10.00),
('2018-08-21 01:10:00', 1, 15.00),
('2018-08-21 01:15:00', 1, 20.00),
('2018-08-21 02:00:00', 1, 25.00),
('2018-08-21 02:05:00', 1, 30.00),
('2018-08-21 02:10:00', 1, 35.00),
('2018-08-21 02:15:00', 1, 40.00),
('2018-08-21 01:00:00', 2, 1.00),
('2018-08-21 01:05:00', 2, 2.00),
('2018-08-21 01:10:00', 2, 3.00),
('2018-08-21 01:15:00', 2, 4.00),
('2018-08-21 02:00:00', 2, 5.00),
('2018-08-21 02:05:00', 2, 6.00),
('2018-08-21 02:10:00', 2, 7.00),
('2018-08-21 02:15:00', 2, 8.00);
What I'm trying to do
Re-Sample the time-frequency by using DATE_TRUNC and average the price of the same truncated date per metric
Calculate the SUM of all metrics per truncated date
This query accomplishes what I want
SELECT ts, SUM(usd) as usd FROM (
SELECT metric, date_trunc('HOUR', ts) as ts, AVG(usd) AS usd
FROM ts_data
WHERE ts BETWEEN '2018-08-20 00:00:00' AND '2018-08-22 00:00:00'
AND metric IN (1, 2)
GROUP BY date_trunc('HOUR', ts), metric
) samples
GROUP BY ts;
We get:
+------------------------------+
| ts | price|
+------------------------------+
| "2018-08-21 01:00:00" | 15 |
| "2018-08-21 02:00:00" | 39 |
Are there any more efficient or compact ways to accomplish the same thing?

How do I return a table of rows from a list in postgresql?

I'm trying to generate rows given a list of values in postgresql. For example, if I have the values 1, 2, 5, 10, 30, 180, I would like to return:
num
----
1
2
5
10
30
180
I've been trying to use the VALUES function, but haven't had any luck. Here's an example of a failed atttempt:
SELECT num
FROM
VALUES (1, 2, 5, 10, 30, 180) as num

values (1, 2, 5, 10, 30, 180) returns a single row with 6 columns.
But you want six rows with one column:
SELECT num
FROM (
VALUES (1), (2), (5), (10), (30), (180)
) as t(num)
as t(num) assigns an alias to the derived table (the values part) which is named t and has a single column num.
Using the table alias in the select list returns a record, not a column.

PostgreSQL, SUM and GROUP from numeric column and hstore

I would kindly ask if someone could make me a query which may SUM up values from numeric column and from hstore column. This is obviously too much for my SQL abbilities.
A table:
DROP TABLE IF EXISTS mytry;
CREATE TABLE IF NOT EXISTS mytry
(mybill int, price numeric, paym text, combined_paym hstore);
INSERT INTO mytry (mybill, price, paym, combined_paym)
VALUES (10, 10.14, '0', ''),
(11, 23.56, '0', ''),
(12, 12.16, '3', ''),
(13, 12.00, '6', '"0"=>"4","3"=>"4","2"=>"4"'),
(14, 14.15, '6', '"0"=>"2","1"=>"4","3"=>"4","4"=>"4.15"'),
(15, 13.00, '1', ''),
(16, 9.00, '4', ''),
(17, 4.00, '4', ''),
(18, 4.00, '1', '');
Here is a list of bills, price and payment method for each bill.
Some bills (here 13 and 14) could have combined payment. Payment methods are enumerated from 0 to 5 which describes specific payment method.
For this I make this query:
SELECT paym, SUM(price) FROM mytry WHERE paym::int<6 GROUP BY paym ORDER BY paym;
This sums prices for payment methods 0-5. 6 is not payment method but a flag which means that we should here consider payment methods and prices from hstore 'combined_paym'. This is what I don't know how to solve. To sum payment methods and prices from 'combined paym' with ones from 'paym' and 'price'.
This query gives result:
"0";33.70
"1";17.00
"3";12.16
"4";13.00
But result is incorrect because here are not summed data from bill's 13 and 14.
Real result should be:
"0";39.70
"1";21.00
"2";4.00
"3";20.16
"4";17.15
Please if someone can make me proper query which would give this last result from given data.

Unnest the hstore column:
select key, value::dec
from mytry, each(combined_paym)
where paym::int = 6
key | value
-----+-------
0 | 4
2 | 4
3 | 4
0 | 2
1 | 4
3 | 4
4 | 4.15
(7 rows)
and use it in union:
select paym, sum(price)
from (
select paym, price
from mytry
where paym::int < 6
union all
select key, value::dec
from mytry, each(combined_paym)
where paym::int = 6
) s
group by 1
order by 1;
paym | sum
------+-------
0 | 39.70
1 | 21.00
2 | 4
3 | 20.16
4 | 17.15
(5 rows)

Calculate total spread covered by several ranges

I have a table where each record has an indicator and a range, and I want to know the total spread covered by the ranges for each indicator -- but not double-counting when ranges overlap for a certain indicator.
I can see that the wording is hard to follow, but the concept is pretty simple. Let me provide an illustrative example.
CREATE TABLE records(id int, spread int4range);
INSERT INTO records VALUES
(1, int4range(1, 4)),
(1, int4range(2, 7)),
(1, int4range(11, 15)),
(2, int4range(3, 5)),
(2, int4range(6, 10));
SELECT * FROM records;
Yielding the output:
id | spread
----+---------
1 | [1,4)
1 | [2,7)
1 | [11,15)
2 | [3,5)
2 | [6,10)
(5 rows)
I would now like a query which gives the following output:
id | total
---+--------
1 | 10
2 | 6
Where did the numbers 10 and 6 come from? For ID 1, we have ranges that include 1, 2, 3, 4, 5, 6, 11, 12, 13, and 14; a total of 10 distinct integers. For ID 2, we have ranges that include 3, 4, 6, 7, 8, and 9; a total of six distinct integers.
If it helps you understand the problem, you might imagine it as something like "if these records represent the day and time range for meetings on my calendar, how many total hours in each day are there where I'm booked at least once?"
Postgres version is 9.4.8, in case that matters.

select id, count(*)
from (
select distinct id, generate_series(lower(spread), upper(spread) - 1)
from records
) s
group by id
;
id | count
----+-------
1 | 10
2 | 6

SQL finding average value of n rows where n is a sum of a field

I have data that looks like this.
SoldToRetailer
OrderDate | Customer | Product | price | units
-------------------------------------------------
1-jun-2011 | customer1 | Product1 | $10 | 5
2-jun-2011 | customer1 | Product1 | $5 | 3
3-jun-2011 | customer1 | Product2 | $10 | 4
4-jun-2011 | customer1 | Product1 | $4 | 4
5-jun-2011 | customer2 | Product3 | $10 | 1
SalesByRetailers
Customer | Product | units
-----------------------------
customer1 | Product1 | 5
customer2 | Product3 | 1
Here's what I need.
Sales(average price)
Customer | Product | units | Average Price
--------------------------------------------
customer1 | Product1 | 5 | $3.44
customer2 | Product3 | 1 | $10
Average Price is defined as the average price of the most recent SoldToRetailer Prices that add up to the units.
So in the first case, I grab the orders from June 4th and June 2nd. I don't need (actually want) the orders from june 1st to be included.
EDIT: Hopefully a better explanation.
I'm attempting to determine the correct (most recent) price where an item was sold to a retailer. It's LIFO order for the prices. The price is determined by averaging the price sold over the last n orders. Where n = total retail sales for a particular product and customer.
In SQL pseudcode it would look like this.
Select s1.Customer, s1.product, average(s2.price)
from SalesByRetailers s1
join SoldToRetailer s2
on s1.customer=s2.customer
and s1.product=s2.product
and ( select top (count of records where s2.units = s1.units) from s2 order by OrderDate desc)
What I need to return is the number of records from SoldToRetailer where the sum of units is >= SalesByRetailer Units.
It looks like it could be solved by a RANK or rowover partition, but I'm at a loss.
The SoldToRetailer table is ginormous so performance is at a premium.
Running on SQL 2008R2
Thanks for helping

So I used 3 techniques. First I created a table with an over by clause to give me a sorted list of products and prices, then I edited the table to add in the running average. An OUTER APPLY sub select fixded my final problem. Hopefully the code will help someone else with a similar problem.
A shout out to Jeff Moden of SQLSderverCentral.com fame for the running average help.
SELECT d.ProductKey,
d.ActualDollars,
d.Units,
ROW_NUMBER() OVER(PARTITION BY ProductKey ORDER BY d.OrderDateKey DESC) AS RowNumber,
NULL AS RunningTotal,
CONVERT(DECIMAL(10, 4), 0) AS RunningDollarsSum,
CONVERT(DECIMAL(10, 4), 0) AS RunningAverage
INTO #CustomerOrdersDetails
FROM dbo.orders d
WHERE customerKey = #CustomerToSelect
--DB EDIT... Google "Quirky update SQL Server central. Jeff Moden's version of a
--Running total. Holy crap it's faster. tried trangular joins before.
CREATE CLUSTERED INDEX [Index1]
ON #CustomerOrdersDetails ( ProductKey ASC, RowNumber ASC )
DECLARE #RunningTotal INT
DECLARE #PrevProductKey INT
DECLARE #RunningDollarsSum DECIMAL(10, 4)
UPDATE #CustomerOrdersDetails
SET #RunningTotal = RunningTotal = CASE
WHEN ProductKey = #PrevProductKey THEN c.Units + ISNULL(#RunningTotal, 0)
ELSE c.Units
END,
#RunningDollarsSum = RunningDollarsSum = CASE
WHEN ProductKey = #PrevProductKey THEN c.ActualDollars + ISNULL(#RunningDollarsSum, 0)
ELSE c.ActualDollars
END,
#PrevProductKey = ProductKey,
RunningAverage = #RunningDollarsSum / NULLIF(#RunningTotal, 0)
FROM #CustomerOrdersDetails c WITH (TABLOCKX)
OPTION (MAXDOP 1)
-- =============================================
-- Update Cost fields with average price calculation
-- =============================================
UPDATE d
SET DolSoldCostUSD = COALESCE(d.DolSoldCostUSD,
d.UnitsSold * a.RunningAverage),
FROM dbo.inbound d
OUTER APPLY (SELECT TOP 1 *
FROM #CustomerOrdersDetails ap
WHERE ap.ProductKey = d.ProductKey
AND d.UnitsSold + d.UnitsOnHand + d.UnitsOnOrder + d.UnitsReceived + d.UnitsReturned >= RunningTotal
ORDER BY RunningTotal) AS a

declare #table table (customer varchar(15), product varchar(15), qty int, price decimal(6,2))
insert into #table (customer, product, qty, price)
values
('customer1', 'product1', 5, 3),
('customer1', 'product1', 4, 4),
('customer1', 'product1', 3, 2),
('customer1', 'product1', 2, 13),
('customer1', 'product1', 3, 3),
('customer1', 'product2', 5, 1),
('customer1', 'product2', 4, 7),
('customer1', 'product2', 2, 5),
('customer1', 'product2', 6, 23),
('customer1', 'product2', 2, 1),
('customer2', 'product1', 2, 1),
('customer2', 'product1', 4, 4),
('customer2', 'product1', 7, 3),
('customer2', 'product1', 1, 12),
('customer2', 'product1', 2, 3),
('customer2', 'product2', 3, 2),
('customer2', 'product2', 6, 5),
('customer2', 'product2', 8, 4),
('customer2', 'product2', 2, 11),
('customer2', 'product2', 1, 2)
select customer, product, sum(qty) as units, (sum(qty * price))/SUM(qty) as 'Average Price' from #table
group by customer, product

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

PostgreSQL, renumber and cumulative sum at once - postgresql

Related

Postgres: Aggregate of Averaged Time Samples

How do I return a table of rows from a list in postgresql?

PostgreSQL, SUM and GROUP from numeric column and hstore

Calculate total spread covered by several ranges

SQL finding average value of n rows where n is a sum of a field

Categories

Resources