Rank result set according to condition - tsql

I have a table which has 3 columns: Product, Date, Status
I want to rank in this manner:
for each product order by Date, and Rank if Status = FALSE then 0, if it's TRUE then start ranking by 1, continue ranking by the same value if previous Status is TRUE.
In this ordered set if FALSE comes assign to it 0, and for the next coming TRUE status for same product assign x+1 (x here is previous rank value for status TRUE).
I hope picture makes it more clear

This code uses SS2008R2 features which do not include LEAD/LAG. A better solution is certainly possible with more modern versions of SQL Server.
-- Sample data.
declare #Samples as Table ( Product VarChar(10), ProductDate Date,
ProductStatus Bit, DesiredRank Int );
insert into #Samples values
( 'a', '20160525', 0, 0 ), ( 'a', '20160526', 1, 1 ), ( 'a', '20160529', 1, 1 ),
( 'a', '20160601', 1, 1 ), ( 'a', '20160603', 0, 0 ), ( 'a', '20160604', 0, 0 ),
( 'a', '20160611', 1, 2 ), ( 'a', '20160612', 0, 0 ), ( 'a', '20160613', 1, 3 ),
( 'b', '20160521', 1, 1 ), ( 'b', '20160522', 0, 0 ), ( 'b', '20160525', 1, 2 );
select * from #Samples;
-- Query to rank data as requested.
with WithRN as (
select Product, ProductDate, ProductStatus, DesiredRank,
Row_Number() over ( partition by Product order by ProductDate ) as RN
from #Samples
RCTE as (
select *, Cast( ProductStatus as Int ) as C
from WithRN
where RN = 1
union all
select WRN.*, C + Cast( 1 - R.ProductStatus as Int ) * Cast( WRN.ProductStatus as Int )
from RCTE as R inner join
WithRN as WRN on WRN.Product = R.Product and WRN.RN = R.RN + 1 )
select Product, ProductDate, ProductStatus, DesiredRank,
C * ProductStatus as CalculatedRank
from RCTE
order by Product, ProductDate;
Note that the sample data was extracted from an image using a Mark I Eyeball. Had the OP taken heed of advice here it would have been somewhat easier.
Tip: Using column names that don't happen to match data types and keywords makes life somewhat simpler.

Try this query,
SELECT a.Product ,
a.Date ,
a.Status ,
END [Rank]
FROM ( SELECT Product ,
Date ,
Status ,
FROM TableProduct
) a
ORDER BY Product, a.RNK


(Postgres) Query in a tree table in ascending and descending mode

I'm having some issues with two queries to search in a "tree" table.
So, my table is represented by the following code, and it has one only direction. However, I need to get data in both directions, ascending and descending mode.
create table graph_examle (input int null, output int );
insert into graph_examle (input, output) values
(null, 1),
(1, 2),
(2, 3 ),
(3, 4 ),
(null, 7 ),
(8, 4 ),
(null, 10 ),
(10, 11 ),
(11, 4),
(3, 15),
(25, 15),
(26, 15),
(15, 4 );
The ascending query has some issues. If I search by id 1, I'm expecting to see the relations:
1, 1->2, 1->2->3, 1->2->3->4, but the results are:
WITH recursive cte (initial_id, level, path, loop, input, output) AS
SELECT input, 1, ':' ||input || ':' , 0, input, output
FROM graph_examle WHERE input = 1
c.level + 1,
c.path ||ur.input|| ':' ,
CASE WHEN c.path LIKE '%:' ||ur.input || ':%' THEN 1 ELSE 0 END,
FROM graph_examle ur
INNER JOIN cte c ON c.output = ur.input AND c.loop = 0
FROM cte
ORDER BY initial_id, level;
The descending query does not work as expected. If I search by id 4, I'm expecting to see the relations:
4, 4->3, 4->3->2, 4->3->2->1
4->8, 4->8->7
4->15, (...)
But I'm only getting:
WITH RECURSIVE cte (input, output, level, real_parent_id, path) AS
ur.input, ur.input, 1, output, ( ur.input|| ' -> ' || ur.output)
FROM graph_examle ur
WHERE ur.output = 4
ur_cte.input, ur.input, level + 1, ur.output, (ur_cte.path || '->' || ur.output)
FROM cte ur_cte
INNER JOIN graph_examle ur on ur.input = ur_cte.real_parent_id
FROM cte
Note that in my queries I'm trying to solve circular dependencies
The ascending query sounds good ... maybe you can concatenate the path and output columns.
For the descending query, you can try this :
WITH RECURSIVE cte (input, output, level, path, loop) AS
ur.input, ur.output, 1, ( ur.output|| ' -> ' || ur.input), 0
FROM graph_examle ur
WHERE ur.output = 4
ur.input, ur_cte.output, level + 1, (ur_cte.path || '->' || ur.input),
CASE WHEN ur_cte.path LIKE '%->' || ur.input THEN 1 ELSE 0 END
FROM cte ur_cte
INNER JOIN graph_examle ur on ur.output = ur_cte.input
WHERE ur_cte.loop = 0
AND ur.input IS NOT NULL
FROM cte
see dbfiddle

postgres aggregate subset from group by rows

I'm trying to evaluate user loyalty bonuses balance when bonuses burns after half-year inactivity. I want my sum consist of ord's 4, 5 and 6 for user 1.
create table transactions (
user int,
ord int, -- transaction date replacement
amount int,
lag interval -- after previous transaction
insert into transactions values
(1, 1, 10, '1h'::interval),
(1, 2, 10, '.5y'::interval),
(1, 3, 10, '1h'::interval),
(1, 4, 10, '.5y'::interval),
(1, 5, 10, '.1h'::interval),
(1, 6, 10, '.1h'::interval),
(2, 1, 10, '1h'::interval),
(2, 2, 10, '.5y'::interval),
(2, 3, 10, '.1h'::interval),
(2, 4, 10, '.1h'::interval),
(3, 1, 10, '1h'::interval),
select user, sum(
amount -- but starting from last '.5y'::interval if any otherwise everything counts
) from transactions group by user
user | sum(amount)
1 | 30 -- (4+5+6), not 50, not 60
2 | 30 -- (2+3+4), not 40
3 | 10
try this:
with cte as(
select *,
case when (lead(lag) over (partition by user_ order by ord)) >= interval '.5 year'
then 1 else 0 end "flag" from test
cte1 as (
select *,
case when flag=(lag(flag,1) over (partition by user_ order by ord)) then 0 else 1 end "flag1" from cte
select distinct on (user_) user_, sum(amount) over (partition by user_,grp order by ord) from (
select *, sum(flag1) over (partition by user_ order by ord) "grp" from cte1) t1
order by user_ , ord desc
Though it is very complicated and slow but resolve your problem
Is this what you're looking for ?
with last_5y as(
select "user", max(ord) as ord
from transactions
where lag = '.5y'::interval group by "user"
) select t.user, sum(amount)
from transactions t, last_5y t2
where t.user = t2.user and t.ord >= t2.ord
group by t.user

Why using same field when filtering cause different execution time? (different index usage)

When I run query and filter by agreement_id it is slow,
but when I filter by an alias id it is fast. (Look at the end of the query)
Why using same field when filtering cause different execution time?
Links to explain analyze:
slow1, slow2
fast1, fast2
Difference start at #20: Where different indexes are used:
Index Cond: (o.sys_period #> sys_time()) VS Index Cond: (o.agreement_id = 38)
PS. It would be nice if I can contact to developer of this feature (I have one more similar problem)
UPD I did some experiments. when I remove window functions from my query it works fast in any case. So why window function stop index usage in some cases? How to escape/workaround that?
dbfiddle with minimal test case
Server version is v13.1
Full query:
WITH gconf AS
-- https://www.postgresql.org/docs/current/queries-with.html#QUERIES-WITH-SELECT
NOT MATERIALIZED -- force it to be merged into the parent query
-- it gives a net savings because each usage of the WITH query needs only a small part of the WITH query's full output.
tstzrange( '2021-05-01', '2021-05-01', '[]') AS acc_period,
(o).agreement_id AS id, -- Required to passthrough WINDOW FUNCTION
(o).id AS order_id,
(ic).consumed_period AS consumed_period,
dense_rank() OVER ( PARTITION BY (o).agreement_id, (o).id ORDER BY (ic).consumed_period ) AS nconf,
row_number() OVER ( wconf ORDER BY (c).sort_order NULLS LAST ) AS nitem,
(sum( ocd.item_cost ) OVER wconf)::numeric( 10, 2) AS conf_cost,
max((ocd.ic).consumed) OVER wconf AS consumed,
THEN (sum( ocd.item_suma ) OVER wconf)::numeric( 10, 2 )
ELSE (sum( ocd.item_cost ) OVER wconf)::numeric( 10, 2 )
END AS conf_suma
FROM order_cost_details( tstzrange( '2021-05-01', '2021-05-01', '[]') ) ocd
WHERE true OR (ocd.ic).consumed_period #> lower( tstzrange( '2021-05-01', '2021-05-01', '[]') )
WINDOW wconf AS ( PARTITION BY (o).agreement_id, (o).id, (ic).consumed_period )
gorder AS (
(conf_suma/6)::numeric( 10, 2 ) as conf_nds,
sum( conf_suma ) FILTER (WHERE nitem = 1) OVER worder AS order_suma
FROM gconf
WINDOW worder AS ( PARTITION BY gconf.id, (o).id )
-- TODO: Ask PG developers: Why changing to (o).agreement_id slows down query?
-- WINDOW worder AS ( PARTITION BY (o).agreement_id, (o).id )
u.id, consumed_period, nconf, nitem,
(c).id as item_id,
COALESCE( (c).sort_order, pd.sort_order ) as item_order,
COALESCE( st.display, st.name, rt.display, rt.name ) as item_name,
COALESCE( item_qty, (c).amount/rt.unit ) as item_qty,
COALESCE( (p).label, rt.label ) as measure,
item_price, item_cost, item_suma,
conf_cost, consumed, conf_suma, conf_nds, order_suma,
(order_suma/6)::numeric( 10, 2 ) as order_nds,
sum( conf_suma ) FILTER (WHERE nitem = 1 ) OVER wagreement AS total_suma,
sum( (order_suma/6)::numeric( 10, 2 ) ) FILTER (WHERE nitem = 1 AND nconf = 1) OVER wagreement AS total_nds,
pkg.id as package_id,
pkg.link_1c_id as package_1c_id,
COALESCE( pkg.display, pkg.name ) as package,
FROM gorder u
LEFT JOIN resource_type rt ON rt.id = (c).resource_type_id
LEFT JOIN service_type st ON st.id = (c).service_type_id
LEFT JOIN package pkg ON pkg.id = (o).package_id
LEFT JOIN package_detail pd ON pd.package_id = (o).package_id
AND pd.resource_type_id IS NOT DISTINCT FROM (c).resource_type_id
AND pd.service_type_id IS NOT DISTINCT FROM (c).service_type_id
-- WHERE (o).agreement_id = 38 -- slow
WHERE u.id = 38 -- fast
WINDOW wagreement AS ( PARTITION BY (o).agreement_id )
As problem workaround we can additionally SELECT an alias for column used at PARTITION BY expression. Then PG apply optimization and use index.
The answer to the question could be: PG does not apply optimization if composite type is used. Notice as it works:
See this dbfiddle
create table agreement ( ag_id int, name text, cost numeric(10,2) );
create index ag_idx on agreement (ag_id);
insert into agreement (ag_id, name, cost) values ( 1, '333', 22 ),
(1,'333', 33), (1, '333', 7), (2, '555', 18 ), (2, '555', 2), (3, '777', 4);
select * from agreement;
create function initial ()
returns table( agreement_id int, ag agreement ) language sql stable AS $$
select ag_id, t from agreement t;
select * from initial() t;
explain( analyze, costs, buffers, verbose ) with totals_by_ag as (
sum( (t.ag).cost ) over ( partition by agreement_id ) as total
from initial() t
select * from totals_by_ag t
where (t.ag).ag_id = 1; -- index is NOT USED
explain( analyze, costs, buffers, verbose ) with totals_by_ag as (
sum( (t.ag).cost ) over ( partition by agreement_id ) as total
from initial() t
select * from totals_by_ag t
where agreement_id = 1; -- index is used when alias for column is used
explain( analyze, costs, buffers, verbose ) with totals_by_ag as (
sum( (t.ag).cost ) over ( partition by (t.ag).ag_id ) as total --renamed
from initial() t
select * from totals_by_ag t
where agreement_id = 1; -- index is NOT USED because grouping by original column
explain( analyze, costs, buffers, verbose ) with totals_by_ag as (
sum( (t.ag).cost ) over ( partition by (t.ag).ag_id ) as total --renamed
from initial() t
select * from totals_by_ag t
where (t.ag).ag_id = 1; -- index is NOT USED even if at both cases original column

How to `sum( DISTINCT <column> ) OVER ()` using window function?

I have next data:
Here I already calculated total for conf_id. But want also calculate total for whole partition. eg:
Calculate total suma by agreement for each its order (not goods at order which are with slightly different rounding)
How to sum 737.38 and 1238.3? eg. take only one number among group
(I can not sum( item_suma ), because it will return 1975.67. Notice round for conf_suma as intermediate step)
Full query. Here I want to calculate rounded suma for each group. Then I need to calculate total suma for those groups
SELECT app_period( '2021-02-01', '2021-03-01' );
target_date AS ( SELECT '2021-02-01'::timestamptz ),
target_order as (
tstzrange( '2021-01-01', '2021-02-01') as bill_range,
FROM ( SELECT * FROM "order_bt" WHERE sys_period #> sys_time() ) o
OR o.agreement_id = 3385 and o.period_id = 10
o.agreement_id as agreement_id,
o.id AS order_id,
(dense_rank() over (PARTITION BY o.agreement_id ORDER BY o.id )) as zzzz_id,
(dense_rank() over (PARTITION BY o.agreement_id, o.id ORDER BY (ocd.ic).consumed_period )) as conf_id,
sum( ocd.item_suma ) OVER( PARTITION BY (ocd.o).agreement_id ) AS agreement_suma2,
(sum( ocd.item_suma ) OVER( PARTITION BY (ocd.o).agreement_id, (ocd.o).id, (ocd.ic).consumed_period )) AS x_suma,
(sum( ocd.item_cost ) OVER( PARTITION BY (ocd.o).agreement_id, (ocd.o).id, (ocd.ic).consumed_period )) AS x_cost,
(sum( ocd.item_suma ) OVER( PARTITION BY (ocd.o).agreement_id, (ocd.o).id, (ocd.ic).consumed_period ))::numeric( 10, 2) AS conf_suma,
(sum( ocd.item_cost ) OVER( PARTITION BY (ocd.o).agreement_id, (ocd.o).id, (ocd.ic).consumed_period ))::numeric( 10, 2) AS conf_cost,
max((ocd.ic).consumed) OVER( PARTITION BY (ocd.o).agreement_id, (ocd.o).id, (ocd.ic).consumed_period ) AS consumed,
(sum( ocd.item_suma ) OVER( PARTITION BY (ocd.o).agreement_id, (ocd.o).id )) AS order_suma2
FROM target_order o
LEFT JOIN order_cost_details( o.bill_range ) ocd
ON (ocd.o).id = o.id AND (ocd.ic).consumed_period && o.app_period
(conf_suma/6) ::numeric( 10, 2 ) as group_nds,
(SELECT sum(x) from (SELECT sum( DISTINCT conf_suma ) AS x FROM usage sub_u WHERE sub_u.agreement_id = usage.agreement_id GROUP BY agreement_id, order_id) t) as total_suma,
(SELECT sum(x) from (SELECT (sum( DISTINCT conf_suma ) /6)::numeric( 10, 2 ) AS x FROM usage sub_u WHERE sub_u.agreement_id = usage.agreement_id GROUP BY agreement_id, order_id) t) as total_nds
My old question
I found solution. See dbfiddle.
To run window function for distinct values I should get first value from each peer. To complete this I
aggregate IDs of rows for this peer
lag this aggregation by one
Mark rows that are not aggregated yet (this is first row at peer) as _distinct
sum( ) FILTER ( WHERE _distinct ) over ( ... )
Voila. You get sum over DISTINCT values at target PARTITION
which are not implemented yet by PostgreSQL
with data as (
select * from (values
( 1, 1, 1, 1.0049 ), (2, 1,1,1.0049), ( 3, 1,1,1.0049 ) ,
( 4, 1, 2, 1.0049 ), (5, 1,2,1.0057),
( 6, 2, 1, 1.53 ), ( 7,2,1,2.18), ( 8,2,2,3.48 )
) t (id, agreement_id, order_id, suma)
intermediate as (select
sum( suma ) over ( partition by agreement_id, order_id ) as fract_order_suma,
sum( suma ) over ( partition by agreement_id ) as fract_agreement_total,
(sum( suma::numeric(10,2) ) over ( partition by agreement_id, order_id )) as wrong_order_suma,
(sum( suma ) over ( partition by agreement_id, order_id ))::numeric( 10, 2) as order_suma,
(sum( suma ) over ( partition by agreement_id ))::numeric( 10, 2) as wrong_agreement_total,
id as xid,
array_agg( id ) over ( partition by agreement_id, order_id ) as agg
from data),
distinc as (select *,
lag( agg ) over ( partition by agreement_id ) as prev,
id = any (lag( agg ) over ()) is not true as _distinct, -- allow to match first ID from next peer
order_suma as xorder_suma, -- repeat column to easily visually compare with _distinct
(SELECT sum(x) from (SELECT sum( DISTINCT order_suma ) AS x FROM intermediate sub_q WHERE sub_q.agreement_id = intermediate.agreement_id GROUP BY agreement_id, order_id) t) as correct_total_suma
from intermediate
sum( order_suma ) filter ( where _distinct ) over ( partition by agreement_id ) as also_correct_total_suma
from distinc
better approach dbfiddle:
Assign row_number at each order: row_number() over (partition by agreement_id, order_id ) as nrow
Take only first suma: filter nrow = 1
with data as (
select * from (values
( 1, 1, 1, 1.0049 ), (2, 1,1,1.0049), ( 3, 1,1,1.0049 ) ,
( 4, 1, 2, 1.0049 ), (5, 1,2,1.0057),
( 6, 2, 1, 1.53 ), ( 7,2,1,2.18), ( 8,2,2,3.48 )
) t (id, agreement_id, order_id, suma)
intermediate as (select
row_number() over (partition by agreement_id, order_id ) as nrow,
(sum( suma ) over ( partition by agreement_id, order_id ))::numeric( 10, 2) as order_suma,
from data)
sum( order_suma ) filter (where nrow = 1) over (partition by agreement_id)
from intermediate```

SQL 2017 - Comparing values between two tables where certain values can be NULL

I have the following Tables with the following data:
InstrumentID int,
ProviderID int,
KPI1 int,
Col2 varchar(255),
KPI3 int
InstrumentID int,
ProviderID int,
KPI1 int,
Col2 varchar(255),
KPI3 int
INSERT INTO TestSource (InstrumentID,ProviderID,KPI1,Col2,KPI3)
VALUES (123, 27, 1, 'ABC', 10.0 ),
(1234, 27, 2, 'DEF', 10.0 ),
(345, 27, 1, NULL, 0.00 );
INSERT INTO TestTarget (InstrumentID,ProviderID,KPI1,Col2,KPI3)
VALUES (123, 27, 1, 'ABC', 10.0 ),
(1234, 27, 2, 'DEF', 10.0 ),
(345, 27, 1, 'ABC', 0.0 );
I'm trying to compare the values between tables. Here's the query logic I am currently using:
compare_source (InstrumentID,ProviderID,
/*** Source columns to compare ***/
Col1Source, Col2Source,Col3Source
as (
select InstrumentID
--,ISNULL(Col2,'NA') as Col2
from TestSource
group by
compare_target (InstrumentID,ProviderID,
/*** Target columns to compare ***/
from TestTarget
group by
SELECT #Result = STRING_AGG ('InstrumentID = ' + CONVERT(VARCHAR,InstrumentID)
+ ', Col1: ' + CONVERT(VARCHAR,Col1Source) + ' vs ' + CONVERT(VARCHAR,Col1Target)
+ ', Col2: ' + CONVERT(VARCHAR,Col2Source) + ' vs ' + CONVERT(VARCHAR,Col2Target)
+ ', Col3: ' + CONVERT(VARCHAR,Col3Source) + ' vs ' + CONVERT(VARCHAR,Col3Target)
, CHAR(13) + CHAR(10)
from compare_source s
left join compare_target t on t.InstrumentID = s.InstrumentID and t.ProviderID = s.ProviderID
where not exists
select 1 from compare_target t where
s.InstrumentID = t.InstrumentID AND
( s.Col1Source = t.Col1Target ) OR (ISNULL(s.Col1Source, t.Col1Target) IS NULL) AND
( s.Col2Source = t.Col2Target ) OR (ISNULL(s.Col2Source, t.Col2Target) IS NULL) AND
( s.Col3Source = t.Col3Target ) OR (ISNULL(s.Col3Source, t.Col3Target) IS NULL)
) diff
PRINT #Result
When there are no NULL values in my tables, the comparison works well. However, as soon as I attempt to insert NULLs in either of the tables, my comparison logic breaks down and does not account for the differences between tables values.
I know that I could easily do an ISNULL on my columns in my individual selects, however, I'd like to keep it as generic as possible and to only do my comparison checks and NULL checks in my final NOT EXISTS comparison WHERE clause.
I've also tried the following logic in my comparison logic without success:
select 1 from compare_target t where
s.InstrumentID = t.InstrumentID AND
( s.Col1Source = t.Col1Target OR (s.Col1Source IS NULL AND t.Col1Target IS NULL) ) AND
( s.Col2Source = t.Col2Target OR (s.Col2Source IS NULL AND t.Col2Target IS NULL) ) AND
( s.Col3Source = t.Col3Target OR (s.Col3Source IS NULL AND t.Col3Target IS NULL) )
Another issue I am having is that my query cannot distinguish between data formats (for example, it sees the value 0.00 as equivalent to 0.0)
I'm not totally certain as to what I am missing.
Any help to put me on the right path would be great.
Well the two problems I see are this:
The WHERE clause at the bottom needs to have extra parenthesis to combine your ORs with your ANDs so that the order of precedence is correct:
select 1 from compare_target t where
s.InstrumentID = t.InstrumentID AND
(( s.Col1Source = t.Col1Target ) OR (ISNULL(s.Col1Source, t.Col1Target) IS NULL)) AND
(( s.Col2Source = t.Col2Target ) OR (ISNULL(s.Col2Source, t.Col2Target) IS NULL)) AND
(( s.Col3Source = t.Col3Target ) OR (ISNULL(s.Col3Source, t.Col3Target) IS NULL))
When you make that change the one row that is returned has a NULL value in the Col2Source column. So when you try and build the string that you are sending to STRING_AGG it has a NULL in the middle of it. So the entire string will be NULL. So you will need to use ISNULL in either the subquery in your FROM clause or within the STRING_AGG()....or is suppose right where you had it commented out.