How to translate SQL to DAX, Need to add FILTER - tsql

I want to create calculated table that will summarize In_Force Premium from existing table fact_Premium.
How can I filter the result by saying:
TODAY() has to be between `fact_Premium[EffectiveDate]` and (SELECT TOP 1 fact_Premium[ExpirationDate] ORDE BY QuoteID DESC)
In SQL I'd do that like this:
`WHERE CONVERT(date, getdate()) between CONVERT(date, tblQuotes.EffectiveDate)
and (
select top 1 q2.ExpirationDate
from Table2 Q2
where q2.ControlNo = Table1.controlno
order by quoteid` desc
)
Here is my DAX statement so far:
In_Force Premium =
FILTER(
ADDCOLUMNS(
SUMMARIZE(
//Grouping necessary columns
fact_Premium,
fact_Premium[QuoteID],
fact_Premium[Division],
fact_Premium[Office],
dim_Company[CompanyGUID],
fact_Premium[LineGUID],
fact_Premium[ProducerGUID],
fact_Premium[StateID],
fact_Premium[ExpirationDate]
),
"Premium", CALCULATE(
SUM(fact_Premium[Premium])
),
"ControlNo", CALCULATE(
DISTINCTCOUNT(fact_Premium[ControlNo])
)
), // Here I need to make sure TODAY() falls between fact_Premium[EffectiveDate] and (SELECT TOP 1 fact_Premium[ExpirationDate] ORDE BY QuoteID DESC)
)
Also, what would be more efficient way, to create calculated table from fact_Premium or create same table using sql statement (--> Get Data--> SQL Server) ?

There are 2 potential ways in T-SQL to get the next effective date. One is to use LEAD() and another is to use an APPLY operator. As there are few facts to work with here are samples:
select *
from (
select *
, lead(EffectiveDate) over(partition by CompanyGUID order by quoteid desc) as NextEffectiveDate
from Table1
join Table2 on ...
) d
or
select table1.*, oa.NextEffectiveDate
from Table1
outer apply (
select top(1) q2.ExpirationDate AS NextEffectiveDate
from Table2 Q2
where q2.ControlNo = Table1.controlno
order by quoteid desc
) oa
nb. an outer apply is a little similar to a left join in that it will allow rows with a NULL to be returned by the query, if that is not needed than use cross apply instead.
In both these approaches you may refer to NextEffectiveDate in a final where clause, but I would prefer to avoid using the convert function if that is feasible (this depends on the data).

Related

Redshift Cross join ignoring where clause

I have the following query:
WITH MY_CTE as
(
select
....
.....
)
SELECT
MY_CTE.*
,tt.currency as most_used_currency
from MY_CTE
cross join
(select t.currency
from My_CTE t
group by t.currency
order by count(*) desc
limit 1
) tt
where MY_CTE.currency = 'EUR'
but the cross join is ignoring my where clause.
How can I enforce that it processes the where clause before working on the cross join please?
Sample data returned:
This is obviously wrong because I said do not include currency SEK, and yet it is saying its the most popular currency.
I cannot put the where clause inside of the cross join because I will be using this in tableau and need the users to be able to filter on certain criteria, e.g. currency.
The most popular currency should be EUR if the MY_CTE is filtered to show only EUR currency
WHERE condition in this case has nothing to do with cross join, it just filters rows after join is already performed. If you need to report only single currency there are simplest two options where to add currency filter (added as comments in SQL):
1) Option 1 - add filter already in CTE statement
2) Option 2 - add filter at the end (as already done) and within tt part.
WITH MY_CTE as
(
select
....
.....
/* OPTION 1*/
)
SELECT
MY_CTE.*
,tt.currency as most_used_currency
from MY_CTE
cross join
(select t.currency
from My_CTE t
/* OPTION 2 first place*/
group by t.currency
order by count(*) desc
limit 1
) tt
where MY_CTE.currency = 'EUR' /* OPTION 2a second place*/
The alias tt will return the most popular currency overall, which is SEK. If you want to filter for separate currencies, you'll need to put them in the inner query as well as the outer one. However, if that isn't an option, you'll want to return all currencies with their popularity, and filter on the most popular one you allow.
....
....
SELECT
LAST_VALUE(MY_CTE.customer_id)
OVER (partition by customer_id
ORDER BY tt.popularity
rows between unbounded preceding and unbounded following)
.... /* rest of your columns */
, LAST_VALUE(tt.currency)
OVER (partition by customer_id
ORDER BY tt.popularity
rows between unbounded preceding and unbounded following)
from MY_CTE
cross join
(select t.currency,
count(*) popularity
from My_CTE t
group by t.currency
order by count(*) desc
/* removed limit 1 */
) tt
where MY_CTE.currency = 'EUR'
AND tt.currency IN ('EUR') /* Added tt.currency filter */

Unable to get Percentile_Cont() to work in Postgresql

I am trying to calculate a percentile using the percentile_cont() function in PostgreSQL using common table expressions. The goal is find the top 1% of accounts regards to their balances (called amount here). My logic is to find the 99th percentile which will return those whose account balances are greater than 99% of their peers (and thus finding the 1 percenters)
Here is my query
--ranking subquery works fine
with ranking as(
select a.lname,sum(c.amount) as networth from customer a
inner join
account b on a.customerid=b.customerid
inner join
transaction c on b.accountid=c.accountid
group by a.lname order by sum(c.amount)
)
select lname, networth, percentile_cont(0.99) within group
order by networth over (partition by lname) from ranking ;
I keeping getting the following error.
ERROR: syntax error at or near "order"
LINE 2: ...ame, networth, percentile_cont(0.99) within group order by n..
I am thinking that perhaps I forgot a closing brace etc. but I can't seem to figure out where. I know it could be something with the order keyword but I am not sure what to do. Can you please help me to fix this error?
This tripped me up, too.
It turns out percentile_cont is not supported in postgres 9.3, only in 9.4+.
https://www.postgresql.org/docs/9.4/static/release-9-4.html
So you have to use something like this:
with ordered_purchases as (
select
price,
row_number() over (order by price) as row_id,
(select count(1) from purchases) as ct
from purchases
)
select avg(price) as median
from ordered_purchases
where row_id between ct/2.0 and ct/2.0 + 1
That query care of https://www.periscopedata.com/blog/medians-in-sql (section: "Median on Postgres")
You are missing the brackets in the within group (order by x) part.
Try this:
with ranking
as (
select a.lname,
sum(c.amount) as networth
from customer a
inner join account b on a.customerid = b.customerid
inner join transaction c on b.accountid = c.accountid
group by a.lname
order by networth
)
select lname,
networth,
percentile_cont(0.99) within group (
order by networth
) over (partition by lname)
from ranking;
I want to point out that you don't need a subquery for this:
select c.lname, sum(t.amount) as networth,
percentile_cont(0.99) within group (order by sum(t.amount)) over (partition by lname)
from customer c inner join
account a
on c.customerid = a.customerid inner join
transaction t
on a.accountid = t.accountid
group by c.lname
order by networth;
Also, when using table aliases (which should be always), table abbreviations are much easier to follow than arbitrary letters.

TSQL - LEAD for Next Different Row

Is there a way to use the lead function such that I can get the next row where something has changed, as opposed it where it is the same?
In this example, the RowType can be 'in' or 'out', for each 'in' I need to know the next RowNumber where it has become 'out'. I have been playing with the lead function as it is really fast, however I haven't been able to get it working. I just need to do the following really, which is partition by a RowType which isn't the one in the current row.
select
RowNumber
,RowType --In this case I am only interested in RowType = 'In'
, Lead(RowNumber)
OVER (partition by "RowType = out" --This is the bit I am stuck on--
order by RowNumber ASC) as NextOutFlow
from table
order by RowNumber asc
Thanks in advance for any help
Rather than using lead() I would use an outer apply that returns the next row with type out for all rows with type in:
select RowNumber, RowType, nextOut
from your_table t
outer apply (
select min(RowNumber) as nextOut
from your_table
where RowNumber > t.RowNumber and RowType='Out'
) oa
where RowType = 'In'
order by RowNumber asc
Given sample data like:
RowNumber RowType
1 in
2 out
3 in
4 in
5 out
6 in
This would return:
RowNumber RowType nextOut
1 in 2
3 in 5
4 in 5
6 in NULL
I think this will work
If you would use a bit field for in out you would get better performance
;with cte1 as
(
SELECT [inden], [OnOff]
, lag([OnOff]) over (order by [inden]) as [lagOnOff]
FROM [OnOff]
), cte2 as
(
select [inden], [OnOff], [lagOnOff]
, lead([inden]) over (order by [inden]) as [Leadinden]
from cte1
where [OnOff] <> [lagOnOff]
or [lagOnOff] is null
)
select [inden], [OnOff], [lagOnOff], [Leadinden]
from cte2
where [OnOff] = 'true'
probably slower but if you have the right indexes may work
select t1.rowNum as 'rowNumIn', min(t2.rownum) as 'nextRowNumOut'
from tabel t1
join table t2
on t1.rowType = 'In'
and t2.rowType = 'Out'
and t2.rowNum > t1.rowNum
and t2.rowNum < t1.rowNum + 1000 -- if you can constrain it
group by t1.rowNum

multiple extract() with WHERE clause possible?

So far I have come up with the below:
WHERE (extract(month FROM orders)) =
(SELECT min(extract(month from orderdate))
FROM orders)
However, that will consequently return zero to many rows, and in my case, many, because many orders exist within that same earliest (minimum) month, i.e. 4th February, 9th February, 15th Feb, ...
I know that a WHERE clause can contain multiple columns, so why wouldn't the below work?
WHERE (extract(day FROM orderdate)), (extract(month FROM orderdate)) =
(SELECT min(extract(day from orderdate)), min(extract(month FROM orderdate))
FROM orders)
I simply get: SQL Error: ORA-00920: invalid relational operator
Any help would be great, thank you!
Sample data:
02-Feb-2012
14-Feb-2012
22-Dec-2012
09-Feb-2013
18-Jul-2013
01-Jan-2014
Output:
02-Feb-2012
14-Feb-2012
Desired output:
02-Feb-2012
I recreated your table and found out you just messed up the brackets a bit. The following works for me:
where
(extract(day from OrderDate),extract(month from OrderDate))
=
(select
min(extract(day from OrderDate)),
min(extract(month from OrderDate))
from orders
)
Use something like this:
with cte1 as (
select
extract(month from OrderDate) date_month,
extract(day from OrderDate) date_day,
OrderNo
from tablename
), cte2 as (
select min(date_month) min_date_month, min(date_day) min_date_day
from cte1
)
select cte1.*
from cte1
where (date_month, date_day) = (select min_date_month, min_date_day from cte2)
A common table expression enables you to restructure your data and then use this data to do your select. The first cte-block (cte1) selects the month and the day for each of your table rows. Cte2 then selects min(month) and min(date). The last select then combines both ctes to select all rows from cte1 that have the desired month and day.
There is probably a shorter solution to that, however I like common table expressions as they are almost all the time better to understand than the "optimal, shortest" query.
If that is really what you want, as bizarre as it seems, then as a different approach you could forget the extracts and the subquery against the table to get the minimums, and use an analytic approach instead:
select orderdate
from (
select o.*,
row_number() over (order by to_char(orderdate, 'MMDD')) as rn
from orders o
)
where rn = 1;
ORDERDATE
---------
01-JAN-14
The row_number() effectively adds a pseudo-column to every row in your original table, based on the month and day in the order date. The rn values are unique, so there will be one row marked as 1, which will be from the earliest day in the earliest month. If you have multiple orders with the same day/month, say 01-Jan-2013 and 01-Jan-2014, then you'll still only get exactly one with rn = 1, but which is picked is indeterminate. You'd need to add further order by conditions to make it deterministic, but I have no idea what you might want.
That is done in the inner query; the outer query then filters so that only the records marked with rn = 1 is returned; so you get exactly one row back from the overall query.
This also avoids the situation where the earliest day number is not in the earliest month number - say if you only had 01-Jan-2014 and 02-Feb-2014; comparing the day and month separately would look for 01-Feb-2014, which doesn't exist.
SQL Fiddle (with Thomas Tschernich's anwer thrown in too, giving the same result for this data).
To join the result against your invoice table, you don't need to join to the orders table again - especially not with a cross join, which is skewing your results. You can do the join (at least) two ways:
SELECT
o.orderno,
to_char(o.orderdate, 'DD-MM-YYYY'),
i.invno
FROM
(
SELECT o.*,
row_number() over (order by to_char(orderdate, 'MMDD')) as rn
FROM orders o
) o, invoices i
WHERE i.invno = o.invno
AND rn = 1;
Or:
SELECT
o.orderno,
to_char(o.orderdate, 'DD-MM-YYYY'),
i.invno
FROM
(
SELECT orderno, orderdate, invno
FROM
(
SELECT o.*,
row_number() over (order by to_char(orderdate, 'MMDD')) as rn
FROM orders o
)
WHERE rn = 1
) o, invoices i
WHERE i.invno = o.invno;
The first looks like it does more work but the execution plans are the same.
SQL Fiddle with your pastebin-supplied query that gets two rows back, and these two that get one.

In Firebird, how to aggregate the first N rows?

I would like to do something like this:
CNT=2;
//[edit]
select avg(price) from (
select first :CNT p.Price
from Price p
order by p.Date desc
);
This does not work, Firebird does not allow :cnt as a parameter to FIRST. I need to average the first CNT newest prices. The number 2 changes so it can not be hard-coded.
This can be broken out into a FOR SELECT loop and break when a count is reached. Is that the best way though? Can this be done in a single SQL statement?
Creating the SQL as a string and running it is not the best fit either. It is important that the database compile my SQL statement.
You don't have to use CTE, you can do it directly:
select avg(price) from (
select first :cnt p.Price
from Price p
order by p.Date desc
);
You can use a CTE (Common Table Expression) (see http://www.firebirdsql.org/refdocs/langrefupd21-select.html#langrefupd21-select-cte) to select data before calculate average.
See example below:
with query1 as (
select first 2 p.Price
from Price p
order by p.Date desc
)
select avg(price) from query1