How to write date filter in MDX where clause? - tsql

I am new to MDX. Could please suggest how to write below T-SQL query in MDX Query language.
T-SQL:
SELECT wp.date,Sum(wp.bbls_oil)
AS BBLSOIL_TOTAL,Sum(wp.bbls_water)
AS BBLSWATER_TOTAL,Sum(wp.mcf_prod)
AS MCF_PROD_TOTAL,Sum(wp.vent_flare)
AS VENT_FLARE_TOTAL
FROM well_prod_bst_horiz_og_2_yrs wp, well_index wi
WHERE wp.fileno = wi.fileno
AND wp.date <= :startDate
AND wp.date >= :endDate
AND wi.apino IN (:wellids)
GROUP BY wp.date ORDER BY wp.date ASC";
In the above query, Start and End date values are supplied dynamically.

Assuming you have measures named BBLSOIL, BBLSWATER, MCF_PROD, and VENT_FLARE_TOTAL and your date attribute is named [Date].[Date], and your :startDate contains [Date].[Date].&[20120101] and your :endDate contains [Date].[Date].&[20141231], and your cube is named Name of your Cube you would write
SELECT {
Measures.[BBLSOIL],
Measures.[BBLSWATER],
Measures.[MCF_PROD],
Measures.[VENT_FLARE_TOTAL]
}
ON COLUMNS,
[Date].[Date].&[20120101] : [Date].[Date].&[20141231]
ON ROWS
FROM [Name of your Cube]
i. e. you put an MDX set containing the list of required measures on the columns axis and you put a range (specified by :) on the rows axis. Aggregations like Sum and GROUP BY are not necessary inn MDX, these are handled by the cube definition.

Related

Selecting max value grouped by specific column

Focused DB tables:
Task:
For given location ID and culture ID, get max(crop_yield.value) * culture_price.price (let's call this multiplication monetaryGain) grouped by year, so something like:
[
{
"year":2014,
"monetaryGain":...
},
{
"year":2015,
"monetaryGain":...
},
{
"year":2016,
"monetaryGain":...
},
...
]
Attempt:
SELECT cp.price * max(cy.value) AS monetaryGain, EXTRACT(YEAR FROM cy.date) AS year
FROM culture_price AS cp
JOIN culture AS c ON cp.id_culture = c.id
JOIN crop_yield AS cy ON cy.id_culture = c.id
WHERE c.id = :cultureId AND cy.id_location = :locationId AND cp.year = year
GROUP BY year
ORDER BY year
The problem:
"columns "cp.price", "cy.value" and "cy.date" must appear in the GROUP BY clause or be used in an aggregate function"
If I put these three columns in GROUP BY, I won't get expected result - It won't be grouped just by year obviously.
Does anyone have an idea on how to fix/write this query better in order to get task result?
Thanks in advance!
The fix
Rewrite monetaryGain to be:
max(cp.price * cy.value) AS monetaryGain
That way you will not be required to group by cp.price because it is not outputted as an group member, but used in aggregate.
Why?
When you write GROUP BY query you can output only columns that are in GROUP BY list and aggregate function values. Well this is expected - you expect single row per group, but you may have several distinct values for the field that is not in grouping column list.
For the same reason you can not use a non grouping column(-s) in arithmetic or any other (not aggregate) function because this would lead in several results for in single row - there would not be a way to display.
This is VERY loose explanation but I hope will help to grasp the concept.
Aliases in GROUP BY
Also you should not use aliases in GROUP BY. Use:
GROUP BY EXTRACT(YEAR FROM cy.date)
Using alias in GROUP BY is not allowed. This link might explain why: https://www.postgresql.org/message-id/7608.1259177709%40sss.pgh.pa.us

How to group by date and calculate the averages at the same time

I am quite new to this, so here it goes: I am trying to convert from unixtime to date format and then group by this by date while calculating the average on another column. This is in MariaDB.
CREATE OR REPLACE
VIEW `history_uint_view` AS select
`history_uint`.`itemid` AS `itemid`,
date(from_unixtime(`history_uint`.`clock`)) AS `mydate`,
AVG(`history_uint`.`value`) AS `value`
from
`history_uint`
where
((month(from_unixtime(`history_uint`.`clock`)) = month((now() - interval 1 month))) and ((`history_uint`.`value` in (1,
0))
and (`history_uint`.`itemid` in (54799, 54810, 54821, 54832, 54843, 54854, 54865, 54876, 54887, 54898, 54909, 54920, 58165, 58226, 59337, 59500, 59503, 59506, 60621, 60624, 60627, 60630, 60633, 60636, 60639, 60642, 60645, 60648, 60651, 60654, 60657, 60660, 60663, 60666, 60669, 60672, 60675, 60678, 60681, 60684, 60687, 60690, 60693, 60696, 60699, 64610)))
GROUP by 'itemid', 'mydate', 'value'
When you select aggregate functions (like AVG) with columns without aggregate functions, you should list all columns but the ones with aggregate function in GROUP BY-clause.
So your group by should look like:
GROUP by itemid, mydate
If you use single quotes (like 'itemid'), MariaDB treats them as strings, not columns.

Tableau - Calculating average where date is less than value from another data source

I am trying to calculate the average of a column in Tableau, except the problem is I am trying to use a single date value (based on filter) from another data source to only calculate the average where the exam date is <= the filtered date value from the other source.
Note: Parameters will not work for me here, since new date values are being added constantly to the set.
I have tried many different approaches, but the simplest was trying to use a calculated field that pulls in the filtered exam date from the other data source.
It successfully can pull the filtered date, but the formula does not work as expected. 2 versions of the calculation are below:
IF DATE(ATTR([Exam Date])) <= DATE(ATTR([Averages (Tableau Test Scores)].[Updated])) THEN AVG([Raw Score]) END
IF DATEDIFF('day', DATE(ATTR([Exam Date])), DATE(ATTR([Averages (Tableau Test Scores)].[Updated]))) > 1 THEN AVG([Raw Score]) END
Basically, I am looking for the equivalent of this in SQL Server:
SELECT AVG([Raw Score]) WHERE ExamDate <= (Filtered Exam Date)
Below a workbook that shows an example of what I am trying to accomplish. Currently it returns all blanks, likely due to the many-to-one comparison I am trying to use in my calculation.
Any feedback is greatly appreciated!
Tableau Test Exam Workbook
I was able to solve this by using Custom SQL to join the tables together and calculate the average based on my conditions, to get the column results I wanted.
Would still be great to have this ability directly in Tableau, but whatever gets the job done.
Edit:
SELECT
[AcademicYear]
,[Discipline]
--Get the number of student takers
,COUNT([Id]) AS [Students (N)]
--Get the average of the Raw Score
,CAST(AVG(RawScore) AS DECIMAL(10,2)) AS [School Mean]
--Get the number of failures based on an "adjusted score" column
,COUNT([AdjustedScore] < 70 THEN 1 END) AS [School Failures]
--This is the column used as the cutoff point for including scores
,[Average_Update].[Updated]
FROM [dbo].[Average] [Average]
FULL OUTER JOIN [dbo].[Average_Update] [Average_Update] ON ([Average_Update].[Id] = [Average].UpdateDateId)
--The meat of joining data for accurate calculations
FULL OUTER JOIN (
SELECT DISTINCT S.[Id], S.[LastName], S.[FirstName], S.[ExamDate], S.[RawScoreStandard], S.[RawScorePercent], S.[AdjustedScore], S.[Subject], P.[Id] AS PeriodId
FROM [StudentScore] S
FULL OUTER JOIN
(
--Get only the 1st attempt
SELECT DISTINCT [NBOMEId], S2.[Subject], MIN([ExamDate]) AS ExamDate
FROM [StudentScore] S2
GROUP BY [NBOMEId],S2.[Subject]
) B
ON S.[NBOMEId] = B.[NBOMEId] AND S.[Subject] = B.[Subject] AND S.[ExamDate] = B.[ExamDate]
--Group in "Exam Periods" based on the list of periods w/ start & end dates in another table.
FULL OUTER JOIN [ExamPeriod] P
ON S.[ExamDate] = P.PeriodStart AND S.[ExamDate] <= P.PeriodEnd
WHERE S.[Subject] = B.[Subject]
GROUP BY P.[Id], S.[Subject], S.[ExamDate], S.[RawScoreStandard], S.[RawScorePercent], S.[AdjustedScore], S.[NBOMEId], S.[NBOMELastName], S.[NBOMEFirstName], S.[SecondYrTake]) [StudentScore]
ON
([StudentScore].PeriodId = [Average_Update].ExamPeriodId
AND [StudentScore].Subject = [Average].Subject
AND [StudentScore].[ExamDate] <= [Average_Update].[Updated])
--End meat
--Joins to pull in relevant data for normalized tables
FULL OUTER JOIN [dbo].[Student] [Student] ON ([StudentScore].[NBOMEId] = [Student].[NBOMEId])
INNER JOIN [dbo].[ExamPeriod] [ExamPeriod] ON ([Average_Update].ExamPeriodId = [ExamPeriod].[Id])
INNER JOIN [dbo].[AcademicYear] [AcademicYear] ON ([ExamPeriod].[AcademicYearId] = [AcademicYear].[Id])
--This will pull only the latest update entry for every academic year.
WHERE [Updated] IN (
SELECT DISTINCT MAX([Updated]) AS MaxDate
FROM [Average_Update]
GROUP BY[ExamPeriodId])
GROUP BY [AcademicYear].[AcademicYearText], [Average].[Subject], [Average_Update].[Updated],
ORDER BY [AcademicYear].[AcademicYearText], [Average_Update].[Updated], [Average].[Subject]
I couldn't download your file to test with your data, but try reversing the order of taking the average ie
average(IF DATE(ATTR([Exam Date])) <= DATE(ATTR([Averages (Tableau Test Scores)].[Updated]) then [Raw Score]) END)
as written, I believe you'll be averaging the data before returning it from the if statement, whereas you want to return the data, then average it.

MDX Query with Date Range Filter

I am new to the MDX queries. I am writing a MDX query to select a Measure value across months and I am putting date Range as filter here just to restrict no of Months returned. For eg I want Sales Revenue for each month in Date Range of 01-Jan-2014 to 30-Jun-2014. Ideally, it should give me sales value for six months i.e Jan, Feb, Mar, Apr, May and June. However when i write below query, I get error. PFB the below enter code here`ow query.
Select NON EMPTY {[Measures].[Target Plan Value]} ON COLUMNS,
NON EMPTY {[Realization Date].[Hierarchy].[Month Year].Members} ON ROWS
From [Cube_BCG_OLAP]
( { [Realization Date].[Hierarchy].[Date].&[20140101] :
[Realization Date].[Hierarchy].[Date].&[20141231] })
The error I get is The Hierarchy hierarchy already appears in the Axis1 axis. Here Date and Month Year belong to same dimension table named as Realization Date. Please help me. Thanks in advance.
You were missing the WHERE clause but I guess that was a typo. As your error message tells, you can't have members of the same hierarchy on two or more axes. In situations like this, you can use something like below which in MDX terminology is called Subselect.
Select NON EMPTY {[Measures].[Target Plan Value]} ON COLUMNS,
NON EMPTY {[Realization Date].[Hierarchy].[Month Year].Members} ON ROWS
From (
SELECT
[Realization Date].[Hierarchy].[Date].&[20140101] :
[Realization Date].[Hierarchy].[Date].&[20141231] ON COLUMNS
FROM [Cube_BCG_OLAP]
)
I like the exists function in this situation:
SELECT
NON EMPTY {[Measures].[Target Plan Value]}
ON COLUMNS,
NON EMPTY
EXISTS(
[Realization Date].[Hierarchy].[Month Year].Members
, {
[Realization Date].[Hierarchy].[Date].&[20140101] :
[Realization Date].[Hierarchy].[Date].&[20141231]
}
)
ON ROWS
FROM [Cube_BCG_OLAP]
Select
[Measures].[Target Plan Value]} On Columns
{
[Realization Date].[Hierarchy].[Date].&[20140101].Parent :
[Realization Date].[Hierarchy].[Date].&[20140631].Parent
}
On Rows
From [Cube_BCG_OLAP]
You need to create this same dimension only for filter in the cube, for example, dimension_filter -> hierarchy_filter -> level_filter

How to create an exclusion filter set in Tableau

I am working in Tableau and trying to figure out how to create a filter exclusion. For example I have the following fields.
Hospital CallType CallDate
I want to filter out all hospitals where one of the Calls has a call type of ColdCall and a Call DateBetween X and Y.
I can do this easily in SQL but don't have access to this data in the SQL Database. It would be the following:
Select
Hospital
,CallType
,CallDate
Into
#TempTable
From
Database
Select
Hospital
,CallType
,CallDate
Into
#ExclusionTable
From
Database
Where
CallType = 'Cold'
and
CallDate Between X and Y
Select
Hospital
,CallType
,CallDate
From
#TempTable
Where
Hospital not in
(Select
Hospital
From
#ExclusionTable)
Any suggestions would be greatly appreciated.
Thanks,
Simple. Create a calculated field Filter:
IF CallType = "Cold" AND CallDate < X AND CallDate > Y
THEN 1
ELSE 0
END
Then drag Hospital to filter, go to Condition tab, select by field, get your Filter field, use sum > 0. It will filter out any hospital that have at least one call with your conditions (because all the calls that don't meet will be zero, and if at least one is not zero, the sum will be over 0)
For X and Y, I'd create parameters. It's easier (and safer) than trying to write the dates directly on the field. And you can manipulate then more easily too