I am new to the MDX queries. I am writing a MDX query to select a Measure value across months and I am putting date Range as filter here just to restrict no of Months returned. For eg I want Sales Revenue for each month in Date Range of 01-Jan-2014 to 30-Jun-2014. Ideally, it should give me sales value for six months i.e Jan, Feb, Mar, Apr, May and June. However when i write below query, I get error. PFB the below enter code here`ow query.
Select NON EMPTY {[Measures].[Target Plan Value]} ON COLUMNS,
NON EMPTY {[Realization Date].[Hierarchy].[Month Year].Members} ON ROWS
From [Cube_BCG_OLAP]
( { [Realization Date].[Hierarchy].[Date].&[20140101] :
[Realization Date].[Hierarchy].[Date].&[20141231] })
The error I get is The Hierarchy hierarchy already appears in the Axis1 axis. Here Date and Month Year belong to same dimension table named as Realization Date. Please help me. Thanks in advance.
You were missing the WHERE clause but I guess that was a typo. As your error message tells, you can't have members of the same hierarchy on two or more axes. In situations like this, you can use something like below which in MDX terminology is called Subselect.
Select NON EMPTY {[Measures].[Target Plan Value]} ON COLUMNS,
NON EMPTY {[Realization Date].[Hierarchy].[Month Year].Members} ON ROWS
From (
SELECT
[Realization Date].[Hierarchy].[Date].&[20140101] :
[Realization Date].[Hierarchy].[Date].&[20141231] ON COLUMNS
FROM [Cube_BCG_OLAP]
)
I like the exists function in this situation:
SELECT
NON EMPTY {[Measures].[Target Plan Value]}
ON COLUMNS,
NON EMPTY
EXISTS(
[Realization Date].[Hierarchy].[Month Year].Members
, {
[Realization Date].[Hierarchy].[Date].&[20140101] :
[Realization Date].[Hierarchy].[Date].&[20141231]
}
)
ON ROWS
FROM [Cube_BCG_OLAP]
Select
[Measures].[Target Plan Value]} On Columns
{
[Realization Date].[Hierarchy].[Date].&[20140101].Parent :
[Realization Date].[Hierarchy].[Date].&[20140631].Parent
}
On Rows
From [Cube_BCG_OLAP]
You need to create this same dimension only for filter in the cube, for example, dimension_filter -> hierarchy_filter -> level_filter
Related
Source data
I am working on an ELT project to load data from CSV files into PostgreSQL where I will transform it. The CSV files have many columns that are consistent across files, but also contain activity columns that are inconsistent with names like Date (05/19/2020), Type (05/19/2020), etc.
In the loading script I am merging all of the columns with dates in the column name into one jsonb column so I don't have to constantly add new columns to the raw data table.
The resulting jsonb column in the raw data table looks like this:
id
activity
12345678
{"Date (05/19/2020)": null, "Type (05/19/2020)": null, "Date (06/03/2020)": "06/01/2020", "Type (06/03/2020)": "E"}
98765432
{"Date (05/19/2020)": "05/18/2020", "Type (05/19/2020)": "B", "Date (10/23/2020)": "10/26/2020", "Type (10/23/2020)": "T"}
JSON to columns
Using the amazing create_jsonb_flat_view function from this post I can convert the jsonb to columns like this:
id
Date (05/19/2020)
Type (05/19/2020)
Date (06/03/2020)
Type (06/03/2020)
Type (10/23/2020
Date (10/23/2020)
Type (10/23/2020)
10629465
null
null
06/01/2020
E
98765432
05/18/2020
B
10/26/2020
T
Need to move part of column name to row
Now, this is where I'm stuck. I need to remove the portion of the column name that is the Activity Date (e.g. (05/19/2020)) and create a row for each id and ActivityDate with additional columns for Date and Type like this:
id
ActivityDate
Date
Type
12345678
05/19/2020
null
null
12345678
06/03/2020
06/01/2020
E
98765432
05/19/2020
05/18/2020
B
98765432
10/23/2020
10/26/2020
T
I followed your link to the create_jsonb_flat_view article yesterday and then forgot this question. While I thank you for pointing me there, I think that mentioning it worked against you.
A more conventional approach using regexp_replace() works here. I left the date values as strings, but you can convert them with to_date() if needed:
with parse as (
select id, e.k, e.v,
regexp_replace(e.k, '\s+\([0-9/]{10}\)', '') as k_no_date,
regexp_replace(e.k, '^.+([0-9/]{10}).+', '\1') as k_date_only
from rawinput
cross join lateral jsonb_each_text(activity) as e(k, v)
)
select id,
k_date_only as activity_date,
min(v) filter (where k_no_date = 'Date') as date,
min(v) filter (where k_no_date = 'Type') as type
from parse
group by id, k_date_only;
db<>fiddle here
#Mike-Organek's Answer works beautifully!
However, I was curious if the regexp_replace() calls might be slowing the query down a bit and it seemed I could get the same results using a simpler function.
Since Mike gave me a great example to start with I modified it to split on the space between Date and (05/19/2020).
For 20,000 rows, it went from taking an avg of 7 sec on my local machine to an avg of .9 sec.
Here is the resulting query:
with parse as (
select id, e.k, e.v,
split_part(e.k, ' ', 1) as k_no_date,
trim(split_part(e.k, ' ', 2),'()') as k_date_only
from rawinput
cross join lateral jsonb_each_text(activity) as e(k, v)
)
select id,
k_date_only as activity_date,
min(v) filter (where k_no_date = 'Date') as date,
min(v) filter (where k_no_date = 'Type') as type
from parse
group by id, k_date_only;
Focused DB tables:
Task:
For given location ID and culture ID, get max(crop_yield.value) * culture_price.price (let's call this multiplication monetaryGain) grouped by year, so something like:
[
{
"year":2014,
"monetaryGain":...
},
{
"year":2015,
"monetaryGain":...
},
{
"year":2016,
"monetaryGain":...
},
...
]
Attempt:
SELECT cp.price * max(cy.value) AS monetaryGain, EXTRACT(YEAR FROM cy.date) AS year
FROM culture_price AS cp
JOIN culture AS c ON cp.id_culture = c.id
JOIN crop_yield AS cy ON cy.id_culture = c.id
WHERE c.id = :cultureId AND cy.id_location = :locationId AND cp.year = year
GROUP BY year
ORDER BY year
The problem:
"columns "cp.price", "cy.value" and "cy.date" must appear in the GROUP BY clause or be used in an aggregate function"
If I put these three columns in GROUP BY, I won't get expected result - It won't be grouped just by year obviously.
Does anyone have an idea on how to fix/write this query better in order to get task result?
Thanks in advance!
The fix
Rewrite monetaryGain to be:
max(cp.price * cy.value) AS monetaryGain
That way you will not be required to group by cp.price because it is not outputted as an group member, but used in aggregate.
Why?
When you write GROUP BY query you can output only columns that are in GROUP BY list and aggregate function values. Well this is expected - you expect single row per group, but you may have several distinct values for the field that is not in grouping column list.
For the same reason you can not use a non grouping column(-s) in arithmetic or any other (not aggregate) function because this would lead in several results for in single row - there would not be a way to display.
This is VERY loose explanation but I hope will help to grasp the concept.
Aliases in GROUP BY
Also you should not use aliases in GROUP BY. Use:
GROUP BY EXTRACT(YEAR FROM cy.date)
Using alias in GROUP BY is not allowed. This link might explain why: https://www.postgresql.org/message-id/7608.1259177709%40sss.pgh.pa.us
what mdx query logic could i implement for this example to get two rows in result set for hrid = 1 with 1/1/16 as min date(start) for first row where someattribut shows up on column with value 'A'
and 1/15/16 as min date(start) for second row where someattribute has value of 'B' and measure.whatevers has its aggregation for whatever data corresponds to that dimension row.
Im trying to just look at january 2016
everything ive tried i seem to get min date values of 1/1/1900 or both rows have value of 1/1/2016 or i get errors since i cant figure it out.
heres my mdx sample:
WITH MEMBER [Measures].[Start] as
(
-- min date that the combination of someattribute and hrid have certain
-- value withing the range of the where clause restriction of january 2016
SELECT {
[Measures].[Start]
, [Measures].[Whatevers]
} ON COLUMNS
, NON EMPTY {
[Agent].[HRID].children
* [Agent].[someAtribute].Members
} ON ROWS
FROM [RADM_REPORTING]
WHERE (
[Date].[Date View].[Month].&[201601]
)
this works, but it feels kind of like a hack or maybe it feels like its not robust, I am not familiar enough with mdx to be able to make that call.
WITH MEMBER [Measures].[Start] as
filter([Date].[Date View].[Month].&[201601].children,
[Measures].[Whatevers]).item(0).membervalue
Here is a potential direction that is more general:
WITH
MEMBER [Measures].[Start] AS
Min
(
(EXISTING
[Date].[Date].[Date].MEMBERS)
,IIF
(
[Measures].[Internet Sales Amount] = 0
,NULL
,[Date].[Date].CurrentMember.MemberValue
)
)
SELECT
NON EMPTY
{
[Measures].[Start]
,[Measures].[Internet Sales Amount]
} ON COLUMNS
,NON EMPTY
[Product].[Product Categories].[Product] ON ROWS
FROM [Adventure Works]
WHERE
[Date].[Calendar].[Calendar Year].&[2005];
It gives the following:
I am new to MDX. Could please suggest how to write below T-SQL query in MDX Query language.
T-SQL:
SELECT wp.date,Sum(wp.bbls_oil)
AS BBLSOIL_TOTAL,Sum(wp.bbls_water)
AS BBLSWATER_TOTAL,Sum(wp.mcf_prod)
AS MCF_PROD_TOTAL,Sum(wp.vent_flare)
AS VENT_FLARE_TOTAL
FROM well_prod_bst_horiz_og_2_yrs wp, well_index wi
WHERE wp.fileno = wi.fileno
AND wp.date <= :startDate
AND wp.date >= :endDate
AND wi.apino IN (:wellids)
GROUP BY wp.date ORDER BY wp.date ASC";
In the above query, Start and End date values are supplied dynamically.
Assuming you have measures named BBLSOIL, BBLSWATER, MCF_PROD, and VENT_FLARE_TOTAL and your date attribute is named [Date].[Date], and your :startDate contains [Date].[Date].&[20120101] and your :endDate contains [Date].[Date].&[20141231], and your cube is named Name of your Cube you would write
SELECT {
Measures.[BBLSOIL],
Measures.[BBLSWATER],
Measures.[MCF_PROD],
Measures.[VENT_FLARE_TOTAL]
}
ON COLUMNS,
[Date].[Date].&[20120101] : [Date].[Date].&[20141231]
ON ROWS
FROM [Name of your Cube]
i. e. you put an MDX set containing the list of required measures on the columns axis and you put a range (specified by :) on the rows axis. Aggregations like Sum and GROUP BY are not necessary inn MDX, these are handled by the cube definition.
I am pretty new to SSAS, OLAP and MDX syntax.
So I have this MDX to query the cube by TSQL (by linked server to SSAS) and it works fine:
select * from openquery(GCUBE,
'SELECT NON EMPTY { [Measures].[Valore] } ON COLUMNS,
NON EMPTY {
( [Prodotti].[Top Marca].[Top Marca].ALLMEMBERS
* [Prodotti].[Top Codice].[Top Codice].ALLMEMBERS
* [Agenti].[Vw Agenti].[Vw Agenti].ALLMEMBERS
* [Calendario].[AnnoMese].[Mese].ALLMEMBERS
* [Prodotti].[Ordinamento].[Ordinamento].ALLMEMBERS
* [Prodotti].[Top].[Top].ALLMEMBERS )
}
DIMENSION PROPERTIES MEMBER_CAPTION
ON ROWS FROM ( SELECT ( { [Calendario].[Anno].&[2012] } )
ON COLUMNS FROM ( SELECT ( { [Agenti].[Vw Agenti].&[005] } )
ON COLUMNS FROM [Vendite])) WHERE ( [Calendario].[Anno].&[2012] )'
)
Well, the [Prodotti].[Top Marca] is a dimension based on a table with the 50 top selling brands and this MDX is filtered by a specific ID Agent [Vw Agenti] = 005.
The purpose of the query is to find out how the agent is selling the company's 50 top selling brands.
The query works fine but there is one brand not sold by this agent and I need to show the empty row.
The figure below shows the missing record relative to the position (rank) 31.
I understand the concept about NON EMPTY but I can't find the right syntax to also show the empty record.
How should I modify the MDX?
I tried to remove NON EMPTY but I get a generic error:
Cannot execute the query against OLE DB provider "MSOLAP" for linked server "GCUBE"
Do I need to change the dimension Top Marca in the cube?
Thanks in advance to anyone who can help me or give the right tips to solve this.
I'm not a specialist of SSAS/TSQL, but I would try a simple request first :
SELECT
[Measures].[Valore]ON COLUMNS,
NON EMPTY [Prodotti].[Top Marca].[Top Marca].ALLMEMBERS ON ROWS
FROM ( SELECT { [Calendario].[Anno].&[2012] } ON COLUMNS
FROM ( SELECT { [Agenti].[Vw Agenti].&[005] } ON COLUMNS
FROM [Vendite]
)
) '
Any way to run it without this TSQL stuff?