MDX: Calculated Dimensions? - filtering

I don't know if this is possible or not, or if my limited knowledge of MDX is pushing me in the wrong direction...
The cube I'm dealing with has two different dimensions for dates, [Statement Dates] and [Premium Dates]. Further in the hierarchy of each looks like this:
[Statement Dates].[Statement Year].[2008]
[Payment Dates].[Payment Year].[2008]
For the business logic I'm implementing I need to do something like:
select
({ [Measures].[Commission] }) on columns,
({[Products].[Product Category]}) on rows
from [Cube]
where
(
IIF( [Products].[Product Category].CurrentMember.Name = "Some Category",
[Statement Dates].[Statement Year].[2008],
[Payment Dates].[Payment Year].[2008] )
)
So I need it to discriminate what dimension to use for filtering the year based on what product category is being used.
This parses ok and the query runs, but the numbers seem to suggest that the IIF statement is always returning false.

Because the WHERE clause gets evaluated first the .CurrentMember function in the IIF will only be seeing "All Product Cateogories". In which case the [Products].[Product Category].CurrentMember.Name will never be equal to "Some Category" as the only product category in context is "All Product Cateogories".
One possible work around would be to do a calculation like the following:
WITH MEMBER Measures.Commission2 as
SUM(
{[Products].[Product Category].[Product Category].Members}
, IIF( [Products].[Product Category].CurrentMember.Name = "Some Category"
, ([Statement Dates].[Statement Year].[2008],[Measures].[Commission])
, ([Payment Dates].[Payment Year].[2008].[Measures].[Commission]) )
)
select
({ [Measures].[Commission2] }) on columns
, ({[Products].[Product Category]}) on rows
from [Cube]
You could also do a scoped assignement in the calculation script in your cube to do this sort of thing.

Related

PostgreSQL: json object where keys are unique array elements and values are the count of times they appear in the array

I have an array of strings, some of which may be repeated. I am trying to build a query which returns a single json object where the keys are the distinct values in the array, and the values are the count of times each value appears in the array.
I have built the following query;
WITH items (item) as (SELECT UNNEST(ARRAY['a','b','c','a','a','a','c']))
SELECT json_object_agg(distinct_values, counts) item_counts
FROM (
SELECT
sub2.distinct_values,
count(items.item) counts
FROM (
SELECT DISTINCT items.item AS distinct_values
FROM items
) sub2
JOIN items ON items.item = sub2.distinct_values
GROUP BY sub2.distinct_values, items.item
) sub1
DbFiddle
Which provides the result I'm looking for: { "a" : 4, "b" : 1, "c" : 2 }
However, it feels like there's probably a better / more elegant / less verbose way of achieving the same thing, so I wondered if any one could point me in the right direction.
For context, I would like to use this as part of a bigger more complex query, but I didn't want to complicate the question with irrelevant details. The array of strings is what one column of the query currently returns, and I would like to convert it into this JSON blob. If it's easier and quicker to do it in code then I can, but I wanted to see if there was an easy way to do it in postgres first.
I think a CTE and json_object_agg() is a little bit of a shortcut to get you there?
WITH counter AS (
SELECT UNNEST(ARRAY['a','b','c','a','a','a','c']) AS item, COUNT(*) AS item_count
GROUP BY 1
ORDER BY 1
)
SELECT json_object_agg(item, item_count) FROM counter
Output:
{"a":4,"b":1,"c":2}

oracle: grouping on merged columns

I have a 2 tables FIRST
id,rl_no,adm_date,fees
1,123456,14-11-10,100
2,987654,10-11-12,30
3,4343,14-11-17,20
and SECOND
id,rollno,fare,type
1,123456,20,bs
5,634452,1000,bs
3,123456,900,bs
4,123456,700,bs
My requirement is twofold,
1, i first need to get all columns from both tables with common rl_no. So i used:
SELECT a.ID,a.rl_no,a.adm_date,a.fees,b.rollno,b.fare,b.type FROM FIRST a
INNER JOIN
SECOND b ON a.rl_no = b.rollno
The output is like this:
id,rl_no,adm_date,fees,rollno,fare,type
1,123456,14-11-10,100,123456,20,bs
1,123456,10-11-12,100,123456,900,bs
1,123456,14-11-17,100,123456,700,bs
2,Next i wanted to get the sum(fare) of those rollno that were common between the 2 tables and also whose fare >= fees from FIRST table group by rollno and id.
My query is:
SELECT x.ID,x.rl_no,,x.adm_date,x.fees,x.rollno,x.type,sum(x.fare) as "fare" from (SELECT a.ID,a.rl_no,a.adm_date,a.fees,b.rollno,b.fare,b.type FROM FIRST a
INNER JOIN
SECOND b ON a.rl_no = b.rollno) x, FIRST y
WHERE x.rollno = y.rl_no AND x.fare >= y.fees AND x.type IS NOT NULL GROUP BY x.rollno,x.ID ;
But this is throwing in exceptions.
ORA-00979: not a GROUP BY expression
00979. 00000 - "not a GROUP BY expression"
The expected output will be like this:
id,rollno,adm_date,fare,type
1,123456,14-11-10,1620,bs
So could someone care to show an oracle newbie what i'm doing wrong here?
It looks like there's a couple different problems here;
Firstly, you're trying to group by an x.ID column which doesn't exist; it looks like you'll want to add ID to the selected columns in your sub-query.
Secondly, when aggregating with GROUP BY, all selected columns need to be either listed in the GROUP BY statement or aggregated. If you're grouping by rollno and ID, what do you want to have happen to all the extra values for adm_date, fees, and type? Are those always going to be the same for each distinct rollno and ID pair?
If so, simply add them to the GROUP BY statement, ie,
GROUP BY adm_date, fees, type, rollno, ID
If not, you'll need to work out exactly how you want to select which one to be output; If you've got output like your example (adding in an ID column here)
ID,adm_date,fees,rollno,fare,type
1,14-11-10,100,123456,20,bs
1,10-11-12,100,123456,900,bs
1,14-11-17,100,123456,700,bs
Call that result set 'a'. If I run;
SELECT a.ID, a.rollno, SUM(a.fare) as total_fare
FROM a
GROUP BY a.ID, a.rollno
Then the result will be a single row;
ID,rollno,total_fare
1,123456,1620
So, if you also select the adm_date, fees, and type columns, oracle has no idea what you mean to do with them. You're not using them for grouping, and you're not telling oracle how you want to pick which one to use.
You could do something like
SELECT a.ID,
FIRST(a.adm_date) as first_adm_date,
FIRST(a.fees) as first_fees,
a.rollno,
SUM(a.fare) as total_fare,
FIRST(a.type) as first_type
FROM a
GROUP BY a.ID, a.rollno
Which would give the result;
ID,first_adm_date,first_fees,rollno,total_fare,first_type
1,14-11-10,100,123456,1620,bs
I'm not sure if that's what you mean to do though.

MDX: Define dimension sub set and show the total

Since in MDX you can specify the member [all] to add the aggregation between all the members of the dimension, if I want to show the totals of a certain measure I can build a query like
SELECT {
[MyGroup].[MyDimension].[MyDimension].members,
[MyGroup].[MyDimension].[all]
} *
[Measures].[Quantity] on 0
FROM [MyDatabase]
Now I want to filter MyDimension for a bunch of values and show the total of the selected values, but of course if I generate the query
SELECT {
[MyGroup].[MyDimension].[MyDimension].&[MyValue1],
[MyGroup].[MyDimension].[MyDimension].&[MyValue2],
[MyGroup].[MyDimension].[all]
} *
[Measures].[Quantity] on 0
FROM [MyDatabase]
it shows the Quantity for MyValue1, MyValue2 and the total of all MyDimension members, not just the ones I selected.
I investigated a bit and came up to a solution that include the generation of a sub query to filter my values
SELECT {
[MyGroup].[MyDimension].[MyDimension].members, [MyGroup].[MyDimension].[all]
} * [Measures].[Quantity] ON 0
FROM (
SELECT {
[MyGroup].[MyDimension].[MyDimension].&[MyValue1],
[MyGroup].[MyDimension].[MyDimension].&[MyValue2]
} ON 0
FROM [MyDatabase]
)
Assuming this works, is there a simplier or more straight forward approach to achieve this?
I tried to use the SET statement to define my custom tuple sets but then I couldn't manage to show the total.
Keep in mind that in my example I kept things as easy as possible, but in real cases I could have multiple dimension on both rows and columns as well as multiple calculated measures defined with MEMBER statement.
Thanks!
What you have done is standard - it is the simple way!
One thing to bear in mind when using a sub-select is that it is not a full filter, in that the original All is still available. I think this is in connection with the query processing of the clauses in mdx - here is an example of what I mean:
WITH
MEMBER [Product].[Product Categories].[All].[All of the Products] AS
[Product].[Product Categories].[All]
SELECT
[Measures].[Internet Sales Amount] ON 0
,NON EMPTY
{
[Product].[Product Categories].[All] //X
,[Product].[Product Categories].[All].[All of the Products] //Y
,[Product].[Product Categories].[Category].MEMBERS
} ON 1
FROM
(
SELECT
{
[Product].[Product Categories].[Category].&[4]
,[Product].[Product Categories].[Category].&[1]
} ON 0
FROM [Adventure Works]
);
So line marked X will be the sum of categories 4 and 1 but line Y will sill refer to the whole of Adventure Works:
This behavior is useful although a little confusing when using All members in the WITH clause.

MDX Query with Date Range Filter

I am new to the MDX queries. I am writing a MDX query to select a Measure value across months and I am putting date Range as filter here just to restrict no of Months returned. For eg I want Sales Revenue for each month in Date Range of 01-Jan-2014 to 30-Jun-2014. Ideally, it should give me sales value for six months i.e Jan, Feb, Mar, Apr, May and June. However when i write below query, I get error. PFB the below enter code here`ow query.
Select NON EMPTY {[Measures].[Target Plan Value]} ON COLUMNS,
NON EMPTY {[Realization Date].[Hierarchy].[Month Year].Members} ON ROWS
From [Cube_BCG_OLAP]
( { [Realization Date].[Hierarchy].[Date].&[20140101] :
[Realization Date].[Hierarchy].[Date].&[20141231] })
The error I get is The Hierarchy hierarchy already appears in the Axis1 axis. Here Date and Month Year belong to same dimension table named as Realization Date. Please help me. Thanks in advance.
You were missing the WHERE clause but I guess that was a typo. As your error message tells, you can't have members of the same hierarchy on two or more axes. In situations like this, you can use something like below which in MDX terminology is called Subselect.
Select NON EMPTY {[Measures].[Target Plan Value]} ON COLUMNS,
NON EMPTY {[Realization Date].[Hierarchy].[Month Year].Members} ON ROWS
From (
SELECT
[Realization Date].[Hierarchy].[Date].&[20140101] :
[Realization Date].[Hierarchy].[Date].&[20141231] ON COLUMNS
FROM [Cube_BCG_OLAP]
)
I like the exists function in this situation:
SELECT
NON EMPTY {[Measures].[Target Plan Value]}
ON COLUMNS,
NON EMPTY
EXISTS(
[Realization Date].[Hierarchy].[Month Year].Members
, {
[Realization Date].[Hierarchy].[Date].&[20140101] :
[Realization Date].[Hierarchy].[Date].&[20141231]
}
)
ON ROWS
FROM [Cube_BCG_OLAP]
Select
[Measures].[Target Plan Value]} On Columns
{
[Realization Date].[Hierarchy].[Date].&[20140101].Parent :
[Realization Date].[Hierarchy].[Date].&[20140631].Parent
}
On Rows
From [Cube_BCG_OLAP]
You need to create this same dimension only for filter in the cube, for example, dimension_filter -> hierarchy_filter -> level_filter

Select most reviewed courses starting from courses having at least 2 reviews

I'm using Flask-SQLAlchemy with PostgreSQL. I have the following two models:
class Course(db.Model):
id = db.Column(db.Integer, primary_key = True )
course_name =db.Column(db.String(120))
course_description = db.Column(db.Text)
course_reviews = db.relationship('Review', backref ='course', lazy ='dynamic')
class Review(db.Model):
__table_args__ = ( db.UniqueConstraint('course_id', 'user_id'), { } )
id = db.Column(db.Integer, primary_key = True )
review_date = db.Column(db.DateTime)#default=db.func.now()
review_comment = db.Column(db.Text)
rating = db.Column(db.SmallInteger)
course_id = db.Column(db.Integer, db.ForeignKey('course.id') )
user_id = db.Column(db.Integer, db.ForeignKey('user.id') )
I want to select the courses that are most reviewed starting with at least two reviews. The following SQLAlchemy query worked fine with SQlite:
most_rated_courses = db.session.query(models.Review, func.count(models.Review.course_id)).group_by(models.Review.course_id).\
having(func.count(models.Review.course_id) >1) \ .order_by(func.count(models.Review.course_id).desc()).all()
But when I switched to PostgreSQL in production it gives me the following error:
ProgrammingError: (ProgrammingError) column "review.id" must appear in the GROUP BY clause or be used in an aggregate function
LINE 1: SELECT review.id AS review_id, review.review_date AS review_...
^
'SELECT review.id AS review_id, review.review_date AS review_review_date, review.review_comment AS review_review_comment, review.rating AS review_rating, review.course_id AS review_course_id, review.user_id AS review_user_id, count(review.course_id) AS count_1 \nFROM review GROUP BY review.course_id \nHAVING count(review.course_id) > %(count_2)s ORDER BY count(review.course_id) DESC' {'count_2': 1}
I tried to fix the query by adding models.Review in the GROUP BY clause but it did not work:
most_rated_courses = db.session.query(models.Review, func.count(models.Review.course_id)).group_by(models.Review.course_id).\
having(func.count(models.Review.course_id) >1) \.order_by(func.count(models.Review.course_id).desc()).all()
Can anyone please help me with this issue. Thanks a lot
SQLite and MySQL both have the behavior that they allow a query that has aggregates (like count()) without applying GROUP BY to all other columns - which in terms of standard SQL is invalid, because if more than one row is present in that aggregated group, it has to pick the first one it sees for return, which is essentially random.
So your query for Review basically returns to you the first "Review" row for each distinct course id - like for course id 3, if you had seven "Review" rows, it's just choosing an essentially random "Review" row within the group of "course_id=3". I gather the answer you really want, "Course", is available here because you can take that semi-randomly selected Review object and just call ".course" on it, giving you the correct Course, but this is a backwards way to go.
But once you get on a proper database like Postgresql you need to use correct SQL. The data you need from the "review" table is just the course_id and the count, nothing else, so query just for that (first assume we don't actually need to display the counts, that's in a minute):
most_rated_course_ids = session.query(
Review.course_id,
).\
group_by(Review.course_id).\
having(func.count(Review.course_id) > 1).\
order_by(func.count(Review.course_id).desc()).\
all()
but that's not your Course object - you want to take that list of ids and apply it to the course table. We first need to keep our list of course ids as a SQL construct, instead of loading the data - that is, turn it into a derived table by converting the query into a subquery (change the word .all() to .subquery()):
most_rated_course_id_subquery = session.query(
Review.course_id,
).\
group_by(Review.course_id).\
having(func.count(Review.course_id) > 1).\
order_by(func.count(Review.course_id).desc()).\
subquery()
one simple way to link that to Course is to use an IN:
courses = session.query(Course).filter(
Course.id.in_(most_rated_course_id_subquery)).all()
but that's essentially going to throw away the "ORDER BY" you're looking for and also doesn't give us any nice way of actually reporting on those counts along with the course results. We need to have that count along with our Course so that we can report it and also order by it. For this we use a JOIN from the "course" table to our derived table. SQLAlchemy is smart enough to know to join on the "course_id" foreign key if we just call join():
courses = session.query(Course).join(most_rated_course_id_subquery).all()
then to get at the count, we need to add that to the columns returned by our subquery along with a label so we can refer to it:
most_rated_course_id_subquery = session.query(
Review.course_id,
func.count(Review.course_id).label("count")
).\
group_by(Review.course_id).\
having(func.count(Review.course_id) > 1).\
subquery()
courses = session.query(
Course, most_rated_course_id_subquery.c.count
).join(
most_rated_course_id_subquery
).order_by(
most_rated_course_id_subquery.c.count.desc()
).all()
A great article I like to point out to people about GROUP BY and this kind of query is SQL GROUP BY techniques which points out the common need for the "select from A join to (subquery of B with aggregate/GROUP BY)" pattern.