Hierarchical SQL Queries, SSRS 2008, and Summing Columns - ssrs-2008

I am building an SSRS report and am having a bit of trouble figuring out how to implement it.
First, here is the basic structure of the data:
Level 1a
Level 2a
Level 2b
Level 2c
Level 3a
Level 1b
Level 1c
In the database, each level has an id and a parent id where all parent ids at the root (parents of level 1) are 0. Additionally, each level has a column with a numeric value. For example, I might have:
Level1a
ParentId = 0
Id = 1
DataValue = 42
Level2a
ParentId = 1
Id = 2
DataValue = 1
Currently, I have implemented a common table expression to get the hierarchy structure and levels of the data:
WITH cteTable(id, parentId, data, level) AS
(SELECT id,
parentId,
data,
1 AS level
FROM sampleTable
WHERE (parentId = 0)
UNION ALL
SELECT b.id,
b.parentId,
b.data,
c.level + 1 AS Expr1
FROM sampleTable AS b
INNER JOIN cteTable AS c ON b.parentId= p.id
)
SELECT id,
parentId,
data,
level
FROM cteTable
so the with the sample data from earlier, the results of the query would be:
+----+----------+------+-------+
| id | parentId | data | level |
|----+----------+------+-------+
| 1 | 0 | 42 | 1 |
| 2 | 1 | 1 | 2 |
+----+----------+------+-------+
From here, I need to build a report that will sum up the data to the root level. For example, the report would show that the data sum for id 1 is 43. One thing I don't know for certain is how deep the levels go - there will be some instances where there are no child levels and others where the tree goes a few levels deep.
There are two options I am considering right now. One is to create a drill down report which will show the data at each level with a sum where available. For example:
-Level1a SUM_LEVEL2_AND_LEVEL1a_DATA
Level2a DATA
Level2b DATA
-Level2c SUM_LEVEL3_AND_LEVEL2c_DATA
Level3a DATA
Level1b DATA
Level1c DATA
The other is to sum the data to the root and show a simple table in the report with the parent totals only. For example:
Level1a SUM_LEVEL1A_AND_ALL_CHILDREN_DATA
Level1b SUM_LEVEL1B_AND_ALL_CHILDREN_DATA
Level1c SUM_LEVEL1C_AND_ALL_CHILDREN_DATA
I can't seem to figure out how to get the drill down report working with the googling I've done (there is a link here that seemed to be useful, but I just couldn't get it to work).
Changing the database schema is not an option.
Does anyone know of any good resources I could use to get started or have any ideas on how to proceed with this?
Let me know if I am missing anything...

I finally found a solution to this one.
The common table expression (CTE) was a good start. The only change I made to it was in the recursive definition - instead of collecting the id of the immediate parent for each child, I assigned it the grandparent's id (the parent id the immediate parent had). This meant that every child was assigned the ultimate parent id instead of moving up only one step in the hierarchy. From there, it was pretty straightforward to use this in the report.
I put the CTE in a stored procedure of its own - the only purpose is to bring back the entire hierarchy. From there, I created another stored procedure, which combined the CTE with a different stored procedure. At this time, I applied the filters and returned the final result set.

Related

PostgreSql Group into predefined groups, where one group acts as group anything thats left

I have a database table with data similar to this.
create table DataTable {
name text,
value number
}
insert into DataTable values
('A', 1),('A', 2),('B', 3),('Other', 5),('C', 1);
I have 3 values A, B and Other.
I want to count the values so that the C get aggregated into Other
Expected output
Name | sum
A | 3
B | 3
Other | 6
I know that the result could be achieved by 2 sub queries, 1 That groups only A, B and OTher and the other that count everything that isnt A or B or Other into Other. And then agggregate these 2 subqueries together.
My question is, is there a way to do this without multiple subqueries? Reson being, the actual query in the application is a monster with multiple joins and aggregations. Repeating the same piece of code multiple times would make it very hard to maintain. So im hoping there is a clever trick.
Simple CASE might do the job:
SELECT
CASE
WHEN name != 'A' AND name != 'B' THEN 'Other'
ELSE name
END AS mapped_name,
SUM(value)
FROM DataTable
GROUP BY mapped_name
without any subquery at all.

Tableau Calculated Field Aggregate SUM for Lowest Level & MIN for All Others

I am having a difficult time aggregating the lowest level of a row group with SUM while aggregating all higher row headers by MIN.
My data is like this structured in a group like this:
Assembly
Parent
Child
Orders
Inventory
And an example of my dataset might look like:
Assembly | Parent | Child | Supply Type | Available Quantity
A1 | P1 | C1 | Orders | 0
A1 | P1 | C1 | Inventory | 50
A1 | P1 | C2 | Orders | 100
A1 | P1 | C2 | Inventory | 0
A1 | P2 | C3 | Orders | 50
A1 | P2 | C3 | Inventory | 100
I want to SUM the measures in the Supply Type rows when I collapse up to the Child row group, but then to show the MIN when I collapse to the Parent and Assembly row groups. For instance, if I collapse at the Child level I want to SUM and have C1 to show a total of 50 and C2 to show a total of 100, but when I collapse at the Parent level I want to see the MIN and have P1 show 50 and P2 show 150.
I was able to partially achieve what I am trying to do with:
{ INCLUDE [Child],[Parent],[Assembly] : MIN({ EXCLUDE [Supply Type] : SUM([Available Quantity])})}
But now at the lowest levels the Order and Inventory measures show the same numbers (which is not accurate) since the EXCLUDE function is essentially ignoring the categories at the Supply Type row level.
Ok, briefly, my recommendation is to (slightly) alter your approach. There are ways to literally do what you say you want, but they are more complicated and brittle than I think most people would want, just for the convenience of controlling by hitting + or
- to expand and collapse levels (not to mention a bit confusing to the user). Especially when there are alternatives that are simpler and likely more reliable.
Here is one alternative method using parameters and calculated fields -- which is a pretty general approach for many things in Tableau.
Define a parameter that specifies how many levels of your dimension hierarchy you wish to display. Call it, say, Display_Level or whatever makes sense to you. Give it the data type string, and provide a list of allowable values, say "Assembly", "Parent" etc.
Show your parameter control and choose how you wish it to look, say a combo box or radio buttons.
Define one calculated field to be the displayed dimension as follows, and put on the Rows shelf or others as desired.
case [Display_Level]
when "Assembly" then [Assembly]
when "Parent" then [Assembly] + " - " + [Parent]
when "Child" then [Assembly] + " - " + [Parent] + " - " + [Child]
...
end
Define another calculated field to be the displayed measure as follows, and put on the Columns shelf or others as desired.
case [Display_Level]
when "Assembly" then Min([Quantity])
when "Parent" then Min([Quantity])
when "Child" then SUM([Quantity])
...
end
This first draft may not be exactly what you want, but the approach can easily be extended. You are just using a parameter to control the display detail instead of clicking on the + or - next to a field name.
If you don't like the full names on the dimensions separated by dashes, then you can create multiple dimension fields to serve as labels, one per level, and have them either return the corresponding field or null depending on the parameter setting. For example, the Parent Label field could be defined as if [Display_Level] <> “Assembly” then [Parent] end Do something similar for the others. Then put all your “Label” fields on the viz to create row or column headers as desired. This will behave much like expand/collapse except that you control the detail using the parameter. There are several ways to get the Null to not display.
If your measure calculations are more complex than MIN() or SUM() - say if they use LOD calcs to rollup values first - then just modify your measure calculation.

backtesting in kdb; updating/passing table as we parse each row of table

I have a trade table that contains historical execution records which contains timestamp, ric, side, price, quantity (where ric are all equities). Additionally, I have aj'ed the futures price snapshot table at each execution time. So, the trade table contains: timestamp, ric, side, price, quantity, futures_price.
I am trying to create an intra-day backtesting system where:
as each execution record is parsed (via { BACKTESTNIG_LOGIC_HERE } each trade), different set of logics will be used to decide hedging timing.
Is it possible for me to create hedge table which can write down timestamp for executing futures, execution price, trade_qty, cumulative_qty without writing to the disk? Basically, I want to see if it is possible to dynamically update hedge table (as each execution record is passed) and pass hedge table.
I was looking at over or scan, but i wasn't sure if that was the right approach of doing it. Can you guys provide some insights on this matter?
Thank you!
Yes, it sounds like over is what you need to use in this case.
This allows you to pass in an initial state, update it and pass in back in for the next iteration.
For example:
q){.[x;(y`sym;`cnt);+;1]}/[([sym:()] cnt:`long$());trades]
sym | cnt
----| ---
ORCL| 114
YHOO| 110
AAPL| 105
IBM | 124
NOK | 120
CSCO| 112
MSFT| 95
DELL| 109
GOOG| 111
In this simple example, a trade table is iterated over (second arg to over), and the initial state is the simple keyed table ([sym:()] cnt:`long$())
On each iteration, we simply add 1 to the count for the relevant sym. In the real usage, you would perform your backtesting here and return the updated hedge table from the lambda function - this updated table will be passed back into the function in the next iteration (e.g. in this example each time the cnt for a sym is increased, the next iteration receives the table with that cnt increased)

Crystal Reports - Create subreport with column range [col1...col60] as datasource?

I am adept in both SQL and CR, but this is something I've never had to do.
In CR, I load a table that will always contain 1 record. There is a range of columns (like Column1, Column2 ... Column60). (bad design, I know, but I can't do anything to change that).
Thanks to this old design I have to manually add each column in my report like this:
-----------
| TABLE |
-----------
| Column1 |
| Column2 |
| Column3 |
| ... |
-----------
Now I would like to be able to create a subreport and create a datasource for it in such a way that [Column1...Column60] becomes a collection [Row1...Row60]. I want to be able to use the detailsection of the subreport to dynamically generate the table. That would save me a lot of time.
Is there any way to do that? Maybe a different approach to what I had in mind?
Edit
#Siva: I'll describe it the best way I can. The table exists out of 500+ columns and will only hold 1 record (never more). Because normalization was never taken into account when creating these tables (Objective C / DBF ages) columns like these: Brand01,Brand02,Brand03...Brand60 should have been placed in a separate table named "Brands"
The document itself is pretty straight forward considering there's only one record. But some columns have to be pivoted (stacked vertically) and placed in a table layout on the document which is a lot of work if you have to do it manually. That's why I wanted to feed a range of columns into my subreport so I can use the detail section of my subreport to generate the table layout automatically.
Ok got it... I will try to answer to the extent possible...
you need to have 2 columns in report that will show the 60 column names as 60 rows as 1st column and 60 column data as 2nd column. For this there are two ways that I can think of.
if columns are static and report need to be developed only once then though its a tough job manually create 120 formulas 60 for row names where you will write column names and 60 for data for respective columns and place in report since you have only one record you will get correct data. Like below:
formula 1:
column1 name // write manually
Formula 1:
databasefield for column1 // this has data for column1
Above will be one row in report like this you will get 120 formulas 60 rows and you don't need sub report here main report will do the job.
Since you are expecting dynamic behavior (Though columns are static), you can create view from database perspective or datatable (Please note I have no idea on datatable use it as per your convinience).
Create in such a way that it has 2 columns in table and in report use cross tab that will give you dynamic behaviour.
In cross tab column1 will be rows part and column 2 will be data.
Here also I don't see any requirement for sub report you can directly use main report. If you want sub report you can use aswell no harm since you have only 1 record

postgres ANY() with BETWEEN Condition

in case someone is wondering, i am recycling a different question i answered myself, because is realized that my problem has a different root-cause than i thought:
My question actually seems pretty simple, but i cannot find a way.
How do is query postgres if any element of an array is between two values?
The Documentation states that a BETWEEN b and c is equivalent to a > b and a < c
This however does not work on arrays, as
ANY({1, 101}) BETWEEN 10 and 20 has to be false
while
ANY({1,101}) > 10 AND ANY({1,101}) < 20 has to be true.
{1,101} meaning an array containing the two elements 1 and 101.
how can i solve this problem, without resorting to workarounds?
regards,
BillDoor
EDIT: for clarity:
The scenario i have is, i am querying an xml document via xpath(), but for this problem a column containing an array of type int[] does the job.
id::int | numbers::int[] | name::text
1 | {1,3,200} | Alice
2 | {21,100} | Bob
I want all Names, where there is a number that is between 20 and 30 - so i want Bob
The query
SELECT name from table where ANY(numbers) > 20 AND ANY(numbers) < 30
will return Alice and Bob, showing that alice has numbers > 20 as well as other numbers < 30.
A BETWEEN syntax is not allowed in this case, however between only gets mapped to > 20 AND < 30 internally anyways
Quoting the docs on the Between Operators' mapping to > and < documentation:
There is no difference between the two respective forms apart from the
CPU cycles required to rewrite the first one into the second one
internally.
PS.:
Just to avoid adding a new question for this: how can i solve
id::int | numbers::int[] | name::text
1 | {1,3,200} | Alice
2 | {21,100} | Alicia
SELECT id FROM table WHERE ANY(name) LIKE 'Alic%'
result: 1, 2
i can only find examples of matching one value to multiple regex, but not matching one regex against a set of values :/. Besides the shown syntax is invalid, ANY has to be the second operand, but the second operand of LIKE has to be the regex.
exists (select * from (select unnest(array[1,101]) x ) q1 where x between 10 and 20 )
you can create a function based on on this query
second approach:
select int4range(10,20,'[]') #> any(array[1, 101])
for timestamps and dates its like:
select tsrange( '2015-01-01'::timestamp,'2015-05-01'::timestamp,'[]') #> any(array['2015-05-01', '2015-05-02']::timestamp[])
for more info read: range operators