I'm trying to design a cube in SSAS 2008 for data whose base unit is Member-Month, meaning that for each member there is demographic data, certain other indicators that may change, and dollar amounts paid per month. I feel like I need to include MemberID and MonthKey in the same dimension, but this seems like the wrong approach in the case when I just want to see dollars by month. If so, would I put both a Month Key and the Member-Month Key in the fact table? Or use a surrogate key in the Member-Month dimension, but include the MemberID and MonthKey in it? It seems wrong to have Month in two different places (Member-Month and Date). Any help is appreciated!
If I understand your question correctly, you should create a member table, month (or dates) table and a fact table that has FactKey,MemberKey,MonthKey,Amount columns in it. Then you may create Member and Month dimensions.
You should not add month data to the member dimension. The relation between month and member dimensions is already built by the fact table which has all data required for cross dimension data existance.
This is a very simple design problem and easily get implemented with SSAS.
Hope this help.
Related
I'm new to Crystal and trying to figure out how to bring in a value from a table with multiple matches without creating duplicate rows.
I have a table of inventory items, and each inventory item has several matches in a costs table. For example inventory item 1234 may have a match for a labor cost, a material cost, and an overhead cost.
What I'd like to do is to have a single row for the inventory item, a column for it's labor cost, a column for it's material cost, etc.
So far the best I've been able to do is to create a formula field for the labor cost and use a conditional to show the cost if it's labor cost type id, but that just resulted in multiple rows where only one showed the cost and the rest showed zero.
What I would probably do in SQL (right or wrong) would be to create common table expressions for each type with the item id and the cost so each CTE would have only one cost value, then link those CTEs to the main query.
I know this is probably a really basic thing to do, but I couldn't seem to find any answers with Google. I'm happy to read over a resource if someone could just point me in the right direction.
Thanks in advance.
Add the Cost table several times to the report, once per each cost type.
This will create aliases that you can name based on the intended type.
Do the join on item ID but also add a record selection criterion for each alias to restrict it to only records of the desired cost type.
There are a couple of other options but the solution above is the cleanest approach.
I ended up accomplishing this by creating sub reports for each of the values that I was wanting to see, linking them on the item id, and filtering each report on the cost type. If there's a better way I'd love to know.
Does anyone know if I can add two rows together so that I end up with just one row in Tableau (see screenshot)? So, if both rows are city Aachen and one row has a value for cost but not for purchasing power and the other row has a value for purchasing power but not cost, I would want just one row with both values. I am not interested in the columns "Table Name" and "Document Index(...". Thankful for any help!
Manipulating data like that in Tableau is usually no-go. Nevertheless, you can try Tableau prep and you should be able to do what you need here. Or maybe a different tool (even excel).
With that said, even though you have the info in two rows, the default approach for Tableau is always to aggregate data, so even if you have many rows with similar cases, once you take it to a viz using City (for example) as a dimension, this issue shouldn't really matter.
I have volume data for specific customers. The customer names come from salesforce and the volume comes from another table. When I add each in tableau, i get a nice table that seems to be working.
We can see that there are 19 values ~500 My ultimate goal is to sum these based upon filters.
A way i discovered that i can do that is to use the syntax
{ FIXED [Account Id]: count([Volume]) }
But when i do that,
I get
When I change my function to count([volume]) i get a count of all joined rows ~250k
My question is how do i make this respect indivudal entries in the database and not all the joined values? If there was a way to do the sum for distinct timestamps in another field this would also work? Any other advice would be helpful from you tableau experts.
Thanks!
I think i got it. In the table of the database that i was trying to calculate there were 20 rows that needed to be calculated. When the data was joined in SF, it duplicated the rows. The trick here was to do the sum of the max for each primary key
SUM({ FIXED [Pk], [Name1] : MAX([Volume]) })
I will try to explain the problem on an abstract level first:
I have X amount of data as input, which is always going to have a field DATE. Before, the dates that came as input (after some process) where put in a table as output. Now, I am asked to put both the input dates and any date between the minimun date received and one year from that moment. If there was originally no input for some day between this two dates, all fields must come with 0, or equivalent.
Example. I have two inputs. One with '18/03/2017' and other with '18/03/2018'. I now need to create output data for all the missing dates between '18/03/2017' and '18/04/2017'. So, output '19/03/2017' with every field to 0, and the same for the 20th and 21st and so on.
I know to do this programmatically, but on powercenter I do not. I've been told to do the following (which I have done, but I would like to know of a better method):
Get the minimun date, day0. Then, with an aggregator, create 365 fields, each has that "day0"+1, day0+2, and so on, to create an artificial year.
After that we do several transformations like sorting the dates, union between them, to get the data ready for a joiner. The idea of the joiner is to do an Full Outer Join between the original data, and the data that is going to have all fields to 0 and that we got from the previous aggregator.
Then a router picks with one of its groups the data that had actual dates (and fields without nulls) and other group where all fields are null, and then said fields are given a 0 to finally be written to a table.
I am wondering how can this be achieved by, for starters, removing the need to add 365 days to a date. If I were to do this same process for 10 years intead of one, the task gets ridicolous really quick.
I was wondering about an XOR type of operation, or some other function that would cut the number of steps that need to be done for what I (maybe wrongly) feel is a simple task. Currently I now need 5 steps just to know which dates are missing between two dates, a minimun and one year from that point.
I have tried to be as clear as posible but if I failed at any point please let me know!
Im not sure what the aggregator is supposed to do?
The same with the 'full outer' join? A normal join on a constant port is fine :) c
Can you calculate the needed number of 'dublicates' before the 'joiner'? In that case a lookup configured to return 'all rows' and a less-than-or-equal predicate can help make the mapping much more readable.
In any case You will need a helper table (or file) with a sequence of numbers between 1 and the number of potential dublicates (or more)
I use our time-dimension in the warehouse, which have one row per day from 1753-01-01 and 200000 next days, and a primary integer column with values from 1 and up ...
You've identified you know how to do this programmatically and to be fair this problem is more suited to that sort of solution... but that doesn't exclude powercenter by any means, just feed the 2 dates into a java transformation, apply some code to produce all dates between them and for a record to be output for each. Java transformation is ideal for record generation
You've identified you know how to do this programmatically and to be fair this problem is more suited to that sort of solution... but that doesn't exclude powercenter by any means, just feed the 2 dates into a java transformation, apply some code to produce all dates between them and for a record to be output for each. Java transformation is ideal for record generation
Ok... so you could override your source qualifier to achieve this in the selection query itself (am giving Oracle based example as its what I'm used to and I'm assuming your data in is from a table). I looked up the connect syntax here
SQL to generate a list of numbers from 1 to 100
SELECT (MIN(tablea.DATEFIELD) + levquery.n - 1) AS Port1 FROM tablea, (SELECT LEVEL n FROM DUAL CONNECT BY LEVEL <= 365) as levquery
(Check if the query works for you - haven't access to pc to test it at the minute)
I'm trying to sum the data points by Months instead of individual days. The data is originating from an SQL Query so I'm thinking this may be the only way to do that. However, I would much rather do this inside of Report Builder 3.0. Any hints on how to do this?
For example, I want to see the number of tickets for the Months of December and January as only two seperate data points.
can you create a new field ( calculated perhaps) onthe dataset and group by that?
else you should be able to create an expression on the graph's group that groups by Month of a certain field.