Tableau - Bins based on Fixed LOD Customer Units - tableau-api

What I'm trying to do: Create a histogram showing the distribution of customers based on their annual ordering size (1-10 units, 11-50, and so on) based on a Combined Field (Child + Zip Code which is our definition of a customer).
Problem: I cannot figure out a way to calculate the different bins correctly. I've seen plenty of posts for using bins in Tableau but none calculated based off a unique id like mine. It seems the customers are being put in every category (1-10, 11-20, etc...) instead of a unique category if their unit sales go beyond the <= . Perhaps I'm misunderstanding FIXED LOD calcs.
End goal: Get a count of the customers in these different ordering ranges to display on a histogram.
Having no luck with this formula:
IF { FIXED [UID_Cust] : SUM([Units]) } <= 10 THEN '1-10'
ELSEIF { FIXED [UID_Cust] : SUM([Units]) } <= 20 THEN '11-20'
ELSEIF { FIXED [UID_Cust] : SUM([Units]) } <= 50 THEN '21-50'
ELSEIF { FIXED [UID_Cust] : SUM([Units]) } <= 250 THEN '51-250'
ELSE '>250'
END
Here is a picture of what I'm currently getting. Everything would be perfect if I could replace those little blocks with just one number, the count of the customers in that range.

Turns out the problem was the LOD calc. I needed to add the year since I forgot Fixed LOD ignores worksheet filters.
{ FIXED [UID_Cust], [Order_Date] = 2017 : SUM([Units]) }
Then I saved this as an individual table calc "UID_Sales"
IF [UID_Sales] <= 10 THEN '1-10'
ELSEIF [UID_Sales] <= 20 THEN '11-20'
ELSEIF [UID_Sales] <= 50 THEN '21-50'
ELSEIF [UID_Sales] <= 250 THEN '51-250'
ELSE '>250'
END

Related

Looking for advice on improving a custom function in AnyLogic

I'm estimating last mile delivery costs in an large urban network using by-route distances. I have over 8000 customer agents and over 100 retail store agents plotted in a GIS map using lat/long coordinates. Each customer receives deliveries from its nearest store (by route). The goal is to get two distance measures in this network for each store:
d0_bar: the average distance from a store to all of its assigned customers
d1_bar: the average distance between all customers common to a single store
I've written a startup function with a simple foreach loop to assign each customer to a store based on by-route distance (customers have a parameter, "customer.pStore" of Store type). This function also adds, in turn, each customer to the store agent's collection of customers ("store.colCusts"; it's an array list with Customer type elements).
Next, I have a function that iterates through the store agent population and calculates the two average distance measures above (d0_bar & d1_bar) and writes the results to a txt file (see code below). The code works, fortunately. However, the problem is that with such a massive dataset, the process of iterating through all customers/stores and retrieving distances via the openstreetmap.org API takes forever. It's been initializing ("Please wait...") for about 12 hours. What can I do to make this code more efficient? Or, is there a better way in AnyLogic of getting these two distance measures for each store in my network?
Thanks in advance.
//for each store, record all customers assigned to it
for (Store store : stores)
{
distancesStore.print(store.storeCode + "," + store.colCusts.size() + "," + store.colCusts.size()*(store.colCusts.size()-1)/2 + ",");
//calculates average distance from store j to customer nodes that belong to store j
double sumFirstDistByStore = 0.0;
int h = 0;
while (h < store.colCusts.size())
{
sumFirstDistByStore += store.distanceByRoute(store.colCusts.get(h));
h++;
}
distancesStore.print((sumFirstDistByStore/store.colCusts.size())/1609.34 + ",");
//calculates average of distances between all customer nodes belonging to store j
double custDistSumPerStore = 0.0;
int loopLimit = store.colCusts.size();
int i = 0;
while (i < loopLimit - 1)
{
int j = 1;
while (j < loopLimit)
{
custDistSumPerStore += store.colCusts.get(i).distanceByRoute(store.colCusts.get(j));
j++;
}
i++;
}
distancesStore.print((custDistSumPerStore/(loopLimit*(loopLimit-1)/2))/1609.34);
distancesStore.println();
}
Firstly a few simple comments:
Have you tried timing a single distanceByRoute call? E.g. can you try running store.distanceByRoute(store.colCusts.get(0)); just to see how long a single call takes on your system. Routing is generally pretty slow, but it would be good to know what the speed limit is.
The first simple change is to use java parallelism. Instead of using this:
for (Store store : stores)
{ ...
use this:
stores.parallelStream().forEach(store -> {
...
});
this will process stores entries in parallel using standard Java streams API.
It also looks like the second loop - where avg distance between customers is calculated doesn't take account of mirroring. That is to say distance a->b is equal to b->a. Hence, for example, 4 customers will require 6 calculations: 1->2, 1->3, 1->4, 2->3, 2->4, 3->4. Whereas in case of 4 customers your second while loop will perform 9 calculations: i=0, j in {1,2,3}; i=1, j in {1,2,3}; i=2, j in {1,2,3}, which seems wrong unless I am misunderstanding your intention.
Generally, for long running operations it is a good idea to include some traceln to show progress with associated timing.
Please have a look at above and post results. With more information additional performance improvements may be possible.

Manipulating last two rows if there's data based on a Cut date

This question is a slightly varied version of this one...
Now I'm using Measures instead of Calculated columns and the date is static instead of having it based on a dropdown list.
Here's the Power BI test .pbix file:
https://drive.google.com/open?id=1OG7keqhdvDUDYkFQFMHyxcpi9Zi6Pn3d
This printscreen describes what I'm trying to accomplish:
Basically the date in P6 Update table is used as a cut date and will be fixed\static. It's imported from an Excel sheet where the user can customize it however they want.
Here's what should happen when a matching row in Test data table is found for P6 Update date:
column Earned Daily - must have its value summed with the next row if there's one;
column Earned Cum - must grab the next row's value;
all the previous rows should remain intact, that is, their values won't change;
all subsequent rows must have their values assigned 0.
So for example:
If P6 Update is 1-May-2018, this is the expected result:
1-May 7,498 52,106
2-May 0 0
If P6 Update is 30-Apr-2018, this is the expected result:
30-Apr 13,173 50,699
1-May 0 0
2-May 0 0
If P6 Update is 29-Apr-2018, this is the expected result:
29-Apr 11,906 44,608
30-Apr 0 0
1-May 0 0
2-May 0 0
and so on...
Hope this makes sense.
This is easier in Excel, but trying to do this in Power BI is making me go nuts.
I will ignore previously asked related questions and start from scratch.
First, create a measure:
Current Earn =
CALCULATE (
SUM( 'Test data'[Value]),
'Test data'[Act Rem] = "Actual Units",
'Test data'[Type] = "Current"
)
This measure will be used in other measures, to save you from typing all these conditions ("Actual Units" and "Current") again and again. It's a great practice to re-use measures in other measures - saves work, makes code cleaner and easier to refactor.
Create another measure:
Cut Date = SELECTEDVALUE('P6 Update'[Date])
We will use this measure whenever we need a cut off date. Please note that it does not have to be hard-coded - if P6 table contains a list of dates, you can create a pull-down slicer from the dates, and can choose the cut-off date dynamically. The formula will work properly.
Create third measure:
Next Earn =
VAR Cut_Date = [Cut Date]
VAR Current_Date = MAX ( 'Test data'[Date] )
VAR Next_Date = Current_Date + 1
VAR Current_Earn = [Current Earn]
VAR Next_Earn = CALCULATE ( [Current Earn], 'Test data'[Date] = Next_Date )
RETURN
SWITCH (
TRUE,
Current_Date < Cut_Date, Current_Earn,
Current_Date = Cut_Date, Current_Earn + Next_Earn,
BLANK ()
)
I am not sure if "Next Earn" is a good name for it, hopefully you will find a more intuitive name. The way it works: we save all necessary inputs into variables, and then use SWITCH function to define the results. Hopefully it's self-explanatory. (Note: if you need 0 above Cut Date, replace BLANK() with 0).
Finally, we define a measure for cumulative earn. It does not require any special logic, because previous measure takes care of it properly:
Cum Earn =
VAR Current_Date = MAX('Test data'[Date])
RETURN
CALCULATE(
[Next Earn],
FILTER(ALL('Test data'[Date]), 'Test data'[Date] <= Current_Date))
Result:

MDX Calculated member filter by dimension attribute and member value

I have MDX (similar to one questioned and answered here):
(
[PX Market].[PX MARKET NAME].&[Elbas],
[Measures].[PX QUANTITY]
)
It works for me (it filters measures to value "Elbas" only). But I need another filtering - to have only values which are < or > than 0. There shoud be some condition similar to "[Measures].[PX QUANTITY] < 0". But I do not know how to implement it.
Thank for any of your advice.
Ondra
Table looks similar like this:
PX_MARKET_NAME; PX_QUANTITY
Elbas; 5
Elbas; -3
Elspot; 4
In result I need only 2nd value (-3). Which belongs to Elbas and is smaller then 0.
So far I tried this, but it is now working :(
FILTER
(
[PX Market].[PX MARKET NAME].&[Elbas],
[Measures].[PX PURCHASE]
) < 0
Try that:
IIF(
([PX Market].[PX MARKET NAME].&[Elbas],[Measures].[PX QUANTITY]) < 0,
([PX Market].[PX MARKET NAME].&[Elbas],[Measures].[PX QUANTITY]),
NULL
)

How to join a string and float value in a calculated field using Tableau?

I am trying to create a calculated field where the output is "x% Over the Goal", however cannot due to string and float values. Essentially, what I want is the following:
IF Actuals > Goal THEN Actuals/Goal+'Over Goal'
ELSEIF Actuals < Goal then Actuals/Goal+'Under Goal'
ELSE 'At Goal' END
Is something like this possible? I've tried creating two separate calculated fields and concatenating them, but that does not work either.
Any help would be greatly appreciated.
You Can achieve this in a single calculated field:
IF [Actuals] > [Goal] THEN STR(FLOAT([Actuals] / [Goal])) + "Over Goal"
ELSEIF [Actuals] < [Goal] THEN STR(FLOAT([Actuals] / [Goal])) + "Under Goal"
ELSE "At Goal" END

SSRS - Expression to count the number of dates in a colums

In SSRS,how to count the number of dates present in a column?
I am developing a report where I need to display the total number of dates where Date_of_Delivery.Value is updated in a specific month & also I need to display the same for where Date_of_Delivery.Value is Not updated.
Please insist me.
If you want a count of the number of times a date is in a certain time period, you would use the IFF to perform the check and then SUM the results.
=SUM(IIF(Fields!Date_of_Delivery.Value >= CDATE("01/01/2016") AND Fields!Date_of_Delivery.Value <= CDATE("01/31/2016"), 1, 0)
The IFF will check to see if the Date of Delivery is between two dates and return 1 if true otherwise 0. The SUM then sums up all the results.
You should probably use some Parameters for your date so you can just change the parameters instead of the code in the report.
=SUM(IIF(Fields!Date_of_Delivery.Value >= Parameters!START_DATE.Value AND Fields!Date_of_Delivery.Value <= Parameters!END_DATE.Value, 1, 0)