How to analyze/group by text in power bi - group-by

I'm quite new to power bi and wonder how to analyze interests of persons over time.
My table looks like:
User, InterestedIn, Date
UserA, Sports, 2018-10-02
UserB, Sports, 2018-10-05
UserC, Reading, 2018-10-05
UserC, Math, 2018-11-03
....
I know want to make visualisations about how the interests develop over a period of time. I therefore related the date column to my date table, but when i drag the "InterestedIn" and Date Column into a visual im not able to generate a meaningfull output.
Do i have to add like a "Count"-Column with numeric values?
Thanks in advance for your help!

Best practice would be to use the Modeling ribbon to add a Count measure e.g.
# Interests = COUNTROWS ( Interests )
and/or
# Users = DISTINCTCOUNT ( Interests[User] )
Then add that to Values well or similar (depends on which visuals you want).

Related

Power BI: Filter sales table with multiple locations on respective start dates from different table

I've tried to find a similiar thread on this, but have not been able to do so. Im pretty new to Power BI, so i might not know what im looking for. I could really use some advise.
I have a sales table ('SalesTable') that contains all the sales from different store locations. The table includes all the sales from each store beginning in january 2021, but the stores was incorporated on different dates in 2021, and so i need to be able to make a filter to only return the sales for each store from when the stores was incorporated respectivaly.
Simplified, the tables looks like this:
'SalesTable'
'SalesTable'
'Stores'
'Stores'
The two tables are joined on storeID. SalesTable is also connected to a dax-created Calender table. The stores table is not connected to the calender table (Maybe it should??).
I need to be able to filter the report so that it only returns sales dated on or after the respective incorporateddate.
Like this:
'Desired output'
I am not sure whats the optimal way to go about this. If i should make a calculated table of the SalesTable, or if a measure is sufficient to filter the report. Any suggestions, tips or solutions would be highly appreciated :)
You can use this measure:
sumIncorp =
var __maxIncorp = CALCULATE(max(inc[IncorporatedDate]), FILTER(inc, inc[StoreID] = SELECTEDVALUE(IncSale[StoreID])))
return
CALCULATE(SUM(IncSale[Amount]), FILTER(IncSale, IncSale[Date] >= __maxIncorp))

Post-Aggregation Join of two tables in Tableau

I´m new to tableau and need to do some kind of post-aggregation join, i think. My goal is, to match some data from google search console to some other regional data concerning hotels. This way, i hope to see if hotels for a certain region perform better or worse than their popularity in the google searches would suggest.
I have one table with the hotel-data which looks like this:
Table 1
Here we have three hierarchical region levels. Country, state and region (and some KPI that is aggregated according to the drill-down-level).
Table 2
Table 2 does not follow the hierarchical dimensionality as table 1, but has the same regions.
What i want tableau to do:
I want tableau, to join the regions on the lowest region level, but NOT to aggregate the KPI impressions. So, when i drill-up to the country level, i want the "random KPI" to be summed to 389, but the impressions should be 40.000 only. You might ask yourself why - it´s a different thing if somebody only searches for "country 1" or if he searches for a state or region of this country. For this analysis it is the goal, to not aggregate the impressions for each region.
I would be glad for any hints on how to do this. I thought about doing a blend - which i thought is a kind of post-aggregation join, but i found out, that if i join on the lowest region-level of table 1 with the region-variable of table 2, the impressions always get aggregated.
Thanks everyone!

How do I use Tableau to populate the count of each dimension over a time period?

How do I populate the number of purchases and sales per day in tableau?
Here is my Sample Data:
In my first attempt, sales numbers are not counted to the exact date.
In my second attempt, I tried to tabulate by dropping sales date into the rows. However, it returned two figures - purchases and sales.
I have also tried Calculated Field but Tableau is unable to do a "for loop" like python.
First attempt:
After dropping Sales Date into the Rows. This is what I get:
Is there any way to populate it like this? Please help, I am still new to tableau. Special thanks to Fabio Fantoni for the first solution!
Desired Format:
I have another sample data (refer to sample data 2) which I would like to populate in the desired format (refer to desired format 2). In Sample Data 2, the purchase date "15/12/2020" is not reflected in sold dates.
My apologies but I may require some guidance as I am still new to tableau. Thank you in advance.
Sample Data 2:
Desired Format 2:
Based on this sample:
In order to bypass your double count for two different date columns, you may want to cross join your original data with a copy of it on original.Purchase = support.Sold, like this:
Doing so, you just have to create two calculated fields:
count Purchase:
count([Purchase Date])
count Sold:
Count([Purchase Date (Foglio11)])
The only thing you have to pay attention to is that in the second calculus you have to count Purchase date due to your "inverted" cross join.
You should get something like this:

Identifying next closest record by date in tableau

I have a table of users and another table of transactions.
The transactions all have a date against them. What I am trying to ascertain for each user is the average time between transactions.
User | Transaction Date
-----+-----------------
A | 2001-01-01
A | 2001-01-10
A | 2001-01-12
Consider the above transactions for user A. I am basically looking for the distance from one transaction to the next chronologically to determine the distances.
There are 9 days between transactions one and two; and there are 2 days between transactions three and four. The average of these is obviously 4.5, so I would want to identify the average time between user A's transactions to be 4.5 days.
Any idea of how to achieve this in Tableau?
I am trying to create a calculated field for each transaction to identify the date of the "next" transaction but I am struggling.
{ FIXED [user id] : MIN(IF [Transaction Date] > **this transaction date** THEN [Transaction Date]) }
I am not sure what to replace this transaction date with or whether this is the right approach at all.
Any advice would be greatly appreciated.
LODs dont have access to previous values directly, so you need to create a self join in your data connection. Follow below steps to achieve what you want.
Create a self join with your data with following criteria
Create an LOD calculation as below
{FIXED [User],[Transaction Date]:
MIN(DATEDIFF('day',[Transaction Date],[Transaction Date (Data1)]))
}
Build the View
PS: If you want to improve the performance, Custom SQL might be the way.
The only type of calculation that can take order sequence into account (e.g., when the value for a calculated field depends on the value of the immediately preceding row) is a table calc. You can't use an LOD calc for this kind of problem.
You'll need to understand how partitioning and addressing works with table calcs, along with specifying your sort order criteria. See the online help. You can then do something like, for example, define days_since_last_transaction as:
if first() > 0 then min([Transaction Date]) -
lookup(min([Transaction Date]), -1) end
If you have very large data or for other reasons want to do your calculations at the database instead of in Tableau by a table calc, then you use SQL windowing (aka analytical) queries instead via Tableau's custom SQL.
Please attach an example workbook and anything you tried along with the error you have.
This might not be useful if you cannot set User ID Field as a filter.
So, you can set
User ID
as a filter. Then following the steps mentioned in here will lead you to calculating difference between any two dates. Ideally if you select any one value in the filter, the calculated field from the link should give you the difference in the dates that you have in the transaction dates column.

Grouping By with missing data

Image of Data and desired result:
I'm trying to aggregate volunteer hours from a Google spreadsheet a non-profit I volunteer for. We collect volunteer e-mail information and the time that each volunteer has contributed. Each volunteer only puts in their e-mail the first time. I've found examples online on how to send e-mails, but I'm having trouble aggregating the data. I think the trouble might be that not every row has an e-mail address associated with it.
I've been able to get the sum of hours worked by volunteer using QUERY(data, "select A, sum(C) Group By A", ) but can't figure out how to get the e-mail associated with each individual.
Thanks for the advice! The VLOOKUP and ArrayFormula functions were new to me. Here's how I solved it:
QUERY(data, "select A, B where B <>'' ", -1)
This allowed me to get the Key-Value pair (Name, Email) for each volunteer (solving the problem of people who volunteered multiple times, but only left their e-mail once). From there, I was able to generate the 'Name:Hours Worked' table off to the right with:
QUERY(data, "select A, sum(C) Group By A", ).
Then, I used VLOOKUP to query my Name-Email table to get the desired result of:
Name-Email-aggregatedHours
Thanks!
You can't achieve this with query. But you could apply vlookup to sorted table:
=ArrayFormula(VLOOKUP(UNIQUE(FILTER(A2:A,A2:A<>"")),SORT(A2:B,2,0),2,0))
and get email list for unique names.
First, clean up your data. You shoud be certain that at least one column has no typos an that this column appropiate identify which data corresponds to each volunteer. This is called key value. This also could be done by, but not limited to, filling up the missing values for each row. If this will be hard, then
Create a volunteer list without missing data.
Calculate the time contributed by each volunteer. If you was able to fill up the missing values, then you could use QUERY, I this case the QUERY formula should have to group by name and email, if not, then use SUMIF