Creating a calculated field with ratio of number of complaints to population in Tableau? - date

Count of complaints per State by Year
I have the count of complaints per state by year in the first picture. I have another dataset which has the population of each state by year. How do I account for population and present only the ratios of count of complaints to population?
Snapshot of first dataset
Snapshot of second dataset
My thought was to create a calculated field to create a ratio but I'm having trouble with adding up the number of complaint counts within a certain year and then dividing by population year. How do I write the formula that only counts complaints within 2011, 2012, etc and dividing it by that population year?
Let me know if there's an easier way to do it as well, thanks for your time.
Edit 1:
Second dataset Pivoted
Population & Complaint Count
I've pivoted my second dataset and now I'm trying to graph both the counts of population and complaints. The population count across the years increases but the count of complaints stay exactly the same; its the sum of all complaints for that particular state for all the years.
Also, when I graph population count with 'Date Received' from the first dataset, I get the total population count across all the years instead of that particular year, like so:
Population per year
How do I properly 'blend' in the two date variables so that it works with both population count and complaint counts in both datasets?
Edits 2:
Blended Year
I changed the [Years] datatype in data source 2 into a date to match the date type of [Date Received] in data source 1. I also took only the 'year' parts because it would only count things on 1/1 of each year if I used [Years] in data source 2.
Now the graphs look similar except when I'm using [Years] instead of [Date Received], all the values are about several thousand off. I tried adding another relationship except this time for month again and then it only counted values for that month.
How do I account for the discrepancy and make [Years] work just like [Date Received] ?

Reshape data source 2, following these instructions.
Then you'll be able to blend the 2 data sources on State and Month and Year.

Related

Why are my values multiplying when I apply Month/Year to my values?

When I apply Month/Year to Cases or Deaths from my data, the values explode. For Cases it goes from approximately 48 million to over 1 billion, and for Deaths it goes from about 700 thousand to over 22 million. However, when I try the same thing with Initial Claims or the Stringency Index, my values remain correct. I'm trying to find the month over month percentage change by the way. And I'm using the Date column. I only select 2020 and 2021 in the filter for Year.
What I'm asking about is Sheet 21.
Link to workbook: https://public.tableau.com/app/profile/nilajah.rivers/viz/CoronaVirusProject_16323687296770/Sheet21
Your problem is that the data points are daily cumulative deaths. If you change the date aggregation to anything other than days, Tableau will default to summing the numbers for all the days in the month. This will give the wrong result, obviously.
If you want to show the correct total deaths or cases regardless of the time aggregation (months, days, weeks etc.) then you could use the New Case or New Death numbers plus a running sum table calculation. This will always give the correct total for the time period.
Table calculations will also allow automatic calculation of the period to period % change from the same data fields.
This is a common problem when working with datasets that offer pre-calculated aggregations. Tableau doesn't need that as it can dynamically calculate the aggregation of a field over any given time period but it is easy to forget which field has pre-aggregated data and which has raw data. Pre-aggregated fields assume a particular time period and can't be used for different time periods without disentangling that assumption (which is unnecessary if you also have the raw data (in this case daily new deaths/cases).

Tableau Summing up aggregated data with FIXED

Data granularity is per customer, per invoice date, per product type.
Generally the idea is simple:
We have a moving average calculation of the volume per week. MA based on last 12 weeks (MA Volume):
window_sum(sum([Volume]),-11,0)/window_count(count([Volume]), -11,0)
We need to see the deviation of the current week vs the MA for that week (Vol DIFF):
SUM([Volume])-[MA Calc]
We need to sum up the deviations for a fixed period of time (Year/Month)
Basically this should show us whether on average, for a given period of time, we deviate positively or negatively vs the base.
enter image description here
Unfortunately I get errors like:
"Argument to SUM (an aggregate function) is already an aggregation, and cannot be further aggregated."
Or
"Level of detail expressions cannot contain table calculations or the ATTR function"
Any ideas how I can go around this one?
Managed to solve this one. Needed to add months to the view and then just WINDOW_SUM(Vol_DIFF).
Simple as that!

Matlab average number of customers during a single day

I'm having problems creating a graph of the average number of people inside a 24h shopping complex. I have two columns of data on a spreadsheet of the times a customer comes in (intime) and when he leaves (outtime). The data spans a couple of years and is in datetime format (dd-mm-yyyy hh:mm:ss).
I want to make a graph of the data with time of day as x-axis, and average number of people as y-axis. So the graph would display the average number of people inside during the day.
Problems arise because the place is open 24h and the timespan of data is years. Also customer intime & outtime might be on different days.
Example:
intime 2.1.2017 21:50
outtime 3.1.2017 8:31
Any idea how to display the data easily using Matlab?
Been on this for multiple hours without any progress...
Seems like you need to decide what defines a customer being in the shop during the day, is 1 min enough? is there a minimum time length under which you don't want to count it as a visit?
In the former case you shouldn't be concerned with the hours at all, and just count it as 1 entry if the entry and exit are in the same day or as 2 different entries if not.
It's been a couple of years since I coded actively in matlab and I don't have a handy IDE but if you add the code you got so far, I can fix it for you.
I think you need to start by just plotting the raw count of people in the complex at the given times. Once that is visualized it may help you determine how you want to define "average people per day" and how to go about calculating it. Does that mean average at a given time or total "ins" per day? Ex. 100 people enter the complex in a day ... but on average there are only 5 in the complex at a given time. Which stat is more important? Maybe you want both.
Here is an example of how to get the raw plot of # of people at any given time. I simulated your in & out time with random numbers.
inTime = cumsum(rand(100,1)); %They show up randomly
outTime = inTime + rand(100,1) + 0.25; % Stay for 0.25 to 1.25 hrs
inCount = ones(size(inTime)); %Add one for each entry
outCount = ones(size(outTime))*-1; %Subtract one for each exit.
allTime = [inTime; outTime]; %Stick them together.
allCount = [inCount; outCount];
[allTime, idx] = sort(allTime);%Sort the timestamps
allCount = allCount(idx); %Sort counts by the timestamps
allCount = cumsum(allCount); %total at any given time.
plot(allTime,allCount);%total at any given time.
Note that the x-values are not uniformly spaced.
IF you decide are more interested in total customers per day then you could just find the intTimes with in a given time range (each day) & probably just ignore the outTimes all together.

Trying to Average number of accounts by hour, day of week, and month

I'm in healthcare and we're trying to assess the number of discharges we have per hour of day, but we'd also like to be able to filter them down by day of week, or specific month, or even a particular day of week in a particular month (e.g. " what is the average number of discharges per hour on Mondays in January?")
I'm confident that Tableau can do this, but haven't been able to make the averages show up in my line graph... every time that I convert it from COUNT to AVG, the line simply goes straight. I got close when I did a table calculation to find the Average (dividing the count per hour by the number of days captured in the report), but when I add a filter for either the month or day of week, selecting one of the options of the filter reduces the total number that is being counted, rather than re-averaging the non-filtered items. (i.e. if the average of the 7 days of the week is "10" for a particular hour, and I deselect the first three days of the week, it's now saying that my average for that hour is roughly 6, despite the fact that all of the days are very close to 10 at that hour.)
Currently, my data table has the following columns:
Account#/MonthYear/HourOfDay/DayOfWeek
ex.12345678/ Jan-17 / 12 /Sunday
I would just create a few calculated fields to differentiate the parts of the calendar you might want to filter/aggregate on. Mixing the month and day of the week with filtering is pretty straight forward with the calculated fields. Then do standard summing to get what you are looking for because an average count of records is always one unless you are throwing some other calculation into the mix. I threw a quick example up on Tableau Public for you to get the idea.

Aggregating data from the US stock market in Tableau, using different time frames

I am a very basic user of tableau and I have not found an answer to my question.
I have a txt file that has historical daily data for 98% of all the stocks in the US, with their daily capitalization. Each stocks has its TICKER, Daily Market Value for every trading day of the year, and its SECTOR.
I did a simple time series that display SUM([Mktval]) (sum of all individual market values) across all stocks, on a daily daily, and where I can see that the total value as of 2016 is about 24 Trillion USD, as in the image below.
When I change the view column from DAY to YEAR, I don't see the right values, but something a lot larger. So I realized that I need to do SUM([Mktval])/252 to get the right value for a year (there are 252 trading days in a year).
If I change the view to MONTH, as in the chart below, the numbers are again wrong because 252 is not the right value to use in the division.
Is there any way that Tableau can adjust the values automatically to reflect the AVG MktVal across different time intervals?
Thanks
Replace SUM(Mktval) on the Rows shelf with the following calculated field
avg({ fixed day(Date1) : sum(Mktval) })
That solution is all in one step. It is perhaps a bit more clear to use 2 steps. First, create a calculated field called total_daily_market_value defined as
{ fixed day(Date1) : sum(Mktval) }
Then make sure that calculated field is a measure. It is an LOD calculation that you can think of as a separate table with one value for each day showing the total market value for that day.
Drag that measure to a shelf, and then change the aggregation function to AVG(), MEDIAN(), MIN(), MAX() or STDEV() as desired. Tableau will aggregate the total_daily_market_value using your chosen aggregation function for whatever values of Date1 are in your view.