Tableau - Return Name of Column with Max Value - tableau-api

I am new to Tableau visualization and need some help.
I have a set of shipping lanes which have whole numbers values based on the duration for the shipment.
Ex:
| Lane Name | 0 Day | 1 Day | 2 Day | 3 Day | 4 Day |
| SFO-LAX | 0 | 30 | 60 | 10 | 0 |
| JFK-LAX | 0 | 10 | 20 | 50 | 80 |
For each Lane Name, I want to return the column header based on the max value.
i.e. for SFO-LAX I would return '2 Day', for JFK-LAX I would return '4 Day', etc.
I then want to set this as a filter to only show 2 Day and 3 Day results in my Tableau data set.
Can someone help?

Two steps to this.
The first step is pretty easy, pivot your data. Read the Tableau help to learn how to PIVOT your data for analysis - i.e. make it look to Tableau as a longer 3 column data set with Lane, Duration, Value as the 3 columns. Tableau's PIVOT feature will let you view your data in that format (which makes analysis much easier) without actually changing the format of your data files.
The second step is a bit trickier than you'd expect at first glance, and there are a few ways to accomplish it. The Tableau features that can be used for this are LOD calcs, table calcs, filters and possibly sets. These are some of the more powerful but complicated parts of Tableau, so worth your time to learn about, but expect to take a while to spin up on them.
The easiest solution is probably to use one of the RANK() function - start as a quick table calc. Set your partitioning and addressing as desired so that the ranks are computed for the blocks of data that you desire - say partitioning on Lane and addressing or computing by Duration. Then when you are happy with the ranks you see, move the rank calculation to the filter shelf and only display data where rank = 1.
This is a quick solution once you get the hang of it, but it can get slow for very large data sets since the rank calculations are done on the client side, requiring fetching all the data that you end up not displaying. If performance becomes an issue, you might want to look at other solutions to do more of the calculations server side - possibly using LOD calcs or analytic aka windowing queries invoked from custom SQL

Related

Tableau - Calculated field of different columns based on different partition of the same table

Sorry for the stupid question.
Situation: I have a partitioned table (the partition is the week of the year) with some metrics (e.g. frequency of some keywords); I need to run an analysis of metrics belonging to different partitions (e.g. the trend between the frequency of a keyword in week 32 compared to week 3). The ultimate purpose is to create a dashboard where the user can choose the week of the year and is presented with the calculated analysis on the go.
So far I have used a live query that uses two parameters (week_1 and week_2) that joins data from the same table based on the two different parameters. You can imagine that the dashboard recomputes everything once one of the parameter is changed by the user. To avoid long waiting times, I have set the two parameters to a non-existent default value (0, zero), so that the dashboard can open very quickly. Then I prompt the user to stop the dashboard, insert the new parameters of choice, and then restart the dashboard to load the new computations.
My question is: is it possible to achieve the same by using an extract of the table? The table itself should not be excessively big (it should be 15 million records spanning 3 years) and as far as I know the extracts are performant with those numbers.
I am quite new to Tableau, so I would like to know from more expert people if there is a more optimal way to do such a thing without using live queries.
Please, feel free to ask more information if I was not clear! However, I cannot share my workbook, as it contains sensitive information.
Edit:
+-----------+ -----------+ ------------+
partition keyword frequency
+-----------+ -----------+ ------------+
202032 hello 5000
202032 ciao 567
...
202031 hello 2323
202031 ciao 34567
...
20203 hello 2
20203 ciao 1000
With the live query, I can join the table where partition = 202032 with the same table where partition - 20203 and make a new table with a column where I compute e.g. a trend between the two frequencies:
+----------+ -----------------------+ ---------------+
keyword partitions_compared trend
+----------+ -----------------------+ ---------------+
hello 202032 - 20203 +1billion %
ciao 202032 - 20203 +1K %
With the live query I join on the keywords.
Thanks a lot in advance and have a great day!
Cheers

Overlapping classifications

I'm trying to create classifications based on date that are overlapped between each other.
Taking "Sample - EU SuperStore" as a reference, I want to do the the following:
Show measures as rows, for example Sum of Profit and sum of Sales
Create two columns: 2016 and Q1 2016.
Output Example:
+-------------+---------+---------+
| Measures | 2016 | Q1 2016 |
+-------------+---------+---------+
| sum(Profit) | 49,544 | 3,811 |
| sum(Sales) | 484,247 | 74,448 |
+-------------+---------+---------+
Is there a way to achieve this without changing the underlying data model?
I've tried using parameters but at the moment of putting to parameters together they are consider as the same column with different "hierarchies". See image below (Parameters are called 1 and 2)
If your periods of interest only overlap because one of them is a total of some of the others, you may be able to just use turn on the totals and sub totals of interest from the menu.
Otherwise, you can define calculated fields that select your values of interest for some records, and null for other records. Some people call those conditional fields or conditional calculations. Since aggregation functions like SUM(), MAX() ignore null values, you can then use those fields as measures to get the effect you want.
For instance, if you create a calculated field called [Sales During Promotion] as
If [Date] >= #2/15/2020# and [Date] <= #3/15/2020# then [Sales] end
Then SUM([Sales During Promotion]) will be the sum of all sales that fell within the specified period.
The last trick to understand the calculated field above is to know that the default behavior if there is no else branch specified is to return null.

Is it possible to make multiple fields default to the same date, but also be individually editable?

I am VERY new to Access - I was sort of thrust into designing a database for a research project I'm involved in. So, please bear with me because I know next to nothing :) The problem I am having is thus:
My database is for a medical research project, and is very time and date dependent, by which I mean I need to capture the date and time for each piece of data so that we end up with a sort of timeline of events for each subject.
As is, I have something like the following for each piece of data: (Each in it's own field)
ArrivalDate
ArrivalTime
HeartRateDate
HeartRateTime
HeartRateData
TemperatureDate
TemperatureTime
TemperatureData
BloodPressureDate
BloodPressureTime
BloodPressureData
There are around 200 similar pieces of data that I need to collect for each patient. To avoid having to re-enter the same data over and over, and also to reduce the potential for error, I would like to have all of the date fields in a given patient record default to the first one that is entered, in this case "Arrival Date". However, I also need each date field to be editable without affecting the others. The reason for this is that in the event that a patient's visit occurs over the span of a few days we can accurately record that.
I have tried messing around with the default value setting, as well as setting the control source to reference the "Arrival Date" field, but then of course any changes to one field affect them all. I am not even sure that what I am trying to do is possible but I will appreciate any help and/or suggestions!
Thank you in advance
Having all this data in separate columns of a big table isn't going to work. You don't measure things like temperature or blood pressure only once per patient, do you?
This is a classic one-to-many relation.
You should have a separate Measurements table, looking e.g. like this:
+--------+-----------+---------------+------------------+-----------+
| MeasID | PatientID | MeasType | MeasDateTime | MeasValue |
+--------+-----------+---------------+------------------+-----------+
| 1 | 1 | Temperature | 2017-05-17 14:30 | 38.2 |
| 2 | 1 | BloodPressure | 2017-05-17 14:30 | 130/90 |
| 3 | 1 | Temperature | 2017-05-17 18:00 | 38.5 |
| 4 | 2 | Temperature | etc. | |
+--------+-----------+---------------+------------------+-----------+
As Barmar wrote, there is no reason to have separate columns for date and time.
In the form where measurements are entered, you can use the BeforeInsert event to set MeasDateTime to the current time, with the Now() function.
So the user never has to enter it manually, but they can edit it if the measurement was at a different time than entering the data.

Time-series Stock Data in Matlab

I'm a MatLab beginner, and have no idea what I'm doing.
I have stock data in CSV format which is something like this:
+--------+--------+------+------+-----+-------+
| Ticker | Date | Open | High | Low | Close |
+--------+--------+------+------+-----+-------+
| APPL | 25-Oct | 10 | 12 | 9 | 12 |
| XYZ | 25-Oct | 10 | 12 | 9 | 12 |
| APPL | 26-Oct | 12 | 15 | 10 | 15 |
+--------+--------+------+------+-----+-------+
There are many stock tickers each day. The file is many rows long listing daily stock prices for each ticket on a particular stock exchange.
I'm aiming to do some fun time-series analysis on the 'close' price for each ticker.
To start with making simple charts of a single ticker over time, or multiple tickers over time would be awesome.
Questions:
1. Best way to import data.
I have a big long CSV. But am lost as to which import method is best. Column Vectors, Numeric Matrix, Cell Array or Table?
2. I need to create a time-series object for each ticker, right?
How would one go about that? I've been looking at this guide, but I'm unsure how to make an object for each ticker, over the span of time defined in the file.
http://www.mathworks.com/help/matlab/ref/timeseries-class.html
Any advice, pointers and resources that are good for beginners are appreciated massively!
Thanks!
There are a ton of ways to import data into MATLAB. Before you import data, I would make sure numeric columns hold ONLY numeric data or MATLAB can complain. Some options in my personal order of preference:
d = readtable('mycsvfile.csv'); % puts data in nice table datatype. I find it makes code more readable.
d = csvread('myfile.csv',1,0); % the 1 skips the first row which is probably header names for the csv file. Puts all the data in a matrix and you have to keep track of what column is what.
xlsread is good for reading excel files
Copy and paste the data into a variable in your workspace. Do save blahblah.mat so you can easily load the data later.
I personally wouldn't bother with financial time series objects. It's just going to complicate your life if you're new to MATLAB. If you loaded the data using tableread (i.e. option 1) you can then execute something like:
aapl_indicator = strcmp(d.Ticker, 'AAPL');
to get a vector indicating whether a row in your table is AAPL or not. Then:
close_price_aapl = d.Close(aapl_indicator);
will give you a vector of Apple's closing prices.
When you get down to doing math, you want to be using the matrices.

How To Aggregate Up From A Text Table With Calculated Fields?

With tableau I am able to act on some data tables to get to a text table that I would like to treat as a table from scratch to do further aggregation. You will see from my example what I actually want to do, but acting on a text table as if it were a brand new table seems to be one solution if possible. I am open to other solutions to the same problem if you have any.
Say I have two tables.
Table A
Date | Purchases
'2014-05-02' | 5
'2014-05-03' | 6
Table B
Date Bucket | Bake Rate
0-1 Month | .20
2-3 Month | .50
First I created a calculated field for Table A to put each line item date into the corresponding date bucket by figuring out how much time has passed from a certain date and called it Date Bucket. Then I made a relationship between Date Bucket in Table B and the newly formed dimension in Table A also called Date Bucket. From Here I could essentially join on date bucket and for each line item get a Bake Rate from table B.
Then I divide each purchase by the corresponding bake rate as determined by how Age Bucket.
So I ended up with a text table like the following.
Date | Age Bucket | Purchases | Baked Purchases
'2014-05-02' | 0-1 Month | 5 | 25
'2014-05-03' | 0-1 Month | 6 | 30
Ideally, from here I'd like to be able to get the sum of the baked purchases and aggregate by whatever other dimensions I have. For example here, get the sum of baked purchases by month.
Any Ideas?!