I'm trying to apply a rank that could be based in 3 other columns.
I've tried to use the formula
{FIXED Column1,Column2 : RANK(MIN(Column3),'asc') }
but I got the error message level of detail expressions cannot contain table calculations or the attr function in Tableau
what I wanna do is to have the rank based on the Column1 and Column2 columns and ranking by the dates (Column3)
here is an example of the data (hope it helps)
You can't call a table calc function from within a LOD calculation.
LOD calcs execute at the data source relatively early in the order of operations. Table calcs execute much later on the client side, operating on the summary query results returned by the data source.
So essentially, table calcs can see the results of LOD calcs and take them as input, but not the other way around.
Tables calcs operate on multiple rows in a summary table at a time, and so can compute values that look across whole sections of that table, such as ranks, running sums, percents etc. Table calcs are the only calculations native to Tableau that take the order of rows in a table into account. Read the help material on table calcs to learn about partitioning and addressing - essential concepts for using table calcs.
All you have to do is create a quick table calc for Rank(column3) is to right click on the measure, select edit table calculation. --> select specific dimensions & select both column 1 & 2 --> Restart at every column 2.
Related
Im fairly sure what im attempting is not the ideal way to do things due to my lack of knowledge of power BI but here goes:
I have two tables in the form of:
One has the actual power against wind and the other is a reference
I created calculated columns that add a corresponding binned speed to each row (so 1-2, 2-3, 3-4 etc)
I have filters and slicers applied on the page / visual that will keep changing.
What i want is to create a pivot or a grouped table that is changed dynamically based on my filters.
The reason i want this is currently the table ive got has totals that are averaged (because individual row is averaged) but i want a sum of an average by category. If i can have this as a calculated table instead of a visual (picture below) i would likely be able to aggregate this again to get what i want
so on the above table i want to totals to be sum of individual rows. I also want to be able to use these totals to carry out other calculations (simple stuff like total divided by fixed number etc)
I would like to randomly sample n rows from a table using Impala. I can think of two ways to do this, namely:
SELECT * FROM TABLE ORDER BY RANDOM() LIMIT <n>
or
SELECT * FROM TABLE TABLESAMPLE SYSTEM(1) limit <n>
In my case I set n to 10000 and sample from a table of over 20 million rows. If I understand correctly, the first option essentially creates a random number between 0 and 1 for each row and orders by this random number.
The second option creates many different 'buckets' and then randomly samples at least 1% of the data (in practice this always seems to be much greater than the percentage provided). In both cases I then select only the 10000 first rows.
Is the first option reliable to randomly sample the 10K rows in my case?
Edit: some aditional context. The structure of the data is why the random sampling or shuffling of the entire table seems quite important to me. Additional rows are added to the table daily. For example, one of the columns is country and usually the incoming rows are then first all from country A, then from country B, etc. For this reason I am worried that the second option would maybe sample too many rows from a single country, rather than randomly. Is that a justified concern?
Related thread that reveals the second option: What is the best query to sample from Impala for a huge database?
I beg to differ OP. I prefer second optoin.
First option, you are assigning values 0 to 1 to all of your data and then picking up first 10000 records. so basically, impala has to process all rows in the table and thus the operation will be slow if you have a 20million row table.
Second option, impala randomly picks up rows from files based on percentage you provide. Since this works on the files, so return count of rows may different than the percentage you mentioned. Also, this method is used to compute statistics in Impala. So, performance wise this is much better and correctness of random can be a problem.
Final thought -
If you are worried about randomness and correctness of your random data, go for option 1. But if you are not much worried about randomness and want sample data and quick performance, then pick second option. Since Impala uses this for COMPUTE STATS, i pick this one :)
EDIT : After looking at your requirement, i have a method to sample over a particular field or fields.
We will use window function to set rownumber randomly to each country group. Then pick up 1% or whatever % you want to pick up from that data set.
This will make sure you have data evenly distributed between countries and each country have same % of rows in result data set.
select * from
(
select
row_number() over (partition by country order by country , random()) rn,
count() over (partition by country order by country) cntpartition,
tab.*
from dat.mytable tab
)rs
where rs.rn between 1 and cntpartition* 1/100 -- This is for 1% data
screenshot from my data -
HTH
I am trying to show the change in moving average by county on a map.
Currently, I have the calculated field for this:
IF ISNULL(LOOKUP(SUM([Covid Count]),-14)) THEN NULL ELSE
WINDOW_AVG(SUM([Covid Count]), -7, 0)-WINDOW_AVG(SUM([Covid Count]), -14, -7)
END
This works in creating a line graph where I filter the dates to only include 15 consecutive dates. This results in one point with the correct change in average.
I would like this to number to be plotted on a map but it says there are just null values.
The formula is only one part of defining a table calculation (a class of calculations performed client side tableau taking the aggregate query results returned from the data source)
Equally critical are the dimensions in play on the view to determine the level of detail of the query, and the instructions you provide to tell Tableau how to slice up or layout the query results before applying the table calc formula. This critical step is known as setting the “partitioning and addressing” for the table calc, sometimes also as setting the “compute using”. Read about it in the online help for table calcs. You can experiment with using the Edit Table Calc dialog by clicking on the corresponding pill.
In short, you probably have to a dimension, such as your Date field to some shelf - likely the detail shelf, and the set the partitioning and addressing, probably to partition by county and address by state.
If you have more than a couple of weeks of data, then you’ll get multiple marks per county. You may need to decide how to handle that on your map.
I'm trying to filter on a measure that is a table calculation and the grand total doesn't change. It only changes when the filter is on a dimension.
I tried to duplicate the data source but that didn't work.
When filtering on table calc, provide a new grand total.
Unfortunately, the Tableau Calculation filter won't run before the aggregations have been made. That's why you aren't seeing a different Grand Total when filtering by the table calculation.
This is explained in Tableau's Order of Operations
If you want to see a different Grand Total your filter will need to come before the Measures you are aggregating in the Order of Operation.
You can think of Tableau Calculations as visual filters, meaning they can change what is rendering on the screen, but won't affect the underlying data.
Simple pivot table:
In this case, I'm pulling back 5 fields from the database:
Category
Year
Quarter
Numerator
Divisor
2 unfortunate facts. First, the year/quarters drift to provide a rolling 8-quarter view. Consequently, there will usually be 1 full year and 2 partials with their respective quarters. Second, the measures to be displayed are ratios of numerator to divisor. Naturally, Crystal is assuming that I want to divide everything and then total it, which is not correct.
How do you get the pivot table totals to calculate correctly as SUM({Numerator})/SUM({Denominator})? Since there are multiple levels in play, the Sum({Numerator}, {Attribute})/Sum({Denominator}, {Attribute}) doesn't seem to work or I'm missing an extra element to it.
This crosstab is intended to replace a report that individually calculated every cell, and is not viable for long-term maintenance. If the totals can't be corrected, we'll have to revert back to that format.
Once you create a cross tab... you can insert saperate column or row inside the existing column using the option embeeded summary
Right click---> Embeeded Summary ---> insert embeeded summary
This will insert a row as Edit This Formula.
Now on the newly created cell
Right click---> Embeeded Summary ---> Edit Calculation formula
This will open a window there write your division formula.