Grouping factors ending in state name - group-by

I am using r studio. I have a variable column with names of different counties in different U.S. states - over 3000 data points. I'd like to group these counties just by state and have the factors of the variable shrink to the number of states. I suspect I need to use dplyr's group_by function, but not sure of the syntax.

Related

Form data evaluation

I have a form data wherein tableau I am using the below value to find out the choices for a multiple value question. The question is Could you please indicate one or two areas where we fell short.
the value can be explosives, vehicles, cement etc
I have individually accounted for each of the choices in the below calculated field:
int(contains(lower([Could you please indicate one / two of the following areas where we fell short of meeting your expectations?]),'factory'))
Similarly I have found out for other values:
The calculated fields are hasvehicle, hasfactory etc
But the problem is how can I visualize the same in the form of bars?
All I am able to do is this:
How can I visualise the same in the form of bars side by side?
When you have one measure on Rows, drag the second measure and drop it in the axis.
It will result with Measure Names on the Columns shelf and Measure Values on the rows shelf.

Displaying change in moving average on map

I am trying to show the change in moving average by county on a map.
Currently, I have the calculated field for this:
IF ISNULL(LOOKUP(SUM([Covid Count]),-14)) THEN NULL ELSE
WINDOW_AVG(SUM([Covid Count]), -7, 0)-WINDOW_AVG(SUM([Covid Count]), -14, -7)
END
This works in creating a line graph where I filter the dates to only include 15 consecutive dates. This results in one point with the correct change in average.
I would like this to number to be plotted on a map but it says there are just null values.
The formula is only one part of defining a table calculation (a class of calculations performed client side tableau taking the aggregate query results returned from the data source)
Equally critical are the dimensions in play on the view to determine the level of detail of the query, and the instructions you provide to tell Tableau how to slice up or layout the query results before applying the table calc formula. This critical step is known as setting the “partitioning and addressing” for the table calc, sometimes also as setting the “compute using”. Read about it in the online help for table calcs. You can experiment with using the Edit Table Calc dialog by clicking on the corresponding pill.
In short, you probably have to a dimension, such as your Date field to some shelf - likely the detail shelf, and the set the partitioning and addressing, probably to partition by county and address by state.
If you have more than a couple of weeks of data, then you’ll get multiple marks per county. You may need to decide how to handle that on your map.

How to pass a vector from tableau to R

I have a need to pass a vector of arguments to Rserve from tableau. Specifically, I am using IRR calculations in R (on Rserve), and i want to pass vector of cash-flows that are as columns in my table (instead of rows/measure). So, i want to collect all those CF in a vector and pass it on to Rserve. Passing them one at a time slows down IO.
SCRIPT_REAL("r_func(c(.arg1, .arg2, .arg3))",sum(cf1), sum(cf2), sum(cf3))
cf1..cfn are cashflows corresponding to various periods. Above code works well when cf are few but takes a long time when i have few hundereds. Further, time spent is not in calculation but IO when communicating with remote Rserve. If i have a local Rserve, this calculation happens under few seconds while on remote, it takes well over a minute.
Also, want to point out that tableau / Rserve, set one argument after another and that takes time. My expectation is that once i have a vector, it would be just 1 transfer and setting of arguments, and therefore this should speed up
The first step in understanding how Tableau interacts with R or Python, is understanding how Tableau's table calcs work.
Tableau Script_XXX() functions are table calculations which means that you invoke them on a vector of aggregate query results and the corresponding R or Python code needs to return a vector usually of the same size. (I think you may be able to return a scalar or smaller vector which gets replicated to appear like a vector of the same size as the argument -- but not certain)
You can control how your data is partitioned into vectors, and also the ordering of data in the vectors, by editing the table calc to specify the partitioning and addressing for that calc.
Partitioning determines how your aggregate query results are broken up into vectors for calculation purposes. Addressing determines how the elements of each vector are ordered. You can either do that based on the physical layout of the table structure, or (better) based on the specific dimensions.
See the Tableau on-line help for table calcs for more info, and look online training videos from Tableau or blog entries (especially from anyone named Bora)
One way to test your understanding of these concepts is create a Tableau table (i.e., a viz with a mark type of text) with several dimensions on row and column shelves. Then create calculated fields for INDEX() and SIZE() and display them on text. Finally, change the partitioning and addressing in different ways by editing those table calcs. Try several different permutations. When you can confidently predict what those functions will produce for different settings, then you're ready to do more complex tasks - such as talking to R.
It is also instructive to experiment with FIRST(), LAST(), LOOKUP(), WINDOW_SUM() etc -- and finally dig into PREVIOUS_VALUE(). Warning, PREVIOUS_VALUE() is a bit odd, and does not behave the way you probably assume it does. Still, it is a useful technique that can implement a recursive calculation, and is about as close to a for loop as Tableau gets.

Show number of elements in multiple sets in a chart

I create about 10 sets using my tableau data. I want to show the number of elements in all sets in a chart, for example, bubble chart, or bar chart. When I move a single set to the sheet and select the number of records and filter the in elements I can see the number of elements in the set, however, I want to simultaneously see the number of records in multiple sets.
When I try to put multiple sets to a for example bubble chart, Tableau creates one single bubble instead of multiple bubbles.
Sets are very useful, but may not be the best approach when you have a very large number of similar groupings to compare side by side when you are using them as dimensions.
Remember the purpose of dimensions is to partition your data into non overlapping blocks prior to aggregating measures. Since a data row may belong to multiple sets, using sets as dimensions doesn't fit the particular application you describe. (but using sets as filters or building blocks for calculations might)
So here is one approach that will give you some flexibility. Define a calculated field for each set to return 1 if the record is in set 1, null otherwise (One way to think of sets is as a boolean function)
Number of Set 1 Records
if [Set_1] then 1 end
Then you you can use SUM([Number of Set 1 Records]) as a measure as desired. You can use Measure Values to display multiple measures together.
This way your set definitions are used for calculating your measures, but not for partitioning the data rows.
If your sets are completely defined by a condition, and this is the only way you use them, you could simplify by using the condition directly in the calculated fields above and not creating the corresponding sets.

Aggregate bins in Tableau

I want to aggregate bins in tableau.
See the following figure:
I want to aggregate (merge) the NumberM from 6 untill 16 in one category. 5+/(6 and higher) for example and sum the values of 6-16 in that category. I think this can be done with a few simple clicks but I am not able to manage.
Thanks in advance,
Tim
There are several ways to classify data rows into different groups or classes: each with different strengths.
Create a calculated field As emh mentioned, one approach is to create a calculated field to assign a value to a new field indicating which group each data row belongs to. For the effect you want, the calculated field should be discrete (blue). If your calculation doesn't return a value for in one case, e.g. an if statement without an else clause, then the field will be null in that case which is a group in itself. This is a very general approach, and can handle much more complex cases. The only downsides are the need to maintain the calculated field definition and that the cutoff values are hard coded and by itself can't be changed dynamically via a control on the view. BUT those issues can by easily resolved by using a parameter instead of a numeric literal in your calculated field. In fact, that's probably the number one use case for parameters. If you think in SQL, a discrete field on a shelf is like a group by clause.
Use a filter If you only want a subset of the data in your view, e.g. data rows with NumberM in [6, 16] then you can drag the NumberM field onto the filters shelf and select the range you want. Note for continuous (green) numeric fields, filter ranges include their endpoints. Filters are very quick and easy to drop on a view. They can be made dynamically adjustable by right clicking on them and creating a quick filter. Its obvious from the view that a filter is in use, and the caption will include the filter settings in its description. But a filter doesn't let you define multiple bins. If you think in SQL, a filter is like a where clause (or in some cases using the condition tab, like a having clause)
Define histogram bins If you want to create regular sized bins to cover a numeric range, such as values in [1,5], [6,10], [11-15] ..., Tableau can create the bin field for you automatically. Just right click on a numeric field, and select Create Bins.
Define a group Very useful for aggregating discrete values, such as string fields, into categories. Good for rolling up detail or handling multiple spellings or variants in your data. Just right click on a field and select Create Group. Or select some discrete values on an axis or legend and press the paperclip option. If you then edit a group, you'll see what's going on. If you think in SQL, a group is like a SQL case statement.
Define a set Another way to roll up values. The definition of a set can be dynamically computed or a hard coded list of members. Both kinds are useful. You can combine sets with union, intersection, set difference operators, and can test set membership in calculated fields. Sets are useful for binary decisions, rows are divided into those that are members of the set and those that are not.
Filters, sets, groups, calculated fields and parameters can often be combined to accomplish different effects.
Most if not all of these features can be implemented using calculated fields, especially if the business rules get complicated. But if a filter, bin, group or set fits your problem well, then it's often best to start with that, rather than define a calculated field for each and every situation. That said, learning about the 4 kinds of calculated fields really makes a difference in being able to use Tableau well.
You can do this with calculated fields.
Go to: Select Analysis > Create Calculated Field.
Then use this formula:
IF NumberM > 5 THEN "OVER 5"
You can then use that calculated field as a filter on the worksheet in your screenshot.
Answering my own question:
With Tableau 9 this can be easily done with the increased flexibility of the level of detail expressions (LOD). I can really recommend this blog on that subject and many more Tableau functions.