In Tableau how do you change y-axis to be calculated by custom function? - tableau-api

I am working with a 2 y-axis graph, one is generally between 40K and 60K, the other between 5K and 10K. What I would like to do is set the the 40K to be a number such as if the MIN = 42K, start at 40K and increment by 5K. If It is 38K, start at 35K. Similarly for the 2nd y-axis, do the same but based on 2K increments. When I set it to automatic I get basically straight lines or I say do not include 0 and i get huge drastic swings. I can set the starting and set the increment, but that means every day I would have to go in and verify that still works, for example 40K is a good start, but one that that may be too high or too low. (I suppose the fact it is 2 axis has nothing to do with it, but in case it does) The key is dynamically changing based on the result set.

If there is a better way to do this, I would love it. However, this got me close to what I wanted. First, I created 2 calculated fields, MIN and MAX using a windowed function on the data. They look something like this below. Note I did 2x the differences to give a window that is roughly 5x the total distance from min to max. Better math could give a better sizing.
Max_Ln=WINDOW_MAX(SUM([Profit]))+(WINDOW_MAX(SUM([Profit]))-WINDOW_MIN(SUM([Profit])))*2
Min_Ln= WINDOW_MIN(SUM([Profit]))-(WINDOW_MAX(SUM([Profit]))-WINDOW_MIN(SUM([Profit])))*2
I then added both to the Details pane and used this to add reference lines. I added the reference line with no title and no line. This will cause the automatic spacing to take them into account, but not show anything. From there I did the same on the 2nd y-axis and now everything looks good and will adjust dynamically.

Related

Changing max node capacity in M-tree affects the results

Posting the code for the entire tree for this problem would be pointless (too long and chaotic), and I've tried to fix this problem for a while now, so I don't really want some concrete solution, but more like ideas as to why this might be happening. So:
I have a dataset of 1.000.000 coordinates and I insert them into the tree. I do a range search after and for MaxCapacity=10 I get the correct results (and for any number >= 10). If I switch to MaxCapacity=4 results are wrong. But if I shrink the dataset to about 20.000 coordinates the results are again correct for MaxCapacity=4.
So to me, this looks like an incorrect split algorithm and it just shows for small MaxCapacities and large datasets where we have an enormous amount of splits. But the algorithm checks out for almost everything so I can't really find a mistake there. Any other ideas? Tree is written in SCALA, promotion policy promotes the two points that are the furthest away from each other and for split policy we iterate through the entries of the overflown node and we put each entry into the group of the promoted point that is closer to.
Don't know if anyone will be interested in this but I found the reasons causing this. I thought the problem was in split but I was wrong. The problem was when I was choosing in the Insert Recursion algorithm what node to jump to next in order to place the entry. So I was choosing this node by calculating the distance between each node's center and the entry's point. The node with minimum said distance was chosen.
This works fine if the entry happens to reside inside the radius of multiple nodes. In this case the minDistance works as intended but if the entry doesn't reside in any node's radius? In this case we would have to expand the radius as well to contain the entry. So we would need to find the node whose radius would expand less if it were to include the entry into its children. For a node, its distance from the entry point might be minimum but the expansion needed might be catastrophically big. I had not considered this case and as a result entries were placed in wrong nodes, causing huge expansions, causing huge overlaps. When I implemented this case the problem was fixed!

Graph a counter from zero in prometheus/grafana

In prometheus, I have a monotonically increasing counter (ifHCInOctets from IF-MIB, in this case).
In Grafana, I can create a graph using the simple query ifHCInOctets{job='snmp',instance='$Device',ifDescr=~'eth0'} and see the counter graphed over different time ranges by selecting the desired range in the upper-right.
This is almost exactly what I want. However, I would like the graph to always start at zero and increase from there. The use-case is that I want to visualize my data usage over the course of a month to see how quickly I am approaching my data cap. (I already create a gauge object using increase(ifHCInOctets{...}[$__range]) function which shows me how much I have used in total over the given time range, but I'd like to be able to visualize that usage over time.)
Basically, I want ifHCInOctets{...} - X where X is the value of ifHCInOctets at the start of the range. My first thought was:
ifHCInOctets{...} - ifHCInOctets{...} offset $__range
But that seems to show me each data point minus the data point $__range time prior to it (rather than just subtracting the starting value from all points).
I then tried creating a query variable with the query query_result(ifHCInOctets{...} offset $__range) and setting it to update on time range change. This almost seemed to work, but the resulting graph always seemed to start slightly negative, depending on the time range selected, which made me think it wasn't doing what I thought it was.
I have also tried various forms of sum, sum_over_time, and increase, all to no avail.
You're probably looking for something like this
ifHCInOctets
-
min_over_time(
(ifHCInOctets
and
(month(timestamp(ifHCInOctets)) == scalar(month(vector($__to / 1000)))))[31d:]
)
But it doesn't take into account counter resets. And is ugly and inefficient as hell. It's basically the current value minus the min_over_time calculated over samples in the previous 31 days that fell into the same month as Grafana's $__to timestamp.
You probably want to set up a recording rule based on this expression (that adds year, month and day labels to a metric) and then calculate the increase() over any given month (including the current month). That takes into account both counter resets and counters that did not exist at the beginning of the month.

How do I sort this scatter plot?

I would like to sort this scatter plot, which is summarized with a Band that includes Minimum, Average, and Maximum.
I would like to sort it in 2 ways:
by Average
by Widest Range (ie difference between Minimum and Maximum values)
Tableau Public workbook
If you can't view this or I'm not allowed to post external resources on stackoverflow, then perhaps you can show me on this screenshot what I would click to get started on the following sort
Also, bonus question, is there a way to create a control for the user to toggle between the 2 sort methods in the same chart? Or do I have to duplicate the chart with a different sort type for each?
One note is that I only have Tableau Public version since I'm evaluating the product. Until I get a paid version, I can't open a workbook file unless you publish it to Tableau Public cloud. But rather than give me the workbook answer, I would just appreciate it if you gave me instructions to do this as this is more of a learning exercise.
Thanks!
Somewhat unfortunately, you'll have to replicate the min,avg,max by creating 3 calculated fields. Tableau cannot operate on the values placed on the view via reference lines.
These calculations might look something like these:
{Fixed [Cwe]: Min([Cvss Score])}
~
{Fixed [Cwe]: Avg([Cvss Score])}
~
{Fixed [Cwe]: Max([Cvss Score])}
In general, from there, you should pretty easily be able to apply them to the view and sort. Average will be easy. The difference between Min and Max will just need a subtracting calculated field to sort by. Once they're on the view, I'd put them as a dimension (column) to verify that the numbers look correct.
Take note that LOD calculations take place before filtering, so you'll want to put the Cvss filter you have there 'on context' by right clicking it and clicking 'add to context'
Here is how I would complete the sorts:
Starting with all the above calculations on 'Rows' and ensuring that they are 'Dimensions' (Blue).
After right clicking "Sort..." on [Sub-Category] on 'Rows'. Select which field to sort by.
From there, the calculated fields can be taken off the rows column. (They were only there in the first place to ensure that you could check that the sorts took place. They don't actually need to have been there in the first place.)

Step function between two alternating values

I'm looking to plot connectivity over time to see connection duration and amount of disconnects. Here is the graph I currently have.
This graph is misleading though. It makes it seem like the machine is slowly disconnecting between Sep 29th and Oct 3rd when it reality it is connected that whole time before a brief disconnection.
I'd like the line to remain at 1 / connected until it is not connected.
Thanks in advance for any help!
Tableau is doing this because it draws a line between all data points in the view along the x-axis. I'm assuming you don't have a 1 before October 3rd, so it just slowly slopes to the next point which happens to be a 0.
There are few approaches you could use to visualize this type of data. If the system is always connected, when not disconnected, then you could just visualize points that are disconnects. Additionally, switching to a bar plot may sometimes communicate your intent better than a line in this situation.
Depending on the structure, and assumptions of how the disconnected/connects are ordered in your underlying data, you could create a table calculation that uses the last value in the partition to determine it's value. (connected vs. disconnected)
You could also resample the data to turn your irregular time series into something that is regular. This would add a large number of data points, depending on the time interval you are looking for. (1 million for 15 days at 1 second)
A few suggestions:
Clarify the units on your x-axis: days? hours? secs?
Try using dots instead of a line connector
& Flip the visualization around: plot a transform of your data where 'connected'=0 and 'disconnected'=1

Tableau Dual Axis with different filters

I am trying to create a graph with two lines, with two filters from the same dimension.
I have a dimension which has 20+ values. I'd like one line to show data based on just one of the selected values and the other line to show a line excluding that same value.
I've tried the following:
-Creating a duplicate/copy dimension and filtering the original one with the first, and the copy with the 2nd. When I do this, the graphic disappears.
-Creating a calculated field that tries to split the measures up. This isn't letting me track the count.
I want this on the same axis; the best I've been able to do is create two sheets, one with the first filter and one with the 2nd, and stack them in a dashboard.
My end user wants the lines in the same visual, otherwise I'd be happy with the dashboard approach. Right now, though, I'd also like to know how to do this.
It is a little hard to tell exactly what you want to achieve, but the problem with filtering is common.
The principle that is important is that Tableau will filter the whole dataset by row. So duplicating the dimension you want to filter won't help as the filter on the original dimension will also filter the corresponding rows in the second dimension. Any solution has to be clever enough to work around this issue.
One solution is to build two new dimensions that use a calculation rather than a filter to create the new result. Let's say you have a dimension, [size] that has a range of numbers from 1 to 10 and you want to compare the total number of rows including and excluding the number 5. You could create a new field using a formula like if [size] <> 5 then 1 else 0 end
Summing the new field will give a count of the number of rows that don't contain a 5 and this can be compared directly to a rowcount of the original [size] field which will give the number including the value 5.
This basic principle can be extended to much more complex logic. The essential point is to realise that filters act on every row in your data and can't, by themselves, show comparisons with alternative filter choices on a single visualisation.
Depending on the nature of your problem there may be other solutions worth looking at including sets and groups but you would need to provide more specific details for users here to tell you whether they would be useful.
We can make a a set out of the values of the dimension and then place it in the required shelf. So, you will have your dimension which will plot accordingly and set which will have data as per the requirement because with filter you can't have that independence of showing data everytime you want.