Can't compute an aggregation of an aggregation in Tableau - tableau-api

I'm trying to compute the standard deviation of a nested calculated measure.
In this example, different countries produce a number of items every month, each with a specific colour. I'm trying to sort countries by the standard deviation of the monthly ratio between warm and cold colours for every country.
The underlying data is as follows (each row is an item produced at a certain date by a certain country):
date country colour
-------------------------------
2020-03-01 France Blue
2020-03-01 UK Red
2020-03-02 USA Green
2020-03-03 Belgium Red
2020-03-04 UK Green
The first calculated measures identifies all the items which are either warm or cold colours:
WARM_COLOUR:
{INCLUDE [Colour]: SUM(If [Colour] = 'Red' or [Colour] = 'Orange' or [Colour] = 'Yellow' THEN 1 ELSE 0 END)}
COLD_COLOUR:
{INCLUDE [Colour]: SUM(If [Colour] = 'Blue' or [Colour] = 'Green' THEN 1 ELSE 0 END)}
Then, I compute the ratio between warm and cold colours:
WARM_COLD_RATIO
sum([WARM_COLOR]) / (sum([WARM_COLOUR]) + SUM([COLD_COLOUR]))
Finally, I want to compute, for every country, the standard deviation of this ratio, but this produces an error:
{INCLUDE [Country]: STDEV([WARM_COLD_RATIO])}
^^^^^ Error: argument to STDEV is already an aggregation and can't be aggregated further
The final desired result is that I want to sort countries by descending order of standard deviation of the warm/cold colours ratio, per time period (e.g. month). Specifically, a country for which the warm/cold ratio would vary a lot every month, would come on top, whereas a country which gets the same warm/cold ratio every month would come last.

Table calculations can't be inside LOD calculations.
Any reason it really needs to be a LOD? Are there good table calculation alternative formulas, such as WINDOW_STDEV?
WINDOW_STDEV([WARM_COLD_RATIO])

Related

How to assign colours on a geo map to a data column based on conditions in Tableau?

I have data as below,
pincode value
1010 null
1020 0
1020 0.2
1030 0.55
1132 0.4
1124 0.8
1010 1
1020 null
1020 0
1030 0.66
1132 0.5
1124 0.3
I want to assign these values on geo map in Tableau based on pincodes. Further the requirement is to satisfy the following conditions,
when value = null -> colour should be blue
when value is in between 0 and 0.5 -> colours should vary in terms of red (light red to dark red)
when value is in between 0.5 and 1 -> colours should vary in terms of green(light green to dark green)
How can I do this in tableau? I am trying to split them into three different columns, by creating calculated fields, but then I am unable to assign three different columns to colour field.
Can anyone suggest a better solution/ idea to do this.
Thank you:)
Why not create one calculated field called "pincode value groups" or similar and use if and elseif logic to set all three groups in one field.
The output would either be "no value", "Between 0 and 0.5", or "greater than 0.5"
Then you can set colors for each of these values.

Charting OHLC candle with SMA 200 using mplfinance plot function

I'm using mplfinance plot function to draw OHLC candlestick chart of a symbol. OHLC data is of 2 min timeframe. Also, I'm plotting sma 20 period and sma 200 period on the same chart. Because of sma200, the number of candles which are displayed on chart is quite huge (almost two days of 2min candle)
Since moving average is calculated internally by plot function so I've to pass the two days of 2 min candle to plot function so that I could get some data points of sma200. Candlestick chart is saved as png file. Now because of around 300 candles displayed on chart (sma20 and sma200 line also displayed), candles are not very clearly displayed.
Is there a way to restrict number of candles which get displayed on chart. If I slice my dataframe to lets say 30 candle, then sma200 will not be calculated in that case due to insufficient number of candles. What I need is sma200 with complete dataset but only fixed number of candle or for a fixed duration chart get displayed like last one hour candle data only.
mpf.plot(df, type='candle', style='charles',
title=title,
ylabel='Price',
ylabel_lower='Shares \nTraded',
mav=(20,200),
savefig=file)
I would suggest that you calculate your own moving average, and plot it using mpf.make_addplot(). This will allow you to calculate a moving average based on one-minute or two-minute candles, while plotting five-minute or ten-minute candles. For example:
# calculate mav values
mav20 = twominute_df['Close'].rolling( 20).mean()
mav200 = twominute_df['Close'].rolling(200).mean()
# resample:
resample_ohlcmap = {'Open' :'first',
'High' :'max',
'Low' :'min',
'Close' :'last',
'Volume':'sum'
}
tenminute_df = twominute_df.resample('10T').agg(resample_ohlcmap)
# plot ten-minute candles with two-minute mavs:
apmavs = [ mpf.make_addplot(mav20),
mpf.make_addplot(mav200) ]
mpf.plot(tenminute_df, type='candle', style='charles',
title=title, ylabel='Price', ylabel_lower='Shares \nTraded',
addplot=apmavs, savefig=file)
References:
resampling
moving average calculation
Thanks Daniel for your help. I'm now able to plot a chart for 60 candles with sma 20 and 200.
Well I don't need resampling as my chart timeframe and moving average time frame both are same.
Please find my code snippet.
# get list of close prices from symbol_docs. symbol_docs contain 2 min OHLC.
close_list = list(map(lambda a: a['close'], symbol_docs))
# sma20 and 200 calculated using ta-lib
sma20 = sma(close_list, 20)
sma200 = sma(close_list, 200)
# call to plot_chart function
plot_chart('TCS', symbol_docs, sma20, sma200)
def plot_chart(symbol, docs, sma20, sma200):
df = pd.DataFrame(docs)
df = df.set_index(['time'])
df.rename(columns={'open': 'Open', 'close': 'Close', 'high': 'High', 'low': 'Low'},
inplace=True)
title = symbol.upper() + ' - 2min'
file = saved_chart_image_abs_path + symbol + '.png'
df['sma20'] = sma20
df['sma200'] = sma200
df_sliced = df[-60:]
apmavs = [mpf.make_addplot(df_sliced['sma20']), mpf.make_addplot(df_sliced['sma200'])]
mpf.plot(df_sliced, type='candle', style='charles',
title=title,
ylabel='Price',
ylabel_lower='Shares \nTraded',
addplot=apmavs,
savefig=file)
telegram_message_sender.send_document(file)
os.remove(file)
Below chart is sent as a document on my telegram group :)

What is the area of geom field?

I want to check the area of geometry values.
The geometry values are POLYGON or POINT or MULTI POLYGON.
The field has the type of geometry
I check the srid of the geom field:
select st_srid(geometry)
from my_table
And I got srid=32636.
I checked here:
https://epsg.io/32636 and it seems that the units are in meters.
Now I want to get the area (in meters) of each value:
select st_area(geometry)
from my_table
And I'm getting very small values (0.0002, or 0.000097 or 0.33, ....).
I want to be sure:
Does those values means square meter (m^2) ?
So the values are less than 1 square meter ?
Since your SRS unit is metre, ST_Area will return the area in square metres. The following example calculates the area of a polygon using SRS's that have different units:
WITH j (geom) AS (
VALUES ('SRID=32636;
POLYGON((-1883435.029648588 6673769.700215263,-1883415.1158478875 6673776.142528819,-1883411.8478185558 6673765.073005969,-1883431.7724919873 6673758.967942359,-1883435.029648588 6673769.700215263))'::GEOMETRY))
SELECT
ST_Area(geom) AS sqm,
ST_Area(
ST_Transform(geom,2249)) AS sqft
FROM j;
sqm | sqft
-------------------+-------------------
237.6060612927441 | 2341.135411173445
EPSG 32636: Units are metres (Ellipsoid WGS84)
EPSG 2249: Units are feet (Ellipsoid GRS1980)
To your questions:
Does those values means square meter (m^2) ?
Yes.
So the values are less than 1 square meter ?
Yes. I'm curious about what are your geometries about. Perhaps you mixed up different SRS?
Unrelated note: Spatial operations with SRS's that have the same unit might still deliver different results, as they might also use different ellipsoids. The example below will calculate the area of the same geometry using SRS's that have metre as unit but a different ellipsoid. Note the difference in the result:
WITH j (geom) AS (
VALUES ('SRID=32636;
POLYGON((-1883435.029648588 6673769.700215263,-1883415.1158478875 6673776.142528819,-1883411.8478185558 6673765.073005969,-1883431.7724919873 6673758.967942359,-1883435.029648588 6673769.700215263))'::GEOMETRY))
SELECT
ST_Area(geom) AS sqm_32636,
ST_Area(
ST_Transform(geom,26986)) AS sqm_26986
FROM j;
sqm_32636 | sqm_26986
-------------------+--------------------
237.6060612927441 | 217.49946674261872
EPSG 32636: Units are metres (Ellipsoid WGS84)
EPSG 26986: Units are metres (Ellipsoid GRS1980)
.. but if you stick to the same ellipsoid and unit, the math makes more sense:
WITH j (geom) AS (
VALUES ('SRID=32636;
POLYGON((-1883435.029648588 6673769.700215263,-1883415.1158478875 6673776.142528819,-1883411.8478185558 6673765.073005969,-1883431.7724919873 6673758.967942359,-1883435.029648588 6673769.700215263))'::GEOMETRY))
SELECT
ST_Area(
ST_Transform(geom,2249)) AS sqft_2249,
ST_Area(
ST_Transform(geom,2249)) * 0.3048 ^ 2 AS sqm_2249, -- manually converted from sqm to sqft
ST_Area(
ST_Transform(geom,26986)) AS sqm_26986
FROM j;
sqft_2249 | sqm_2249 | sqm_26986
-------------------+--------------------+--------------------
2341.135411173445 | 217.49859674966302 | 217.49946674261872
Demo: db<>fiddle

Excluding Outliers from Avg Calculations at the dimension level in Qlik

All,
I need to plot monthly average trend. Each month behaves differently hence the need to exclude outliers at the month level. This calculation needs to be dynamic if I need to look at the trend by a certain category (the outliers need to be recalculated at the category level by each month)
here is what I tried:
I created a variable
vOutlier = Fractile(count , 0.75 ) + 1.5 *(Fractile(count , 0.75 ) - Fractile(count , 0.25 ))
then I used it in the set analysis
avg({$<Values = {"<=Fractile(Values , 0.75 ) + 1.5 *(Fractile(Values , 0.75 ) - Fractile(Values , 0.25 ))"}>} Values)
and avg({$<Values = {"<=$(=$(vOutliers))"}>} Values)
If you need a sample QVF, please find it here on the Qlik forum.

PostgreSQL: if column1 contains x, y or z set column2 to x,y or z

I have a list of products with uncleaned descriptions that I need to extract attributes from.
i.e:
'large (blue) tank top'
'med wht. W shorts'
etc.
needs to become:
size color style gender
large Blue Tank top
Medium White Shorts women's
The list of possible variations is fairly long. Is there a way through postgres to say:
If description contains [blue,white,red] set Color to [returned found color]
There are multiple ways. The most direct is to use case:
select (case when description like '%blue%' then 'blue'
when description like '%white%' then 'white'
when description like '%red%' then 'red'
end) as color