Tableau : Create a calculated column which adds up all the values of another column - tableau-api

I have the data in tableau like this
City Number of office Total Offices
Chicago 5 20
Houston 6 20
Dallas 4 20
I want to create a calculated field which has the difference of value [Total Offices] -(minus) Sum([Number of Office]). It should look like this
(20 - (5+6+4)) = 5
City Number of office Total Offices Difference(Calculated Field)
Chicago 5 20 5
Houston 6 20 5
Dallas 4 20 5
How can achieve this

simply this
[Total Offices] - {sum([Number of Office])}

Related

How to roll numbers up in Tableau for aggregation?

I have a data structure issue. I have a problem where I need to roll up my data within tableau so that aggregated numbers do not skew in a certain manner.
Example of current data
ID Model_Number Value
123 fff 2
123 ggg 2
123 hhh 2
123 uuu 2
124 yyy 5
124 qqq 5
124 eee 5
Avg: NA 3.28
Ideal state of data and aggregation
ID Value
123 2
124 5
Avg 3.5
As you see since the data is at two different grains the aggregated number (avg) will be different. I would like to roll up my numbers to the distinct value of ID and then calculate my average which will result in a different (correct in my context) aggergated number.
Here is one calculated field that could help.
{ FIXED [ID] : AVG([Value]) }
This will give you the avg value by ID. You can then use a grand total(avg) to get the 3.5

Pyspark : Filter dataframe based on null values in two columns

I have a dataframe like this
id customer_name city order
1 John dallas 5
2 steve 4
3 austin 3
4 Ryan houston 2
5 6
6 nyle austin 4
I want to filter out the rows where customer_name and city are both null. If one of them have value then they should not get filtered. Result should be
id customer_name city order
1 John dallas 5
2 steve 4
3 austin 3
4 Ryan houston 2
6 nyle austin 4
I can only find out the filter condition based on one column. How to filter based on two columns?
Use coalesce.
from pyspark.sql.functions import *
df.filter(coalesce('customer_name', 'city').isNotNull())
I believe this will work by using these and f alias for functions.
df.filter(f.col("customer_name").isNotNull() & f.col("city").isNotNull())

Postgresql conditional sum

I have a table with the following data for sales and inventory (oh):
category sales oh item_num
Clothes 12 10 1
Clothes 11 10 1
Clothes 10 10 1
Clothes 5 10 1
Clothes 8 10 1
Clothes 4 10 1
Clothes 23 10 2
Clothes 5 10 2
Clothes 20 10 2
Clothes 5 10 2
Clothes 13 10 2
Clothes 9 10 2
Food 6 25 3
Food 8 25 3
Food 7 25 3
Food 14 25 3
I am trying to query this table to get a sum of both the sales and oh columns by category:
SELECT category, SUM(sales) AS sales, SUM(oh) AS oh
FROM data
GROUP BY category
However, the problem is I need the SUM(oh) to only sum distinct items but the SUM(sales) to sum all the values. So the result should be:
category sales oh
Clothes 125 20
Food 35 25
I tried SUM(DISTINCT oh), but that only works for distinct oh values not distinct items. I really need something like SUM(DISTINCT(item_num) oh).
I experimented with various window functions, but could not come up with a solution. Does anyone know how to return this kind of sum on a unique key?
Here's how I'd do it:
SELECT category, SUM(sales) AS sales, SUM(oh) AS oh
FROM (
SELECT category, SUM(sales) AS sales, oh
FROM data
GROUP BY category, item_num, oh
) ttl
GROUP BY category;
Basically tackle the problem in stages. First group up the items by category and item number to get the sum of sales then group and sum by category to get the sum of oh.
Result:
category | sales | oh
----------+-------+----
Food | 35 | 25
Clothes | 125 | 20
(2 rows)
Edit: Included simplified query.

duplicating table columns in KDB

Consider the code below:
q)tab:flip `items`sales`prices!(`nut`bolt`cam`cog;6 8 0 3;10 20 15 20)
q)tab
items sales prices
------------------
nut 6 10
bolt 8 20
cam 0 15
cog 3 20
I would like to duplicate the prices column. I can write a query like this:
q)update prices_copy: prices from tab
I also can write a query like this:
q)select items, sales, prices, prices_copy: first prices by items from tab
Both would work. I would like to know how the "by" version would work and the motivation for writing each version. I cannot help but think the "by" version is more thinking in rows.
Your initial query would be ideally what you want for your duplicate column requirement.
The by creates groups of the column items in your example and collapses every other column in the select query according to the indices calculated from grouping items. More info on by here - http://code.kx.com/wiki/Reference/select and http://code.kx.com/wiki/JB:QforMortals2/queries_q_sql#The_by_Phrase
In your example, the column items is already unique and so no collapsing into groups is actually performed, however, the by will create nested lists from the other columns (i.e. lists of lists). The use of first will just un-nest the items column, thus collapsing it to a normal (long-typed) vector.
When the grouping is finished the by columns are used as the key column[s] of the result and you will see this by the use of a vertical line to the right hand side of the key column[s]. All other columns within the select query are placed to the right hand side of the key.
The logic of the by version coincidentally creates a copy of prices. But by changes the order:
q)ungroup select sales, prices by items from tab
items sales prices
------------------
bolt 8 20
cam 0 15
cog 3 20
nut 6 10
q)tab
items sales prices
------------------
nut 6 10
bolt 8 20
cam 0 15
cog 3 20
The by version works only because items is unique. For a tab with multiple values for item eg. 8#tab, the query only produces 4 values for prices_copy.
q)select items, sales, prices, prices_copy: first prices by items from 8#tab
items| items sales prices prices_copy
-----| ----------------------------------
bolt | bolt bolt 8 8 20 20 20
cam | cam cam 0 0 15 15 15
cog | cog cog 3 3 20 20 20
nut | nut nut 6 6 10 10 10
There is a fundamental difference between a simple update and update by queries.
Let's explore it by adding an extra column brand to the table
tab2:flip `items`sales`prices`brand!(`nut`bolt`cam`cog`nut`bolt`cam`cog;6 8 0 3 1 2 3 4;10 20 15 20 30 40 50 60;`b1`b1`b1`b1`b2`b2`b2`b2)
The following will now simply copy the column :
asc update prices_copy: prices from tab2
However, the following query is copying the first item price regardless of the brand and updating it for all other brands of same item.
asc ungroup select sales, prices,brand, prices_copy: first prices by items from tab2
items sales prices brand prices_copy
------------------------------------
bolt 2 40 b2 20
bolt 8 20 b1 20 //b2 price
cam 0 15 b1 15 //b2 price
cam 3 50 b2 15
cog 3 20 b1 20
cog 4 60 b2 20 //b2 price
nut 1 30 b2 10 //b2 price
nut 6 10 b1 10
update by might be useful in the case where you want to copy the max price of the items regardless of the brand or some other aggregation query.
asc ungroup select sales, prices,brand, prices_copy: max prices by items from tab2
items sales prices brand prices_copy
------------------------------------
bolt 2 40 b2 40
bolt 8 20 b1 40 //max price in bolts regardless of the brand
cam 0 15 b1 50
cam 3 50 b2 50
cog 3 20 b1 60
cog 4 60 b2 60
nut 1 30 b2 30
nut 6 10 b1 30

take sum of similar column from multiple data table based on unique id in crystal report

I have four datatables like
Table 1
id name Afee Insfee
1 a 100 10
2 b 100 10
Table 2
id name Bfee Insfee
2 b 100 10
1 a 100 10
3 c 100 10
Table 3
id name Cfee Insfee
1 a 100 10
3 c 100 10
Table 4
id name Dfee Insfee
1 a 100 10
2 b 100 10
in the crystal report i want to get the result as
Name Afee Bfee Cfee Dfee Insfee total
a 100 100 100 100 40 440
b 100 100 0 100 30 330
c 0 100 100 0 20 220
where this INSfee should be the sum from all the four table for a particular ID and
total should be the sum of a row in that in that report.
How to do this in a sap crystal report.
To get the sum of Insfee, Create a formula and add the field (Insfee) from all tables using sign "+" and place it adjacent to afee, dfee... etc.
Now to get the total use below code:
Create formulas for all fileds(afee,bfee...etc) in below code I named those as a, a1,a1.
Now create a another formula for "total" and implement below code
Place the formulas in detail section, You will get result.
EvaluateAfter({#a});
EvaluateAfter({#a 1});
EvaluateAfter({#a 2});
{#a}+{#a 1}+{#a 2}