How to roll numbers up in Tableau for aggregation? - tableau-api

I have a data structure issue. I have a problem where I need to roll up my data within tableau so that aggregated numbers do not skew in a certain manner.
Example of current data
ID Model_Number Value
123 fff 2
123 ggg 2
123 hhh 2
123 uuu 2
124 yyy 5
124 qqq 5
124 eee 5
Avg: NA 3.28
Ideal state of data and aggregation
ID Value
123 2
124 5
Avg 3.5
As you see since the data is at two different grains the aggregated number (avg) will be different. I would like to roll up my numbers to the distinct value of ID and then calculate my average which will result in a different (correct in my context) aggergated number.

Here is one calculated field that could help.
{ FIXED [ID] : AVG([Value]) }
This will give you the avg value by ID. You can then use a grand total(avg) to get the 3.5

Related

Is it possible in Postgresql to use one column as source of names of columns?

Suppose that we have table with field ID, type, Date, any_value
Suppose we have such values in a table (select * from my_table):
ID type date any_value
1 alfa 2022-01-01 50
2 beta 2022-01-01 70
3 alfa 2022-01-02 111
4 beta 2022-01-02 444
...
Is it possible to create a query with a such result (if it is - how I may do it?):
date alfa beta
2022-01-01 50 70
2022-01-02 111 444
....
Of course it should be performed automatically, not by making the simplest "left join" for each column (as the number of values in "type" column may be different in May , different in July- and it should be only one query.
(if it would be "gamma" in "type" column, then it should be "gamma" column after beta column, if "delta" - then "delta")

Select row with one column for each row in another select

An existing query returns the following table:
UserID
Sector
Value
1
1
111
1
2
122
1
3
133
2
2
222
2
3
233
3
1
311
But I would like to "reformat" it in the following way:
UserID
Sector 1
Sector 2
Sector 3
1
111
122
122
2
222
233
3
311
The maximum number of sectors is variable. Since i am new to SQL I am not sure weather this would be something DB Type Specific, so a solution which works for PostgreSQL is appreciated.
If this is something which should not be done in the database, it is also okay. I am still figuring out what to do in the database and what not.
Title is not good, I know. Please make some recommendation if you have an more precise one.
Unfortunately you cannot have dynamic number of columns.
Here is the SQL for the sample data you've provided. You must have tablefunc extension installed.
SELECT *
FROM crosstab(
'select userid, sector, value
from user_sector
order by 1,2')
AS ct(userid int, sector1 int, sector2 int,sector3 int);

Cannot access the value of the kdb table column

I am new to KDB and cannot understand why i can access the order column for the table stocks but not trader. The below is my code with the error.
q)trader
item brand | price order
---------------| -----------
soda fry | 1.5 200
bacon prok | 1.99 180
mushroom veggie| 0.88 110
eggs veggie| 1.55 210
tomatoes veggie| 1.35 100
q)trader.order
'order
[0] trader.order
^
q)stock.order
50 82 45 92
q)stock
item brand price order
---------------------------
soda fry 1.5 50
bacon prok 1.99 82
mushroom veggie 0.88 45
eggs veggie 1.55 92
q)trader.order
'order
[0] trader.order
^
Your table trader is keyed and you cannot use trader.order to select the order column.
You can use this instead if you want
(0!trader)`order
The reason is because when you do trader.order what you actually do is you use indexing. It's the same as if you'd do list.index. A table is just a list of dictionaries and you use dot(.) to index into it. However a keyed table does not have the same structure so you'll have to unkey it first.

Compare number of instances in a column for two different dates in SAS

I have the following dataset (items) with transactions on any date and amount paid on the next business day.
The amount paid for each id on the next business day is $10 for the ids whose rate is >5
My task is to compare the number of instances where rate > 5 are in line with amount paid on the next business day (This will have a standard code 121)
For instance, there are four instances with rate > 5 on 4/14/2017' - The amount$40 (4*10)is paid on4/17/2017`
Date id rate code batch
4/14/2017 1 12 100 A1
4/14/2017 1 2 101 A1
4/14/2017 1 13 101 A1
4/14/2017 1 10 100 A1
4/14/2017 1 10 100 A1
4/17/2017 1 40 121
4/20/2017 2 12 100 A1
4/20/2017 2 2 101 A1
4/20/2017 2 3 101 A1
4/20/2017 2 10 100 A1
4/20/2017 2 10 100 A1
4/21/2017 2 30 121
My code
proc sql;
create table items2 as select
count(id) as id_count,
(case when code='121' then rate/10 else 0 end) as rate_count
from items
group by date,id;
quit;
This has not yielded the desired result and the challenge I have here is to check the transaction dates (4/14/2017 and 4/20/2017) and next business day dates (4/17/2017,4/21/2017).
Appreciate your help.
LAG function will do the trick here. As we can use lagged values to create the condition we want without having to use the rate>5 condition.
Here is the solution:-
Data items;
set items;
Lag_dt=LAG(Date);
Lag_id=LAG(id);
Lag_rate=LAG(rate);
if ((id=lag_id) and (code=121) and (Date>lag_dt)) then rate_count=(rate/lag_rate);
else rate_count=0;
Drop lag_dt lag_id lag_rate;
run;
Hope this helps.

Difference between rows in KDB/Q

I'm new to KDB/Q and have a question around getting the difference between two (not necessarily adjacent) rows.
I have only one table, which looks like the below:
q)tickers:`ibm`bac`dis`gs`ibm`gs`dis`bac
q)pxs:100 50 30 250 110 240 45 48
q)dates:2013.05.01 2013.01.05 2013.02.03 2013.02.11 2013.06.17 2013.06.21 2013.04.24 2013.01.06
q)trades:([tickers;dates];pxs)
q)trades
tickers dates | pxs
------------------| ---
ibm 2013.05.01| 100
bac 2013.01.05| 50
dis 2013.02.03| 30
gs 2013.02.11| 250
ibm 2013.06.17| 110
gs 2013.06.21| 240
dis 2013.04.24| 45
bac 2013.01.06| 48
I would like to be able to have a either another column in the table that stores the difference between the current and the previous price, or another structure similar in structure. The key question that the resulting needs to answer is "by how much did the stock change compared to the previous time a price was recorded?"
So far I've tried something along the lines of:
select tickers, dates, pxs - pxs(dates bin (exec dates from trades where tickers = trades.tickers)) from trades
which doesn't really work (at all). Definitely due to trying to do SQL-like queries and having a row-oriented mindset.
Please find below an exemple of the sought after answer:
q)trades: do magic with trades
q)trades
tickers dates | pxs | delta
------------------| --- | -----
ibm 2013.05.01| 100 | 0
bac 2013.01.05| 50 | 0
dis 2013.02.03| 30 | 0
gs 2013.02.11| 250 | 0
ibm 2013.06.17| 110 | 10
gs 2013.06.21| 240 | -10
dis 2013.04.24| 45 | 15
bac 2013.01.06| 48 | -2
Thanks for your help,
Dan
q)update delta:{0,1_deltas x}pxs by tickers from trades
tickers dates | pxs delta
------------------| ---------
ibm 2013.05.01| 100 0
bac 2013.01.05| 50 0
dis 2013.02.03| 30 0
gs 2013.02.11| 250 0
ibm 2013.06.17| 110 10
gs 2013.06.21| 240 -10
dis 2013.04.24| 45 15
bac 2013.01.06| 48 -2
if you do:
select pxs by dates,tickers from table
you will have a complex column (pxs) which is a list of prices for the particular date and ticker. You can then apply deltas:
select deltas pxs by dates,tickers from table
Which will give you the running difference. The first value is the original pxs though so you'll need to update the first one to 0.
EDIT
Just re-read and having looked at your result, you'll need to join back to your original trade table
update dates, pxs, delta:(0N,(-1_ pxs) - 1_ pxs) by tickers from trades
Please find how it works:
select pxs by tickets from trades
creates table which rows contains: ticket and list pxs.
So in every row we have a list:
tickers| pxs
-------| -------
bac | 50 48
dis | 30 45
gs | 250 240
ibm | 100 110
now we have to apply function which will calculate delta. Best function mentioned above: deltas, but my version is about the same.
if we select - then we will have table with tickers|list of pxs|list of deltas, but is we use update .. by, then it ungroup groupped values.
You can get the same results using the prev function. One thing worth highlighting that prev automatically adds the null (0N) as the first element. This is important as we don't have the previous information available, however, adding a 0 as the first element suggests that there has not been any change; though it depends on how you want to handle the first record.
q)update delta:pxs-prev[pxs] by tickers from trades
tickers dates | pxs delta
------------------| ---------
ibm 2013.05.01| 100
bac 2013.01.05| 50
dis 2013.02.03| 30
gs 2013.02.11| 250
ibm 2013.06.17| 110 10
gs 2013.06.21| 240 -10
dis 2013.04.24| 45 15
bac 2013.01.06| 48 -2
using deltas to get the same results (0N instead of 0)
q)update delta:{0N,1_deltas x}pxs by tickers from trades