Tableau count number of Records that have the Max value for that field - aggregate

I have a field where I'd like to count the number of instances the field has the max number for that given column. For example, if the max value for a given column is 20, I want to know how many 20's are in that column. I've tried the following formula but I have received a "Cannot mix aggregate and non-aggregate arguments with this function."
IF [Field1] = MAX([Field1])
THEN 1
ELSE 0
END

Try
IF ATTR([Field1]) = MAX(['Field1'])
THEN 1
ELSE 0
END
ATTR() is an aggreation which will allow you to compare aggregate and non aggregate values. As long as the value you are aggregating with ATTR() contains unique values then this won't have an impact on your data.

Related

Remove duplicates from a query sqlalchemy postgres func.max group_by

I'm having problem with one list.
Field X.1 is duplicated
I would like use group_by and func.max to leave only X.1 rows which have some max value.
There is not much to choose from. I would prefer timestamp.
But it is not working for me even with int field.
Would you know what I'm doing wrong.
q = self.session.query(
X.0, # object id
X.1, # these are duplicated and the output I want have max to show only max rows
X.2,
X.3,
X.3,
X.5,
X.6,
X.7,
X.8,
X.9,
X.10 # this is timestamp
).filter(x.3 == filter_value1)
q.group_by(
X.0, # object id
X.1, # these are duplicated and the output I want have max to show only max rows
X.2,
X.3,
X.3,
X.5,
X.6,
X.7,
X.8,
X.9,
func.max(X.10)) # timestamp
q.all()

druid query count for multiple columns

I have a query to count null values in a column. How can I adapt this to return count of null values across multiple columns? I have tried adding a list of fields e.g. [‘ip_address’,’user_agent’] to the dimension field but this didn’t work.
{"intervals":["2019-05-26T00:00:00.000Z/2019-06-25T00:00:00.000Z"],
"granularity":"all",
"context":{"timeout":60000,
"queryId":"71fe66b2-e654-45dc-8a8c-38ed160e79f5"},
"queryType":"timeseries",
"dataSource":"dataset-tablename”,
"aggregations":[{"type":"count",
"name":"count"}],
"filter":{"type":"and",
"fields":[{"type":"selector",
"dimension":"ip_address",
"value":"null"}]}}
this returns two columns,
Timestamp | Count
2019-04-27T04:55:01.000Z | 246,933
which is the count of ip_address records with null values in the timeframe. How can I return the counts for other additional fields?
You can use filtered aggregators:
{"intervals":["2019-05-26T00:00:00.000Z/2019-06-25T00:00:00.000Z"],
"granularity":"all",
"context":{"timeout":60000, "queryId":"71fe66b2-e654-45dc-8a8c-38ed160e79f5"},
"queryType":"timeseries",
"dataSource":"dataset-tablename",
"aggregations":[
{"type":"filtered", "filter":{"type":"selector", "dimension":"ip_address", "value":"null"},
"aggregator": {"type":"count", "name":"null_ip_address_count"}},
{"type":"filtered", "filter":{"type":"selector", "dimension":"user_agent", "value":"null"},
"aggregator": {"type":"count", "name":"null_user_agent_count"}}]
}
That is, instead of applying the filter to the entire query, apply the filter to individual aggregators.

How to convert null rows to 0 and sum the entire column using DB2?

I'm using the following query to sum the entire column. In the TOREMOVEALLPRIV column, I have both integer and null values.
I want to sum both null and integer values and print the total sum value.
Here is my query which print the sum values as null:
select
sum(URT.PRODSYS) as URT_SUM_PRODSYS,
sum(URT.Users) as URT_SUM_USERS,
sum(URT.total_orphaned) as URT_SUM_TOTAL_ORPHANED,
sum(URT.Bp_errors) as URT_SUM_BP_ERRORS,
sum(URT.Ma_errors) as URT_SUM_MA_ERRORS,
sum(URT.Pp_errors) as URT_SUM_PP_ERRORS,
sum(URT.REQUIREURTCBN) as URT_SUM_CBNREQ,
sum(URT.REQUIREURTQEV) as URT_SUM_QEVREQ,
sum(URT.REQUIREURTPRIV) as URT_SUM_PRIVREQ,
sum(URT.cbnperf) as URT_SUM_CBNPERF,
sum(URT.qevperf) as URT_SUM_QEVPERF,
sum(URT.privperf) as URT_SUM_PRIVPERF,
sum(URT.TO_REMOVEALLPRIV) as TO_REMOVEALLPRIV_SUM
from
URTCUSTSTATUS URT
inner join CUSTOMER C on URT.customer_id=C.customer_id;
Output image:
Expected Output:
Instead of null, I need to print sum of rows whichever have integers.
The SUM function automatically handles that for you. You said the column had a mix of NULL and numbers; the SUM automatically ignores the NULL values and correctly returns the sum of the numbers. You can read it on IBM Knowledge Center:
The function is applied to the set of values derived from the argument values by the elimination of null values.
Note: All aggregate functions ignore NULL values except the COUNT function. Example: if you have two records with values 5 and NULL, the SUM and AVG functions will both return 5, but the COUNT function will return 2.
However, it seems that you misunderstood why you're getting NULL as a result. It's not because the column contains null values, it's because there are no records selected. That's the only case when the SUM function returns NULL. If you want to return zero in this case, you can use the COALESCE or IFNULL function. Both are the same for this scenario:
COALESCE(sum(URT.TO_REMOVEALLPRIV), 0) as TO_REMOVEALLPRIV_SUM
or
IFNULL(sum(URT.TO_REMOVEALLPRIV), 0) as TO_REMOVEALLPRIV_SUM
I'm guessing that you want to do the same to all other columns in your query, so I'm not sure why you only complained about the TO_REMOVEALLPRIV column.
What you're looking for is the COALESCE function:
select
sum(URT.PRODSYS) as URT_SUM_PRODSYS,
sum(URT.Users) as URT_SUM_USERS,
sum(URT.total_orphaned) as URT_SUM_TOTAL_ORPHANED,
sum(URT.Bp_errors) as URT_SUM_BP_ERRORS,
sum(URT.Ma_errors) as URT_SUM_MA_ERRORS,
sum(URT.Pp_errors) as URT_SUM_PP_ERRORS,
sum(URT.REQUIREURTCBN) as URT_SUM_CBNREQ,
sum(URT.REQUIREURTQEV) as URT_SUM_QEVREQ,
sum(URT.REQUIREURTPRIV) as URT_SUM_PRIVREQ,
sum(URT.cbnperf) as URT_SUM_CBNPERF,
sum(URT.qevperf) as URT_SUM_QEVPERF,
sum(URT.privperf) as URT_SUM_PRIVPERF,
sum(COALESCE(URT.TO_REMOVEALLPRIV,0)) as TO_REMOVEALLPRIV_SUM
from
URTCUSTSTATUS URT
inner join CUSTOMER C on URT.customer_id=C.customer_id;

Min value with GROUP BY in Power BI Desktop

id datetime new_column datetime_rankx
1 12.01.2015 18:10:10 12.01.2015 18:10:10 1
2 03.12.2014 14:44:57 03.12.2014 14:44:57 1
2 21.11.2015 11:11:11 03.12.2014 14:44:57 2
3 01.01.2011 12:12:12 01.01.2011 12:12:12 1
3 02.02.2012 13:13:13 01.01.2011 12:12:12 2
3 03.03.2013 14:14:14 01.01.2011 12:12:12 3
I want to make new column, which will have minimum datetime value for each row in group by id.
How could I do it in Power BI desktop using DAX query?
Use this expression:
NewColumn =
CALCULATE(
MIN(
Table[datetime]),
FILTER(Table,Table[id]=EARLIER(Table[id])
)
)
In Power BI using a table with your data it will produce this:
UPDATE: Explanation and EARLIER function usage.
Basically, EARLIER function will give you access to values of different row context.
When you use CALCULATE function it creates a row context of the whole table, theoretically it iterates over every table row. The same happens when you use FILTER function it will iterate on the whole table and evaluate every row against the filter condition.
So far we have two row contexts, the row context created by CALCULATE and the row context created by FILTER. Note FILTER use the EARLIER to get access to the CALCULATE's row context. Having said that, in our case for every row in the outer (CALCULATE's row context) the FILTER returns a set of rows that correspond to the current id in the outer context.
If you have a programming background it could give you some sense. It is similar to a nested loop.
Hope this Python code points the main idea behind this:
outer_context = ['row1','row2','row3','row4']
inner_context = ['row1','row2','row3','row4']
for outer_row in outer_context:
for inner_row in inner_context:
if inner_row == outer_row: #this line is what the FILTER and EARLIER do
#Calculate the min datetime using the filtered rows
...
...
UPDATE 2: Adding a ranking column.
To get the desired rank you can use this expression:
RankColumn =
RANKX(
CALCULATETABLE(Table,ALLEXCEPT(Table,Table[id]))
,Table[datetime]
,Hoja1[datetime]
,1
)
This is the table with the rank column:
Let me know if this helps.

numeric range data type postgresql

I have a strange situation in the desing of my DB. I have the case that the type of value of a field can be a normal integer or a number between a range. I explain myself with a example:
the column age can be a number (18) or a range between (18-30). How I can represent this with postgresql?
Thx!
An integer range can represent both a single integer value and a range. The single value:
select int4range(18,18,'[]');
int4range
-----------
[18,19)
The ")" in the result above means exclusive.
The range:
select int4range(18,30,'[]');
int4range
-----------
[18,31)
There are a couple different ways to do this.
Store a VARCHAR
Store two values lower bound and upper bound
If there are only a select set of ranges you can create a lookup table for that set and store a foreign key to that lookup table.
You can make a bigger number, for example 18 x 1000 + 0 = 18000 for 18 and 18 x 1000 + 30 = 18030 for (18, 30).
When you retrieve it, you do first = round(number/1000) for the first number and second = number - first for the second number.
You can also store them as a point http://www.postgresql.org/docs/9.4/static/datatype-geometric.html#AEN6730.