AVG function in PostgreSQL ignores NULL values when it calculates the average. But what if I want to count the average value of multiple columns with many NULL values?
All of below commands dont work
AVG(col1,col2,col3)
AVG(col1)+AVG(col2)+AVG(col3) -> sum calculation alone gives wrong value because of null calculation
This question is similar to this Average of multiple columns, but is there any simple solution for PostgreSQL specific case?
Related
requirement -
In the Picture attached, consider the first 3 columns as my raw data. Some rows have quantity column as NULL value which is exactly what I want to fill up.
In an Ideal case, I would fill up any NULL value with the previous KNOWN value.
Spark Imputer seemed to be a very easily implementable library that can help me fill missing values.
But here the issue is,Spark Imputer is limited to mean or Median calculation according to all NON-BULL values present in the data frame as a result of which I don't get desired result (4th column in the Pic).
Logic -
val imputer = new Imputer()
.setInputCols(Array("quantity"))
.setOutputCols(Array("quantity_imputed"))
.setStrategy("mean")
val model = imputer.fit(new_combinedDf)
model.transform(new_combinedDf).show()
Result -
Now is it possible to limit the Mean calculation for EACH null value to be the MEAN of last n values ?
i.e
For 2020-09-26 , where we get the first null value, Is it possible to tweak Spark Imputer to calculate the Mean over last n values only instead of all non-null values in the dataframe ?
I am trying to divide the average value of column1 by the average value of column 2, which will give me an average price from my data. I believe there is a problem with my syntax / structure of my code, or I am making a rookie mistake.
I have searched stack and cannot find many examples of dividing two averaged columns, and checked the postgres documentation.
The individual average query is working fine (as shown here)
SELECT (AVG(CAST("Column1" AS numeric(4,2))),2) FROM table1
But when I combine two of them in an attempt to divide, It simply does not work.
SELECT (AVG(CAST("Column1" AS numeric(4,2))),2) / (AVG(CAST("Column2" AS numeric(4,2))),2) FROM table1
I am receiving the following error; "ERROR: row comparison operator must yield type boolean, not type numeric". I have tried a few other variations which have mostly given me syntax errors.
I don't know what you are trying to do with your current approach. However, if you want to take the ratio of two averages, you could also just take the ratio of the sums:
SELECT SUM(CAST(Column1 AS numeric(4,2))) / SUM(CAST(Column2 AS numeric(4,2)))
FROM table1;
Note that SUM() just takes a single input, not two inputs. The reason why we can use the sums is that average would normalize both the numerator and denominator by the same amount, which is the number of rows in table1. Hence, this factor just cancels out.
I am trying to find the difference between 2 columns in tableau. The catch though here is that each column is ranked based on a value. The difference i need is between these 2 ranked columns
The rank is computed using the table calculations rank function. Attaching the picture for more information
I am assuming "current" and "prior" are calculated fields.
Just create a new calculated field, here I'll call it "Result". In this field just minus your one from the other so:
[Current - Prior]
Then pull this new field into your measures values on your sheet.
I can't seem to find a solution for my issue anywhere, and I hope that I can ask my question correctly here to find an answer:
I am having a problem finding a count of the number of values from only the distinct rows in my dataset.
In my Tableau sheet, I am trying to find the number of cases that are closed as "FCR" - and in the database, this is represented as true or false.
My formula:
SUM(if [IsFCR] = true then 1 else 0 end)
The problem that I am running into is that this is getting the count from all of the rows in my data. But I need represent the total cases, and the FCR value from each of THOSE rows as a distinct count.
As seen in this image: Screen capture of Tableau formula
I am returning 9,000 distinct rows - but 46,000 counts of FCR being true.
Notably, I have tried wrapping my formula in a COUNTD (and it returns 2), in Window_Count (returns 1), among many other guesses as to how I could wrap this calculated field in a way to only count the unique rows.
If only I could use a foreach logic...
Any help that you can give is greatly appreciated.
I have a column with dollar amounts, such as:
$1.00
$4.00
$100.00
and so on. These values are put into the column by calculating the % of another column. I want to get the sum of all the dollar amounts of that column. I cant use Sum(data) because its static data in the column. Is there a way to get the sum easily?
Thanks!
There are a couple of different ways to deal with this:
If it's the same percentage being applied to each row, sum the original column and apply the same percentage to that sum.
If the original source of the data is from a database, do the percent calculation in your Dataset's SQL query, then use the Sum function in your Tablix as usual.
If these don't apply, you'll need to give more detailed information about your problem.