In python, None !=1 will return True.
But why in Pyspark "Null_column" != 1 will return false?
example:
data = [(1,5),(2,5)]
columns=["id","test"]
df_null=spark.createDataFrame(data,columns)
df_null = df_null.withColumn("nul_val",lit(None))
df_null.printSchema()
df_null.show()
but df_null.filter(df_null.nul_val != 1).count() will return 0
Please check NULL Semantics - Spark 3.0.0 for how to handle comparison with null in spark.
But to summerize, in Spark, null is undefined , so any comparison with null will result in undefined and should be avoided to avoid unwanted results. And in your case, since undefined is not True, the count will be 0.
Apache spark supports the standard comparison operators such as ‘>’, ‘>=’, ‘=’, ‘<’ and ‘<=’. The result of these operators is unknown or NULL when one of the operarands or both the operands are unknown or NULL.
If you want to compare with a column that might contain null, use the null-safe operation <=> which results in False if one of the operands is null:
In order to compare the NULL values for equality, Spark provides a null-safe equal operator (‘<=>’), which returns False when one of the operand is NULL
So, back to your problem. To solve it I would do a null-check and the comparison with 1:
df_null.filter((df_null.nul_val.isNull()) | (df_null.nul_val != 1)).count()
Another solution would be to replace null with 0, if that does not destroy any other logic:
df_null.fill(value=0,subset=["nul_val"]).filter(df_null.nul_val != 1).count()
Related
SELECT * FROM Entity e WHERE e.Status <> ANY(ARRAY[1,2,3]);
Here Status is a nullable integer column. Using the above query i am unable to fetch the records whose status value is NULL.
SELECT * FROM Entity e WHERE (e.Status is NULL OR e.Status = 4);
This query does the trick. Could someone explain me why the first query was not working as expected.
NULL kinda means "unknown", so the expressions
NULL = NULL
and
NULL != NULL
are neither true nor false, they're NULL. Because it is not known whether an "unknown" value is equal or unequal to another "unknown" value.
Since <> ANY uses an equality test, if the value searched in the array is NULL, then the result will be NULL.
So your second query is correct.
It is spelled out in the docs Array ANY:
If the array expression yields a null array, the result of ANY will be null. If the left-hand expression yields null, the result of ANY is ordinarily null (though a non-strict comparison operator could possibly yield a different result). Also, if the right-hand array contains any null elements and no true comparison result is obtained, the result of ANY will be null, not false (again, assuming a strict comparison operator). This is in accordance with SQL's normal rules for Boolean combinations of null values.
FYI:
e.Status is NULL OR e.Status = 4
can be shortened to:
e_status IS NOT DISTINCT FROM 4
per Comparison operators.
when i run the program,it shows "Failed to build basics1_dart:basics1_dart:
bin/basics1_dart.dart:38:12: Error: The operator '<' isn't defined for the class 'String?'.
Try correcting the operator to an existing operator, or defining a '<' operator.
if(num1<0)"
what should i do
You are asking about String? type, which is a String with added possibility to contain null value.
Here under the question is a comment from #jamesdlin, it possible could be an answer, but operator < does not exist also for a String type. There is a compareTo function, which can be used instead to compare it with other String: if (num1 != null && num1.compareTo("0") < 0)
You can easily compare Your value in case of it does not actually contain the null. As have said #jamesdlin, You can compare you value with null first, but if You are sure it is not null, also You can use ! operator to write compact cast of Your value to not nullable: if (num1!.compareTo("0") < 0). This cast will throw an exception when it is null, but if You sure - why not?
Also I wonder why Your num1 variable is a nullable string while You compare it with a number? May be You need to cast to a numeric type first? You can use int.parse(String) to do this, and store parsed integer value in another variable, or use it once for comparing if (int.parse(num1!) < 0)
I have a requirement to load null if the total hours is less than previous total hours else the difference
iif(lesser(TOTAL_HOURS, PREVIOUS_TOTAL_HOURS),null(),TOTAL_HOURS-PREVIOUS_TOTAL_HOURS)
It gives me expression could not be evaluated.
Not all rows have values for these fields, some of them are null. They are numeric fields in database.
I just want to replace negative results with null
If you look at the document iif it says
iif(<condition> : boolean, <true_expression> : any, [<false_expression> : any]) => any
Based on a condition applies one value or the other. If other is
unspecified it is considered NULL. Both the values must be
compatible(numeric, string...).
Now as per your expression:
iif(lesser(TOTAL_HOURS, PREVIOUS_TOTAL_HOURS),null(),TOTAL_HOURS-PREVIOUS_TOTAL_HOURS)
since first value you have mentioned is of type null it expects TOTAL_HOURS-PREVIOUS_TOTAL_HOURS must also return a same type null
What you can try is:
iif(lesser(TOTAL_HOURS, PREVIOUS_TOTAL_HOURS),toInteger(null()),TOTAL_HOURS-PREVIOUS_TOTAL_HOURS)
OR
case(TOTAL_HOURS < PREVIOUS_TOTAL_HOURS, toInteger(null()), minus(TOTAL_HOURS,PREVIOUS_TOTAL_HOURS) )
I need to compare a nullable entity property via IN expression within the following JPQL query:
#NamedQuery(name = "query",
query = "SELECT e FROM MyEntity e WHERE e.status IN :statuses")
Now, I like the shown collection-valued input parameter statuses to optionally contain null as an element:
final List<MyEntity> actual = entityManager.createNamedQuery("query", MyEntity.class)
.setParameter("statuses", Arrays.asList(null, 1L))
.getResultList();
However with Hibernate/Derby an actual result list only contains entities with status 1L but not null.
I have not found anything in the JPA 2.2 specification about this case. Did I miss something or is this vendor-specific?
The answers to this question only solve part of my problem. In their proposed solutions, the null comparison is hard-baked into the query and cannot be controlled via the collection-valued parameter.
As a Java programmer, where null = null yields true it might come as a surprise that in SQL (and JPQL) null = null is itself null which is "falsy". As a result, null in (null) yields null as well.
Instead you need to treat null seperately with a IS NULL check: e.status IS NULL OR e.status IN :statuses.
This is described in 4.11 Null Values of the JPA Specification:
Comparison or arithmetic operations with a NULL value always yield an unknown value.
Two NULL values are not considered to be equal, the comparison yields an unknown value.
I am using python3.6 and py-postgresql==1.2.1.
I have the following statement:
db.prepapre("SELECT * FROM seasons WHERE user_id=$1 AND season_id=$2 LIMIT 1), where season_id can be NULL.
I want to be able to be able to get the latest record with a NULL season_id by passing None as the $2 param, but it does not work. Instead, I need to create this second statement:
db.prepapre("SELECT * FROM seasons WHERE user_id=$1 AND season_id IS NULL LIMIT 1)
It must have something to do with season_id = NULL not working and season_id IS NULL is, but is there a way to make this work?
From Comparison Functions and Operators:
Do not write expression = NULL because NULL is not “equal to” NULL. (The null value represents an unknown value, and it is not known whether two unknown values are equal.)
Some applications might expect that expression = NULL returns true if expression evaluates to the null value. It is highly recommended that these applications be modified to comply with the SQL standard. However, if that cannot be done the transform_null_equals configuration variable is available. If it is enabled, PostgreSQL will convert x = NULL clauses to x IS NULL.
and:
19.13.2. Platform and Client Compatibility
transform_null_equals (boolean)
When on, expressions of the form expr = NULL (or NULL = expr) are treated as expr IS NULL, that is, they return true if expr evaluates to the null value, and false otherwise. The correct SQL-spec-compliant behavior of expr = NULL is to always return null (unknown). Therefore this parameter defaults to off.
You could rewrite your query:
SELECT *
FROM seasons
WHERE user_id = $1
AND (season_id = $2 OR ($2 IS NULL AND season_id IS NULL))
-- ORDER BY ... --LIMIT without sorting could be dangerous
-- you should explicitly specify sorting
LIMIT 1;