SQL WHERE statement?

SQL WHERE statement? - tsql

What should my WHERE clause be in a SQL Statement in which I want to return those rows where column A is null or column B is null, but not where both are null?

WHERE (ColA is NULL AND ColB is NOT NULL)
OR (ColB is NULL AND ColA is NOT NULL)

(A IS NULL OR B IS NULL) AND NOT (A IS NULL AND B IS NULL)

Related

how to make a query not depending on the conditions in where clause when they are null in PostgreSQL?

I'd like to make a query from which the result is not depending on the conditions in where clause when conditions are null while it is depending on the conditions when they are not null.
A query I made is
select * from mytable where (num_lot = :num_lot or :num_lot is null) and date_work between :date_start and :date_stop
When :num_lot is null, the result was not depending on the num_lot, which was what I wanted.
But :date_start and :date_stop was null, no rows were returned rather than not depending on :date_start and :date_stop.

SELECT * FROM mytable WHERE
num_lot=COALESCE(:num_lot,num_lot) AND
date_work BETWEEN COALESCE(:date_start,date_work) and COALESCE(:date_stop,date_work)
when the verified value is NULL it is replaced with the column value i.e. always true.

Use coalesce() to check if both :date_start and :date_stop are null:
select *
from mytable
where (num_lot = :num_lot or :num_lot is null)
and (date_work between :date_start and :date_stop or coalesce(:date_start, :date_stop) is null)
or:
select *
from mytable
where (num_lot = :num_lot or :num_lot is null)
and ((date_work between :date_start and :date_stop) or (:date_start is null and :date_stop is null))

remove NULL columns in Spark SQL

How to remove columns containing only null values from a table? Suppose I have a table -
SnapshotDate CreationDate Country Region CloseDate Probability BookingAmount RevenueAmount SnapshotDate1 CreationDate1 CloseDate1
null null null null null 25 882000 0 null null null
null null null null null 25 882000 0 null null null
null null null null null 0 882000 0 null null null
null null null null null 0 882000 0 null null null
null null null null null 0 882000 0 null null null
null null null null null 0 882000 0 null null null
null null null null null 0 882000 0 null null null
null null null null null 0 882000 0 null null null
null null null null null 0 882000 0 null null null
null null null null null 0 882000 0 null null null
null null null null null 0 882000 0 null null null
null null null null null 0 882000 0 null null null
null null null null null 0 882000 0 null null null
null null null null null 0 882000 0 null null null
null null null null null 0 882000 0 null null null
null null null null null 0 882000 0 null null null
null null null null null 0 882000 0 null null null
null null null null null 0 882000 0 null null null
null null null null null 0 882000 0 null null null
null null null null null 0 882000 0 null null null
So I would just like to have Probability, BookingAmount and RevenueAmount columns and ignore the rest.
Is there a way to dynamically select the columns?
I am using spark 1.6.1

I solved this with a global groupBy. This works for numeric and non-numeric columns:
case class Entry(id: Long, name: String, value: java.lang.Float)
val results = Seq(
Entry(10, null, null),
Entry(10, null, null),
Entry(20, null, null)
)
val df: DataFrame = spark.createDataFrame(results)
// mark all columns with null only
val row = df
.select(df.columns.map(c => when(col(c).isNull, 0).otherwise(1).as(c)): _*)
.groupBy().max(df.columns.map(c => c): _*)
.first
// and filter the columns out
val colKeep = row.getValuesMap[Int](row.schema.fieldNames)
.map{c => if (c._2 == 1) Some(c._1) else None }
.flatten.toArray
df.select(row.schema.fieldNames.intersect(colKeep)
.map(c => col(c.drop(4).dropRight(1))): _*).show(false)
+---+
|id |
+---+
|10 |
|10 |
|20 |
+---+
Edit: I removed the shuffling of columns. The new approach keeps the given order of the columns.

You can add custom udf, and it in Spark SQL.
sqlContext.udf.register("ISNOTNULL", (str: String) => Option(str).getOrElse(""))
And with Spark SQL you can do :
SELECT ISNOTNULL(Probability) Probability, ISNOTNULL(BookingAmount) BookingAmount, ISNOTNULL(RevenueAmount) RevenueAmount FROM df

Constraint on columns based on single column not firing

I made a constraint where to mark the column completed to true some of the other columns would have to have a value.
But for some reason the constraint does not complain when I leave a specified column blank when completed is marked true. I have also purposely inserted NULL a specified column and still no constraint.
Any ideas?
CREATE TABLE info (
id bigserial PRIMARY KEY,
created_at timestamptz default current_timestamp,
posted_by text REFERENCES users ON UPDATE CASCADE ON DELETE CASCADE,
title character varying(31),
lat numeric,
lng numeric,
contact_email text,
cost money,
description text,
active boolean DEFAULT false,
activated_date date,
deactivated_date date,
completed boolean DEFAULT false,
images jsonb,
CONSTRAINT columns_null_check CHECK (
(completed = true
AND posted_by != NULL
AND title != NULL
AND lat != NULL
AND lng != NULL
AND contact_email != NULL
AND cost != NULL
AND description != NULL
AND images != NULL) OR completed = false)
);

In Chapter 9. Functions and Operators:
To check whether a value is or is not null, use the predicates:
expression IS NULL
expression IS NOT NULL
or the equivalent, but nonstandard, predicates:
expression ISNULL
expression NOTNULL
Therefore you can not use value != NULL to check null values, you can only use value IS NULL and value IS NOT NULL.
For boolean values they are the same:
Boolean values can also be tested using the predicates
boolean_expression IS TRUE
boolean_expression IS NOT TRUE
boolean_expression IS FALSE
boolean_expression IS NOT FALSE
boolean_expression IS UNKNOWN
boolean_expression IS NOT UNKNOWN

Hive select outputs null values

I get the select output as null for the following Hive table.
Describe studentdetails;
clustername string
schemaname string
tablename string
primary_key map<string,int>
schooldata struct<alternate_aliases:string,application_deadline:bigint,application_deadline_early_action:string,application_deadline_early_decision:bigint,calendaring_system:string,fips_code:string,funding_type:string,gender_preference:string,iped_id:bigint,learning_environment:string,mascot:string,offers_open_admission:boolean,offers_rolling_admission:boolean,region:string,religious_affiliation:string,school_abbreviation:string,school_colors:string,school_locale:string,school_term:string,short_name:string,created_date:bigint,modified_date:bigint,percent_students_outof_state:float> from deserializer
deletedind boolean
truncatedind boolean
versionid bigint
select * from studentdetails limit 3;
Output :
NULL NULL NULL NULL NULL NULL NULL NULL
NULL NULL NULL NULL NULL NULL NULL NULL
NULL NULL NULL NULL NULL NULL NULL NULL
I have used the following properties while creating the table.
ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe'
WITH SERDEPROPERTIES ("ignore.malformed.json" = "true")
And the following properties while selecting the data.
SET hive.exec.compress.output=true;
SET io.seqfile.compression.type=BLOCK;
SET mapred.output.compression.codec=org.apache.hadoop.io.compress.GzipCodec;
ADD JAR s3://emr/hive/lib/hive-serde-1.0.jar;

Thank you for the comments, I have found the solution for this.
The issue was that the column name in my json file and the column name that i used while creating the table was different.
When i synched the column names between the Hive table and the Json File the issue was resolved.
Thanks & Regards,
Srivignesh KN

How to use RAISEERROR statement?

This is something very basic, but I can't understand it, and the manual is not helping:
declare #rule int =
(select id from menu_availability_rules
where (daily_serving_start = null or
(daily_serving_start is null and null is null)) and
(daily_serving_end = null or
(daily_serving_end is null and null is null)) and
(weekly_service_off = 3 or
(weekly_service_off is null and 3 is null)) and
(one_time_service_off = null or
(one_time_service_off is null and null is null)));
print #rule;
-- syntax error here --\/
if (#rule is not null) raiseerror ('test error', 42, 42);
if #rule is not null
begin
delete from menu_availability
where menu_id = 5365 and rule_id = #rule
delete from menu_availability_rules
where (daily_serving_start = null or
(daily_serving_start is null and null is null)) and
(daily_serving_end = null or
(daily_serving_end is null and null is null)) and
(weekly_service_off = 3 or
(weekly_service_off is null and 3 is null)) and
(one_time_service_off = null or
(one_time_service_off is null and null is null))
and not exists
(select rule_id from menu_availability
where rule_id = #rule)
end
Why is it a syntax error? How would I write it? I need to throw error for debugging purposes, just to make sure the code reached the conditional branch.
I can just replace the raiseerror with select 1 / 0 and I will get what I need, but why can't I do it normally?

The correct name is RAISERROR.

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

SQL WHERE statement? - tsql

What should my WHERE clause be in a SQL Statement in which I want to return those rows where column A is null or column B is null, but not where both are null?

WHERE (ColA is NULL AND ColB is NOT NULL) OR (ColB is NULL AND ColA is NOT NULL)

(A IS NULL OR B IS NULL) AND NOT (A IS NULL AND B IS NULL)

Related

how to make a query not depending on the conditions in where clause when they are null in PostgreSQL?

remove NULL columns in Spark SQL

Constraint on columns based on single column not firing

Hive select outputs null values

How to use RAISEERROR statement?

Categories

Resources