Hive select outputs null values

Hive select outputs null values - select

I get the select output as null for the following Hive table.
Describe studentdetails;
clustername string
schemaname string
tablename string
primary_key map<string,int>
schooldata struct<alternate_aliases:string,application_deadline:bigint,application_deadline_early_action:string,application_deadline_early_decision:bigint,calendaring_system:string,fips_code:string,funding_type:string,gender_preference:string,iped_id:bigint,learning_environment:string,mascot:string,offers_open_admission:boolean,offers_rolling_admission:boolean,region:string,religious_affiliation:string,school_abbreviation:string,school_colors:string,school_locale:string,school_term:string,short_name:string,created_date:bigint,modified_date:bigint,percent_students_outof_state:float> from deserializer
deletedind boolean
truncatedind boolean
versionid bigint
select * from studentdetails limit 3;
Output :
NULL NULL NULL NULL NULL NULL NULL NULL
NULL NULL NULL NULL NULL NULL NULL NULL
NULL NULL NULL NULL NULL NULL NULL NULL
I have used the following properties while creating the table.
ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe'
WITH SERDEPROPERTIES ("ignore.malformed.json" = "true")
And the following properties while selecting the data.
SET hive.exec.compress.output=true;
SET io.seqfile.compression.type=BLOCK;
SET mapred.output.compression.codec=org.apache.hadoop.io.compress.GzipCodec;
ADD JAR s3://emr/hive/lib/hive-serde-1.0.jar;

Thank you for the comments, I have found the solution for this.
The issue was that the column name in my json file and the column name that i used while creating the table was different.
When i synched the column names between the Hive table and the Json File the issue was resolved.
Thanks & Regards,
Srivignesh KN

Related

Check if value is not null and not empty in a select query

I have created a function in Postgresql and specified the returned type as TABLE (id uuid, data boolean).
This is the code that I have tried:
BEGIN
RETURN QUERY SELECT table.id, (table.data <> '') as data FROM table;
END
But it will return NULL for "data" when data is NULL in the table. I was expecting it to return FALSE.
Data column is storing a JSON and I am trying to check if the stored value is not null and not empty
How can I make this code work?

Use is distinct from to use a null-safe comparison:
SELECT table.id, table.data is distinct from '' as data
FROM table;
Another option is to treat an empty string like null:
SELECT table.id, nullif(table.data, '') is not null as data
FROM table;

How to insert null in column array datatype in ksql

I am creating a stream where data type is Array. a null value comes so handle this i am creating another stream on top of previous stream, and giving a case like if its null put []. but its not working.
i have tried [],{},[0],0 in cases.
CREATE STREAM stream1
(
id VARCHAR,
tags ARRAY<INT>,
feed_id VARCHAR,
status INT,
updated_at VARCHAR
)
WITH (kafka_topic='origin_topic', value_format='JSON');
Create Stream stream2 AS
select
id AS id,
case when tags is NULL THEN [] END ELSE tags END as tags,
case when feed_id is NULL THEN '0' ELSE feed_id END as feed_id,
case when status is NULL THEN 0 ELSE status END as status,
case when updated_at is NULL THEN '0' ELSE updated_at END as
updated_at
from stream1 PARTITION BY id;

ksqlDB supports an array constructor ARRAY[], though if you're running an old version this may not be available. Also, it does not yet support empty arrays, so you'll need to pass at least one parameter, e.g. ARRAY[1] will create an array with a single INT/BIGINT element.

Constraint on columns based on single column not firing

I made a constraint where to mark the column completed to true some of the other columns would have to have a value.
But for some reason the constraint does not complain when I leave a specified column blank when completed is marked true. I have also purposely inserted NULL a specified column and still no constraint.
Any ideas?
CREATE TABLE info (
id bigserial PRIMARY KEY,
created_at timestamptz default current_timestamp,
posted_by text REFERENCES users ON UPDATE CASCADE ON DELETE CASCADE,
title character varying(31),
lat numeric,
lng numeric,
contact_email text,
cost money,
description text,
active boolean DEFAULT false,
activated_date date,
deactivated_date date,
completed boolean DEFAULT false,
images jsonb,
CONSTRAINT columns_null_check CHECK (
(completed = true
AND posted_by != NULL
AND title != NULL
AND lat != NULL
AND lng != NULL
AND contact_email != NULL
AND cost != NULL
AND description != NULL
AND images != NULL) OR completed = false)
);

In Chapter 9. Functions and Operators:
To check whether a value is or is not null, use the predicates:
expression IS NULL
expression IS NOT NULL
or the equivalent, but nonstandard, predicates:
expression ISNULL
expression NOTNULL
Therefore you can not use value != NULL to check null values, you can only use value IS NULL and value IS NOT NULL.
For boolean values they are the same:
Boolean values can also be tested using the predicates
boolean_expression IS TRUE
boolean_expression IS NOT TRUE
boolean_expression IS FALSE
boolean_expression IS NOT FALSE
boolean_expression IS UNKNOWN
boolean_expression IS NOT UNKNOWN

PostgreSQL: how to insert null value to uuid

Need to insert null value to field with uuid type without NOT NULL specification (not primary key).
When I try insert '', this return:
ERROR: invalid input syntax for uuid: ""
When I try insert null, this return:
ERROR: null value in column "uuid" violates not-null constraint
How to do it?
psql 9.3.5
SQL:
INSERT INTO inv_location (address_id)
VALUES (null)

In Postgres use uuid_nil() function to simulate empty uuid (same as 00000000-0000-0000-0000-000000000000)
INSERT INTO inv_location (address_id)
VALUES (uuid_nil())
You might need to have uuid extension (not sure), if you do, run this (only once):
create extension if not exists "uuid-ossp";

If the column is defined NOT NULL, you cannot enter a NULL value. Period.
In this error message:
ERROR: null value in column "uuid" violates not-null constraint
"uuid" is the name of the column, not the data type of address_id. And this column is defined NOT NULL:
uuid | character varying(36) | not null
Your INSERT statement does not include "uuid" in the target list, so NULL is defaults to NULL in absence of a different column default. Boom.
Goes to show how basic type names (ab)used as identifier lead to confusing error messages.

Change column type and set not null

How do you change the column type and also set that column to not null together?
I am trying:
ALTER TABLE mytable ALTER COLUMN col TYPE character varying(15) SET NOT NULL
This returns an error.
What is the right syntax?

This should be correct:
ALTER TABLE mytable
ALTER COLUMN col TYPE character varying(15),
ALTER COLUMN col SET NOT NULL

Also, if you want to REMOVE NOT NULL constrain in postgresql:
ALTER TABLE mytable
ALTER COLUMN email DROP NOT NULL;

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Hive select outputs null values - select

Related

Check if value is not null and not empty in a select query

How to insert null in column array datatype in ksql

Constraint on columns based on single column not firing

PostgreSQL: how to insert null value to uuid

Change column type and set not null

Categories

Resources