I have a hive table with 5 columns in hive and there is no data loaded for this table and the table is not partitioned - hiveql

I have a HiveQL table with 5 columns, and there is no data loaded for this table, and the table is not partitioned.
SELECT
*
FROM
emp;
Here are the types of the columns:
emp_id int
emp_name string
emp_dept int
businesseffectivedate date
How can I convert businesseffectivedate to a partition key?

Related

Create Partitioned Table from Partitioned Table - Postgresql

Let's say I have a partitioned table A.
create table A (
col1 timestamp,
col2 int
)
partition by col2;
create table partition1 partition of A from values (minvalue) to (y);
create table partition1 partition of A from values (y) to (maxvalue);
copy A from '/some/csv/file'
The above code gives me a paritioned table A with the data populated. I want to create another table using -
create table B as (
select *,
col2 * 3 as col3 -- Add a new column
from A
);
Can I save A as a partitioned CSV/'insert_format' file?
Is it possible that B is also paritioned the same way A is?

How to updated numRows on Amazon Spectrum based on a count from a query

Hi I'm trying to automate a workflow in Airflow where I am going to be appending rows to an external Spectrum table daily and I need to alter the numRows on the spectrum table by extracting the count of the existing table + the new count of rows I am appending.
CREATE EXTERNAL TABLE spectrum.my_external_table
(
id INTEGER,
barkdata_timestamp timestamp,
created_at timestamp,
updated_at timestamp
)
PARTITIONED BY (asofdate timestamp)
STORED AS PARQUET
LOCATION 's3://<SOME BUCKET>/manifest'
table properties ('numRows'= '<some number>';
ALTER TABLE spectrum.my_external_table
ADD PARTITION (asofdate='2021-03-03 00:00:00') LOCATION 's3://<SOME BUCKET>/asofdate=2021-03-03 00:00:00/';
ALTER TABLE spectrum.couponable_coupon
SET TABLE PROPERTIES ('numRows'='<HELP HERE should be count(*) from my_external_table + count(*) from table_I_unloaded_to_s3 where asofdate='2021-03-03 00:00:00'>');

How to get multiple table data in one query in posgresql JSONB data type

How can I Fetch table data in one query? I have below tables:
Tabel Name: calorieTracker
Creat Table calorieTracker(c_id serial NOT NULL PRIMARY KEY, caloriesConsumption jsonb);
INSERT INTO public."calorieTracker" ("caloriesConsumption")
VALUES ('[{"C_id":"1",,"calorie":88,"date":"19/08/2020"},{"C_id":2,"date":"19/08/2020","calorie":87}]');
Table Name: watertracker
create table watertracker(wt_id serial not null primary key, wt_date varchar, wt_goal float,wt_cid int);
INSERT INTO public.watertracker (wt_id,wt_date,wt_goal,wt_cid)
VALUES (2,'2020-08-19',5.5,2);
What I am looking here I want to write query where date is 19/08/2020(in calorieTracker table and water tracker table) and wt_cid is 2(water tracker table) and c_id is 2(calorieTracker table) then return data.
As you have not mentioned what output you want, so i am assuming you want JSON object from caloriesConsumption which matches the condition mentioned in the question:
based on above assumption try this query:
with cte as (
select
c_id,
jsonb_array_elements("caloriesConsumption") "data"
from "calorieTracker"
)
select
t1.*
from cte t1 inner join watertracker t2
on t2.wt_cid=cast(t1.data->>'c_id' as int)
and t2.wt_date=t1.data->>'date'
if you want the result from watertracker then just replace t1.* with t2.*.

index on composite primary key columns

I have table called
CREATE TABLE process (
batch_id Integer
,product_id Integer
,machine_id Integer
,created_date DATE
,updated_date DATE
,primary key(batch_id,product_id,machine_id)
)
But I generally use SQL like
SELECT *
FROM process
WHERE product_id = 123
AND machine_id = 1
When i check SQL plan for this id does not uses primary key index.
Do i need to create another index of both columns?
Database is DB2

Can we set String column as partitionColumn?

Table only has String column as primary column EMPLOYEE_ID how to partition it.
val destination = spark.read.options(options).jdbc(options("url"), options("dbtable"), "EMPLOYEE_ID", P00100001, P00100005000000, 10, new java.util.Properties()).rdd.map(_.mkString(","))
Is there any other way to Read JDBC table and process it.
It is not possible. Only integer columns can be used here. If your database supports some variant of rowid, which is integer or can be casted to integer, you can extract it in a query (pseudocode):
(SELECT CAST(rowid AS INTEGER), * FROM TABLE) AS tmp