Hive how to load data into partitioned table from unpartitioned table with where condition - hiveql

samp1 table is a partitioned on the year and month columns
The data is stored in the ORCFile format
Insert the records from January, 2008, of the samp2 table into the
appropriate partition of weather_partitione
d

INSERT OVERWRITE TABLE dest_table_name PARTITION (year, month) SELECT all other columns,year,month FROM source_table_name;
When using dynamic partitions, partition columns should come as the last columns in the select query.

Related

KSQL Persisent Query not writing data to KSQL Table

I have two KSQL Tables each having the same key. I am running the following query on them
CREATE TABLE TEMP1 AS SELECT
b.MFG_DATE,
b.rowtime as bd_rowtime,
s.rowtime as sd_rowtime,
b.EXPIRY_DATE as EXP_DATE,
b.BATCH_NO as BATCH_NO,
s.rowkey as SD_ID
FROM GR_SD4 s
INNER JOIN GR_BD4 b ON b.rowkey = s.rowkey;
PARTITION BY s.rowkey;
The resulting table does not get populated with data but when I run the select query separately it populates the data. I am confused on what could be the reason for the table not being populated with data.
The issue may be related to the PARTITION BY clause in your query. Since you are joining two tables, the resulting table will have a composite primary key (rowkey, s.rowkey). The PARTITION BY clause should be updated to reflect this, i.e. PARTITION BY rowkey, s.rowkey. This should ensure that the data is correctly partitioned and can be inserted into the table.

I want to create a partition table month wise only 12 tables for multiple years

I am new to the partitioning of a table and I want to make a partition of the table by range type on the inserted_on column in this table the records are inserted around ~ 40000 daily
I have tried creating a partition table as:
CREATE TABLE My.table_name_fy2022_01 PARTITION OF My.table_name FOR VALUES FROM ('2022-01-01') TO ('2022-02-01');
But this way i will have to create 12 tables per year and that I don't want to do.
My question is:- how to create a partition table such as the no. of partition table be only 12 (months wise) and stores the data according to a specific month's partition.
For Example:-
partition table June
record of 2022-06-20 insert into June,
record of 2023-06-16 insert into June,
record of 2024-06-10 insert into June,
and So on
PARTITION BY HASH should be used like:
PARTITION BY HASH(MONTH(use_time)) PARTITIONS 12;

Postgres partition size increase but no selectable data

I have a postgres table set up with a list partition defined on a column like so:
create table table1 (
val varchar(10),
t_type varchar(10)
) partition by list(t_type);
create table table1_xvals partition of table1 for values in ('xval1','xval2');
create table table1_yvals partition of table1 for values in ('yval1','yval2');
As I insert data into table1, and I can see the size of the table and individual partitions increasing, however when I try to select data from any of those tables, nothing shows up (select * from ). Is there anything wrong with how I'm creating the tables or selecting data?
That is normal. You are loading the data in a single database transaction, and the effects of a database transaction only become visible when the transaction is completed (committed). If you are loading the data with a single INSERT or COPY statement, the transaction will end as soon as the statement is done.

Range Partitions by Date with One Partition for NULL dates

I would like to have a table with deleted column containing the date the item was soft-deleted. Rows with NULL value in deleted column are the active ones. I was not able to figure our the syntax to create a partition for null values in deleted column. What is the syntax of creating such column?
create table my_table_pointing(street_id int, p_city_id int, name varchar(10), deleted date)
PARTITION BY RANGE (deleted);
CREATE TABLE my_table_pointing_2020 PARTITION OF my_table_pointing
FOR VALUES FROM ('2020-01-01') TO ('2021-01-01');
CREATE TABLE my_table_pointing_active PARTITION OF my_table_pointing
"for all rows where date is null"...
Thanks!
Provided you are on PG11 or later, you can create a default partition, and rows with deleted is null will be routed there.
create table my_table_pointing_active partition of my_table_pointing default;

Hive partitioning external table based on range

I want to partition an external table in hive based on range of numbers. Say numbers with 1 to 100 go to one partition. Is it possible to do this in hive?
I am assuming here that you have a table with some records from which you want to load data to an external table which is partitioned by some field say RANGEOFNUMS.
Now, suppose we have a table called testtable with columns name and value. The contents are like
India,1
India,2
India,3
India,3
India,4
India,10
India,11
India,12
India,13
India,14
Now, suppose we have a external table called testext with some columns along with a partition column say, RANGEOFNUMS.
Now you can do one thing,
insert into table testext partition(rangeofnums="your value")
select * from testtable where value>=1 and value<=5;
This way all records from the testtable having value 1 to 5 will come into one partition of the external table.
The scenario is my assumption only. Please comment if this is not the scenario you have.
Achyut