hive insert current date into a table using date function errors - postgresql

I have to insert current date (timestamp) in a table via hive query. The query is failing for some reason. Can someone please help me out.
CREATE EXTERNAL TABLE IF NOT EXISTS dataFlagTest(
date string
)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
LOCATION 's3://bckt1/hive_test/dateFlag/';
Now To insert into it, I run following query :
INSERT OVERWRITE TABLE dataFlagTest
SELECT from_unixtime(unix_timestamp()) ;
It failed with the following error :
FAILED: NullPointerException null
Can someone please help me out

Solution is you have to do select from a table. You cannot run select without from clause.
So, create a sample table with 1 row or use an existing table like below :
Insert OVERWRITE TABLE dataflagtest SELECT from_unixtime(unix_timestamp()) as date FROM EXISTING_TABLE TABLESAMPLE(1 ROWS);

Related

My PSQL after insert trigger fails to insert into another table when ON DUPLICATE encounters a dupilcate

I am slowly working through a feature where I am importing large csv files. The contents of the csv file has a chance that when it is uploaded the contents will trigger a uniqueness conflict. I've combed stack overflow for some similar resources but I still can't seem to get my trigger to update another table when a duplicate entry is found. The following code is what I have currently implemented with my line of logic for this process. Also, this is implemented in a rails app but the underlying sql is the following.
When a user uploads a file, the following happens when its processed.
CREATE TEMP TABLE codes_temp ON COMMIT DROP AS SELECT * FROM codes WITH NO DATA;
create or replace function log_duplicate_code()
returns trigger
language plpgsql
as
$$
begin
insert into duplicate_codes(id, campaign_id, code_batch_id, code, code_id, created_at, updated_at)
values (gen_random_uuid(), excluded.campaign_id, excluded.code_batch_id, excluded.code, excluded.code_id, now(), now());
return null;
end;
$$
create trigger log_duplicate_code
after insert on codes
for each row execute procedure log_duplicate_code();
INSERT INTO codes SELECT * FROM codes_temp ct
ON CONFLICT (campaign_id, code)
DO update set updated_at = excluded.updated_at;
DROP TRIGGER log_duplicate_code ON codes;
When I try to run this process nothing happens at all. If I were to have a csv file with this value CODE01 and then upload again with CODE01 the duplicate_codes table doesn't get populated at all and I don't understand why. There is no error that gets triggered or anything so it seems like DO UPATE..... is doing something. What am I missing here?
I also have some questions that come to my mind even if this were to work as intended. For example, I am uploading millions of these codes, etc.
1). Should my trigger be a statement trigger instead of a row for scalability?
2). What if someone else tries to upload another file that has millions of codes? I have my code wrapped in a transaction. Would a new separate trigger be created? Will this conflict with a previously processing process?
####### EDIT #1 #######
Thanks to Adriens' comment I do see that After Insert does not have the OLD key phrase. I updated my code to use EXCLUDED and I receive the following error for the trigger.
ERROR: missing FROM-clause entry for table "excluded" (PG::UndefinedTable)
Finally, here are the S.O. posts I've used to try to tailor my code but I just can't seem to make it work.
####### EDIT #2 #######
I have a little more context on to how this is implemented.
When the CSV is loaded, a staging table called codes_temp is created and dropped at the end of the transaction. This table contains no unique constraints. From what I read only the actual table that I want to insert codes should have the unique constraint error.
In my INSERT statement, the DO update set updated_at = excluded.updated_at; doesn't trigger a unique constraint error. As of right now, I don't know if it should or not. I borrowed this logic taken from this s.o. question postgresql log into another table with on conflict it seemed to me like I had to update something if I specify the DO UPDATE SET clause.
Last, the correct criteria for codes in the database is the following:
For example, this is an example entry in my codes table
id, campaign_id, code
1, 1, CODE01
2, 1, CODE02
3, 1, CODE03
If any of these codes appear again somewhere, This should not be inserted into the codes table but it needs to be inserted into the duplicate_codes table because they were already uploaded before.
id, campaign_id, code
1, 1, CODE01.
2, 1, CODE02
3, 1, CODE03
As for the codes_temp table I don't have any unique constraints, so there is no criteria to select the right one.
postgresql log into another table with on conflict
Postgres insert on conflict update using other table
Postgres on conflict - insert to another table
How to do INSERT INTO SELECT and ON DUPLICATE UPDATE in PostgreSQL 9.5?
Seems to me something like:
INSERT INTO
codes
SELECT
distinct on(campaign_id, code) *
FROM
codes_temp ct
ORDER BY
campaign_id, code, id DESC;
Assuming id was assigned sequentially, the above would select the most recent row into codes.
Then:
INSERT INTO
duplicate_codes
SELECT
*
FROM
codes_temp AS ct
LEFT JOIN
codes
ON
ct.id = codes.id
WHERE
codes.id IS NULL;
The above would select the rows in codes_temp that where not selected into codes into the duplicates table.
Obviously not tested on your data set. I would create a small test data set that has uniqueness conflicts and test with.

UPSERT from table with different table sizes

I'm getting the error:
ERROR: column "some_col_name" does not exist Hint: There is a column named "some_col_name" in table "usert_test", but it cannot be referenced from this part of the query.
On UPSERT The cause of this error is that the source table (read in from API) doesn't always have the same number of fields as the table I'm looking to UPSERT. Within the UPSERT process is there a way to handle this? So far I've tried the below:
INSERT INTO scratch."usert_test" (many_cols)
SELECT *
FROM scratch.daily_scraper
ON CONFLICT (same_unique_id)
DO UPDATE
SET
many_fields = excluded.many_fields;
Name each column specifically in every instance.
insert into scratch."usert_test" (column_name1, column_name2, column_name3,column_name3)
select cola, colb, colc, colf
from scratch.daily_scraper
on conflict (column_name1, column_name4)
do update
set
column_name3 = excluded.column_name3
, column_name2 = excluded.column_name2;
How ever many columns you have properly name every one. (IMHO) As you should always do.

Snowflake How to get Records failed in Copy command

is it possible to get records which failed during Copy command in Snowflake from internal stage to snowflake table?
I am trying to load error recrods in a error table during Copy command execution . Copy Command used:
Copy into table ( col1, col2,col3,col4) from ( select $1,$2,$3,56 from #%table) ON_ERROR=CONTINUE
To get all the bad records, you can run the copy with VALIDATION_MODE = 'RETURN ERRORS'. Then use the RESULT_SCAN from the validation in an insert statement.
If one of your columns is unique (i.e. col1), maybe you can compare rows in the table with the rows in the stage:
select $1 from #%table
MINUS
select col1 from table;
Please check below select statement after copy command
select rejected_record from table(validate(test_copy , job_id => '_last')) ;

How does SELECT INTO works with SAS

I'm new with SAS and I try to copy my Code from Access vba into SAS.
In Access I use often the SELECT INTO funtion, but it seems to me this function is not in SAS.
I have two tables and I get each day new data and I want to update my table with the new lines. Now I Need to check if some new lines appear -> if yes insert this lines into the old table.
I tried some Code from stackoverflow and other stuff from Google, but I didn't find something which works.
INSERT INTO OLD_TABLE T
VALUES (GRVID = VTGONR)
FROM NEW_TABLE V
WHERE not exists (SELECT V.VTGONR FROM NEW_TABLE V WHERE T.GRVID = V.VTGONR);
Not sure what the purpose of using the VALUES keyword is in your example. PROC SQL uses VALUES() to list static values. Like:
VALUES (100)
SAS just uses normal SQL syntax instead. See for example: https://www.techonthenet.com/sql/insert.php
To specify the observations to insert just use SELECT. You can add a WHERE clause as part of the select to limit the rows that you select to insert. To tell INSERT which columns to insert into list them inside () after the table name. Otherwise it will expect the order that the columns are listed in the select statement to match the order of the columns in the target table.
insert into old_table(GRVID)
select VTGONR from new_table
where VTGONR not in (select GRVID from old_table)
;

Rollback doesn't work with Amazon Redshift

I am practicing with redshift, I have created a table:
Inserted values from another table
Delete the data from table
I have tried rollback both of this steps, but it doesn't work. What is wrong with this, I don't understand?
Open two psql terminals connected to same Redshift intance and database, say terminal-1 and terminal-2.
Execute following queries on terminal-1.
create table sales(
salesid integer not null Identity,
commission decimal(8,2),
saledate date,
description varchar(255),
created_at timestamp default sysdate,
updated_at timestamp);
begin;
insert into sales(commission,saledate,description,created_at,updated_at) values('3.55','2018-12-10','Test description','2018-05-17 23:54:51','2018-05-17 23:54:51');
insert into sales(commission,saledate,description,created_at,updated_at) values('5.67','2018-11-10','Test description1','2018-05-17 23:54:51','2018-05-17 23:54:51');
Hold on here and go to terminal-2; don't close the terminal-1, and execute following query
select * from sales;
You will not get above two data records inserted from terminal-1.
Hold on here, again go to terminal-1; and execute below query.
commit;
Hold on here and go to terminal-2; execute following query again
select * from sales;
Now, you will both records.
Point proven.