Redshift insert a date value into a table - amazon-redshift

insert into table1 (ID,date)
select
ID,sysdate
from table2
assume i insert a record into table2 with value ID:1,date:2023-1-1
the expected result is update the ID of table1 base on the ID from table2 and update the value of date of table1 base on the sysdate from table2.
select *
from table1;
the expected result after running the insert statement will be
ID
date
1
2023-1-6
but what i get is:
ID
date
1
2023-1-1

I see a few possibilities based on the information given:
You say "the expected result is update the ID of table1 base on the ID from table2" and this begs the question - did ID = 1 exist in table1 BEFORE you ran the INSERT statement? If so are you expecting that the INSERT will update the value for ID #1? Redshift doesn't enforce or check uniqueness of primary keys and you would get 2 rows in the table1 in this case. Is this what is happening?
SYSDATE on Redshift provides the start timestamp of the current transaction, NOT the current statement. Have you had the current transaction open since the 1st?
You didn't COMMIT the results (or the statement failed) and are checking from a different session. It could also be that the transaction started before in the second session before the COMMIT completed. Working with MVCC across multiple sessions can trip anyone up.
There are likely other possible explanations. If you could provide DDL, sample data, and a simple test case so that others can recreate what you are seeing it would greatly narrow down the possibilities.

Related

After insert trigger failing to join two tables with values inserted by a database transaction

I have 2 table invoice_header table and invoice_line table with details below:
invoice_header colums : invoice_id, customer_id
invoice_line colums : invoice_id, line_id ,item_id, quantity, line_flag
They are joined by the common column invoice_id
Values in these 2 tables are inserted using a database transaction by the application
Tables are on Microsoft SQL Server
I created a trigger on invoice_line to update line_flag to zero if customer_id is 10. However the trigger is not working, I believe because it's failing to find a matching line in invoice_header since these two tables are inserted by a database transaction at the same time.
Below is the trigger
update invoice_line
set line_flag = 0
from invoice_line l
inner join Inserted v on v.line_id = l.line_id
inner join invoice_header h on h.invoice_id = v.invoice_id
where h.customer_id = 10
If I don't join the tables, It works but updates all the lines.
I have also tried to rewrite the trigger on the invoice_header to update the lines but it's still not working.
Is the a way to write an after insert trigger that joins tables inserted by database transaction?
I believe because it's failing to find a matching line in
invoice_header since these two tables are inserted by a database
transaction at the same time.
This is not true; the data may still not be committed for "outsiders", but your own transaction can see the data it iself has already modified.
the trigger is not working
While your question has some details, you don't mention why the trigger doesn't work. Do you mean it does nothing? Or is there an error? If so, which one?
update invoice_line
I think here's the cause for the error. You want to update l instead, since you have already aliased your table.

Is a subquery able to select columns from outer query? [duplicate]

This question already has answers here:
sql server 2008 management studio not checking the syntax of my query
(2 answers)
Closed 1 year ago.
I have the following select:
SELECT DISTINCT pl
FROM [dbo].[VendorPriceList] h
WHERE PartNumber IN (SELECT DISTINCT PartNumber
FROM [dbo].InvoiceData
WHERE amount > 10
AND invoiceDate > DATEADD(yyyy, -1, CURRENT_TIMESTAMP)
UNION
SELECT DISTINCT PartNumber
FROM [dbo].VendorDeals)
The issue here is that the table [dbo].VendorDeals has NO column PartNumber, however no error is detected and the query works with the first part of the union.
Even more, IntelliSense also allows and recognize PartNumber. This fails only when inside a complex statement.
It is pretty obvious that if you qualify column names, the mistake will be evident.
This isn't a bug in SQL Server/the T-SQL dialect parsing, no, this is working exactly as intended. The problem, or bug, is in your T-SQL; specifically because you haven't qualified your columns. As I don't have the definition of your table, I'm going to provide sample DDL first:
CREATE TABLE dbo.Table1 (MyColumn varchar(10), OtherColumn int);
CREATE TABLE dbo.Table2 (YourColumn varchar(10) OtherColumn int);
And then an example that is similar to your query:
SELECT MyColumn
FROM dbo.Table1
WHERE MyColumn IN (SELECT MyColumn FROM dbo.Table2);
This, firstly, will parse; it is a valid query. Secondly, provided that dbo.Table2 contains at least one row, then every row from table dbo.Table1 will be returned where MyColumn has a non-NULL value. Why? Well, let's qualify the column with table's name as SQL Server would parse them:
SELECT Table1.MyColumn
FROM dbo.Table1
WHERE Table1.MyColumn IN (SELECT Table1.MyColumn FROM dbo.Table2);
Notice that the column inside the IN is also referencing Table1, not Table2. By default if a column has it's alias omitted in a subquery it will be assumed to be referencing the table(s) defined in that subquery. If, however, none of the tables in the sub query have a column by that name, then it will be assumed to reference a table where that column does exist; in this case Table1.
Let's, instead, take a different example, using the other column in the tables:
SELECT OtherColumn
FROM dbo.Table1
WHERE OtherColumn IN (SELECT OtherColumn FROM dbo.Table2);
This would be parsed as the following:
SELECT Table1.OtherColumn
FROM dbo.Table1
WHERE Table1.OtherColumn IN (SELECT Table2.OtherColumn FROM dbo.Table2);
This is because OtherColumn exists in both tables. As, in the subquery, OtherColumn isn't qualified it is assumed the column wanted is the one in the table defined in the same scope, Table2.
So what is the solution? Alias and qualify your columns:
SELECT T1.MyColumn
FROM dbo.Table1 T1
WHERE T1.MyColumn IN (SELECT T2.MyColumn FROM dbo.Table2 T2);
This will, unsurprisingly, error as Table2 has no column MyColumn.
Personally, I suggest that unless you have only one table being referenced in a query, you alias and qualify all your columns. This not only ensures that the wrong column can't be referenced (such as in a subquery) but also means that other readers know exactly what columns are being referenced. It also stops failures in the future. I have honestly lost count how many times over years I have had a process fall over due to the "ambiguous column" error, due to a table's definition being changed and a query referencing the table wasn't properly qualified by the developer...

How to create INSERT logs from SELECTs?

As school work we're supposed to create a table that logs all operations done by users on another table. To be more clear, say I have table1 and logtable, table1 can contain any info (names, ids, job, etc), logtable contains info on who did what, when on table1. Using a function and a trigger I managed to get the INSERT, DELETE and UPDATE operations to be a logged in logtable, but we're also supposed to keep a log of SELECTs. To be more specific about the SELECTs, in a View if you do a SELECT, this is supposed to be logged into logtable via an INSERT, essentially the logtable is supposed to have a new row with information telling that somebody did a SELECT. My problem is that I can't figure out any way to accomplish this as SELECTs can't make use of triggers and in turn can't make use of functions, and rules don't allow for two different operations to take place. The only thing that came close was using query logs, however as the database is the school's and not mine I can't make any use of them.
Here is a rough example of what I'm working with (in reality tstamp has hours minutes and such):
id operation hid tablename who tstamp val_new val_old
x INSERT x table1 name YYYY-MM-DD newValues previousValues
That works as intended, but what I also need to get to work is this (Note: Whether val_new and old come out as empty or not in this case is not a concern):
id operation hid tablename who tstamp val_new val_old
x SELECT x table1 name YYYY-MM-DD NULL previousValues
Any and all help is appreciated.
Here is an example:
CREATE TABLE public.test (id integer PRIMARY KEY, value integer);
INSERT INTO test VALUES (1,42),(2,13);
CREATE TABLE test_log(id serial primary key, dbuser varchar,datetime timestamp);
-- get_test() inserts username / timestamp into log, then returns all rows
-- of test
CREATE OR REPLACE FUNCTION get_test() RETURNS SETOF test AS '
INSERT INTO test_log (dbuser,datetime)VALUES(current_user,now());
SELECT * FROM test;'
language 'sql';
-- now a view returns the full row set of test by instead calling our function
CREATE VIEW test_v AS SELECT * FROM get_test();
SELECT * FROM test_v;
id | value
----+-------
1 | 42
2 | 13
(2 rows)
SELECT * FROM test_log;
id | dbuser | datetime
----+----------+----------------------------
1 | postgres | 2020-11-30 12:42:00.188341
(1 row)
If your table has many rows and/or the selects are complex, you don't want to use this view for performance reasons.

Primary key duplicate in a table-valued parameter in stored procedure

I am using following code to insert date by Table Valued Parameter in my SP. Actually it works when one record exists in my TVP but when it has more than one record it raises the following error :
'Violation of Primary key constraint 'PK_ReceivedCash''. Cannot insert duplicate key in object 'Banking.ReceivedCash'. The statement has been terminated.
insert into banking.receivedcash(ReceivedCashID,Date,Time)
select (select isnull(Max(ReceivedCashID),0)+1 from Banking.ReceivedCash),t.Date,t.Time from #TVPCash as t
Your query is indeed flawed if there is more than one row in #TVPCash. The query to retrieve the maximum ReceivedCashID is a constant, which is then used for each row in #TVPCash to insert into Banking.ReceivedCash.
I strongly suggest finding alternatives rather than doing it this way. Multiple users might run this query and retrieve the same maximum. If you insist on keeping the query as it is, try running the following:
insert into banking.receivedcash(
ReceivedCashID,
Date,
Time
)
select
(select isnull(Max(ReceivedCashID),0) from Banking.ReceivedCash)+
ROW_NUMBER() OVER(ORDER BY t.Date,t.Time),
t.Date,
t.Time
from
#TVPCash as t
This uses ROW_NUMBER to count the row number in #TVPCash and adds this to the maximum ReceivedCashID of Banking.ReceivedCash.

Why is this Postgres query throwing "duplicate key value violates unique constraint"? [duplicate]

This question already has answers here:
Insert, on duplicate update in PostgreSQL?
(18 answers)
Closed 8 years ago.
I've implemented simple update/insert query like this:
-- NOTE: :time is replaced in real code, ids are placed statically for example purposes
-- set status_id=1 to existing rows, update others
UPDATE account_addresses
SET status_id = 1, updated_at = :time
WHERE account_id = 1
AND address_id IN (1,2,3)
AND status_id IN (2);
-- filter values according to what that update query returns, i.e. construct query like this to insert remaining new records:
INSERT INTO account_addresses (account_id, address_id, status_id, created_at, updated_at)
SELECT account_id, address_id, status_id, created_at::timestamptz, updated_at::timestamptz
FROM (VALUES (1,1,1,:time,:time),(1,2,1,:time,:time)) AS sub(account_id, address_id, status_id, created_at, updated_at)
WHERE NOT EXISTS (
SELECT 1 FROM account_addresses AS aa2
WHERE aa2.account_id = sub.account_id AND aa2.address_id = sub.address_id
)
RETURNING id;
-- throws:
-- PG::UniqueViolation: ERROR: duplicate key value violates unique constraint "..."
-- DETAIL: Key (account_id, address_id)=(1, 1) already exists.
The reason why I'm doing it this way is: the record MAY exist with status_id=2. If so, set status_id=1.
Then insert new records. If it already exists, but was not affected by first UPDATE query, ignore it (i.e. rows with status_id=3).
This works nicely, but doing it concurrently, it crashes on duplicate key in race condition.
But why is race condition occurring, if I'm trying to do that "insert-where-not-exists" atomically?
Ah. I just searched a little more and insert where not exists is not atomic.
Quote from http://www.postgresql.org/message-id/26970.1296761016#sss.pgh.pa.us :
Mage writes:
The main question is that isn't "insert into ... select ... where not
exists" atomic?
No, it isn't: it will fail in the presence of other transactions
doing the same thing, because the EXISTS test will only see rows that
committed before the command started. You might care to read the
manual's chapter about concurrency:
http://www.postgresql.org/docs/9.0/static/mvcc.html