KNIME does not create TEMP table

KNIME does not create TEMP table - postgresql

I have tried to create TEMP table using "Database SQL Executor" node using expression:
CREATE TEMPORARY TABLE tmp_table AS
SELECT type_id, created_at
FROM t1
LEFT JOIN t2 ON t1.id = t2.id
WHERE t2.id = (SELECT id
FROM t2
WHERE id = t1.id
AND event_date <= created_at
ORDER BY event_date DESC
LIMIT 1);
Unfortunately despite the fact there is neither no mistake in pure syntax (above code works when run from PSQL console) nor error after node execution, table does not exist in database.
EDIT : Thanks to #SABER-FICTIONALCHARACTER

Related

Update query with null left join condition in postgreSQL

I'm currently migrating from SQL Server to PostgreSQL and got confuse with update query in postgres.
I have query like this in SQL Server:
UPDATE t1
SET col1 = 'xx'
FROM table1 t1
LEFT JOIN table2 t2 ON t1.id = t2.id
WHERE t2.id is null
how do you do this in postgres?
thanks in advance

The left join is used to simulate a "NOT EXISTS" condition, so you can rewrite it to:
update table1 t1
set col1 = 'xx'
where not exists (select *
from table2 t2
where t1.id = t2.id);
As a side note: Postgres works differently than SQL Server and in general you should not repeat the target table of an UPDATE statement in the FROM clause in Postgres.

Postgres JOIN on timestamp fails

Trying to do a simple FULL OUTER JOIN on a timestamp and it is outputing the full cartesian product instead of matching identical dates. What is wrong here?
SQL Fiddle with example data
CREATE TABLE A (
id INT,
time TIMESTAMP
);
CREATE TABLE B (
id INT,
time TIMESTAMP
);
Query:
SELECT A.Id AS a_id, A.Time AS a_time, B.Id AS b_id, B.Time AS b_time
FROM A
FULL OUTER JOIN B ON A.Time = B.Time
-- This works:
-- SELECT A.id, A.time, B.id, B.time
-- FROM A
-- FULL OUTER JOIN B ON A.id = B.id

You are using the wrong parameters on TO_DATE() on your INSERTS easy to test if you do
SELECT * FROM A;
SELECT * FROM B;
Instead of
TO_DATE('01-01-2002', '%d-%m-%Y')
Should be:
TO_DATE('01-01-2002', '%DD-%MM-%Y')
SQL DEMO

In your sql fiddle all your inserted dates are the same because your date pattern is wrong. Try using TO_DATE('01-01-2002', 'DD-MM-YYYY') instead of TO_DATE('01-01-2002', '%d-%m-%y')

postgresql 9.5.7: INSERT WHERE NOT IN (or NOT EXISTS) not working with bulk-insert of multiple lines at once

I want to copy rows from one table t2 to another t1, while excluding rows with values already existing in t1. The usual approach of 'NOT IN' works fine but only as long there are not multiple occurences of the same value in the source table t2.
Now, assuming I have two tables with the schema:
CREATE TABLE t1 ( id INTEGER );
CREATE TABLE t2 ( id INTEGER );
then insert data into them like:
INSERT INTO t1 VALUES (1);
INSERT INTO t2 VALUES (1);
INSERT INTO t2 VALUES (2);
Now, I try to insert all data from t2 into t1 but exclude pre-existing in t1:
INSERT INTO t1 (id) SELECT t2.id FROM t2
WHERE t2.id NOT IN ( SELECT t1.id FROM t1 WHERE t1.id = t2.id );
it works flawlessly; the row in t2 with the value of '1' did not get insert a second time into t1:
SELECT * FROM t1;
id
----
1
2
(2 rows)
But when there are multiple occurences of the same value in t2 it doesn't check if they exist in t1 for each individual insert, but for the whole transaction as it seems. Let's continue with my example by:
DELETE FROM t1;
INSERT INTO t2 VALUES (2);
SELECT * FROM t2;
id
----
1
2
2
(3 rows)
INSERT INTO t1 (id) SELECT t2.id FROM t2
WHERE t2.id NOT IN ( SELECT t1.id FROM t1 WHERE t1.id = t2.id );
SELECT * FROM t1;
id
----
1
2
2
(3 rows)
The same result is achieved with WHERE NOT EXISTS as well.
Has anyone an idea of how to check for existing values in t1 on an individual row-level to prevent multiple occurences?
I could as well use ON CONFLICT DO ... but I rather not want to since the idea is to split the data coming from t2 into a "clean" t1 and a "dirty" t1_faulty where all the rows are collected which do not fit some given criteria (one of which the uniqueness of id for which I am asking this question).

I think you could simply filter the records you want from the source table (t2).
you might use distinct on
INSERT INTO t1 (id) SELECT distinct on (t2.id) t2.id FROM t2
WHERE t2.id NOT IN ( SELECT t1.id FROM t1 WHERE t1.id = t2.id );
or group by
INSERT INTO t1 (id) SELECT t2.id FROM t2
WHERE t2.id NOT IN ( SELECT t1.id FROM t1 WHERE t1.id = t2.id ) group by t2.id;
or, if you want only the records that are already unique on t2, add a having count = 1
INSERT INTO t1 (id) SELECT t2.id FROM t2
WHERE t2.id NOT IN ( SELECT t1.id FROM t1 WHERE t1.id = t2.id )
group by t2.id
having count(t2.id) = 1

Postgres "NOT IN" operator usage

In my database when I run the following query, I get 1077 as output.
select count(distinct a_t1) from t1;
Then, when I run this query, I get 459.
select count(distinct a_t1) from t1
where a_t1 in (select a_t1 from t1 join t2 using (a_t1_t2) where a_t2=0);
The above is the same as, this query which also give 459:
select count(distinct a_t1) from t1 join t2 using (a_t1_t2) where a_t2=0
But when I run this query, I get 0 instead of 618 which I was expecting:
select count(distinct a_t1) from t1
where a_t1 not in (select a_t1 from t1 join t2 using (a_t1_t2) where a_t2=0);
I am running PostgreSQL 9.1.5, which really might not be necessary. Please point out my mistake in the above query.
UPDATE 1:
I created a new table and output the result of the subquery above into that one. Then, I ran a few queries:
select count(distinct a_t1) from t1
where a_t1 not in (select a_t1 from sub_query_table order by a_t1 limit 10);
And Hooray! now I get 10 as the answer! I was able to increase the limit until 450. After that, I started getting 0 again.
UPDATE 2:
The sub_query_table has 459 values in it. Finally, this query gives me the required answer:
select count(distinct a_t1) from t1
where a_t1 not in (select a_t1 from sub_query_table order by a_t1 limit 459);
Where as this one, gives 0 as the answer:
select count(distinct a_t1) from t1
where a_t1 not in (select a_t1 from sub_query_table);
But, why is this happening?

The 'NOT IN' operator works only over 'NOT NULL'. Columns with a value of null are not matched.
select count(distinct a_t1) from t1
where a_t1 not in (select a_t1 from t1 join t2 using (a_t1_t2) where a_t2=0) OR a_t1 IS NULL;

How to delete a huge amount of data without increasing CPU on server?

I need to delete a huge amount of data without increasing CPU SQL server.
Here is example of my query. Subquery returns about 999K Rows and I need to delete one by one. But the problem it deletes first thousand and gives error
Msg -2, Level 11, State 0, Line 0
Timeout expired. The timeout period elapsed prior to completion of the operation or the server is not responding.
DECLARE #i INT
CREATE TABLE #TempListingTable (ID INT Primary Key IDENTITY(1,1), ListingID INT)
DECLARE #numrows INT
DECLARE #ListingID INT
INSERT #TempListingTable
SELECT T1.ListingID
FROM Table T1 WITH(NOLOCK)
LEFT OUTER JOIN Table T2
ON T1.ID = T2.ID
WHERE T1.ID IS NULL AND T1.ID IS NOT NULL
SET #i = 1
SET #numrows = (SELECT COUNT(*) FROM #TempListingTable)
IF #numrows > 0
WHILE (#i <= (SELECT MAX(ID) FROM #TempListingTable))
BEGIN
SET #ListingID = (SELECT ListingID FROM #TempListingTable WHERE ID = #i)
DELETE Listing WHERE ListingID = #ListingID
SET #i = #i + 1
END
If I delete in subquery like you can see below the CPU goes up and gives timeouts
DELETE T1
FROM Table T1 WITH(NOLOCK)
LEFT OUTER JOIN Table T2
ON T1.ID = T2.ID
WHERE T1.ID IS NULL AND T1.ID IS NOT NULL
What would be the best approach in this case?

You need to fix your sql first, it should never delete any rows since T1.ID is null and T1.ID is not null. Fix that and use something like this
WHILE 1 = 1
BEGIN
DELETE TOP 1000 T1
FROM Table T1 WITH(NOLOCK)
LEFT OUTER JOIN Table T2
ON T1.ID = T2.ID
WHERE 1 = 2
-- replace 'where' statement with a prober wherestatement.
-- I assume this is the 'where' statement you want
--WHERE T2.ID IS NULL
--AND T1.ID IS NOT NULL
IF ##ROWCOUNT = 0 BREAK
END

That's a .NET timeout, and nothing to do with SQL Server. You don't need to increase the timeout, you need to batch the deletes. If you say it deletes the first thousand and then gives an error, take your script to do the deletes and throw it in a proc that limits the number of records to delete to 1000 by using either TOP 1000 or DELETE TOP(1000), and then have the application fire that proc repeatedly. Have the proc give a return value (0 if there are no more rows to delete, 1 if there are) to control the application firing the proc.
Otherwise, can you elaborate on why you need to delete almost 100K rows one at a time?

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

KNIME does not create TEMP table - postgresql

Related

Update query with null left join condition in postgreSQL

Postgres JOIN on timestamp fails

postgresql 9.5.7: INSERT WHERE NOT IN (or NOT EXISTS) not working with bulk-insert of multiple lines at once

Postgres "NOT IN" operator usage

How to delete a huge amount of data without increasing CPU on server?

Categories

Resources