How would I do the following TSQL query in DB2? I'm having problems creating a temp table based on the results from a query.
SELECT
COLUMN_1, COLUMN_2, COLUMN_3
INTO #TEMP_A
FROM TABLE_A
WHERE COLUMN_1 = 1 AND COLUMN_2 = 2
The error message is:
"Error: SQL0104N An unexpected token "#TEMP_A" was found following "". Expected tokens may include: ":". SQLSTATE=42601"
You have to declare a temp table in DB2 before you can use it. Either with the same query you are running:
DECLARE GLOBAL TEMPORARY TABLE SESSION.YOUR_TEMP_TABLE_NAME AS (
SELECT COLUMN_1, COLUMN_2, COLUMN_3
FROM TABLE_A
) DEFINITION ONLY
Or "manually" define the columns:
DECLARE GLOBAL TEMPORARY TABLE SESSION.YOUR_TEMP_TABLE_NAME (
COLUMN_1 CHAR(10)
,COLUMN_2 TIMESTAMP
,COLUMN_3 INTEGER
)
Then populate it:
INSERT INTO SESSION.YOUR_TEMP_TABLE_NAME
SELECT COLUMN_1, COLUMN_2, COLUMN_3
FROM TABLE_A
WHERE COLUMN_1 = 1
AND COLUMN_2 = 2
It's not quite as straight-forward as in SQL Server. :)
And even though it's called a "global" temporary table, it only exists for the current session. Note that all temp tables should be prefixed with the SESSION schema. If you do not provide a schema name, then SESSION will be implied.
maybe the "with" clause is what you look for:
with TEMP_A as (
SELECT COLUMN_1, COLUMN_2, COLUMN_3
FROM TABLE_A
WHERE COLUMN_1 = 1 AND COLUMN_2 = 2
)
-- now use TEMP_A
select * from TEMP_A
As it turned out, I did not have permissions to create temp tables.
Related
We have a static database we constantly update with loader scripts. These loader scripts get current information from third party sources, clean it and upload it to database.
I have already made some SQL scripts to ensure schemas and tables required exists. Now I'd like to check that each table has the expected row count.
I did something like this:
select case when count(*) = <someNumber>
then 'someSchema.someTable OK'
else 'someSchema.someTable BAD row count' end
from someSchema.someTable;
But doing these kind of queries for ~300 tables is cumbersome.
Now I was thinking maybe there's a way to have a table like:
create table expected_row_count (
schema_name varchar,
table_name varchar,
row_count bigint
);
And somehow test all listed tables and only output the ones that fail the count check. But I'm kind of missing now... Should I try to write a function? Can a table like this be used to build queries and execute them?
Whole credit goes to #a-horse_with*_no_name , I'm posting a reply for completeness:
Check row count
First let's create some data to test the query:
create schema if not exists data;
create table if not exists data.test1 (nothing int);
create table if not exists data.test2 (nothing int);
insert into data.test1 (nothing)
(select random() from generate_series(1, 28));
insert into data.test2 (nothing)
(select random() from generate_series(1, 55));
create table if not exists public.expected_row_count (
table_schema varchar not null default '',
table_name varchar not null default '',
row_count bigint not null default 0
);
insert into public.expected_row_count (table_schema, table_name, row_count) values
('data', 'test1', (select count(*) from data.test1)),
('data', 'test2', (select count(*) from data.test2))
;
Now the query to check the data:
select * from (
select
table_schema,
table_name,
(xpath('/row/cnt/text()', xml_count))[1]::text::int as row_count
from (
select
table_schema,
table_name,
query_to_xml(format('select count(*) as cnt from %I.%I', table_schema, table_name), false, true, '') as xml_count
from information_schema.tables
where table_schema = 'data' --<< change here for the schema you want
) infs ) as r
inner join expected_row_count erc
on r.table_schema = erc.table_schema
and r.table_name = erc.table_name
and r.row_count != erc.row_count
;
Previous query should give an empty results if all counts are ok, and the
tables with missing data if not. To check it, update the count for some
table on expected_row_count and re-run the query. For example:
update expected_row_count set row_count = 666 where table_name = 'test1';
Let's say we have the following table:
CREATE TABLE foo (
column_1 bigint,
column_2 bytea DEFAULT gen_random_bytes(2),
PRIMARY KEY (column_1, column_2)
);
Note: We want column_2 to be random & cryptographically strong.
How do we insert a row without causing a primary key conflict?
I guess we'd have to do a loop until gen_random_bytes(2) returns a unique result? If so, can we do this loop with pure SQL, maybe with recursive CTE, instead of with plpgsql?
insert into t (col1, col2)
select 1, ('\x' || right('000' || to_hex(i), 4))::bytea
from (
select generate_series(0, 65535) i
except
select get_byte(col2, 0) * 256 + get_byte(col2, 1)
from t
where col1 = 1
) s
order by random()
limit 1
Hello what is the easiest way to duplicate a DB record over the same table?
My problem is that the table where I am doing this has many column, like 100+, and I don't like how the solution looks like. Here is what I do (this is inside plpqsql function):
...
1. duplicate record
INSERT INTO history
(SELECT NEXTVAL('history_id_seq'), col_1, col_2, ... , col_100)
FROM history
WHERE history_id = 1234
ORDER BY datetime DESC
LIMIT 1)
RETURNING
history_id INTO new_history_id;
2. update some columns
UPDATE history
SET
col_5 = 'test_5',
col_23 = 'test_23',
datetime = CURRENT_TIMESTAMP
WHERE history_id = new_history_id;
Here are the problems I am attempting to solve
Listing all these 100+ columns looks lame
When new column is added eventually the function should be updated too
On separate DB instances the column order might differ, which would cause the function fail
I am not sure if I can list them once more (solving issue 3) like insert into <table> (<columns_list>) values (<query>) but then the query looks even uglier.
I would like to achieve something like 'insert into ', but this seems impossible the unique primary key constraint will raise a duplication error.
Any suggestions?
Thanks in advance for you time.
This isn't pretty or particularly optimized but there are a couple of ways to go about this. Ideally, you might want to do this all in an UPDATE trigger though you could implement a duplication function something like this:
-- create source table
CREATE TABLE history (history_id serial not null primary key, col_2 int, col_3 int, col_4 int, datetime timestamptz default now());
-- add some data
INSERT INTO history (col_2, col_3, col_4)
SELECT g, g * 10, g * 100 FROM generate_series(1, 100) AS g;
-- function to duplicate record
CREATE OR REPLACE FUNCTION fn_history_duplicate(p_history_id integer) RETURNS SETOF history AS
$BODY$
DECLARE
cols text;
insert_statement text;
BEGIN
-- build list of columns
SELECT array_to_string(array_agg(column_name::name), ',') INTO cols
FROM information_schema.columns
WHERE (table_schema, table_name) = ('public', 'history')
AND column_name <> 'history_id';
-- build insert statement
insert_statement := 'INSERT INTO history (' || cols || ') SELECT ' || cols || ' FROM history WHERE history_id = $1 RETURNING *';
-- execute statement
RETURN QUERY EXECUTE insert_statement USING p_history_id;
RETURN;
END;
$BODY$
LANGUAGE 'plpgsql';
-- test
SELECT * FROM fn_history_duplicate(1);
history_id | col_2 | col_3 | col_4 | datetime
------------+-------+-------+-------+-------------------------------
101 | 1 | 10 | 100 | 2013-04-15 14:56:11.131507+00
(1 row)
As I noted in my original comment, you might also take a look at the colnames extension as an alternative to querying the information schema.
You don't need the update anyway, you can supply the constant values directly in the SELECT statement:
INSERT INTO history
SELECT NEXTVAL('history_id_seq'),
col_1,
col_2,
col_3,
col_4,
'test_5',
...
'test_23',
...,
col_100
FROM history
WHERE history_sid = 1234
ORDER BY datetime DESC
LIMIT 1
RETURNING history_sid INTO new_history_sid;
This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
delete duplicate records in SQL Server
I have a table in which unique records are denoted by a composite key, such as (COL_A, COL_B).
I have checked and confirmed that I have duplicate rows in my table by using the following query:
select COL_A, COL_B, COUNT(*)
from MY_TABLE
group by COL_A, COL_B
having count(*) > 1
order by count(*) desc
Now, I would like to remove all duplicate records but keep only one.
Could someone please shed some light on how to achieve this with 2 columns?
EDIT:
Assume the table only has COL_A and COL_B
1st solution,
It is flexible, because you can add more columns than COL_A and COL_B :
-- create table with identity filed
-- using idenity we can decide which row we can delete
create table MY_TABLE_COPY
(
id int identity,
COL_A varchar(30),
COL_B varchar(30)
/*
other columns
*/
)
go
-- copy data
insert into MY_TABLE_COPY (COL_A,COL_B/*other columns*/)
select COL_A, COL_B /*other columns*/
from MY_TABLE
group by COL_A, COL_B
having count(*) > 1
-- delete data from MY_TABLE
-- only duplicates (!)
delete MY_TABLE
from MY_TABLE_COPY c, MY_TABLE t
where c.COL_A=t.COL_A
and c.COL_B=t.COL_B
go
-- copy data without duplicates
insert into MY_TABLE (COL_A, COL_B /*other columns*/)
select t.COL_A, t.COL_B /*other columns*/
from MY_TABLE_COPY t
where t.id = (
select max(id)
from MY_TABLE_COPY c
where t.COL_A = c.COL_A
and t.COL_B = c.COL_B
)
go
2nd solution
If you have really two columns in MY_TABLE you can use:
-- create table and copy data
select distinct COL_A, COL_B
into MY_TABLE_COPY
from MY_TABLE
-- delete data from MY_TABLE
-- only duplicates (!)
delete MY_TABLE
from MY_TABLE_COPY c, MY_TABLE t
where c.COL_A=t.COL_A
and c.COL_B=t.COL_B
go
-- copy data without duplicates
insert into MY_TABLE
select t.COL_A, t.COL_B
from MY_TABLE_COPY t
go
Try:
-- Copy Current Table
SELECT * INTO #MY_TABLE_COPY FROM MY_TABLE
-- Delte all rows from current able
DELETE FROM MY_TABLE
-- Insert only unique values, removing your duplicates
INSERT INTO MY_TABLE
SELECT DISTINCT * FROM #MY_TABLE_COPY
-- Remove Temp Table
DROP TABLE #MY_TABLE_COPY
That should work as long as you don't break any foreign keys when deleting rows from MY_TABLE.
I'm using PostgreSQL 9.0 and I have a table with just an artificial key (auto-incrementing sequence) and another unique key. (Yes, there is a reason for this table. :)) I want to look up an ID by the other key or, if it doesn't exist, insert it:
SELECT id
FROM mytable
WHERE other_key = 'SOMETHING'
Then, if no match:
INSERT INTO mytable (other_key)
VALUES ('SOMETHING')
RETURNING id
The question: is it possible to save a round-trip to the DB by doing both of these in one statement? I can insert the row if it doesn't exist like this:
INSERT INTO mytable (other_key)
SELECT 'SOMETHING'
WHERE NOT EXISTS (SELECT * FROM mytable WHERE other_key = 'SOMETHING')
RETURNING id
... but that doesn't give the ID of an existing row. Any ideas? There is a unique constraint on other_key, if that helps.
Have you tried to union it?
Edit - this requires Postgres 9.1:
create table mytable (id serial primary key, other_key varchar not null unique);
WITH new_row AS (
INSERT INTO mytable (other_key)
SELECT 'SOMETHING'
WHERE NOT EXISTS (SELECT * FROM mytable WHERE other_key = 'SOMETHING')
RETURNING *
)
SELECT * FROM new_row
UNION
SELECT * FROM mytable WHERE other_key = 'SOMETHING';
results in:
id | other_key
----+-----------
1 | SOMETHING
(1 row)
No, there is no special SQL syntax that allows you to do select or insert. You can do what Ilia mentions and create a sproc, which means it will not do a round trip fromt he client to server, but it will still result in two queries (three actually, if you count the sproc itself).
using 9.5 i successfully tried this
based on Denis de Bernardy's answer
only 1 parameter
no union
no stored procedure
atomic, thus no concurrency problems (i think...)
The Query:
WITH neworexisting AS (
INSERT INTO mytable(other_key) VALUES('hello 2')
ON CONFLICT(other_key) DO UPDATE SET existed=true -- need some update to return sth
RETURNING *
)
SELECT * FROM neworexisting
first call:
id|other_key|created |existed|
--|---------|-------------------|-------|
6|hello 1 |2019-09-11 11:39:29|false |
second call:
id|other_key|created |existed|
--|---------|-------------------|-------|
6|hello 1 |2019-09-11 11:39:29|true |
First create your table ;-)
CREATE TABLE mytable (
id serial NOT NULL,
other_key text NOT NULL,
created timestamptz NOT NULL DEFAULT now(),
existed bool NOT NULL DEFAULT false,
CONSTRAINT mytable_pk PRIMARY KEY (id),
CONSTRAINT mytable_uniq UNIQUE (other_key) --needed for on conflict
);
you can use a stored procedure
IF (SELECT id FROM mytable WHERE other_key = 'SOMETHING' LIMIT 1) < 0 THEN
INSERT INTO mytable (other_key) VALUES ('SOMETHING')
END IF
I have an alternative to Denis answer, that I think is less database-intensive, although a bit more complex:
create table mytable (id serial primary key, other_key varchar not null unique);
WITH table_sel AS (
SELECT id
FROM mytable
WHERE other_key = 'test'
UNION
SELECT NULL AS id
ORDER BY id NULLS LAST
LIMIT 1
), table_ins AS (
INSERT INTO mytable (id, other_key)
SELECT
COALESCE(id, NEXTVAL('mytable_id_seq'::REGCLASS)),
'test'
FROM table_sel
ON CONFLICT (id) DO NOTHING
RETURNING id
)
SELECT * FROM table_ins
UNION ALL
SELECT * FROM table_sel
WHERE id IS NOT NULL;
In table_sel CTE I'm looking for the right row. If I don't find it, I assure that table_sel returns at least one row, with a union with a SELECT NULL.
In table_ins CTE I try to insert the same row I was looking for earlier. COALESCE(id, NEXTVAL('mytable_id_seq'::REGCLASS)) is saying: id could be defined, if so, use it; whereas if id is null, increment the sequence on id and use this new value to insert a row. The ON CONFLICT clause assure
that if id is already in mytable I don't insert anything.
At the end I put everything together with a UNION between table_ins and table_sel, so that I'm sure to take my sweet id value and execute both CTE.
This query needs to search for the value other_key only once, and is a "search this value" not a "check if this value not exists in the table", that is very heavy; in Denis alternative you use other_key in both types of searches. In my query you "check if a value not exists" only on id that is a integer primary key, that, for construction, is fast.
Minor tweak a decade late to Denis's excellent answer:
-- Create the table with a unique constraint
CREATE TABLE mytable (
id serial PRIMARY KEY
, other_key varchar NOT NULL UNIQUE
);
WITH new_row AS (
-- Only insert when we don't find anything, avoiding a table lock if
-- possible.
INSERT INTO mytable ( other_key )
SELECT 'SOMETHING'
WHERE NOT EXISTS (
SELECT *
FROM mytable
WHERE other_key = 'SOMETHING'
)
RETURNING *
)
(
-- This comes first in the UNION ALL since it'll almost certainly be
-- in the query cache. Marginally slower for the insert case, but also
-- marginally faster for the much more common read-only case.
SELECT *
FROM mytable
WHERE other_key = 'SOMETHING'
-- Don't check for duplicates to be removed
UNION ALL
-- If we reach this point in iteration, we needed to do the INSERT and
-- lock after all.
SELECT *
FROM new_row
) LIMIT 1 -- Just return whatever comes first in the results and allow
-- the query engine to cut processing short for the INSERT
-- calculation.
;
The UNION ALL tells the planner it doesn't have to collect results for de-duplication. The LIMIT 1 at the end allows the planner to short-circuit further processing/iteration once it knows there's an answer available.
NOTE: There is a race condition present here and in the original answer. If the entry does not already exist, the INSERT will fail with a unique constraint violation. The error can be suppressed with ON CONFLICT DO NOTHING, but the query will return an empty set instead of the new row. This is a difficult problem because getting that info from another transaction would violate the I in ACID.