How I can create table with 10 MB test data in DB2 Express-C ?
Can you show me some basic example how to insert random data?
CREATE TABLE topic_sources (
topic_id integer NOT NULL,
platform varchar(50) NOT NULL,
keywords varchar(50) default NULL,
PRIMARY KEY (topic_id,platform)
);
You can use a recursive query for that, something like
insert into topic_sources (topic_id, platform, keywords)
with tmp (i) as (
select 1 from sysibm.sysdummy1
union all
select i+1 from tmp where i < 1000000
)
select
int(rand()*10000),
'platform'||int(rand()*10),
'keyword'||int(rand()*100)
from tmp
Adjust the random number ranges and the number of rows as appropriate.
The idea is taken from here (slide 14).
Related
How can I use default constraints, triggers, or some other mechanism to automatically insert multiple successive values from a sequence into multiple columns on the same row of a table?
A standard use of a sequence in SQL Server is to combine it with default constraints on multiple tables to essentially get a cross-table identity. See for example the section "C. Using a Sequence Number in Multiple Tables" in the Microsoft documentation article "Sequence Numbers".
This works great if you only want to get a single value from the sequence for each row inserted. But sometimes I want to get multiple successive values. So theoretically I would create a sequence and table like this:
CREATE SEQUENCE DocumentationIDs;
CREATE TABLE Product
(
ProductID BIGINT NOT NULL IDENTITY(1, 1) PRIMARY KEY
, ProductName NVARCHAR(100) NOT NULL
, MarketingDocumentationID BIGINT NOT NULL DEFAULT ( NEXT VALUE FOR DocumentationIDs )
, TechnicalDocumentationID BIGINT NOT NULL DEFAULT ( NEXT VALUE FOR DocumentationIDs )
, InternalDocumentationID BIGINT NOT NULL DEFAULT ( NEXT VALUE FOR DocumentationIDs )
);
Unfortunately this will insert the same value in all three columns. This is by design:
If there are multiple instances of the NEXT VALUE FOR function specifying the same sequence generator within a single Transact-SQL statement, all those instances return the same value for a given row processed by that Transact-SQL statement. This behavior is consistent with the ANSI standard.
Increment by hack
The only suggestion I could find online was to use a hack where you have the sequence increment by the number of columns you need to insert (three in my contrived example) and manually add to the NEXT VALUE FOR function in the default constraint:
CREATE SEQUENCE DocumentationIDs START WITH 1 INCREMENT BY 3;
CREATE TABLE Product
(
ProductID BIGINT NOT NULL IDENTITY(1, 1) PRIMARY KEY
, ProductName NVARCHAR(100) NOT NULL
, MarketingDocumentationID BIGINT NOT NULL DEFAULT ( NEXT VALUE FOR DocumentationIDs )
, TechnicalDocumentationID BIGINT NOT NULL DEFAULT ( ( NEXT VALUE FOR DocumentationIDs ) + 1 )
, InternalDocumentationID BIGINT NOT NULL DEFAULT ( ( NEXT VALUE FOR DocumentationIDs ) + 2 )
)
This does not work for me because not all tables using my sequence require the same number of values.
One possible way using AFTER INSERT trigger is following.
Table definition need to be changed slighlty (DocumentationID columns should be defaulted to 0, or allowed to be nullable):
CREATE TABLE Product
(
ProductID BIGINT NOT NULL IDENTITY(1, 1)
, ProductName NVARCHAR(100) NOT NULL
, MarketingDocumentationID BIGINT NOT NULL CONSTRAINT DF_Product_1 DEFAULT (0)
, TechnicalDocumentationID BIGINT NOT NULL CONSTRAINT DF_Product_2 DEFAULT (0)
, InternalDocumentationID BIGINT NOT NULL CONSTRAINT DF_Product_3 DEFAULT (0)
, CONSTRAINT PK_Product PRIMARY KEY (ProductID)
);
And the trigger doing the job is following:
CREATE TRIGGER Product_AfterInsert ON Product
AFTER INSERT
AS
BEGIN
SET NOCOUNT ON;
IF NOT EXISTS (SELECT 1 FROM INSERTED)
RETURN;
CREATE TABLE #DocIDs
(
ProductID BIGINT NOT NULL
, Num INT NOT NULL
, DocID BIGINT NOT NULL
, PRIMARY KEY (ProductID, Num)
);
INSERT INTO #DocIDs (ProductID, Num, DocID)
SELECT
i.ProductID
, r.n
, NEXT VALUE FOR DocumentationIDs OVER (ORDER BY i.ProductID, r.n)
FROM INSERTED i
CROSS APPLY (VALUES (1), (2), (3)) r(n)
;
WITH Docs (ProductID, MarketingDocID, TechnicalDocID, InternalDocID)
AS (
SELECT ProductID, [1], [2], [3]
FROM #DocIDs d
PIVOT (MAX(DocID) FOR Num IN ([1], [2], [3])) pvt
)
UPDATE p
SET
p.MarketingDocumentationID = d.MarketingDocID
, p.TechnicalDocumentationID = d.TechnicalDocID
, p.InternalDocumentationID = d.InternalDocID
FROM Product p
JOIN Docs d ON d.ProductID = p.ProductID
;
END
I have a database about weather that updates every second.
It contains temperature and wind speed.
This is my database:
CREATE TABLE `new_table`.`test` (
`id` INT(10) NOT NULL,
`date` DATETIME() NOT NULL,
`temperature` VARCHAR(25) NOT NULL,
`wind_speed` INT(10) NOT NULL,
`humidity` FLOAT NOT NULL,
PRIMARY KEY (`id`))
ENGINE = InnoDB
DEFAULT CHARACTER SET = utf8
COLLATE = utf8_bin;
I need to find the average temperature every hour.
This is my code:
Select SELECT AVG( temperature ), date
FROM new_table
GROUP BY HOUR ( date )
My coding is working but the problem is that I want to move the value and date of the average to another table.
This is the table:
CREATE TABLE `new_table.`table1` (
`idsea_state` INT(10) NOT NULL,
`dateavg` DATETIME() NOT NULL,
`avg_temperature` VARCHAR(25) NOT NULL,
PRIMARY KEY (`idsea_state`))
ENGINE = InnoDB
DEFAULT CHARACTER SET = utf8
COLLATE = utf8_bin;
Is it possible? Can you give me the coding?
In order to insert new rows into a database based on data you have obtained from another table, you can do this by setting up an INSERT query targeting the destination table, then run a sub-query which will pull the data from the source table and then the result set returned from the sub-query will be used to provide the VALUES used for the INSERT command
Here is the basic structure, note that the VALUES keyword is not used:
INSERT INTO `table1`
(`dateavg`, `avg_temperature`)
SELECT `date` , avg(`temperature`)
FROM `test`;
Its also important to note that the position of the columns returned by result set will be sequentially matched to its respective position in the INSERT fields of the outer query
e.g. if you had a query
INSERT INTO table1 (`foo`, `bar`, `baz`)
SELECT (`a`, `y`, `g`) FROM table2
a would be inserted into foo
y would go into bar
g would go into baz
due to their respective positions
I have made a working demo - http://www.sqlfiddle.com/#!9/ff740/4
I made the below changes to simplify the example and just demonstrate the concept involved.
Here is the DDL changes I made to your original code
CREATE TABLE `test` (
`id` INT(10) NOT NULL AUTO_INCREMENT,
`date` DATETIME NOT NULL,
`temperature` FLOAT NOT NULL,
`wind_speed` INT(10),
`humidity` FLOAT ,
PRIMARY KEY (`id`))
ENGINE = InnoDB
DEFAULT CHARACTER SET = utf8
COLLATE = utf8_bin;
CREATE TABLE `table1` (
`idsea_state` INT(10) NOT NULL AUTO_INCREMENT,
`dateavg` VARCHAR(55),
`avg_temperature` VARCHAR(25),
PRIMARY KEY (`idsea_state`))
ENGINE = InnoDB
DEFAULT CHARACTER SET = utf8
COLLATE = utf8_bin;
INSERT INTO `test`
(`date`, `temperature`) VALUES
('2013-05-03', 7.5),
('2013-06-12', 17.5),
('2013-10-12', 37.5);
INSERT INTO `table1`
(`dateavg`, `avg_temperature`)
SELECT `date` , avg(`temperature`)
FROM `test`;
I want to generate big data sample (almost 1 million records) for studying tuplesort.c's polyphase merge in postgresql, and I hope the schema as follows:
CREATE TABLE Departments (code VARCHAR(4), UNIQUE (code));
CREATE TABLE Towns (
id SERIAL UNIQUE NOT NULL,
code VARCHAR(10) NOT NULL, -- not unique
article TEXT,
name TEXT NOT NULL, -- not unique
department VARCHAR(4) NOT NULL REFERENCES Departments (code),
UNIQUE (code, department)
);
how to use generate_series and random for do it? thanks a lot!
To insert one million rows into Towns
insert into towns (
code, article, name, department
)
select
left(md5(i::text), 10),
md5(random()::text),
md5(random()::text),
left(md5(random()::text), 4)
from generate_series(1, 1000000) s(i)
Since id is a serial it is not necessary to include it.
I'm using PostgreSQL 9.0 and I have a table with just an artificial key (auto-incrementing sequence) and another unique key. (Yes, there is a reason for this table. :)) I want to look up an ID by the other key or, if it doesn't exist, insert it:
SELECT id
FROM mytable
WHERE other_key = 'SOMETHING'
Then, if no match:
INSERT INTO mytable (other_key)
VALUES ('SOMETHING')
RETURNING id
The question: is it possible to save a round-trip to the DB by doing both of these in one statement? I can insert the row if it doesn't exist like this:
INSERT INTO mytable (other_key)
SELECT 'SOMETHING'
WHERE NOT EXISTS (SELECT * FROM mytable WHERE other_key = 'SOMETHING')
RETURNING id
... but that doesn't give the ID of an existing row. Any ideas? There is a unique constraint on other_key, if that helps.
Have you tried to union it?
Edit - this requires Postgres 9.1:
create table mytable (id serial primary key, other_key varchar not null unique);
WITH new_row AS (
INSERT INTO mytable (other_key)
SELECT 'SOMETHING'
WHERE NOT EXISTS (SELECT * FROM mytable WHERE other_key = 'SOMETHING')
RETURNING *
)
SELECT * FROM new_row
UNION
SELECT * FROM mytable WHERE other_key = 'SOMETHING';
results in:
id | other_key
----+-----------
1 | SOMETHING
(1 row)
No, there is no special SQL syntax that allows you to do select or insert. You can do what Ilia mentions and create a sproc, which means it will not do a round trip fromt he client to server, but it will still result in two queries (three actually, if you count the sproc itself).
using 9.5 i successfully tried this
based on Denis de Bernardy's answer
only 1 parameter
no union
no stored procedure
atomic, thus no concurrency problems (i think...)
The Query:
WITH neworexisting AS (
INSERT INTO mytable(other_key) VALUES('hello 2')
ON CONFLICT(other_key) DO UPDATE SET existed=true -- need some update to return sth
RETURNING *
)
SELECT * FROM neworexisting
first call:
id|other_key|created |existed|
--|---------|-------------------|-------|
6|hello 1 |2019-09-11 11:39:29|false |
second call:
id|other_key|created |existed|
--|---------|-------------------|-------|
6|hello 1 |2019-09-11 11:39:29|true |
First create your table ;-)
CREATE TABLE mytable (
id serial NOT NULL,
other_key text NOT NULL,
created timestamptz NOT NULL DEFAULT now(),
existed bool NOT NULL DEFAULT false,
CONSTRAINT mytable_pk PRIMARY KEY (id),
CONSTRAINT mytable_uniq UNIQUE (other_key) --needed for on conflict
);
you can use a stored procedure
IF (SELECT id FROM mytable WHERE other_key = 'SOMETHING' LIMIT 1) < 0 THEN
INSERT INTO mytable (other_key) VALUES ('SOMETHING')
END IF
I have an alternative to Denis answer, that I think is less database-intensive, although a bit more complex:
create table mytable (id serial primary key, other_key varchar not null unique);
WITH table_sel AS (
SELECT id
FROM mytable
WHERE other_key = 'test'
UNION
SELECT NULL AS id
ORDER BY id NULLS LAST
LIMIT 1
), table_ins AS (
INSERT INTO mytable (id, other_key)
SELECT
COALESCE(id, NEXTVAL('mytable_id_seq'::REGCLASS)),
'test'
FROM table_sel
ON CONFLICT (id) DO NOTHING
RETURNING id
)
SELECT * FROM table_ins
UNION ALL
SELECT * FROM table_sel
WHERE id IS NOT NULL;
In table_sel CTE I'm looking for the right row. If I don't find it, I assure that table_sel returns at least one row, with a union with a SELECT NULL.
In table_ins CTE I try to insert the same row I was looking for earlier. COALESCE(id, NEXTVAL('mytable_id_seq'::REGCLASS)) is saying: id could be defined, if so, use it; whereas if id is null, increment the sequence on id and use this new value to insert a row. The ON CONFLICT clause assure
that if id is already in mytable I don't insert anything.
At the end I put everything together with a UNION between table_ins and table_sel, so that I'm sure to take my sweet id value and execute both CTE.
This query needs to search for the value other_key only once, and is a "search this value" not a "check if this value not exists in the table", that is very heavy; in Denis alternative you use other_key in both types of searches. In my query you "check if a value not exists" only on id that is a integer primary key, that, for construction, is fast.
Minor tweak a decade late to Denis's excellent answer:
-- Create the table with a unique constraint
CREATE TABLE mytable (
id serial PRIMARY KEY
, other_key varchar NOT NULL UNIQUE
);
WITH new_row AS (
-- Only insert when we don't find anything, avoiding a table lock if
-- possible.
INSERT INTO mytable ( other_key )
SELECT 'SOMETHING'
WHERE NOT EXISTS (
SELECT *
FROM mytable
WHERE other_key = 'SOMETHING'
)
RETURNING *
)
(
-- This comes first in the UNION ALL since it'll almost certainly be
-- in the query cache. Marginally slower for the insert case, but also
-- marginally faster for the much more common read-only case.
SELECT *
FROM mytable
WHERE other_key = 'SOMETHING'
-- Don't check for duplicates to be removed
UNION ALL
-- If we reach this point in iteration, we needed to do the INSERT and
-- lock after all.
SELECT *
FROM new_row
) LIMIT 1 -- Just return whatever comes first in the results and allow
-- the query engine to cut processing short for the INSERT
-- calculation.
;
The UNION ALL tells the planner it doesn't have to collect results for de-duplication. The LIMIT 1 at the end allows the planner to short-circuit further processing/iteration once it knows there's an answer available.
NOTE: There is a race condition present here and in the original answer. If the entry does not already exist, the INSERT will fail with a unique constraint violation. The error can be suppressed with ON CONFLICT DO NOTHING, but the query will return an empty set instead of the new row. This is a difficult problem because getting that info from another transaction would violate the I in ACID.
Running the following query (SQL Server 2000) the execution plan shows that it used an index seek and Profiler shows it's doing 71 reads with a duration of 0.
select top 1 id from table where name = '0010000546163' order by id desc
Contrast that with the following with uses an index scan with 8500 reads and a duration of about a second.
declare #p varchar(20)
select #p = '0010000546163'
select top 1 id from table where name = #p order by id desc
Why is the execution plan different? Is there a way to change the second method to seek?
thanks
EDIT
Table looks like
CREATE TABLE [table] (
[Id] [int] IDENTITY (1, 1) NOT NULL ,
[Name] [varchar] (13) COLLATE Latin1_General_CI_AS NOT NULL)
Id is primary clustered key
There is a non-unique index on Name and a unique composite index on id/name
There are other columns - left them out for brevity
Now you've added the schema, please try this. SQL Server treats length differences as different data types and will convert the varchar(13) column to match the varchar(20) variable
declare #p varchar(13)
If not, what about collation coercien? Is the DB or server different to the column?
declare #p varchar(13) COLLATE Latin1_General_CI_AS NOT NULL
If not, add this before and post results
SET SHOWPLAN_TEXT ON
GO
If the name column is NVARCHAR then u need your parameter to be also of the same type. It should then pick it up by index seek.
declare #p nvarchar(20)
select #p = N'0010000546163'
select top 1 id from table where name = #p order by id desc