How do I specify multiple sort key columns?

How do I specify multiple sort key columns? - amazon-redshift

This question here indicates that you can have multiple sort key columns. However, I can not figure out the correct syntax. This works fine for one column:
create table elt.tmptmp (
val1 smallint sortkey,
val2 smallint,
);
This is how I'd assume it would work for multiple columns, but it results in an error:
create table elt.tmptmp (
val1 smallint,
val2 smallint,
sortkey(val1, val2)
);
ERROR: syntax error at or near "("
How do I specify a sort key on multiple columns?

create table tablename (...) sortkey (..., ...);
in your case, this should work:
create table elt.tmptmp (
val1 smallint,
val2 smallint,
)
sortkey(val1, val2);
as in Create Table - Amazon.

Related

Does column order matter when defining unique constraints

How is this
CREATE TABLE foo (
id SERIAL PRIMARY KEY,
col1 VARCHAR(50) NOT NULL,
col2 VARCHAR(50) NOT NULL,
col3 DOUBLE PRECISION NULL,
UNIQUE(col1, col2)
);
Different from this?
CREATE TABLE foo (
id SERIAL PRIMARY KEY,
col1 VARCHAR(50) NOT NULL,
col2 VARCHAR(50) NOT NULL,
col3 DOUBLE PRECISION NULL,
UNIQUE(col2, col1) -- reversed column ordering
);
From what I understand both commands will generate an index on the two columns to enforce the unique constraint but with different ordering.
So I would not need to generate a separate index to speed up queries like this in either case.
SELECT id, col3 FROM foo WHERE col1 = 'stack' AND col2 = 'overflow'
However if future queries will also involve querying by column "col2" alone like below the latter form is preferred because the index will still be usable right?
SELECT id, col3 FROM foo WHERE col2 = 'overflow'

The order matters if you expect to ever use the index as a partial index. For example, suppose you had a unique index on (col1, col2), and you wanted to optimize the following query:
SELECT col1, col2 FROM foo WHERE col1 = 'stack';
The index on (col1, col2) could still be used here, because col1, which appears in the WHERE clause, is the leftmost portion of the index. Had you defined the unique constraint on (col2, col1), the index could not be used for this query.

Postgresql - retrieving referenced fields in a query

I have a table created like
CREATE TABLE data
(value1 smallint references labels,
value2 smallint references labels,
value3 smallint references labels,
otherdata varchar(32)
);
and a second 'label holding' table created like
CREATE TABLE labels (id serial primary key, name varchar(32));
The rationale behind it is that value1-3 are a very limited set of strings (6 options) and it seems inefficient to enter them directly in the data table as varchar types. On the other hand these do occasionally change, which makes enum types unsuitable.
My question is, how can I execute a single query such that instead of the label IDs I get the relevant labels?
I looked at creating a function for it and stumbled at the point where I needed to pass the label holding table name to the function (there are several such (label holding) tables across the schema). Do I need to create a function per label table to avoid that?
create or replace function translate
(ref_id smallint,reference_table regclass) returns varchar(128) as
$$
begin
select name from reference_table where id = ref_id;
return name;
end;
$$
language plpgsql;
And then do
select
translate(value1, labels) as foo,
translate(value2, labels) as bar
from data;
This however errors out with
ERROR: relation "reference_table" does not exist
All suggestions welcome - at this point a can still alter just about anything...

CREATE TABLE labels
( id smallserial primary key
, name varchar(32) UNIQUE -- <<-- might want this, too
);
CREATE TABLE data
( value1 smallint NOT NULL REFERENCES labels(id) -- <<-- here
, value2 smallint NOT NULL REFERENCES labels(id)
, value3 smallint NOT NULL REFERENCES labels(id)
, otherdata varchar(32)
, PRIMARY KEY (value1,value2,value3) -- <<-- added primary key here
);
-- No need for a function here.
-- For small sizes of the `labels` table, the query below will always
-- result in hash-joins to perform the lookups.
SELECT l1.name AS name1, l2.name AS name2, l3.name AS name3
, d.otherdata AS the_data
FROM data d
JOIN labels l1 ON l1.id = d.value1
JOIN labels l2 ON l2.id = d.value2
JOIN labels l3 ON l3.id = d.value3
;
Note: labels.id -> labels.name is a functional dependency (id is the primary key), but that doesn't mean that you need a function. The query just acts like a function.

You can pass the label table name as string, construct a query as string and execute it:
sql = `select name from ` || reference_table_name || `where id = ` || ref_id;
EXECUTE sql INTO name;
RETURN name;

Access database, Sql query , Error "Syntax error in DROP TABLE or DROP INDEX."

This is the query , running this in C#.
n getting above error
"DROP TABLE IF EXISTS `NATIONAL_ID_ISSUANCE_CENTER`;
CREATE TABLE `NATIONAL_ID_ISSUANCE_CENTER` (
`ID` INTEGER NOT NULL AUTO_INCREMENT,
`NAME` VARCHAR(100),
`APPLICATION_ID` INTEGER,
`STATUS` INTEGER,
`CREATED_BY` INTEGER,
`UPDATED_BY` INTEGER,
`CREATED_DATE` DATETIME,
`UPDATED_DATE` DATETIME,
`THIRD_PARTY_ID` INTEGER,
`PROVINCE_ID` INTEGER,
INDEX (`APPLICATION_ID`),
PRIMARY KEY (`ID`),
INDEX (`PROVINCE_ID`),
INDEX (`THIRD_PARTY_ID`)
)"

You can't put an IF statement inside Drop and Create statements. Anytime you want to drop a table that you're not sure exists, use the following:
IF(OBJECT_ID('[Database].[Schema].[TableName]') is not null)
BEGIN
DROP TABLE [Database].[Schema].[TableName];
END;
Please note you should replace [Database], [Schema], and [TableName] with the appropriate database, schema, and table names, respectively.

Use a sequence to populate multiple columns with successive values

How can I use default constraints, triggers, or some other mechanism to automatically insert multiple successive values from a sequence into multiple columns on the same row of a table?
A standard use of a sequence in SQL Server is to combine it with default constraints on multiple tables to essentially get a cross-table identity. See for example the section "C. Using a Sequence Number in Multiple Tables" in the Microsoft documentation article "Sequence Numbers".
This works great if you only want to get a single value from the sequence for each row inserted. But sometimes I want to get multiple successive values. So theoretically I would create a sequence and table like this:
CREATE SEQUENCE DocumentationIDs;
CREATE TABLE Product
(
ProductID BIGINT NOT NULL IDENTITY(1, 1) PRIMARY KEY
, ProductName NVARCHAR(100) NOT NULL
, MarketingDocumentationID BIGINT NOT NULL DEFAULT ( NEXT VALUE FOR DocumentationIDs )
, TechnicalDocumentationID BIGINT NOT NULL DEFAULT ( NEXT VALUE FOR DocumentationIDs )
, InternalDocumentationID BIGINT NOT NULL DEFAULT ( NEXT VALUE FOR DocumentationIDs )
);
Unfortunately this will insert the same value in all three columns. This is by design:
If there are multiple instances of the NEXT VALUE FOR function specifying the same sequence generator within a single Transact-SQL statement, all those instances return the same value for a given row processed by that Transact-SQL statement. This behavior is consistent with the ANSI standard.
Increment by hack
The only suggestion I could find online was to use a hack where you have the sequence increment by the number of columns you need to insert (three in my contrived example) and manually add to the NEXT VALUE FOR function in the default constraint:
CREATE SEQUENCE DocumentationIDs START WITH 1 INCREMENT BY 3;
CREATE TABLE Product
(
ProductID BIGINT NOT NULL IDENTITY(1, 1) PRIMARY KEY
, ProductName NVARCHAR(100) NOT NULL
, MarketingDocumentationID BIGINT NOT NULL DEFAULT ( NEXT VALUE FOR DocumentationIDs )
, TechnicalDocumentationID BIGINT NOT NULL DEFAULT ( ( NEXT VALUE FOR DocumentationIDs ) + 1 )
, InternalDocumentationID BIGINT NOT NULL DEFAULT ( ( NEXT VALUE FOR DocumentationIDs ) + 2 )
)
This does not work for me because not all tables using my sequence require the same number of values.

One possible way using AFTER INSERT trigger is following.
Table definition need to be changed slighlty (DocumentationID columns should be defaulted to 0, or allowed to be nullable):
CREATE TABLE Product
(
ProductID BIGINT NOT NULL IDENTITY(1, 1)
, ProductName NVARCHAR(100) NOT NULL
, MarketingDocumentationID BIGINT NOT NULL CONSTRAINT DF_Product_1 DEFAULT (0)
, TechnicalDocumentationID BIGINT NOT NULL CONSTRAINT DF_Product_2 DEFAULT (0)
, InternalDocumentationID BIGINT NOT NULL CONSTRAINT DF_Product_3 DEFAULT (0)
, CONSTRAINT PK_Product PRIMARY KEY (ProductID)
);
And the trigger doing the job is following:
CREATE TRIGGER Product_AfterInsert ON Product
AFTER INSERT
AS
BEGIN
SET NOCOUNT ON;
IF NOT EXISTS (SELECT 1 FROM INSERTED)
RETURN;
CREATE TABLE #DocIDs
(
ProductID BIGINT NOT NULL
, Num INT NOT NULL
, DocID BIGINT NOT NULL
, PRIMARY KEY (ProductID, Num)
);
INSERT INTO #DocIDs (ProductID, Num, DocID)
SELECT
i.ProductID
, r.n
, NEXT VALUE FOR DocumentationIDs OVER (ORDER BY i.ProductID, r.n)
FROM INSERTED i
CROSS APPLY (VALUES (1), (2), (3)) r(n)
;
WITH Docs (ProductID, MarketingDocID, TechnicalDocID, InternalDocID)
AS (
SELECT ProductID, [1], [2], [3]
FROM #DocIDs d
PIVOT (MAX(DocID) FOR Num IN ([1], [2], [3])) pvt
)
UPDATE p
SET
p.MarketingDocumentationID = d.MarketingDocID
, p.TechnicalDocumentationID = d.TechnicalDocID
, p.InternalDocumentationID = d.InternalDocID
FROM Product p
JOIN Docs d ON d.ProductID = p.ProductID
;
END

Multi-column partitioning in Greenplum

I am trying to do multi-column partitioning in Greenplum database using PostgreSQL. However I keep getting an error -
ERROR: partition key has 2 columns but 1 columns specified in VALUES
clause LINE 15: VALUES ('10001','2014-03-11'),
^
********** Error **********
ERROR: partition key has 2 columns but 1 columns specified in VALUES
clause SQL state: 42P16 Character: 341
This is the query that I used:
CREATE TABLE EMP_TABLE
(
EMP_ID CHARACTER VARYING(9) NOT NULL,
JOB_ID CHARACTER VARYING(10) NOT NULL,
DT_OF_JOIN DATE NOT NULL,
SALARY NUMERIC(20,8) NOT NULL
-- CONSTRAINT ENTITY_MODEL_SCORE_PKEY PRIMARY KEY (ENTITY_ID, MODEL_ID, MODEL_RUN_DT)
)
WITH (
OIDS=FALSE
)
DISTRIBUTED BY (EMP_ID)
PARTITION BY LIST(EMP_ID,DT_OF_JOIN)
(
VALUES ('10001','2014-03-11'),
VALUES ('10002','2014-03-12')
)
I am not sure what I am missing. Can someone help me with the right syntax to do multi-column partition in Greenplum using PostgreSQL?

You can try it following using subpartition
CREATE TABLE sandbox.EMP_TABLE
(
EMP_ID CHARACTER VARYING(9) NOT NULL,
JOB_ID CHARACTER VARYING(10) NOT NULL,
DT_OF_JOIN date NOT NULL,
SALARY NUMERIC(20,8) NOT NULL
-- CONSTRAINT ENTITY_MODEL_SCORE_PKEY PRIMARY KEY (ENTITY_ID, MODEL_ID, MODEL_RUN_DT)
)
WITH (
OIDS=FALSE
)
DISTRIBUTED BY (JOB_ID)
PARTITION BY LIST(EMP_ID)
SUBPARTITION BY LIST(DT_OF_JOIN)
SUBPARTITION TEMPLATE
(
SUBPARTITION year1 VALUES ('2014-03-11'),
SUBPARTITION year2 VALUES ('2014-03-12')
)
(
values ('1001'),
values('10002')
)

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

How do I specify multiple sort key columns? - amazon-redshift

create table tablename (...) sortkey (..., ...); in your case, this should work: create table elt.tmptmp ( val1 smallint, val2 smallint, ) sortkey(val1, val2); as in Create Table - Amazon.

Related

Does column order matter when defining unique constraints

Postgresql - retrieving referenced fields in a query

Access database, Sql query , Error "Syntax error in DROP TABLE or DROP INDEX."

Use a sequence to populate multiple columns with successive values

Multi-column partitioning in Greenplum

Categories

Resources