Generate column value automatically from other columns values and be used as PRIMARY KEY - postgresql

I have a table with a column named "source" and "id". This table is populated from open data DB.
"id" can't be UNIQUE, since my data came from other db with their own id system. There is a real risk to have same id but really different data.
I want to create another column which combine source and id into a single value.
"openDataA" + 123456789 -> "openDataA123456789"
"openDataB" + 123456789 -> "openDataB123456789"
I have seen example that use || and function to concatenate value. This is good, but I want to make this third column my PRIMARY KEY, to avoid duplicate, and create a really unique id that I can query without much computation and that I can use as a foreign key constraint for other table.
I think Composite Types is what I'm looking for, but instead of setting the value manually each time, I want to grab them automatically by setting only "source" and "id"
I'm fairly new to postgresql, so any help is welcome.
Thank you.

You could just have a composite key in your table:
CREATE TABLE mytable (
source VARCHAR(10),
id VARCHAR(10),
PRIMARY KEY (source, id)
);
If you really want a joined column, you could create a view to display it:
CREATE VIEW myview AS
SELECT *, source || id AS primary_key
FROM mytable;

Related

How do I represent an array of tuples in postgresql?

Here's the easiest way I can think of to explain this. Imagine a user wants to bookmark a bunch of webpages. There's a url table with a UrlID and the actual url. I'd like the user to have a list of UrlIDs which are unique (but I don't need the constraint) and a 32bit int value such as an epoch date. The only two things I care about is 1) being to check if UrlID is in this list or not and 2) get the entire list and sort it by date (or second value)
If it helps I'm expecting no more than 8K bookmarks but most likely it will be <128
If you really want to avoid the extra table to express the relationship, you can do something like that:
CREATE TABLE "user" (
id integer primary key,
name text not null,
bookmarks integer[] not null
);
CREATE TABLE url (
id integer primary key,
time timestamp with time zone not null,
val text not null
);
Then finding all bookmarks for a particular user (say with id 66) would involve doing something like that:
SELECT url,time
FROM (SELECT bookmarks FROM "user" WHERE id=66) u
JOIN url ON url.id=ANY(bookmarks)
ORDER BY TIME;
Now here's why I don't like this schema. First, adding a new bookmark would require to rewrite the bookmarks array and hence the entire user row (so adding n bookmarks, one after the other, would require Θ(n^2) time). Secondly, you cannot use foreign keys on the elements of the array. Thridly, many queries will become more complicated to write, e.g. in order to retrieve all bookmarks for all users, you have to do something like that:
SELECT "user".id,"user".name,url.val,url.time
FROM "user",
LATERAL unnest((SELECT bookmarks)) b
LEFT JOIN url ON b = url.id;
Edit: So here's the schema I would use and which I think fits best with the relational paradigm
CREATE TABLE "user" (
id integer primary key,
name text not null
);
CREATE TABLE url (
id integer primary key,
val text not null
);
CREATE TABLE bookmark (
user_id integer not null REFERENCES "user",
url_id integer REFERENCES url,
time timestamp with time zone not null,
UNIQUE (user_id,url_id)
);

Unique Identifier in multiple schemas

As the title suggests I want to have a unique ID as a primary key but over multiple schemas. I know about UUID but it's just too costly.
Is there any way to work this around a serial?
You can create a global sequence and use that in your table instead of the automatic sequence that a serial column creates.
create schema global;
create schema s1;
create schema s2;
create sequence global.unique_id;
create table s1.t1
(
id integer default nextval('global.unique_id') primary key
);
create table s2.t1
(
id integer default nextval('global.unique_id') primary key
);
The difference to a serial column is, that the sequence unique_id doesn't "know" it's used by the id columns. A "serial sequence" is automatically dropped if the corresponding column (or table) is dropped which is not what you want with a global sequence.
There is one drawback however: you can't make sure that duplicate values across those two table are inserted manually. If you want to make sure the sequence is always used to insert values, you can create a trigger that always fetches a value from the sequence.

Building a primary key with Json columns

I am trying to set a unique constraint across rows, in which some of them are JSON data types. Since there's no way to make a JSON column a primary key, I thought maybe I can hash the desired columns and build a primary key on the hash. For example:
CREATE TABLE petshop(
name text,
fav_food jsonb,
md5sum uuid);
I can do the following:
SELECT md5(name||fav_food::text) FROM petshop;
But I want that to be performed by default and/or with a trigger which insert the md5 sum into the column md5sum. And then build a pkey on that column.
But really, I just want to know if the JSON object is unique, and not restrict the keys in the JSON. So if anyone has a better idea, helps!

T-SQL create table with primary keys

Hello I wan to create a new table based on another one and create primary keys as well.
Currently this is how I'm doing it. Table B has no primary keys defined. But I would like to create them in table A. Is there a way using this select top 0 statement to do that? Or do I need to do an ALTER TABLE after I created tableA?
Thanks
select TOP 0 *
INTO [tableA]
FROM [tableB]
SELECT INTO does not support copying any of the indexes, constraints, triggers or even computed columns and other table properties, aside from the IDENTITY property (as long as you don't apply an expression to the IDENTITY column.
So, you will have to add the constraints after the table has been created and populated.
The short answer is NO. SELECT INTO will always create a HEAP table and, according to Books Online:
Indexes, constraints, and triggers defined in the source table are not
transferred to the new table, nor can they be specified in the
SELECT...INTO statement. If these objects are required, you must
create them after executing the SELECT...INTO statement.
So, after executing SELECT INTO you need to execute an ALTER TABLE or CREATE UNIQUE INDEX in order to add a primary key.
Also, if dbo.TableB does not already have an IDENTITY column (or if it does and you want to leave it out for some reason), and you need to create an artificial primary key column (rather than use an existing column in dbo.TableB to serve as the new primary key), you could use the IDENTITY function to create a candidate key column. But you still have to add the constraint to TableA after the fact to make it a primary key, since just the IDENTITY function/property alone does not make it so.
-- This statement will create a HEAP table
SELECT Col1, Col2, IDENTITY(INT,1,1) Col3
INTO dbo.MyTable
FROM dbo.AnotherTable;
-- This statement will create a clustered PK
ALTER TABLE dbo.MyTable
ADD CONSTRAINT PK_MyTable_Col3 PRIMARY KEY (Col3);

How to AUTO_INCREMENT in db2?

I thought this would be simple, but I can't seem to use AUTO_INCREMENT in my db2 database. I did some searching and people seem to be using "Generated by Default", but this doesn't work for me.
If it helps, here's the table I want to create with the sid being auto incremented.
create table student(
sid integer NOT NULL <auto increment?>
sname varchar(30),
PRIMARY KEY (sid)
);
Any pointers are appreciated.
You're looking for is called an IDENTITY column:
create table student (
sid integer not null GENERATED ALWAYS AS IDENTITY (START WITH 1 INCREMENT BY 1)
,sname varchar(30)
,PRIMARY KEY (sid)
);
A sequence is another option for doing this, but you need to determine which one is proper for your particular situation. Read this for more information comparing sequences to identity columns.
You will have to create an auto-increment field with the sequence object (this object generates a number sequence).
Use the following CREATE SEQUENCE syntax:
CREATE SEQUENCE seq_person
MINVALUE 1
START WITH 1
INCREMENT BY 1
CACHE 10
The code above creates a sequence object called seq_person, that starts with 1 and will increment by 1. It will also cache up to 10 values for performance. The cache option specifies how many sequence values will be stored in memory for faster access.
To insert a new record into the "Persons" table, we will have to use the nextval function (this function retrieves the next value from seq_person sequence):
INSERT INTO Persons (P_Id,FirstName,LastName)
VALUES (seq_person.nextval,'Lars','Monsen')
The SQL statement above would insert a new record into the "Persons" table. The "P_Id" column would be assigned the next number from the seq_person sequence. The "FirstName" column would be set to "Lars" and the "LastName" column would be set to "Monsen".
hi If you are still not able to make column as AUTO_INCREMENT while creating table. As a work around first create table that is:
create table student(
sid integer NOT NULL
sname varchar(30),
PRIMARY KEY (sid)
);
and then explicitly try to alter column bu using the following
alter table student alter column sid set GENERATED BY DEFAULT AS
IDENTITY
Or
alter table student alter column sid set GENERATED BY DEFAULT
AS IDENTITY (start with 100)
Added a few optional parameters for creating "future safe" sequences.
CREATE SEQUENCE <NAME>
START WITH 1
INCREMENT BY 1
NO MAXVALUE
NO CYCLE
CACHE 10;