Does PostgreSQL create an internal key (probably an int type) as primary key for a table without a primary key specified? - postgresql

From https://stackoverflow.com/a/40597571/3284469
If you don't specify a primary key, RDBMS will help you choose an unique and non-null key, OR create an internal key (probably an int type) as primary key for this table.
Could you give some examples for the "OR" case, where a RDBMS (PostgreSQL in particular, and possibly also MySQL or SQL Server) create an "internal key (probably an int type) as primary key" for a table without a primary key specified?
Does PostgreSQL have something similar to MySQL?
Thanks.

for Postgres:
From "5.4. System Columns":
oid
The object identifier (object ID) of a row. This column is only present if the table was created using WITH OIDS, or if the default_with_oids configuration variable was set at the time. This column is of type oid (same name as the column); see Section 8.18 for more information about the type.
and
ctid
The physical location of the row version within its table. Note that although the ctid can be used to locate the row version very quickly, a row's ctid will change if it is updated or moved by VACUUM FULL. Therefore `ctid is useless as a long-term row identifier. The OID, or even better a user-defined serial number, should be used to identify logical rows.
Both come close to what you're searching for but have restrictions as you can read in the documentation. So, as the manual states, using a user-defined PK is the better choice.
for SQL Server:
There is the undocumented pseudo column %%physloc%%. It describes the physical location of a row. That, however, might be subject to change if the row gets physically moved for whatever reason. And it's undocumented, that is, its behavior might change any time between releases or even just patches or it might be removed completely without further notice. So using a user-defined PK is the better choice here either.

Related

DB2 access specific row, in an non Unique table, for update / delete operations

Can I do row-specific update / delete operations in a DB2 table Via SQL, in a NON QUNIQUE Primary Key Context?
The Table is a PHYSICAL FILE on the NATIVE SYSTEM of the AS/400.
It was, like many other Files, created without the unique definition, which leads DB2 to the conclusion, that The Table, or PF has no qunique Key.
And that's my problem. I can't override the structure of the table to insert a unique ID ROW, because, I would have to recompile ALL my correlating Programs on the AS/400, which is a serious issue, much things would not work anymore, "perhaps". Of course, I can do that refactoring for one table, but our system has thousands of those native FILES, some well done with Unique Key, some without Unique definition...
Well, I work most of the time with db2 and sql on that old files. And all files which have a UNIQUE Key are no problem for me to do those important update / delete operations.
Is there some way to get an additional column to every select with a very unique row id, respective row number. And in addition, what is much more important, how can I update this RowNumber.
I did some research and meanwhile I assume, that there is no chance to do exact alterations or deletes, when there is no unique key present. What I would wish would be some additional ID-ROW which is always been sent with the table, which I can Refer to when I do my update / delete operations. Perhaps my thinking here has an fallacy as non Unique Key Tables are purposed to be edited in other ways.
Try the RRN function.
SELECT RRN(EMPLOYEE), LASTNAME
FROM EMPLOYEE
WHERE ...;
UPDATE EMPLOYEE
SET ...
WHERE RRN(EMPLOYEE) = ...;

Errors creating constraint trigger

Let me start by saying that I’m a Linux/Unix admin. That being said my manager has tasked me with moving older PostgreSQL databases to a RedHat server running 8.4.20. I was successful moving a 7.2.1 db but I’m running into issues moving a 7.4.20 db.
I use pg_dump –c filename and psql < filename. For the problematic db everything runs until I get to a CREATE CONSTRAINT TRIGGER statement. If I run it as it is in the file I get :
NOTICE: ignoring incomplete trigger group for constraint "" FOREIGN KEY data(ups) REFERENCES upsinfo(ups)
DETAIL: Found referenced table's DELETE trigger.
CREATE TRIGGER
If I run set schema 'pg_catalog'; I get:
ERROR: relation "upsinfo" does not exist
The tables (I think) involved are:
CREATE TABLE upsinfo (
ups text NOT NULL,
ipaddr inet,
rcomm text,
wcomm text,
reachable boolean,
managed boolean,
comments text,
region text
);
CREATE TABLE data (
date timestamp with time zone,
ups text,
mib text,
value text
);
The trigger problem trigger statement:
CREATE CONSTRAINT TRIGGER "<unnamed>"
AFTER DELETE ON upsinfo
FROM data
NOT DEFERRABLE INITIALLY IMMEDIATE
FOR EACH ROW
EXECUTE PROCEDURE "RI_FKey_cascade_del"('<unnamed>', 'data', 'upsinfo', 'UNSPECIFIED', 'ups', 'ups');
I know that the RI_FKey_cascade_del function is defined differently in the different versions of pg_catalog. Note that search_path is set to ‘public, pg_catalog’ so I’m also confused why I have to set the schema.
Again I’m not a real PostgreSQL DBA so try to be kind.
Oof, those are really old postgres versions, including the version you're upgrading to (8.4 was released in 2009, and support ended in 2014).
The short answer is that, as long as upsinfo and data are being created and populated, you're probably fine, and good to go. But one of your foreign key relationships is broken.
The long answer, well, let me see if I can explain what is going on (or, at least, what I think is going on).
I'm guessing that the original table definition of data included something like FOREIGN KEY (ups) REFERENCES upsinfo (ups) ON DELETE CASCADE. That causes postgres to automatically make some trigger constraints: 1- every time there's a new row for data, make sure that its ups column matches an existing row in upsinfo, and 2- every time you delete a row from upsinfo, delete the corresponding rows in data, based on the matching ups value.
That (not very informative) error message can come up when the foreign key relationship doesn't work. In order for a foreign key to make sense, the referenced value needs to be unique -- there should be only one row in upsinfo for each distinct value of ups. In order for postgres to know that, there needs to be a unique index or primary key on upsinfo.ups.
In this case, one of a couple things could be breaking it:
There's no primary key or unique index on upsinfo.ups (postgres should not have allowed a foreign key, but may have in very old versions)
There used to be a unique index, but it hadn't properly enforced uniqueness, so it didn't get successfully imported (a bug, again likely from a very old version)
In either case, if that foreign key relationship is important, you can try to fix it once the import is complete. Start by trying to make a unique index on upsinfo.ups, and see if you have problems. If you do, resolve the duplicate entries, and try again till it works. Then issue something like:
ALTER TABLE data
ADD FOREIGN KEY (ups) REFERENCES upsinfo (ups) ON DELETE CASCADE;
Of course, if things are working, it's possible you don't need to fix the foreign key, in which case you're probably able to ignore those errors and just move forward.
Hope that helps, and good luck!
This seems to be a part of ON DELETE CONSTRAINT. If I were you I would delete all such statements and replace them with a proper constraint definition on the target table.
Table definition should then look like this:
CREATE TABLE bookings (
boo_id serial NOT NULL,
boo_hotelid character varying NOT NULL,
boo_roomid integer NOT NULL,
CONSTRAINT pk_bookings
PRIMARY KEY (boo_id),
CONSTRAINT fk_bookings_boo_roomid
FOREIGN KEY (boo_roomid)
REFERENCES rooms (roo_id) MATCH SIMPLE
ON UPDATE CASCADE ON DELETE CASCADE
) WITHOUT OIDS;
And this part is what will internally create the trigger:
CONSTRAINT fk_bookings_boo_roomid
FOREIGN KEY (boo_roomid)
REFERENCES rooms (roo_id) MATCH SIMPLE
ON UPDATE CASCADE ON DELETE CASCADE
But, to be honest, I do not have an understanding for an upgrade to an unsupported version. You know the Postgres is version 9.5 now, right?

Way to migrate a create table with sequence from postgres to DB2

I need to migrate a DDL from Postgres to DB2, but I need that it works the same as in Postgres. There is a table that generates values from a sequence, but the values can also be explicitly given.
Postgres
create sequence hist_id_seq;
create table benchmarksql.history (
hist_id integer not null default nextval('hist_id_seq') primary key,
h_c_id integer,
h_c_d_id integer,
h_c_w_id integer,
h_d_id integer,
h_w_id integer,
h_date timestamp,
h_amount decimal(6,2),
h_data varchar(24)
);
(Look at the sequence call in the hist_id column to define the value of the primary key)
The business logic inserts into the table by explicitly providing an ID, and in other cases, it leaves the database to choose the number.
If I change this in DB2 to a GENERATED ALWAYS it will throw errors because there are some provided values. On the other side, if I create the table with GENERATED BY DEFAULT, DB2 will throw an error when trying to insert with the same value (SQL0803N), because the "internal sequence" does not take into account the already inserted values, and it does not retry with a next value.
And, I do not want to restart the sequence each time a provided ID was inserted.
This is the problem in BenchmarkSQL when trying to port it to DB2: https://sourceforge.net/projects/benchmarksql/ (File sqlTableCreates)
How can I implement the same database logic in DB2 as it does in Postgres (and apparently in Oracle)?
You're operating under a misconception: that sources external to the db get to dictate its internal keys. Ideally/conceptually, autogenerated ids will never need to be seen outside of the db, as conceptually there should be unique natural keys for export or reporting. Still, there are times when applications will need to manage some ids, often when setting up related entities (eg, JPA seems to want to work this way).
However, if you add an id value that you generated from a different source, the db won't be able to manage it. How could it? It's not efficient - for one thing, attempting to do so would do one of the following
Be unsafe in the face of multiple clients (attempt to add duplicate keys)
Serialize access to the table (for a potentially slow query, too)
(This usually shows up when people attempt something like: SELECT MAX(id) + 1, which would require locking the entire table for thread safety, likely including statements that don't even touch that column. If you try to find any "first-unused" id - trying to fill gaps - this gets more complicated and problematic)
Neither is ideal, so it's best to not have the problem in the first place. This is usually done by having id columns be autogenerated, but (as pointed out earlier) there are situations where we may need to know what the id will be before we insert the row into the table. Fortunately, there's a standard SQL object for this, SEQUENCE. This provides a db-managed, thread-safe, fast way to get ids. It appears that in PostgreSQL you can use sequences in the DEFAULT clause for a column, but DB2 doesn't allow it. If you don't want to specify an id every time (it should be autogenerated some of the time), you'll need another way; this is the perfect time to use a BEFORE INSERT trigger;
CREATE TRIGGER Add_Generated_Id NO CASCADE BEFORE INSERT ON benchmarksql.history
NEW AS Incoming_Entity
FOR EACH ROW
WHEN Incoming_Entity.id IS NULL
SET id = NEXTVAL FOR hist_id_seq
(something like this - not tested. You didn't specify where in the project this would belong)
So, if you then add a row with something like:
INSERT INTO benchmarksql.history (hist_id, h_data) VALUES(null, 'a')
or
INSERT INTO benchmarksql.history (h_data) VALUES('a')
an id will be generated and attached automatically. Note that ALL ids added to the table must come from the given sequence (as #mustaccio pointed out, this appears to be true even in PostgreSQL), or any UNIQUE CONSTRAINT on the column will start throwing duplicate-key errors. So any time your application needs an id before inserting a row in the table, you'll need some form of
SELECT NEXT VALUE FOR hist_id_seq
FROM sysibm.sysdummy1
... and that's it, pretty much. This is completely thread and concurrency safe, will not maintain/require long-term locks, nor require serialized access to the table.

T-SQL implicit conversion between 2 varchars

I have some T-SQL (SQL Server 2008) that I inherited and am trying to find out why some of queries are running really slow. In the Actual Execution Plan I have three clustered index scans which are costing me 19%, 21% and 26%, so this seems to be the source of my problem.
The contents of the fields are usually numeric (but some job numbers have an alpha prefix)
The database design (vendor supplied) is pretty poor. The max length of a job number in their application is 12 chars, but in the tables that are joined it is defined as varchar(50) in some places and varchar(15) in others. My parameter is a varchar(12), but I get same thing if I change it to a varchar(50)
The node contains this:
Predicate: [Live_Costing].[dbo].[TSTrans].[JobNo] as [sts1].[JobNo]=CONVERT_IMPLICIT(varchar(50),[#JobNo],0)
sts1 is a derived table, but the table it pulls jobno from is a varchar(50)
I don't understand why it's doing an implicit conversion between 2 varchars. Is it just because they are different lengths?
I'm fairly new to the execution plan
Is there an easy way to figure out which node in the exec plan relates to which part of the query?
Is the predicate, the join clause?
Regards
Mark
Some variables can have collation: enter link description here
Regardless you need to verify your collations, which can be specified at server, DB, table, and column level.
First, check your collation between tempdb and the vendor supplied database. It should match. If it doesn't, it will tend to do implicit conversions.
Assuming you cannot modify the vendor supplied code base, one or more of the following should help you:
1) Predefine your temp tables and specify the same collation for the key field as in the db in use, rather than tempdb.
2) Provide collations when doing string comparisons.
3) Specify collation for key values if using "select into" with a temp table
4) Make sure your collations on your tables and columns match your database collation (VERY important if you imported only specific tables from a vendor into an existing database.)
If you can change the vendor supplied code base, I would suggest reviewing the cost for making all of your char keys the same length and NOT varchar. Varchar has an overhead of 10. The caveat is that if you create a fixed length character field not null, it will be padded to the right (unavoidable).
Ideally, you would have int keys, and only use varchar fields for user interaction/lookup:
create table Products(ProductID int not null identity(1,1) primary key clustered, ProductNumber varchar(50) not null)
alter table Products add constraint uckProducts_ProductNumber unique(ProductNumber)
Then do all joins on ProductID, rather than ProductNumber. Just filter on ProductNumber.
would be perfectly fine.

How to maintain record history on table with one-to-many relationships?

I have a "services" table for detailing services that we provide. Among the data that needs recording are several small one-to-many relationships (all with a foreign key constraint to the service_id) such as:
service_owners -- user_ids responsible for delivery of service
service_tags -- e.g. IT, Records Management, Finance
customer_categories -- ENUM value
provider_categories -- ENUM value
software_used -- self-explanatory
The problem I have is that I want to keep a history of updates to a service, for which I'm using an update trigger on the table, that performs an insert into a history table matching the original columns. However, if a normalized approach to the above data is used, with separate tables and foreign keys for each one-to-many relationship, any update on these tables will not be recognised in the history of the service.
Does anyone have any suggestions? It seems like I need to store child keys in the service table to maintain the integrity of the service history. Is a delimited text field a valid approach here or, as I am using postgreSQL, perhaps arrays are also a valid option? These feel somewhat dirty though!
Thanks.
If your table is:
create table T (
ix int identity primary key,
val nvarchar(50)
)
And your history table is:
create table THistory (
ix int identity primary key,
val nvarchar(50),
updateType char(1), -- C=Create, U=Update or D=Delete
updateTime datetime,
updateUsername sysname
)
Then you just need to put an update trigger on all tables of interest. You can then find out what the state of any/all of the tables were at any point in history, to determine what the relationships were at that time.
I'd avoid using arrays in any database whenever possible.
I don't like updates for the exact reason you are saying here...you lose information as it's over written. My answer is quite simple...don't update. Not sure if you're at a point where this can be implemented...but if you can I'd recommend using the main table itself to store historical (no need for a second set of history tables).
Add a column to your main header table called 'active'. This can be a character or a bit (0 is off and 1 is on). Then it's a bit of trigger magic...when an update is preformed, you insert a row into the table identical to the record being over-written with a status of '0' (or inactive) and then update the existing row (this process keeps the ID column on the active record the same, the newly inserted record is the inactive one with a new ID).
This way no data is ever lost (admittedly you are storing quite a few rows...) and the history can easily be viewed with a select where active = 0.
The pain here is if you are working on something already implemented...every existing query that hits this table will need to be updated to include a check for the active column. Makes this solution very easy to implement if you are designing a new system, but a pain if it's a long standing application. Unfortunately existing reports will include both off and on records (without throwing an error) until you can modify the where clause