Delete duplicate entries in Postgresql - postgresql

I have users table which has multiple same user entries and I need to delete duplicate
entries.How to skip foreign key related entries and delete remaining entries. For example below the entries I have in table.I need to delete duplicate entries which is not related to foreign keys.Could any one please guide how to proceed with this in Postgresql?
id name email role_id
2512 |Raja (Contractor) | raja_test#test.com|5 |
6 |Raja (Contractor) | raja_test#test.com|5 |
5 |Raja (Contractor) | raja_test#test.com|5 |
I have tried below query
delete from users a using users b where a.email=b.email ;
ERROR: update or delete on table "users" violates foreign key constraint "fk_rails_c5e2af0763" on table "devices"
DETAIL: Key (id)=(14) is still referenced from table "devices".
Devices table
id | mac_address | model | user_id
14 | 14:5E:BE:26 |Arris | 6

You can use:
ALTER TABLE users disable TRIGGER ALL;
-- your delete query
ALTER TABLE users enable TRIGGER ALL;
When we use disable trigger all in PostgreSQL, hidden triggers, foreign-key controls, and other constraints for the selected table are also disabled

Related

Creating one-many relationship between street sections and car scans?

I have two tables and need to create a one to many relationship between them. tlbsection represents a series of street sections as lines in a city. Each street section has its own id.
tlbscans represents a on street scan of a street section counting cars on it. I need to relate tlbscans to tlbsection as a street section and can have more than one scan. What is a good way to do this with with the example data below?
tlbsections
ID(PK) | geom | section |
1 | xy | 5713 |
2 | xy | 5717 |
tlbscans
section | a | b |
5713 | 30 | 19 |
5717 | 2 | 1 |
The overwhelming question: Is the column section unique in tlbsections. If it is then create a unique constraint on it. Then create a FK on column section in table `tblscans' referencing. Assuming the tables already exist:
alter table tlbsections
add constraint section_uk
unique section;
alter table tblscans
add constraint scans_section_fk
foreign key (section)
references tlbsections(section);
If column section unique in tlbsections is not unique then you cannot build a relationship as currently defined. Without much more detail, I suggest you add a column to contain tlbsections.id, create a FK on the new column then drop column section tblscans.
alter table tblscans
add tlbsections_id <data type>;
alter table tblscans
add constraint sections_fk
foreign key (tlbsections_id)
references tlbsections(id);
alter table tblscans
drop column section;
There may be other options, but not apparent what is provided.

duplicate key in a PostgreSQL index

I want to move my OwnCloud database to a new server, but the operation fails during restore.
pg_restore: [archive program (db)] COPY failed for table "oc_storages": ERROR: value of a duplicate key breaks unique constraint "storages_id_index"
DETAIL: The key "(id) = (local :: / var / www / owncloud_data /)" already exists.
Indeed, a simple query on the oc_sorages database shows that there is a duplicate.
ocl=# select * from oc_storages where id ~* 'owncloud_data';
id | numeric_id | available | last_checked
--------------------------------+------------+-----------+--------------
local::/var/www/owncloud_data/ | 491 | 1 |
local::/var/www/owncloud_data/ | 838 | 1 |
(2 rows)
but at the same time, postgresql managed to create an index for this table based on the id (storages_id_index). How is it possible that PostgreSQL accepts this duplicate in this table?
ocl=# SELECT indexname, indexdef FROM pg_indexes WHERE tablename = 'oc_storages';
indexname | indexdef
-------------------+-------------------------------------------------------------------------------------
oc_storages_pkey | CREATE UNIQUE INDEX oc_storages_pkey ON public.oc_storages USING btree (numeric_id)
storages_id_index | CREATE UNIQUE INDEX storages_id_index ON public.oc_storages USING btree (id)
(2 rows)
What to do to get out of this impasse: delete one of the two values? which ?
Thanks in advance.
Ernest.
There are usually two explanation for this:
Hardware problems leading to data corruption. Then remove conflicting rows manually, export the database and import it into a newly created cluster to get rid of potential lurking data corruption.
You upgraded the C library on the operating system and the collations changed, corrupting the index. Then remove conflicting rows manually and REINDEX the indexes with string columns.
This is one of those semantic annoyances I have with Postgres, but creating a UNIQUE INDEX on a table does not actually add an enforced table constraint.
You need to explicitly add each constraint USING the created index, e.g.:
CREATE UNIQUE INDEX oc_storages_pkey ON public.oc_storages USING btree (numeric_id);
ALTER TABLE public.oc_storages ADD CONSTRAINT oc_storages_pkey UNIQUE USING INDEX oc_storages_pkey;
If you do have such a table constraint already, then this would be a case of corruption.

How can I update or insert a where by looking to see the data already exists in 2 of the columns?

My table is look like below:
id | u_id | server_id | user_id | pts | level | count | timestamp
I want to check when inserting new data or updating data if the values from data to be inserted already exist in both the server_id or user_id column. In other words, no two rows can have the same server_id or user_id.
Add an unique constraint to the table
alter table tablename add constraint ids_unique unique (server_id, user_id);
and then handle possible exceptions.

Cause of PostgreSQL foreign key violation?

My PostgreSQL (9.2) database contains two tables registrations and attributes with a foreign key constraint:
postgres=# \d+ registrations;
Table "public.registrations"
Column | Type | Modifiers | Storage | Stats target | Description
---------+-------+-----------+----------+--------------+-------------
name | text | not null | extended | |
parent | text | | extended | |
storage | bytea | | extended | |
Indexes:
"registrations_pkey" PRIMARY KEY, btree (name)
Referenced by:
TABLE "attributes" CONSTRAINT "attributes_cname_fkey" FOREIGN KEY (cname) REFERENCES registrations(name) ON DELETE CASCADE
Has OIDs: no
postgres=# \d+ attributes;
Table "public.attributes"
Column | Type | Modifiers | Storage | Stats target | Description
--------+-------+-----------+----------+--------------+-------------
cname | text | not null | extended | |
aname | text | not null | extended | |
tags | text | | extended | |
value | bytea | | extended | |
Indexes:
"attributes_pkey" PRIMARY KEY, btree (cname, aname)
Foreign-key constraints:
"attributes_cname_fkey" FOREIGN KEY (cname) REFERENCES registrations(name) ON DELETE CASCADE
Has OIDs: no
At some point I realised that some rows violated the foreign key constraint:
postgres=# SELECT COUNT(*) FROM attributes LEFT JOIN registrations ON attributes.cname=registrations.name WHERE registrations.name IS NULL;
count
-------
71
(1 row)
Could you help me understand how this corruption could happen?
A constraint marked as NOT VALID is the one case you might expect to see violations, but the NOT VALID clause would show up in the psql \d+ output. (I believe it's possible to manually update this flag in the catalog, but I hope for your sake that this isn't the issue...)
As far as I know, the only supported way of bypassing a foreign key check is to SET session_replication_role TO replica before modifying the data. This is there for the benefit of replication processes, operating under the assumption that the constraint has already been validated on the master - though this can certainly go wrong if your replicator is buggy or misconfigured.
It's also possible for a superuser to manually disable the constraint's underlying triggers (and it's often tempting for someone trying to speed up a bulk import). The following will tell you if the triggers are currently active (tgenabled should be 'O'):
SELECT *
FROM pg_trigger
WHERE tgname ~ '^RI_ConstraintTrigger'
AND tgrelid IN ('registrations'::regclass, 'attributes'::regclass)
I don't think there's any way of knowing whether this was temporarily changed in the past, though if you have statement logging enabled, you might find an ALTER TABLE ... DISABLE TRIGGER statement in there somewhere.
There is also at least one loophole in the foreign key enforcement, and of course, it's always possible that you've found a bug...
This can happen if the FK contraint was created with a NOT VALID clause (don't do this):
CREATE TABLE one
( one_id INTEGER NOT NULL PRIMARY KEY
);
CREATE TABLE two
( two_id INTEGER NOT NULL PRIMARY KEY
, one_id INTEGER NOT NULL
);
INSERT INTO one(one_id)
SELECT gs FROM generate_series(0,12,2) gs;
INSERT INTO two(two_id,one_id)
SELECT gs, gs FROM generate_series(0,12,3) gs;
ALTER TABLE two
ADD CONSTRAINT omg FOREIGN KEY (one_id) references one(one_id)
-- DEFERRABLE INITIALLY DEFERRED
NOT VALID
;

Split table with duplicates into 2 normalized tables?

I have a table with some duplicate rows that I want to normalize into 2 tables.
user | url | keyword
-----|-----|--------
fred | foo | kw1
fred | bar | kw1
sam | blah| kw2
I'd like to start by normalizing this into two tables (user, and url_keyword). Is there a query I can run to normalize this, or do I need to loop through the table with a script to build the tables?
You can do it with a few queries, but I'm not familiar with postgreSQL. Create a table users first, with an identity column. Also add a column userID to the existing table:
Then something along these lines:
INSERT INTO users (userName)
SELECT DISTINCT user FROM url_keyword
UPDATE url_keyword
SET userID=(SELECT ID FROM users WHERE userName=user)
Then you can drop the old user column, create the foreign key constraint, etc.