Split table with duplicates into 2 normalized tables?

Split table with duplicates into 2 normalized tables? - postgresql

I have a table with some duplicate rows that I want to normalize into 2 tables.
user | url | keyword
-----|-----|--------
fred | foo | kw1
fred | bar | kw1
sam | blah| kw2
I'd like to start by normalizing this into two tables (user, and url_keyword). Is there a query I can run to normalize this, or do I need to loop through the table with a script to build the tables?

You can do it with a few queries, but I'm not familiar with postgreSQL. Create a table users first, with an identity column. Also add a column userID to the existing table:
Then something along these lines:
INSERT INTO users (userName)
SELECT DISTINCT user FROM url_keyword
UPDATE url_keyword
SET userID=(SELECT ID FROM users WHERE userName=user)
Then you can drop the old user column, create the foreign key constraint, etc.

Related

Creating one-many relationship between street sections and car scans?

I have two tables and need to create a one to many relationship between them. tlbsection represents a series of street sections as lines in a city. Each street section has its own id.
tlbscans represents a on street scan of a street section counting cars on it. I need to relate tlbscans to tlbsection as a street section and can have more than one scan. What is a good way to do this with with the example data below?
tlbsections
ID(PK) | geom | section |
1 | xy | 5713 |
2 | xy | 5717 |
tlbscans
section | a | b |
5713 | 30 | 19 |
5717 | 2 | 1 |

The overwhelming question: Is the column section unique in tlbsections. If it is then create a unique constraint on it. Then create a FK on column section in table `tblscans' referencing. Assuming the tables already exist:
alter table tlbsections
add constraint section_uk
unique section;
alter table tblscans
add constraint scans_section_fk
foreign key (section)
references tlbsections(section);
If column section unique in tlbsections is not unique then you cannot build a relationship as currently defined. Without much more detail, I suggest you add a column to contain tlbsections.id, create a FK on the new column then drop column section tblscans.
alter table tblscans
add tlbsections_id <data type>;
alter table tblscans
add constraint sections_fk
foreign key (tlbsections_id)
references tlbsections(id);
alter table tblscans
drop column section;
There may be other options, but not apparent what is provided.

Delete duplicate entries in Postgresql

I have users table which has multiple same user entries and I need to delete duplicate
entries.How to skip foreign key related entries and delete remaining entries. For example below the entries I have in table.I need to delete duplicate entries which is not related to foreign keys.Could any one please guide how to proceed with this in Postgresql?
id name email role_id
2512 |Raja (Contractor) | raja_test#test.com|5 |
6 |Raja (Contractor) | raja_test#test.com|5 |
5 |Raja (Contractor) | raja_test#test.com|5 |
I have tried below query
delete from users a using users b where a.email=b.email ;
ERROR: update or delete on table "users" violates foreign key constraint "fk_rails_c5e2af0763" on table "devices"
DETAIL: Key (id)=(14) is still referenced from table "devices".
Devices table
id | mac_address | model | user_id
14 | 14:5E:BE:26 |Arris | 6

You can use:
ALTER TABLE users disable TRIGGER ALL;
-- your delete query
ALTER TABLE users enable TRIGGER ALL;
When we use disable trigger all in PostgreSQL, hidden triggers, foreign-key controls, and other constraints for the selected table are also disabled

How can I update or insert a where by looking to see the data already exists in 2 of the columns?

My table is look like below:
id | u_id | server_id | user_id | pts | level | count | timestamp
I want to check when inserting new data or updating data if the values from data to be inserted already exist in both the server_id or user_id column. In other words, no two rows can have the same server_id or user_id.

Add an unique constraint to the table
alter table tablename add constraint ids_unique unique (server_id, user_id);
and then handle possible exceptions.

ERROR: cannot create a unique index without the column "date_time" (used in partitioning)

I just started using timescaleDB with postgresql. I have a database named storage_db which contains a table named day_ahead_prices.
After installing timescaledb, I was following Migrate from the same postgresql database to migrate my storage_db into a timescaledb.
When I did (indexes included):
CREATE TABLE tsdb_day_ahead_prices (LIKE day_ahead_prices INCLUDING DEFAULTS INCLUDING CONSTRAINTS INCLUDING INDEXES);
select create_hypertable('tsdb_day_ahead_prices', 'date_time');
It gave me the following error:
ERROR: cannot create a unique index without the column "date_time" (used in partitioning)
But when I did (indexed excluded):
CREATE TABLE tsdb_day_ahead_prices (LIKE day_ahead_prices INCLUDING DEFAULTS INCLUDING CONSTRAINTS EXCLUDING INDEXES);
select create_hypertable('tsdb_day_ahead_prices', 'date_time');
It was successful. Following which, I did
select create_hypertable('tsdb_day_ahead_prices', 'date_time');
and it gave me the following output:
create_hypertable
------------------------------------
(3,public,tsdb_day_ahead_prices,t)
(1 row)
I am a bit new to this so can anyone please explain to me what is the difference between both of them and why was I getting an error in the first case?
P.S.:
My day_ahead_prices looks as follows:
id | country_code | values | date_time
----+--------------+---------+----------------------------
1 | LU | 100.503 | 2020-04-11 14:04:30.461605
2 | LU | 100.503 | 2020-04-11 14:18:39.600574
3 | DE | 106.68 | 2020-04-11 15:59:10.223965
Edit 1:
I created the day_ahead_prices table in python using flask and flask_sqlalchemy and the code is:
class day_ahead_prices(db.Model):
__tablename__ = "day_ahead_prices"
id = db.Column(db.Integer, primary_key=True)
country_code = db.Column(avail_cc_enum, nullable=False)
values = db.Column(db.Float(precision=2), nullable=False)
date_time = db.Column(db.DateTime, default=datetime.now(tz=tz), nullable=False)
def __init__(self, country_code, values):
self.country_code = country_code
self.values = values

When executing CREATE TABLE tsdb_day_ahead_prices (LIKE day_ahead_prices INCLUDING DEFAULTS INCLUDING CONSTRAINTS INCLUDING INDEXES); you're telling the database to create the tsdb_day_ahead_prices table using the day_ahead_prices as a template (same columns, same types for those columns), but you're also telling it to include the default values, constraints and indexes that you have defined on the original table, and apply/create the same for your new table.
Then you are executing the timescaledb command that makes the tsdb_day_ahead_prices table
a hypertable. A hypertable is an abstraction that hides away the partitioning of the physical
table. (https://www.timescale.com/products/how-it-works). You are telling
TimescaleDB to make the tsdb_day_ahead_prices a hypertable using the date_time column as a partitioning key.
When creating hypertables, one constraing that TimescaleDB imposes is that the partitioning column (in your case 'date_time') must be included in any unique indexes (and Primary Keys) for that table. (https://docs.timescale.com/latest/using-timescaledb/schema-management#indexing-best-practices)
The first error you get cannot create a unique index without the column "date_time" is exactly because of this. You copied the primary key definition on the id column. So the primary key is preventing
the table to be a hypertable.
The second time, you created the tsdb_day_ahead_prices table but you didn't copy
the indexes from the original table, so the primary key is not defined (which is really a unique index). So the creation of the hypertable was successfull.
The output you get from the create_hypertable function tells you that you have a new hypertable, in the public schema, the name of the hypertable, and the internal id that timescaledb uses for it.
So now you can use the tsdb_day_ahead_prices as normal, and timescaledb underneath will make sure the data goes into the proper partitions/chunks
Does the id need to be unique for this table?
If you're going to be keeping time-series data
then each row may not really be unique for each id, but may be uniquely identified by the id at a given time.
You can create a separate table for the items that you're identifying
items(id PRIMARY KEY, country_code) and have the hypertable be
day_ahead_prices(time, value, item_id REFERENCES items(id))

Understanding the output of PSQL's \dp and \z

I'm having trouble selecting from a database in PSQL. This is the output of the table I'm interested in. Can someone decipher the access priveleges for me? I know that arwdRxt means append,read,write,etc... The syntax is confusing to me, what exactly do the slashes and equals mean in the access privileges column? Please let me know if my question isn't clear.
Access privileges
schema | name | type | access privileges
--------+---------------+------+-------------------------
public | table_name | view | amazonuser=arwdRxt/amazonuser+
| | | readonly=r/amazonuser

It is described in detail in the docs. The thing before the = is who has those permissions, the thing after the / is who granted those permissions.

From the docs:
Privilege Abbreviation Applicable Object Types
SELECT r (“read”) LARGE OBJECT, SEQUENCE, TABLE (and table-like objects), table column
INSERT a (“append”) TABLE, table column
UPDATE w (“write”) LARGE OBJECT, SEQUENCE, TABLE, table column
DELETE d TABLE
TRUNCATE D TABLE
REFERENCES x TABLE, table column
TRIGGER t TABLE
CREATE C DATABASE, SCHEMA, TABLESPACE
CONNECT c DATABASE
TEMPORARY T DATABASE
EXECUTE X FUNCTION, PROCEDURE
USAGE U DOMAIN, FOREIGN DATA WRAPPER, FOREIGN SERVER, LANGUAGE, SCHEMA, SEQUENCE, TYPE

We Keep Coding

iphone swift flutter scala powershell matlab mongodb postgresql perl eclipse

Split table with duplicates into 2 normalized tables? - postgresql

Related

Creating one-many relationship between street sections and car scans?

Delete duplicate entries in Postgresql

How can I update or insert a where by looking to see the data already exists in 2 of the columns?

ERROR: cannot create a unique index without the column "date_time" (used in partitioning)

Understanding the output of PSQL's \dp and \z

Categories

Resources