SQLalchemy - Many to Many - Permitting Duplicate - postgresql

I've have the below many to many setup which works as expected. I can have an many Access_Paths with many Datasets between them. However I would like there to duplicate instances of a Dataset in Access_Path. Currently any attempt to add multiple instances results in just the one.
class Association(Base):
__tablename__ = 'link_dataset_url'
__table_args__ = {'autoload': True}
dataset = Column('dataset_id', ForeignKey('dataset_attributes.id'), primary_key=False)
accesspath = Column('access_path_id', ForeignKey('access_path.id'), primary_key=False)
class Dataset(Base):
__tablename__ = 'dataset'
__table_args__ = {'autoload': True}
access_path = relationship("Access_Path", secondary='link_dataset_url')
class Access_Path(Base):
__tablename__ = 'access_path'
__table_args__ = {'autoload': True}
datasets = relationship("Dataset", secondary='link_dataset_url')

Related

How to Initialise & Populate a Postgres Database with Circular ForeignKeys in SQLModel?

Goal:
I'm trying to use SQLModel (a wrapper that ties together pydantic and sqlalchemy) to define and interact with the back-end database for a cleaning company. Specifically, trying to model a system where customers can have multiple properties that need to be cleaned and each customer has a single lead person who has a single mailing property (to contact them at). Ideally, I want to be able to use a single table for the mailing properties and cleaning properties (as in most instances they will be the same).
Constraints:
Customers can be either individual people or organisations
A lead person must be identifiable for each customer
Each person must be matched to a property (so that their mailing address can be identified)
A single customer can have multiple properties attached to them (e.g. for a landlord that includes cleaning as part of the rent)
The issue is that the foreign keys have a circular dependency.
Customer -> Person based on the lead_person_id
Person -> Property based on the mailing_property_id
Property -> Customer based on the occupant_customer_id
Code to reproduce the issue:
# Imports
from typing import Optional, List
from sqlmodel import Session, Field, SQLModel, Relationship, create_engine
import uuid as uuid_pkg
# Defining schemas
class Person(SQLModel, table=True):
person_id: uuid_pkg.UUID = Field(default_factory=uuid_pkg.uuid4, primary_key=True, index=True, nullable=True)
first_names: str
last_name: str
mailing_property_id: uuid_pkg.UUID = Field(foreign_key='property.property_id')
customer: Optional['Customer'] = Relationship(back_populates='lead_person')
mailing_property: Optional['Property'] = Relationship(back_populates='person')
class Customer(SQLModel, table=True):
customer_id: uuid_pkg.UUID = Field(default_factory=uuid_pkg.uuid4, primary_key=True, index=True, nullable=True)
lead_person_id: uuid_pkg.UUID = Field(foreign_key='person.person_id')
contract_type: str
lead_person: Optional['Person'] = Relationship(back_populates='customer')
contracted_properties: Optional[List['Property']] = Relationship(back_populates='occupant_customer')
class Property(SQLModel, table=True):
property_id: uuid_pkg.UUID = Field(default_factory=uuid_pkg.uuid4, primary_key=True, index=True, nullable=True)
occupant_customer_id: uuid_pkg.UUID = Field(foreign_key='customer.customer_id')
address: str
person: Optional['Person'] = Relationship(back_populates='mailing_property')
occupant_customer: Optional['Customer'] = Relationship(back_populates='contracted_properties')
# Initialising the database
engine = create_engine(f'postgresql://{DB_USERNAME}:{DB_PASSWORD}#{DB_URL}:{DB_PORT}/{DB_NAME}')
SQLModel.metadata.create_all(engine)
# Defining the database entries
john = Person(
person_id = 'eb7a0f5d-e09b-4b36-8e15-e9541ea7bd6e',
first_names = 'John',
last_name = 'Smith',
mailing_property_id = '4d6aed8d-d1a2-4152-ae4b-662baddcbef4'
)
johns_lettings = Customer(
customer_id = 'cb58199b-d7cf-4d94-a4ba-e7bb32f1cda4',
lead_person_id = 'eb7a0f5d-e09b-4b36-8e15-e9541ea7bd6e',
contract_type = 'Landlord Premium'
)
johns_property_1 = Property(
property_id = '4d6aed8d-d1a2-4152-ae4b-662baddcbef4',
occupant_customer_id = 'cb58199b-d7cf-4d94-a4ba-e7bb32f1cda4',
address = '123 High Street'
)
johns_property_2 = Property(
property_id = '2ac15ac9-9ab3-4a7c-80ad-961dd565ab0a',
occupant_customer_id = 'cb58199b-d7cf-4d94-a4ba-e7bb32f1cda4',
address = '456 High Street'
)
# Committing the database entries
with Session(engine) as session:
session.add(john)
session.add(johns_lettings)
session.add(johns_property_1)
session.add(johns_property_2)
session.commit()
Results in:
ForeignKeyViolation: insert or update on table "customer" violates foreign key constraint "customer_lead_person_id_fkey"
DETAIL: Key (lead_person_id)=(eb7a0f5d-e09b-4b36-8e15-e9541ea7bd6e) is not present in table "person".
This issue is specific to Postgres, which unlike SQLite (used in the docs) imposes constraints on foreign keys when data is being added. I.e. replacing engine = create_engine(f'postgresql://{DB_USERNAME}:{DB_PASSWORD}#{DB_URL}:{DB_PORT}/{DB_NAME}') with engine = create_engine('sqlite:///test.db') will let the database be initialised without causing an error - however my use-case is with a Postgres DB.
Attempted Solutions:
Used link tables between customers/people and properties/customers - no luck
Used Session.exec with this code from SO to temporarily remove foreign key constraints then add them back on - no luck
Used primary joins instead of foreign keys as described in this SQLModel Issue - no luck

Two-level join Sqlalchemy

I have three Model classes, representing three tables in my PostgreSQL database: Project, Label, ProjectLabel. Many projects can have multiple labels:
class Project(db.Model):
__tablename__ = 'projects'
id = db.Column(db.Integer, primary_key=True)
name = db.Column(db.String())
labels = db.relationship('ProjectLabel')
class Label(db.Model):
__tablename__ = 'labels'
label_id = db.Column(db.Integer, primary_key=True)
label_name = db.Column(db.String())
class ProjectLabel(db.Model):
__tablename__ = 'projects_labels'
projectlabel_id = db.Column(db.Integer, primary_key=True)
projectlabel_projectid = db.Column(db.Integer, db.ForeignKey('projects.id'))
projectlabel_labelid = db.Column(db.Integer, db.ForeignKey('labels.label_id'))
How can I query Project model, so that I can get objects from labels table?
Specifically, how can I get label_name of the label assigned to the Project? I somehow need to connect between Project(labels) -> ProjectLabel -> Label classes
This will get the related labels in long form:
db.session.query(Project.id,
Label.label_name)\
.filter(ProjectLabel.projectlabel_projectid==Project.id)\
.filter(Label.label_id==ProjectLabel.projectlabel_labelid)\
.order_by(Project.id.asc()).all()
If you want the labels in comma-delimited lists use func.group_concat():
db.session.query(Project.id,
func.group_concat(Label.label_name).label('related_labels'))\
.filter(ProjectLabel.projectlabel_projectid==Project.id)\
.filter(Label.label_id==ProjectLabel.projectlabel_labelid)\
.group_by(Project.id)\
.order_by(Project.id.asc()).all()

Is there a way of viewing the columns for relationships within pgAdmin?

I've begun populating the following tables inside my database:
class ModelItem(Base):
__tablename__ = 'item'
name = Column('name', String, primary_key=True)
set_id = Column(String, ForeignKey('set.name'))
class ModelSet(Base):
__tablename__ = 'set'
name = Column('name', String, primary_key=True)
items = relationship('ModelItem', backref='set')
Everything seems to be working fine since I can query the children of the parent record and come up with the expected data within my code. I'm just wondering if there's a way to see that same items column in pgAdmin like I can with all the other columns for the parent table

Return set of unique values from multiple rows of arrays in sqlalchemy

I have 2 tables, one called Tasks and another called TaskUpdates (one to many relation). TaskUpdates has a column called tags, which is an array. I am trying to get back an array that has only unique values from the TaskUpdates.tags
class Task(BASE):
__tablename__ = 'tasks'
id = Column(Integer, primary_key=True)
# One Task to Many Updates
updates = relationship("TaskUpdate")
class TaskUpdate(BASE):
__tablename__ = 'task_updates'
# Columns
id = Column(Integer, primary_key=True)
tags = Column(ARRAY(String(255)))
task_id = Column(Integer, ForeignKey('tasks.id'))
task = relationship('Task', back_populates="updates")

Deleting from many-to-many SQL-Alchemy and Postgresql

I'm trying to delete a child object from a many-to-many relationship in sql-alchemy.
I keep getting the following error:
StaleDataError: DELETE statement on table 'headings_locations' expected to delete 1 row(s); Only 2 were matched.
I have looked at a number of the existing stackexchange questions
(SQLAlchemy DELETE Error caused by having a both lazy-load AND a dynamic version of the same relationship, SQLAlchemy StaleDataError on deleting items inserted via ORM sqlalchemy.orm.exc.StaleDataError, SQLAlchemy Attempting to Twice Delete Many to Many Secondary Relationship, Delete from Many to Many Relationship in MySQL)
regarding this as well as read the documentation and can't figure out why it isn't working.
My code defining the relationships is as follows:
headings_locations = db.Table('headings_locations',
db.Column('id', db.Integer, primary_key=True),
db.Column('location_id', db.Integer(), db.ForeignKey('location.id')),
db.Column('headings_id', db.Integer(), db.ForeignKey('headings.id')))
class Headings(db.Model):
__tablename__ = "headings"
id = db.Column(db.Integer, primary_key=True)
name = db.Column(db.String(80))
version = db.Column(db.Integer, default=1)
special = db.Column(db.Boolean(), default=False)
content = db.relationship('Content', backref=db.backref('heading'), cascade="all, delete-orphan")
created_date = db.Column(db.Date, default=datetime.datetime.utcnow())
modified_date = db.Column(db.Date, default=datetime.datetime.utcnow(), onupdate=datetime.datetime.utcnow())
def __init__(self, name):
self.name = name
class Location(db.Model):
__tablename__ = "location"
id = db.Column(db.Integer, primary_key=True)
name = db.Column(db.String(80), unique=True)
account_id = db.Column(db.Integer, db.ForeignKey('account.id'))
version = db.Column(db.Integer, default=1)
created_date = db.Column(db.Date, default=datetime.datetime.utcnow())
modified_date = db.Column(db.Date, default=datetime.datetime.utcnow())
location_prefix = db.Column(db.Integer)
numbers = db.relationship('Numbers', backref=db.backref('location'), cascade="all, delete-orphan")
headings = db.relationship('Headings', secondary=headings_locations,
backref=db.backref('locations', lazy='dynamic', cascade="all"))
def __init__(self, name):
self.name = name
And my delete code is as follows:
#content_blueprint.route('/delete_content/<int:location_id>/<int:heading_id>')
#login_required
def delete_content(location_id, heading_id):
import pdb
pdb.set_trace()
location = db.session.query(Location).filter_by(id = location_id).first()
heading = db.session.query(Headings).filter_by(id = heading_id).first()
location.headings.remove(heading)
#db.session.delete(heading)
db.session.commit()
flash('Data Updated, thank-you')
return redirect(url_for('content.add_heading', location_id=location_id))
Whichever way i try and remove the child object (db.session.delete(heading) or location.headings.remove(heading) I still get the same error.
Any help is much appreciated.
My database is postgresql.
Edit:
My code which adds the relationship:
new_heading = Headings(form.new_heading.data)
db.session.add(new_heading)
location.headings.append(new_heading)
db.session.commit()
I would assume that the error message is correct: indeed in your database you have 2 rows which link Location and Heading instances. In this case you should find out where and why did this happen in the first place, and prevent this from happening again
First, to confirm this assumption, you could run the following query against your database:
q = session.query(
headings_locations.c.location_id,
headings_locations.c.heading_id,
sa.func.count().label("# connections"),
).group_by(
headings_locations.c.location_id,
headings_locations.c.heading_id,
).having(
sa.func.count() > 1
)
Assuming, the assumption is confirmed, fix it by manually deleting all the duplicates in your database (leaving just one for each).
After that, add a UniqueConstraint to your headings_locations table:
headings_locations = db.Table('headings_locations',
db.Column('id', db.Integer, primary_key=True),
db.Column('location_id', db.Integer(), db.ForeignKey('location.id')),
db.Column('headings_id', db.Integer(), db.ForeignKey('headings.id')),
db.UniqueConstraint('location_id', 'headings_id', name='UC_location_id_headings_id'),
)
Note that you need to need to add it to the database, it is not enough to add it to the sqlalchemy model.
Now the code where the duplicates are inserted by mistake will fail with the unique constraint violation exception, and you can fix the root of the problem.