SQLAlchemy migrating MySQL followers system to PostgreSQL issue - postgresql

Today I decided to move my fairly new project to PostgreSQL, but I've encountered a few issues down the road, one of which is:
Below is a perfectly normal, functional MySQL model for a follower / followed system, however for some reason I keep getting SQLAlchemy errors:
sqlalchemy.exc.ArgumentError: Relationship User.followed could not determine any
unambiguous local/remote column pairs based on join condition and remote_side a
rguments. Consider using the remote() annotation to accurately mark those eleme
nts of the join condition that are on the remote side of the relationship.
followers = db.Table(
'followers',
db.Column('follower_id', db.Integer(), db.ForeignKey('users.id'), primary_key=True),
db.Column('followed_id', db.Integer(), db.ForeignKey('users.id'), primary_key=True)
)
class Base(db.Model):
__abstract__ = True
id = db.Column(db.Integer, primary_key=True)
created_at = db.Column(db.DateTime,
default=datetime.datetime.utcnow)
updated_at = db.Column(db.DateTime,
default=datetime.datetime.utcnow,
onupdate=datetime.datetime.utcnow)
class User(Base, UserJsonSerializer, UserMixin):
__tablename__ = 'users'
email = db.Column(db.String(255), unique=True,)
followed = db.relationship(
'User',
secondary=followers,
primaryjoin=(followers.c.follower_id == id),
secondaryjoin=(followers.c.followed_id == id),
backref=db.backref('followers', lazy='dynamic'),
lazy='dynamic'
)
I am very new to PostgreSQL so I am rather lost as to what is the issue here, and since this is a fairly complicated relationship, I am even more at loss what to do.

So I found a solution myself, not sure if it will work but so far the error seems to be gone.
Cheers.
followed = db.relationship('User',
secondary=followers,
foreign_keys=[followers.c.follower_id])
followers = db.relationship('User',
secondary=followers,
foreign_keys=[followers.c.followed_id])

Related

How to query SQLAlchemy/PostgreSQL table for existence of an object where PK is UUID

Apologies if this is poorly phrased as I'm pretty new to SQLAlchemy. Suppose I have the following setup:
class Student(Base):
__tablename__ = 'student'
id = Column(UUID(as_uuid=True), primary_key=True, default=uuid.uuid4)
name = Column(String(50), nullable=False)
...and I create a new Student and add it to the table:
new_student = Student(name="Nick")
session.add(new_student)
session.commit()
Assuming that primary key is solely the UUID, how can I check the table for the existence of the specific object referenced by new_student? Is it possible to do something like:
student = session.query(Student).filter(new_student)
Everything I've seen in my searches requires you to use a table column, but the only guaranteed unique column in my table is the UUID, which I don't know until after it's created.
Thanks for the help!

How to Initialise & Populate a Postgres Database with Circular ForeignKeys in SQLModel?

Goal:
I'm trying to use SQLModel (a wrapper that ties together pydantic and sqlalchemy) to define and interact with the back-end database for a cleaning company. Specifically, trying to model a system where customers can have multiple properties that need to be cleaned and each customer has a single lead person who has a single mailing property (to contact them at). Ideally, I want to be able to use a single table for the mailing properties and cleaning properties (as in most instances they will be the same).
Constraints:
Customers can be either individual people or organisations
A lead person must be identifiable for each customer
Each person must be matched to a property (so that their mailing address can be identified)
A single customer can have multiple properties attached to them (e.g. for a landlord that includes cleaning as part of the rent)
The issue is that the foreign keys have a circular dependency.
Customer -> Person based on the lead_person_id
Person -> Property based on the mailing_property_id
Property -> Customer based on the occupant_customer_id
Code to reproduce the issue:
# Imports
from typing import Optional, List
from sqlmodel import Session, Field, SQLModel, Relationship, create_engine
import uuid as uuid_pkg
# Defining schemas
class Person(SQLModel, table=True):
person_id: uuid_pkg.UUID = Field(default_factory=uuid_pkg.uuid4, primary_key=True, index=True, nullable=True)
first_names: str
last_name: str
mailing_property_id: uuid_pkg.UUID = Field(foreign_key='property.property_id')
customer: Optional['Customer'] = Relationship(back_populates='lead_person')
mailing_property: Optional['Property'] = Relationship(back_populates='person')
class Customer(SQLModel, table=True):
customer_id: uuid_pkg.UUID = Field(default_factory=uuid_pkg.uuid4, primary_key=True, index=True, nullable=True)
lead_person_id: uuid_pkg.UUID = Field(foreign_key='person.person_id')
contract_type: str
lead_person: Optional['Person'] = Relationship(back_populates='customer')
contracted_properties: Optional[List['Property']] = Relationship(back_populates='occupant_customer')
class Property(SQLModel, table=True):
property_id: uuid_pkg.UUID = Field(default_factory=uuid_pkg.uuid4, primary_key=True, index=True, nullable=True)
occupant_customer_id: uuid_pkg.UUID = Field(foreign_key='customer.customer_id')
address: str
person: Optional['Person'] = Relationship(back_populates='mailing_property')
occupant_customer: Optional['Customer'] = Relationship(back_populates='contracted_properties')
# Initialising the database
engine = create_engine(f'postgresql://{DB_USERNAME}:{DB_PASSWORD}#{DB_URL}:{DB_PORT}/{DB_NAME}')
SQLModel.metadata.create_all(engine)
# Defining the database entries
john = Person(
person_id = 'eb7a0f5d-e09b-4b36-8e15-e9541ea7bd6e',
first_names = 'John',
last_name = 'Smith',
mailing_property_id = '4d6aed8d-d1a2-4152-ae4b-662baddcbef4'
)
johns_lettings = Customer(
customer_id = 'cb58199b-d7cf-4d94-a4ba-e7bb32f1cda4',
lead_person_id = 'eb7a0f5d-e09b-4b36-8e15-e9541ea7bd6e',
contract_type = 'Landlord Premium'
)
johns_property_1 = Property(
property_id = '4d6aed8d-d1a2-4152-ae4b-662baddcbef4',
occupant_customer_id = 'cb58199b-d7cf-4d94-a4ba-e7bb32f1cda4',
address = '123 High Street'
)
johns_property_2 = Property(
property_id = '2ac15ac9-9ab3-4a7c-80ad-961dd565ab0a',
occupant_customer_id = 'cb58199b-d7cf-4d94-a4ba-e7bb32f1cda4',
address = '456 High Street'
)
# Committing the database entries
with Session(engine) as session:
session.add(john)
session.add(johns_lettings)
session.add(johns_property_1)
session.add(johns_property_2)
session.commit()
Results in:
ForeignKeyViolation: insert or update on table "customer" violates foreign key constraint "customer_lead_person_id_fkey"
DETAIL: Key (lead_person_id)=(eb7a0f5d-e09b-4b36-8e15-e9541ea7bd6e) is not present in table "person".
This issue is specific to Postgres, which unlike SQLite (used in the docs) imposes constraints on foreign keys when data is being added. I.e. replacing engine = create_engine(f'postgresql://{DB_USERNAME}:{DB_PASSWORD}#{DB_URL}:{DB_PORT}/{DB_NAME}') with engine = create_engine('sqlite:///test.db') will let the database be initialised without causing an error - however my use-case is with a Postgres DB.
Attempted Solutions:
Used link tables between customers/people and properties/customers - no luck
Used Session.exec with this code from SO to temporarily remove foreign key constraints then add them back on - no luck
Used primary joins instead of foreign keys as described in this SQLModel Issue - no luck

Deleting from many-to-many SQL-Alchemy and Postgresql

I'm trying to delete a child object from a many-to-many relationship in sql-alchemy.
I keep getting the following error:
StaleDataError: DELETE statement on table 'headings_locations' expected to delete 1 row(s); Only 2 were matched.
I have looked at a number of the existing stackexchange questions
(SQLAlchemy DELETE Error caused by having a both lazy-load AND a dynamic version of the same relationship, SQLAlchemy StaleDataError on deleting items inserted via ORM sqlalchemy.orm.exc.StaleDataError, SQLAlchemy Attempting to Twice Delete Many to Many Secondary Relationship, Delete from Many to Many Relationship in MySQL)
regarding this as well as read the documentation and can't figure out why it isn't working.
My code defining the relationships is as follows:
headings_locations = db.Table('headings_locations',
db.Column('id', db.Integer, primary_key=True),
db.Column('location_id', db.Integer(), db.ForeignKey('location.id')),
db.Column('headings_id', db.Integer(), db.ForeignKey('headings.id')))
class Headings(db.Model):
__tablename__ = "headings"
id = db.Column(db.Integer, primary_key=True)
name = db.Column(db.String(80))
version = db.Column(db.Integer, default=1)
special = db.Column(db.Boolean(), default=False)
content = db.relationship('Content', backref=db.backref('heading'), cascade="all, delete-orphan")
created_date = db.Column(db.Date, default=datetime.datetime.utcnow())
modified_date = db.Column(db.Date, default=datetime.datetime.utcnow(), onupdate=datetime.datetime.utcnow())
def __init__(self, name):
self.name = name
class Location(db.Model):
__tablename__ = "location"
id = db.Column(db.Integer, primary_key=True)
name = db.Column(db.String(80), unique=True)
account_id = db.Column(db.Integer, db.ForeignKey('account.id'))
version = db.Column(db.Integer, default=1)
created_date = db.Column(db.Date, default=datetime.datetime.utcnow())
modified_date = db.Column(db.Date, default=datetime.datetime.utcnow())
location_prefix = db.Column(db.Integer)
numbers = db.relationship('Numbers', backref=db.backref('location'), cascade="all, delete-orphan")
headings = db.relationship('Headings', secondary=headings_locations,
backref=db.backref('locations', lazy='dynamic', cascade="all"))
def __init__(self, name):
self.name = name
And my delete code is as follows:
#content_blueprint.route('/delete_content/<int:location_id>/<int:heading_id>')
#login_required
def delete_content(location_id, heading_id):
import pdb
pdb.set_trace()
location = db.session.query(Location).filter_by(id = location_id).first()
heading = db.session.query(Headings).filter_by(id = heading_id).first()
location.headings.remove(heading)
#db.session.delete(heading)
db.session.commit()
flash('Data Updated, thank-you')
return redirect(url_for('content.add_heading', location_id=location_id))
Whichever way i try and remove the child object (db.session.delete(heading) or location.headings.remove(heading) I still get the same error.
Any help is much appreciated.
My database is postgresql.
Edit:
My code which adds the relationship:
new_heading = Headings(form.new_heading.data)
db.session.add(new_heading)
location.headings.append(new_heading)
db.session.commit()
I would assume that the error message is correct: indeed in your database you have 2 rows which link Location and Heading instances. In this case you should find out where and why did this happen in the first place, and prevent this from happening again
First, to confirm this assumption, you could run the following query against your database:
q = session.query(
headings_locations.c.location_id,
headings_locations.c.heading_id,
sa.func.count().label("# connections"),
).group_by(
headings_locations.c.location_id,
headings_locations.c.heading_id,
).having(
sa.func.count() > 1
)
Assuming, the assumption is confirmed, fix it by manually deleting all the duplicates in your database (leaving just one for each).
After that, add a UniqueConstraint to your headings_locations table:
headings_locations = db.Table('headings_locations',
db.Column('id', db.Integer, primary_key=True),
db.Column('location_id', db.Integer(), db.ForeignKey('location.id')),
db.Column('headings_id', db.Integer(), db.ForeignKey('headings.id')),
db.UniqueConstraint('location_id', 'headings_id', name='UC_location_id_headings_id'),
)
Note that you need to need to add it to the database, it is not enough to add it to the sqlalchemy model.
Now the code where the duplicates are inserted by mistake will fail with the unique constraint violation exception, and you can fix the root of the problem.

SQLAlchemy + PostgreSQL quoted table name for User, with Flask-Login

Consider the following declarative User model in SQLAlchemy:
class User(Base):
id = Column(Integer, primary_key=True)
username = Column(String(50), unique=True)
email = Column(String(1024), unique=True)
points = Column(Integer, default=0)
achievements = relationship('Achievement',
secondary=achievement_association_table,
backref='users')
reviews = relationship('Review', backref='author', lazy='dynamic')
moderated = Column(Boolean, default=True)
When I do a SELECT * FROM user, I noticed that the query was not returning all of my columns, and showed only the "current_user" column which I can only surmise is a result of using Flask-Login.
Making the query User.query.all() resulted in the following SQL:
SELECT "user".created AS user_created, "user".modified AS user_modified, "user".id AS user_id, "user".username AS user_username, "user".email AS user_email, "user".points AS user_points, "user".moderated AS user_moderated
Can anyone help me understand why this table was created double quoted? None of my other (similarly defined) declarative models exhibit this behavior.
Thanks in advance!
user is a reserved words and thus needs to be quoted.
More details about quoted identifiers are in the manual:
http://www.postgresql.org/docs/current/static/sql-syntax-lexical.html#SQL-SYNTAX-IDENTIFIERS

Entity Framework Timeout

I have been trying to figure out how to optimize the following query for the past few days and just not having much luck. Right now my test db is returning about 300 records with very little nested data, but it's taking 4-5 seconds to run and the SQL being generated by LINQ is awfully long (too long to include here). Any suggestions would be very much appreciated.
To sum up this query, I'm trying to return a somewhat flattened "snapshot" of a client list with current status. A Party contains one or more Clients who have Roles (ASPNET Role Provider), Journal is returning the last 1 journal entry of all the clients in a Party, same goes for Task, and LastLoginDate, hence the OrderBy and FirstOrDefault functions.
Guid userID = 'some user ID'
var parties = Parties.Where(p => p.BrokerID == userID).Select(p => new
{
ID = p.ID,
Title = p.Title,
Goal = p.Goal,
Groups = p.Groups,
IsBuyer = p.Clients.Any(c => c.RolesInUser.Any(r => r.Role.LoweredName == "buyer")),
IsSeller = p.Clients.Any(c => c.RolesInUser.Any(r => r.Role.LoweredName == "seller")),
Journal = p.Clients.SelectMany(c => c.Journals).OrderByDescending(j => j.OccuredOn).Select(j=> new
{
ID = j.ID,
Title = j.Title,
OccurredOn = j.OccuredOn,
SubCatTitle = j.JournalSubcategory.Title
}).FirstOrDefault(),
LastLoginDate = p.Clients.OrderByDescending(c=>c.LastLoginDate).Select(c=>c.LastLoginDate).FirstOrDefault(),
MarketingPlanCount = p.Clients.SelectMany(c => c.MarketingPlans).Count(),
Task = p.Tasks.Where(t=>t.DueDate != null && t.DueDate > DateTime.Now).OrderBy(t=>t.DueDate).Select(t=> new
{
ID = t.TaskID,
DueDate = t.DueDate,
Title = t.Title
}).FirstOrDefault(),
Clients = p.Clients.Select(c => new
{
ID = c.ID,
FirstName = c.FirstName,
MiddleName = c.MiddleName,
LastName = c.LastName,
Email = c.Email,
LastLogin = c.LastLoginDate
})
}).OrderBy(p => p.Title).ToList()
I think posting the SQL could give us some clues, as small things like the order of OrderBy coming before or after the projection could make a big difference.
But regardless, try extracting the Clients in a seperate query, this will simplify your query probably. And then include other tables like Journal and Tasks before projecting and see how this affects your query:
//am not sure what the exact query would be, and project it using ToList()
var clients = GetClientsForParty();
var parties = Parties.Include("Journal").Include("Tasks")
.Where(p=>p.BrokerID == userID).Select( p => {
....
//then use the in-memory clients
IsBuyer = clients.Any(c => c.RolesInUser.Any(r => r.Role.LoweredName == "buyer")),
...
}
)
In all cases, install EF profiler and have a look at how your query is affected. EF can be quiet surprising. Something like putting OrderBy before the projection, the same for all these FirstOrDefault or SingleOrDefault, they can all have a big effect.
And go back to the basics, if you are searching on LoweredRoleName, then make sure it is indexed so that the query is fast (even though that could be useless since EF could end up not making use of the covering index since it is querying so many other columns).
Also, since this is query is to view data (you will not alter data), don't forget to turn off Entity tracking, that will give you some performance boost as well.
And last, don't forget that you could always write your SQL query directly and project to your a ViewModel rather than anonymous type (which I see as a good practice anyhow) so create a class called PartyViewModel that includes the flatten view you are after, and use it with your hand-crafted SQL
//use your optimized SQL query that you write or even call a stored procedure
db.Database.SQLQuery("select * from .... join .... on");
I am writing a blog post about these issues around EF. The post is still not finished, but all in all, just be patient, use some of these tricks and observe their effect (and measure it) and you will reach what you want.