Race condition when manually setting the id in sqlalchemy - postgresql

I want that all my object have a unique id that is set by PostgreSQL with a (serial) and another id that depends to the first one.
When creating an object if I set the second id after saving it, I'll have an INSERT and an UPDATE on the table, what is not really the best.
So to have only one INSERT I fetch the id from the PostgreSQL sequence and set the id with it instead of letting PostgreSQL do it at INSERT stage.
I'm pretty new on SQLAlchemy and want to be sure that this way of doing is race condition proof.
Thanks for you thoughts on this idea
class MyModel:
def __init__(self, session, **data):
"""
Base constructor for almost all model classes, performing common tasks
"""
cls = type(self)
if session:
"""To avoid having an UPDATE right after the INSERT we manually fetch
the next available id using a postgresl internal
SELECT nextval(pg_get_serial_sequence('events', 'id'));
To do that we need the table's name and the sequence
column's name, by chance we use the same name in all our
model
"""
table_name = cls.__tablename__
qry = f"SELECT nextval(pg_get_serial_sequence('{table_name}', 'id'))"
rs = session.execute(qry)
# TODO : find a non ugly way to to that
for row in rs:
next_id = row[0]
# manually set the object id
self.id = next_id
# set the external_id before saving the object in the database
self.ex_id = cls.ex_id_prefix + self.id
session.add(self)
session.flush([self])

If you are targetting Postgresql 12 or later, you can use a generated column. SQLAlchemy's Computed column type will create such a column, and we can pass an SQL expression to compute the value.
The model would look like this:
class MyModel(Base):
__tablename__ = 't68225046'
ex_id_prefix = 'prefix_'
id = sa.Column(sa.Integer, primary_key=True)
ex_id = sa.Column(sa.String,
sa.Computed(sa.text(":p || id::varchar").bindparams(p=ex_id_prefix)))
producing this DDL
CREATE TABLE t68225046 (
id SERIAL NOT NULL,
ex_id VARCHAR GENERATED ALWAYS AS ('prefix_' || id::varchar) STORED,
PRIMARY KEY (id)
)
and a single insert statement
2021-09-19 ... INFO sqlalchemy.engine.Engine BEGIN (implicit)
2021-09-19 ... INFO sqlalchemy.engine.Engine INSERT INTO t68225046 DEFAULT VALUES RETURNING t68225046.id
2021-09-19 ... INFO sqlalchemy.engine.Engine [generated in 0.00014s] {}
2021-09-19 ... INFO sqlalchemy.engine.Engine COMMIT
For earlier releases of Postgresql, or if you don't need to store the value in the database, you could simulate it with a hybrid property.
import sqlalchemy as sa
from sqlalchemy import orm
from sqlalchemy.ext.hybrid import hybrid_property
from sqlalchemy.sql import cast
Base = orm.declarative_base()
class MyModel(Base):
__tablename__ = 't68225046'
ex_id_prefix = 'prefix_'
id = sa.Column(sa.Integer, primary_key=True)
#hybrid_property
def ex_id(self):
return self.ex_id_prefix + str(self.id)
#ex_id.expression
def ex_id(cls):
# See https://stackoverflow.com/a/54487891/5320906
return cls.ex_id_prefix + cast(cls.id, sa.String)

Related

How to query SQLAlchemy/PostgreSQL table for existence of an object where PK is UUID

Apologies if this is poorly phrased as I'm pretty new to SQLAlchemy. Suppose I have the following setup:
class Student(Base):
__tablename__ = 'student'
id = Column(UUID(as_uuid=True), primary_key=True, default=uuid.uuid4)
name = Column(String(50), nullable=False)
...and I create a new Student and add it to the table:
new_student = Student(name="Nick")
session.add(new_student)
session.commit()
Assuming that primary key is solely the UUID, how can I check the table for the existence of the specific object referenced by new_student? Is it possible to do something like:
student = session.query(Student).filter(new_student)
Everything I've seen in my searches requires you to use a table column, but the only guaranteed unique column in my table is the UUID, which I don't know until after it's created.
Thanks for the help!

How to Initialise & Populate a Postgres Database with Circular ForeignKeys in SQLModel?

Goal:
I'm trying to use SQLModel (a wrapper that ties together pydantic and sqlalchemy) to define and interact with the back-end database for a cleaning company. Specifically, trying to model a system where customers can have multiple properties that need to be cleaned and each customer has a single lead person who has a single mailing property (to contact them at). Ideally, I want to be able to use a single table for the mailing properties and cleaning properties (as in most instances they will be the same).
Constraints:
Customers can be either individual people or organisations
A lead person must be identifiable for each customer
Each person must be matched to a property (so that their mailing address can be identified)
A single customer can have multiple properties attached to them (e.g. for a landlord that includes cleaning as part of the rent)
The issue is that the foreign keys have a circular dependency.
Customer -> Person based on the lead_person_id
Person -> Property based on the mailing_property_id
Property -> Customer based on the occupant_customer_id
Code to reproduce the issue:
# Imports
from typing import Optional, List
from sqlmodel import Session, Field, SQLModel, Relationship, create_engine
import uuid as uuid_pkg
# Defining schemas
class Person(SQLModel, table=True):
person_id: uuid_pkg.UUID = Field(default_factory=uuid_pkg.uuid4, primary_key=True, index=True, nullable=True)
first_names: str
last_name: str
mailing_property_id: uuid_pkg.UUID = Field(foreign_key='property.property_id')
customer: Optional['Customer'] = Relationship(back_populates='lead_person')
mailing_property: Optional['Property'] = Relationship(back_populates='person')
class Customer(SQLModel, table=True):
customer_id: uuid_pkg.UUID = Field(default_factory=uuid_pkg.uuid4, primary_key=True, index=True, nullable=True)
lead_person_id: uuid_pkg.UUID = Field(foreign_key='person.person_id')
contract_type: str
lead_person: Optional['Person'] = Relationship(back_populates='customer')
contracted_properties: Optional[List['Property']] = Relationship(back_populates='occupant_customer')
class Property(SQLModel, table=True):
property_id: uuid_pkg.UUID = Field(default_factory=uuid_pkg.uuid4, primary_key=True, index=True, nullable=True)
occupant_customer_id: uuid_pkg.UUID = Field(foreign_key='customer.customer_id')
address: str
person: Optional['Person'] = Relationship(back_populates='mailing_property')
occupant_customer: Optional['Customer'] = Relationship(back_populates='contracted_properties')
# Initialising the database
engine = create_engine(f'postgresql://{DB_USERNAME}:{DB_PASSWORD}#{DB_URL}:{DB_PORT}/{DB_NAME}')
SQLModel.metadata.create_all(engine)
# Defining the database entries
john = Person(
person_id = 'eb7a0f5d-e09b-4b36-8e15-e9541ea7bd6e',
first_names = 'John',
last_name = 'Smith',
mailing_property_id = '4d6aed8d-d1a2-4152-ae4b-662baddcbef4'
)
johns_lettings = Customer(
customer_id = 'cb58199b-d7cf-4d94-a4ba-e7bb32f1cda4',
lead_person_id = 'eb7a0f5d-e09b-4b36-8e15-e9541ea7bd6e',
contract_type = 'Landlord Premium'
)
johns_property_1 = Property(
property_id = '4d6aed8d-d1a2-4152-ae4b-662baddcbef4',
occupant_customer_id = 'cb58199b-d7cf-4d94-a4ba-e7bb32f1cda4',
address = '123 High Street'
)
johns_property_2 = Property(
property_id = '2ac15ac9-9ab3-4a7c-80ad-961dd565ab0a',
occupant_customer_id = 'cb58199b-d7cf-4d94-a4ba-e7bb32f1cda4',
address = '456 High Street'
)
# Committing the database entries
with Session(engine) as session:
session.add(john)
session.add(johns_lettings)
session.add(johns_property_1)
session.add(johns_property_2)
session.commit()
Results in:
ForeignKeyViolation: insert or update on table "customer" violates foreign key constraint "customer_lead_person_id_fkey"
DETAIL: Key (lead_person_id)=(eb7a0f5d-e09b-4b36-8e15-e9541ea7bd6e) is not present in table "person".
This issue is specific to Postgres, which unlike SQLite (used in the docs) imposes constraints on foreign keys when data is being added. I.e. replacing engine = create_engine(f'postgresql://{DB_USERNAME}:{DB_PASSWORD}#{DB_URL}:{DB_PORT}/{DB_NAME}') with engine = create_engine('sqlite:///test.db') will let the database be initialised without causing an error - however my use-case is with a Postgres DB.
Attempted Solutions:
Used link tables between customers/people and properties/customers - no luck
Used Session.exec with this code from SO to temporarily remove foreign key constraints then add them back on - no luck
Used primary joins instead of foreign keys as described in this SQLModel Issue - no luck

How do I prevent sql alchemy from inserting the None value to field?

The Alembic migration script :
def upgrade():
uuid_gen = saexp.text("UUID GENERATE V1MC()")
op.create_table(
'foo',
sa.Column('uuid', UUID, primary_key=True, server_default=uuid_gen),
sa.Column(
'inserted',
sa.DateTime(timezone=True),
server_default=sa.text("not null now()"))
sa.Column('data', sa.Text)
)
This is my Base class for SQL Alchemy:
Class Foo(Base):
__tablename__ = 'foo'
inserted = Column(TIMESTAMP)
uuid = Column(UUID, primary_key=True)
data = Column(TEXT)
It has a static mehtod for insert :
#staticmethod
def insert(session, jsondata):
foo = Foo()
foo.data = jsondata['data']
if 'inserted' in jsondata:
foo.inserted = jsondata['inserted']
if 'uuid' in jsondata:
foo.uuid = jsondata['uuid']
session.add(foo)
return foo
the purpose of the 2 if's are to simplify testing. this way i can "inject" a uuid and inserted date, to get predictible data for my tests
When trying to insert data
foo = Foo()
foo.insert(session, {"data": "foo bar baz"})
session.commit()
I get an IntegrityError :
[SQL: 'INSERT INTO foo (inserted, data) VALUES (%(inserted)s, %(data)s) RETURNING foo.uuid'] [parameters: {'data': 'foo bar baz', 'inserted': None}]
wich seem normal to me because the insert violates the "not-null" constraint in the postgres database.
How do I prevent sql alchemy from inserting the None value to the inserted field ?
While playing and testing around, I found that if the "inserted" column is defined as primary key , sql alchemy does not include the field in the insert statement.
def upgrade():
uuid_gen = saexp.text("UUID GENERATE V1MC()")
op.create_table(
'foo',
sa.Column('uuid', UUID, primary_key=True, server_default=uuid_gen),
sa.Column(
'inserted',
primary_key=True,
sa.DateTime(timezone=True),
server_default=sa.text("not null now()"))
sa.Column('data', sa.Text)
)
But this is not what I want.
The primary problem is the server_default which is missing in the inserted member in class Foo. It's only present in the alembic script. Note that the alembic definitions are only used when running the migrations. They do not affect the application. For this reason, it's a good idea to copy the exact same definitions from the alembic script to your application (or vice-versa).
Because no value is defined in the model definition, sqlalchemy seems to set this to None when the class is instantiated. This will then be sent to the DB which will complain. To fix this, either set default or server_default on the model definition (the class inheriting from Base).
Some additional notes/questions:
Where does UUID GENERATE V1MC() come from? The official docs look different. I replaced it with func.uuid_generate_v1mc().
The server_default value in your case contains not null which is incorrect. You should set nullable=False on you column attribute (see below).
alembic script
# revision identifiers, used by Alembic.
revision = THIS_IS_DIFFERENT_ON_EACH_INSTANCE! # '1b7e145f2138'
down_revision = None
branch_labels = None
depends_on = None
from alembic import op
import sqlalchemy as sa
from sqlalchemy.dialects.postgresql import UUID
def upgrade():
op.create_table(
'foo',
sa.Column('uuid', UUID, primary_key=True,
server_default=sa.func.uuid_generate_v1mc()),
sa.Column(
'inserted',
sa.DateTime(timezone=True),
nullable=False,
server_default=sa.text("now()")),
sa.Column('data', sa.Text)
)
def downgrade():
op.drop_table('foo')
tester.py
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy import Column, create_engine, func
from sqlalchemy.orm import scoped_session, sessionmaker
from sqlalchemy.dialects.postgresql import (
TEXT,
TIMESTAMP,
UUID,
)
engine = create_engine('postgresql://michel#/michel')
Session = scoped_session(sessionmaker(autocommit=False,
autoflush=False,
bind=engine))
Base = declarative_base()
class Foo(Base):
__tablename__ = 'foo'
inserted = Column(TIMESTAMP, nullable=False,
server_default=func.now())
uuid = Column(UUID, primary_key=True,
server_default=func.uuid_generate_v1mc()),
data = Column(TEXT)
#staticmethod
def insert(session, jsondata):
foo = Foo()
foo.data = jsondata['data']
if 'inserted' in jsondata:
foo.inserted = jsondata['inserted']
if 'uuid' in jsondata:
foo.uuid = jsondata['uuid']
session.add(foo)
return foo
if __name__ == '__main__':
session = Session()
Foo.insert(session, {"data": "foo bar baz"})
session.commit()
session.close()
output after execution
[9:43:54] michel#BBS-nexus [1 background job(s)]
/home/users/michel/tmp› psql -c "select * from foo"
uuid | inserted | data
--------------------------------------+-------------------------------+-------------
71f5fd32-0602-11e6-aebb-27be4bbac26e | 2016-04-19 09:43:45.297191+02 | foo bar baz
(1 row)

Deleting from many-to-many SQL-Alchemy and Postgresql

I'm trying to delete a child object from a many-to-many relationship in sql-alchemy.
I keep getting the following error:
StaleDataError: DELETE statement on table 'headings_locations' expected to delete 1 row(s); Only 2 were matched.
I have looked at a number of the existing stackexchange questions
(SQLAlchemy DELETE Error caused by having a both lazy-load AND a dynamic version of the same relationship, SQLAlchemy StaleDataError on deleting items inserted via ORM sqlalchemy.orm.exc.StaleDataError, SQLAlchemy Attempting to Twice Delete Many to Many Secondary Relationship, Delete from Many to Many Relationship in MySQL)
regarding this as well as read the documentation and can't figure out why it isn't working.
My code defining the relationships is as follows:
headings_locations = db.Table('headings_locations',
db.Column('id', db.Integer, primary_key=True),
db.Column('location_id', db.Integer(), db.ForeignKey('location.id')),
db.Column('headings_id', db.Integer(), db.ForeignKey('headings.id')))
class Headings(db.Model):
__tablename__ = "headings"
id = db.Column(db.Integer, primary_key=True)
name = db.Column(db.String(80))
version = db.Column(db.Integer, default=1)
special = db.Column(db.Boolean(), default=False)
content = db.relationship('Content', backref=db.backref('heading'), cascade="all, delete-orphan")
created_date = db.Column(db.Date, default=datetime.datetime.utcnow())
modified_date = db.Column(db.Date, default=datetime.datetime.utcnow(), onupdate=datetime.datetime.utcnow())
def __init__(self, name):
self.name = name
class Location(db.Model):
__tablename__ = "location"
id = db.Column(db.Integer, primary_key=True)
name = db.Column(db.String(80), unique=True)
account_id = db.Column(db.Integer, db.ForeignKey('account.id'))
version = db.Column(db.Integer, default=1)
created_date = db.Column(db.Date, default=datetime.datetime.utcnow())
modified_date = db.Column(db.Date, default=datetime.datetime.utcnow())
location_prefix = db.Column(db.Integer)
numbers = db.relationship('Numbers', backref=db.backref('location'), cascade="all, delete-orphan")
headings = db.relationship('Headings', secondary=headings_locations,
backref=db.backref('locations', lazy='dynamic', cascade="all"))
def __init__(self, name):
self.name = name
And my delete code is as follows:
#content_blueprint.route('/delete_content/<int:location_id>/<int:heading_id>')
#login_required
def delete_content(location_id, heading_id):
import pdb
pdb.set_trace()
location = db.session.query(Location).filter_by(id = location_id).first()
heading = db.session.query(Headings).filter_by(id = heading_id).first()
location.headings.remove(heading)
#db.session.delete(heading)
db.session.commit()
flash('Data Updated, thank-you')
return redirect(url_for('content.add_heading', location_id=location_id))
Whichever way i try and remove the child object (db.session.delete(heading) or location.headings.remove(heading) I still get the same error.
Any help is much appreciated.
My database is postgresql.
Edit:
My code which adds the relationship:
new_heading = Headings(form.new_heading.data)
db.session.add(new_heading)
location.headings.append(new_heading)
db.session.commit()
I would assume that the error message is correct: indeed in your database you have 2 rows which link Location and Heading instances. In this case you should find out where and why did this happen in the first place, and prevent this from happening again
First, to confirm this assumption, you could run the following query against your database:
q = session.query(
headings_locations.c.location_id,
headings_locations.c.heading_id,
sa.func.count().label("# connections"),
).group_by(
headings_locations.c.location_id,
headings_locations.c.heading_id,
).having(
sa.func.count() > 1
)
Assuming, the assumption is confirmed, fix it by manually deleting all the duplicates in your database (leaving just one for each).
After that, add a UniqueConstraint to your headings_locations table:
headings_locations = db.Table('headings_locations',
db.Column('id', db.Integer, primary_key=True),
db.Column('location_id', db.Integer(), db.ForeignKey('location.id')),
db.Column('headings_id', db.Integer(), db.ForeignKey('headings.id')),
db.UniqueConstraint('location_id', 'headings_id', name='UC_location_id_headings_id'),
)
Note that you need to need to add it to the database, it is not enough to add it to the sqlalchemy model.
Now the code where the duplicates are inserted by mistake will fail with the unique constraint violation exception, and you can fix the root of the problem.

Default value doesn't work in SQLAlchemy + PostgreSQL + aiopg + psycopg2

I've found an unexpected behavior in SQLAlchemy. I'm using the following versions:
SQLAlchemy (0.9.8)
PostgreSQL (9.3.5)
psycopg2 (2.5.4)
aiopg (0.5.1)
This is the table definition for the example:
import asyncio
from aiopg.sa import create_engine
from sqlalchemy import (
MetaData,
Column,
Integer,
Table,
String,
)
metadata = MetaData()
users = Table('users', metadata,
Column('id_user', Integer, primary_key=True, nullable=False),
Column('name', String(20), unique=True),
Column('age', Integer, nullable=False, default=0),
)
Now if I try to execute a simple insert to the table just populating the id_user and name, the column age should be auto-generated right? Lets see...
#asyncio.coroutine
def go():
engine = yield from create_engine('postgresql://USER#localhost/DB')
data = {'id_user':1, 'name':'Jimmy' }
stmt = users.insert(values=data, inline=False)
with (yield from engine) as conn:
result = yield from conn.execute(stmt)
loop = asyncio.get_event_loop()
loop.run_until_complete(go())
This is the resulting statement with the corresponding error:
INSERT INTO users (id_user, name, age) VALUES (1, 'Jimmy', null);
psycopg2.IntegrityError: null value in column "age" violates not-null constraint
I didn't provide the age column, so where is that age = null value coming from? I was expecting something like this:
INSERT INTO users (id_user, name) VALUES (1, 'Jimmy');
Or if the default flag actually works should be:
INSERT INTO users (id_user, name, Age) VALUES (1, 'Jimmy', 0);
Could you put some light on this?
This issue has been confirmed has an aiopg bug. Seems like at the moment it's ignoring the default argument on data manipulation.
I've fixed the issue using server_default instead:
users = Table('users', metadata,
Column('id_user', Integer, primary_key=True, nullable=False),
Column('name', String(20), unique=True),
Column('age', Integer, nullable=False, server_default='0'))
I think you need to use inline=True in your insert. This turns off 'pre-execution'.
Docs are a bit cryptic on what exactly this 'pre-execution' entails, but they mentions default parameters:
:param inline:
if True, SQL defaults present on :class:`.Column` objects via
the ``default`` keyword will be compiled 'inline' into the statement
and not pre-executed. This means that their values will not
be available in the dictionary returned from
:meth:`.ResultProxy.last_updated_params`.
This piece of docstring is from Update class, but they have a shared behavior with Insert.
Besides, that's the only way they test it:
https://github.com/zzzeek/sqlalchemy/blob/rel_0_9/test/sql/test_insert.py#L385