Sqlalchemy + Postgres: synthetic/artificial id mixin with sequence - postgresql

I've found the mixin pattern to be really handy for staying DRY, but I am having trouble with sequences. Note, I'm using postgres.
We use alembic migrations, and I'd really like the --autogeneration to work with this sequence, though I understand this might not be possible right now. However, it looks like setting up the sequence without an ORM identifier, prevents the sequence from being dropped later if I wanted to perform a downgrade.
Through googling, I found some explanation on how to properly setup a sequence. Essentially: separate the id and its sequence.
Current Code looks a bit like this:
import sqlalchemy as sa
from sqlalchemy.ext.declarative import declared_attr
class AutoIdMixin(object):
"""Generates an synthetic identifier primary key.
"""
# See: http://docs.sqlalchemy.org/en/latest/core/defaults.html#associating-a-sequence-as-the-server-side-default
#declared_attr
def id_seq(cls):
bases = cls.__bases__
Base = bases[0]
sequence_prefix = 'seq'
schema = cls._schema_name
sequence_id = '_'.join((sequence_prefix, schema, cls.__tablename__, 'id'))
sequence = sa.Sequence(sequence_id, 1, 1, metadata=Base.metadata)
return sequence
#declared_attr
def id(cls):
column_id = sa.Column(sa.types.Integer, cls.id_seq.next_value(), primary_key=True)
return column_id
With the code above, I end up with a non-helpful error:
AttributeError: Neither 'next_value' object nor 'Comparator' object has an attribute '_set_parent_with_dispatch'

In an RTM moment, it looks like I missed a keyword: server_default.
#declared_attr
def id(cls):
sequence = cls.id_seq
column_id = sa.Column(sa.types.Integer, server_default=sequence.next_value(), primary_key=True)
return column_id

Related

Slick insert not working while trying to return inserted row

My goal here is to retrieve the Board entity upon insert. If the entity exists then I just want to return the existing object (which coincides with the argument of the add method). Otherwise I'd like to return the new row inserted in the database.
I am using Play 2.7 with Slick 3.2 and MySQL 5.7.
The implementation is based on this answer which is more than insightful.
Also from Essential Slick
exec(messages returning messages +=
Message("Dave", "So... what do we do now?"))
DAO code
#Singleton
class SlickDao #Inject()(db: Database,implicit val playDefaultContext: ExecutionContext) extends MyDao {
override def add(board: Board): Future[Board] = {
val insert = Boards
.filter(b => b.id === board.id && ).exists.result.flatMap { exists =>
if (!exists) Boards returning Boards += board
else DBIO.successful(board) // no-op - return specified board
}.transactionally
db.run(insert)
}
EDIT: also tried replacing the += part with
Boards returning Boards.map(_.id) into { (b, boardId) => sb.copy(id = boardId) } += board
and this does not work either
The table definition is the following:
object Board {
val Boards: TableQuery[BoardTable] = TableQuery[BoardTable]
class BoardTable(tag: Tag) extends Table[BoardRow](tag, "BOARDS") {
// columns
def id = column[String]("ID", O.Length(128))
def x = column[String]("X")
def y = column[Option[Int]]("Y")
// foreign key definitions
.....
// primary key definitions
def pk = primaryKey("PK_BOARDS", (id,y))
// default projection
def * = (boardId, x, y).mapTo[BoardRow]
}
}
I would expect that there would e a new row in the table but although the exists query gets executed
select exists(select `ID`, `X`, `Y`
from `BOARDS`
where ((`ID` = '92f10c23-2087-409a-9c4f-eb2d4d6c841f'));
and the result is false there is no insert.
There is neither any logging in the database that any insert statements are received (I am referring to the general_log file)
So first of all the problem for the query execution was a mishandling of the futures that the DAO produced. I was assigning the insert statement to a future but this future was never submitted to an execution context. My bad even more so that I did not mention it in the description of the problem.
But when this was actually fixed I could see the actual error in the logs of my application. The stack trace was the following:
slick.SlickException: This DBMS allows only a single column to be returned from an INSERT, and that column must be an AutoInc column.
at slick.jdbc.JdbcStatementBuilderComponent$JdbcCompiledInsert.buildReturnColumns(JdbcStatementBuilderComponent.scala:67)
at slick.jdbc.JdbcActionComponent$ReturningInsertActionComposerImpl.x$17$lzycompute(JdbcActionComponent.scala:659)
at slick.jdbc.JdbcActionComponent$ReturningInsertActionComposerImpl.x$17(JdbcActionComponent.scala:659)
at slick.jdbc.JdbcActionComponent$ReturningInsertActionComposerImpl.keyColumns$lzycompute(JdbcActionComponent.scala:659)
at slick.jdbc.JdbcActionComponent$ReturningInsertActionComposerImpl.keyColumns(JdbcActionComponent.scala:659)
So this is a MySQL thing in its core. I had to redesign my schema in order to make this retrieval after insert possible. This redesign includes an introduction of a dedicated primary key (completely unrelated to the business logic) which is also an AutoInc column as the stack trace prescribes.
In the end the solution becomes too involved and instead decided to use the actual argument of the add method to return if the insert was actually successful. So the implementation of the add method ended up being something like this
override def add(board: Board): Future[Board] = {
db.run(Boards.insertOrUpdate(board).map(_ => board))
}
while there was some appropriate Future error handling in the controller which was invoking the underlying repo.
If you're lucky enough and not using MySQL with Slick I suppose you might have been able to do this without a dedicated AutoInc primary key. If not then I suppose this is a one way road.

Using multiple POSTGRES databases and schemas with the same Flask-SQLAlchemy model

I'm going to be very specific here, because similar questions have been asked, but none of the solutions work for this problem.
I'm working on a project that has four postgres databases, but let's say for the sake of simplicity there are 2. Namely, A & B
A,B represent two geographical locations, but the tables and schemas in the database are identical.
Sample model:
from flask_sqlalchemy import SQLAlchemy
from sqlalchemy import *
from sqlalchemy.ext.declarative import declarative_base
db = SQLAlchemy()
Base = declarative_base()
class FRARecord(Base):
__tablename__ = 'tb_fra_credentials'
recnr = Column(db.Integer, primary_key = True)
fra_code = Column(db.Integer)
fra_first_name = Column(db.String)
This model is replicated in both databases, but with different schemas, so to make it work in A, I need to do:
__table_args__ = {'schema' : 'A_schema'}
I'd like to use a single content provider that is given the database to access, but has identical methods:
class ContentProvider():
def __init__(self, database):
self.database = database
def get_fra_list():
logging.debug("Fetching fra list")
fra_list = db.session.query(FRARecord.fra_code)
Two problems are, how do I decide what db to point to and how do I not replicate the model code for different schemas (this is a postgres specific problem)
Here's what I've tried so far:
1) I've made separate files for each of the models and inherited them, so:
class FRARecordA(FRARecord):
__table_args__ = {'schema' : 'A_schema'}
This doesn't seem to work, because I get the error:
"Can't place __table_args__ on an inherited class with no table."
Meaning that I can't set that argument after the db.Model (in its parent) was already declared
2) So I tried to do the same with multiple inheritance,
class FRARecord():
recnr = Column(db.Integer, primary_key = True)
fra_code = Column(db.Integer)
fra_first_name = Column(db.String)
class FRARecordA(Base, FRARecord):
__tablename__ = 'tb_fra_credentials'
__table_args__ = {'schema' : 'A_schema'}
but got the predictable error:
"CompileError: Cannot compile Column object until its 'name' is assigned."
Obviously I can't move the Column objects to the FRARecordA model without having to repeat them for B as well (and there are actually 4 databases and a lot more models).
3) Finally, I'm considering doing some sort of sharding (which seems to be the correct approach), but I can't find an example of how I'd go about this. My feeling is that I'd just use a single object like this:
class FRARecord(Base):
__tablename__ = 'tb_fra_credentials'
#declared_attr
def __table_args__(cls):
#something where I go through the values in bind keys like
for key, value in self.db.app.config['SQLALCHEMY_BINDS'].iteritems():
# Return based on current session maybe? And then have different sessions in the content provider?
recnr = Column(db.Integer, primary_key = True)
fra_code = Column(db.Integer)
fra_first_name = Column(db.String)
Just to be clear, my intention for accessing the different databases was as follows:
app.config['SQLALCHEMY_DATABASE_URI']='postgresql://%(user)s:\
%(pw)s#%(host)s:%(port)s/%(db)s' % POSTGRES_A
app.config['SQLALCHEMY_BINDS']={'B':'postgresql://%(user)s:%(pw)s#%(host)s:%(port)s/%(db)s' % POSTGRES_B,
'C':'postgresql://%(user)s:%(pw)s#%(host)s:%(port)s/%(db)s' % POSTGRES_C,
'D':'postgresql://%(user)s:%(pw)s#%(host)s:%(port)s/%(db)s' % POSTGRES_D
}
Where the POSTGRES dictionaries contained all the keys to connect to the data
I assumed with the inherited objects, I'd just connect to the correct one like this (so the sqlalchemy query would automatically know):
class FRARecordB(FRARecord):
__bind_key__ = 'B'
__table_args__ = {'schema' : 'B_schema'}
Finally found a solution to this.
Essentially, I didn't create new classes for each database, I just used different database connections for each.
This method on its own is pretty common, the tricky part (which I couldn't find examples of) was handling schema differences. I ended up doing this:
from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker
Session = sessionmaker()
class ContentProvider():
db = None
connection = None
session = None
def __init__(self, center):
if center == A:
self.db = create_engine('postgresql://%(user)s:%(pw)s#%(host)s:%(port)s/%(db)s' % POSTGRES_A, echo=echo, pool_threadlocal=True)
self.connection = self.db.connect()
# It's not very clean, but this was the extra step. You could also set specific connection params if you have multiple schemas
self.connection.execute('set search_path=A_schema')
elif center == B:
self.db = create_engine('postgresql://%(user)s:%(pw)s#%(host)s:%(port)s/%(db)s' % POSTGRES_B, echo=echo, pool_threadlocal=True)
self.connection = self.db.connect()
self.connection.execute('set search_path=B_schema')
def get_fra_list(self):
logging.debug("Fetching fra list")
fra_list = self.session.query(FRARecord.fra_code)
return fra_list

Django1.8 (Django Rest Framework3.3.1) and Postgresql 9.3 - Programming Error

I've created views in my Postgres Database that have links in Django Rest Framework.
I have about 20 views altogether. On 7 of them - I keep getting this programming error:
ProgrammingError at /api/reports/
column report.event_type_id_id does not exist
LINE 1: SELECT "report"."id", "report"."ev...
All 7 have the same exact error message. All of the views are based off one table with all the same column names. The column referenced in the table is event_type_id. NOT event_type_id_id. So I'm not sure where it's getting that from. If it had the same error on all of them - it would make a little more sense but it's only 7.
I'm not sure where to begin even correcting this issue because I can't pinpoint what the exact problem is. I'm assuming it's on the database side and how django expects to receive something from the database - but I'm not entirely sure. Any ideas??
Thanks in advance!
UPDATE:
Exception Location:
/usr/local/lib/python2.7/dist-packages/django/db/backends/utils.py in execute
def execute(self, sql, params=None):
self.db.validate_no_broken_transaction()
with self.db.wrap_database_errors:
if params is None:
return self.cursor.execute(sql)
else:
return self.cursor.execute(sql, params) ...
def executemany(self, sql, param_list):
self.db.validate_no_broken_transaction()
with self.db.wrap_database_errors:
return self.cursor.executemany(sql, param_list)
Report Model Update:
class Report(models.Model):
event_type_id = models.ForeignKey(EventTypeRef, default='01')
event_at = models.DateTimeField("Event Time")
xxxxxx_id = models.ForeignKey(xxxxxx)
xxxx = models.BigIntegerField(blank=True, null=True)
color = models.IntegerField(blank=True, null=True)
xxxxx_info = models.CharField(db_column='xxxxxInfo', max_length=56, blank=True)
xxxx_tag = models.ForeignKey(xxxxxxxx, blank=True, null=True)
hidden = models.BooleanField(default=False)
def __unicode__(self): # __unicode__ on Python 2
return self.event_type_id
class Meta:
managed= False,
db_table = u'report'
verbose_name_plural = "XXXXX Reports"
According to the Django ForeignKey documentation:
Behind the scenes, Django appends "_id" to the field name to create
its database column name. In the above example, the database table for
the Car model will have a manufacturer_id column. (You can change this
explicitly by specifying db_column) However, your code should never
have to deal with the database column name, unless you write custom
SQL. You’ll always deal with the field names of your model object.
So, you should rename event_type_id to event_type

How to access a field in a related django model other than the primary key

This seems a silly, simple question. I'm going round in circles trying to get this work, but I can't see the wood for the trees.
Given a simple model such as (I've skipped the imports):
class Location(models.Model):
description = model.CharField(max_length=40)
address1 = model.CharField(max_length=40)
# ..... (and so on)
tel = model.CharField(max_length=12)
and another with a relationship to it:
class InformationRequest(models.Model):
source = models.ForeignKey(Location)
request_date = Models.DateField(default=datetime.now())
# ..... (and so on)
How do I add a field that references the 'tel' field from the Location model in such a way that it can be populated automatically or from a select list in Django admin.
OK, if I get this right than you are, nomen est omen, thoroughly confusing the way that relational databases work :] One of key principles is to eliminate redundancy. There shouldn't be the very same piece of data stored in two tables that are related to one another.
I think that your current models are correct. Given these instances (I'm ignoring the fact that you have other, non-nullable fields)...
>>> loc = Location()
>>> loc.tel = "123"
>>> loc.save()
>>> info = InformationRequest()
>>> info.source = loc
>>> info.save()
...you can access tel from InformationRequest instance just like this:
>>> info.source.tel
'123'
You can also create a method...
class InformationRequest(models.Model):
source = models.ForeignKey(Location, related_name="information_requests")
request_date = Models.DateField(default=datetime.now())
# ..... (and so on)
def contact_tel(self):
return self.source.tel
... and get it like this:
>>> info.contact_tel()
'123'
You can even trick it into being an attribute...
class InformationRequest(models.Model):
source = models.ForeignKey(Location, related_name="information_requests")
request_date = Models.DateField(default=datetime.now())
# ..... (and so on)
#property
def contact_tel(self):
return self.source.tel
... and get it without parentheses:
>>> info.contact_tel
'123'
Anyway, you should work your way around it programatically. Hope that helps.

how to use composite data types (e.g. geomval) in SQLAlchemy?

I'm trying to replicate a nested raw PostGreSQL/PostGIS raster query using SQLAlchemy(0.8)/GeoAlchemy2(0.2.1) and can't figure how to access the components of a geomval data type. It's a compound data type that returns a 'geom' and a 'val'.
Here is the raw query that works:
SELECT (dap).val, (dap).geom
FROM (SELECT ST_DumpAsPolygons(rast) as dap FROM my_raster_table) thing
And the SQLAlchemy query I'm currently working with:
import geoalchemy2 as ga2
from sqlalchemy import *
from sqlalchemy.orm import sessionmaker
metadata = MetaData()
my_raster_table = Table('my_raster_table', metadata,
Column('rid', Integer),
Column('rast', ga2.Raster))
engine = create_engine(my_conn_str)
session = sessionmaker(engine)()
q = session.query(ga2.func.ST_DumpAsPolygons(my_raster_table.c.rast).label('dap'))
And then I'd like to access that in a subquery, something like this:
q2 = session.query(ga2.func.ST_Area(q.subquery().c.dap.geom))
But that syntax doesn't work, or I wouldn't be posting this question ;). Anyone have ideas? Thanks!
The solution ended up being fairly simple:
First, define a custom GeomvalType, inheriting geoalchemy2's CompositeType and specifying a typemap specific to geomval:
class GeomvalType(ga2.types.CompositeType):
typemap = {'geom':ga2.Geometry('MULTIPOLYGON'),'val':Float}
Next, use type_coerce to cast the ST_DumpAsPolygons result to the GeomvalType in the initial query:
q = session.query(type_coerce(ga2.func.ST_DumpAsPolygons(my_raster_table.c.rast), GeomvalType()).label('dap'))
Finally, access it (successfully!) from the subquery as I was trying to before:
q2 = session.query(ga2.func.ST_Area(q.subquery().c.dap.geom))