What is the sql query equivalent to sqlalchemy query? - postgresql

I am having issues ietrating over results from SQL query with using encode databases (https://pypi.org/project/databases/)
but this sqlalchemy query works fine for the celery tasks
query = session.query(orders).filter(orders.shipped == True)
I have tried the following (celery task unable to iterate over multiple rows from postgresql database with python) but does not work
def check_all_orders():
query = "SELECT * FROM orders WHERE shipped=True"
return database.fetch_all(query)
...
...
...
#app.task
async def check_orders():
query = await check_all_orders()
today = datetime.utcnow()
for q in query:
if q.last_notification is not None:
if (today - q.last_notification).total_seconds() < q.cooldown:
continue
Does anyone know what SQL statement will generate what i can iterate like it does for this sqlalchemy query?
query = session.query(orders).filter(orders.shipped == True)

SQLAlchemy's resulting query depends on what particular backend engine you use. For example, filter(orders.shipped == True) can be converted to something like WHERE shipped = 't' for PostgreSQL. You can always log the query it sends to database backend. For your particular case SELECT * FROM orders WHERE shipped should be enough.

Related

Sqlalchemy asyncio translate postgres query for GROUP_BY clause

I want to translate the below postgres query into Sqlalchemy asyncio format, but so far, I could only retrieve the first column only, or the whole row at once, while I need only to retrieve only two columns per record:
SELECT
table.xml_uri,
max(table.created_at) AS max_1
FROM
table
GROUP BY
table.xml_uri
ORDER BY
max_1 DESC;
I reach out to the below translation, but this only returns the first column xml_uri, while I need both columns. I left the order_by clause commented out for now as it generates also the below error when commented in:
Sqlalchemy query:
from sqlalchemy.ext.asyncio import AsyncSession
query = "%{}%".format(query)
records = await session.execute(
select(BaseModel.xml_uri, func.max(BaseModel.created_at))
.order_by(BaseModel.created_at.desc())
.group_by(BaseModel.xml_uri)
.filter(BaseModel.xml_uri.like(query))
)
# Get all the records
result = records.scalars().all()
Error generated when commenting in order_by clause:
column "table.created_at" must appear in the GROUP BY clause or be used in an aggregate function
The query is returning a resultset consisting of two-element tuples. session.scalars() is taking the first element of each tuple. Using session.execute instead will provide the desired behaviour.
It's not permissable to order by the date field directly as it isn't part of the projection, but you can give the max column a label and use that to order.
Here's an example script:
import sqlalchemy as sa
from sqlalchemy import orm
Base = orm.declarative_base()
class MyModel(Base):
__tablename__ = 't73018397'
id = sa.Column(sa.Integer, primary_key=True)
code = sa.Column(sa.String)
value = sa.Column(sa.Integer)
engine = sa.create_engine('postgresql:///test', echo=True, future=True)
Base.metadata.drop_all(engine)
Base.metadata.create_all(engine)
Session = orm.sessionmaker(engine, future=True)
with Session.begin() as s:
for i in range(10):
# Split values based on odd or even
code = 'AB'[i % 2 == 0]
s.add(MyModel(code=code, value=i))
with Session() as s:
q = (
sa.select(MyModel.code, sa.func.max(MyModel.value).label('mv'))
.group_by(MyModel.code)
.order_by(sa.text('mv desc'))
)
res = s.execute(q)
for row in res:
print(row)
which generates this query:
SELECT
t73018397.code,
max(t73018397.value) AS mv
FROM t73018397
GROUP BY t73018397.code
ORDER BY mv desc

Expressing Postgresql VALUES command in SQLAlchemy ORM?

How to express the query
VALUES ('alice'), ('bob') EXCEPT ALL SELECT name FROM users;
(i.e. "list all names in VALUES that are not in table 'users'") in SQLAlchemy ORM? In other words, what should the statement 'X' below be like?
def check_for_existence_of_all_users_in_list(list):
logger.debug(f"checking that each user in {list} is in the database")
query = X(list)
(There is sqlalchemy.values which could be used like this:
query = sa.values(sa.column('name', sa.String)).data(['alice', 'bob']) # .???
but it appears that it can only be used as argument to INSERT or UPDATE.)
I am using SQLAlchemy 1.4.4.
This should work for you:
user_names = ['alice', 'bob']
q = values(column('name', String), name="temp_names").data([(_,) for _ in user_names])
query = select(q).except_all(select(users.c.name)) # 'users' is Table instance

How to properly parameterize my postgresql query

I'm trying to parameterize my postgresql query in order to prevent SQL injection in my ruby on rails application. The SQL query will sum a different value in my table depending on the input.
Here is a simplified version of my function:
def self.calculate_value(value)
calculated_value = ""
if value == "quantity"
calculated_value = "COALESCE(sum(amount), 0)"
elsif value == "retail"
calculated_value = "COALESCE(sum(amount * price), 0)"
elsif value == "wholesale"
calculated_value = "COALESCE(sum(amount * cost), 0)"
end
query = <<-SQL
select CAST(? AS DOUBLE PRECISION) as ? from table1
SQL
return Table1.find_by_sql([query, calculated_value, value])
end
If I call calculate_value("retail"), it will execute the query like this:
select location, CAST('COALESCE(sum(amount * price), 0)' AS DOUBLE PRECISION) as 'retail' from table1 group by location
This results in an error. I want it to execute without the quotes like this:
select location, CAST(COALESCE(sum(amount * price), 0) AS DOUBLE PRECISION) as retail from table1 group by location
I understand that the addition of quotations is what prevents the sql injection but how would I prevent it in this case? What is the best way to handle this scenario?
NOTE: This is a simplified version of the queries I'll be writing and I'll want to use find_by_sql.
Prepared statement can not change query structure: table or column names, order by clause, function names and so on. Only literals can be changed this way.
Where is SQL injection? You are not going to put a user-defined value in the query text. Instead, you check the given value against the allowed list and use only your own written parts of SQL. In this case, there is no danger of SQL injection.
I also want to link to this article. It is safe to create a query text dynamically if you control all parts of that query. And it's much better for RDBMS than some smart logic in query.

What is the right way to work with slick's 3.0.0 streaming results and Postgresql?

I am trying to figure out how to work with slick streaming. I use slick 3.0.0 with postgres driver
The situation is following: server have to give client sequences of data split into chunks limited by size(in bytes). So, I wrote following slick query:
val sequences = TableQuery[Sequences]
def find(userId: Long, timestamp: Long) = sequences.filter(s ⇒ s.userId === userId && s.timestamp > timestamp).sortBy(_.timestamp.asc).result
val seq = db.stream(find(0L, 0L))
I combined seq with akka-streams Source, wrote custom PushPullStage, that limits size of data(in bytes) and finishes upstream when it reaches size limit. It works just fine. The problem is - when I look into postgres logs, I see query like that
select * from sequences where user_id = 0 and timestamp > 0 order by timestamp;
So, at first glance it appears to be much (and unnecessary) database querying going on, only to use a few bytes in each query. What is the right way to do streaming with Slick so as to minimize database querying and to make best use of the data transferred in each query?
The "right way" to do streaming with Slick and Postgres includes three things:
Must use db.stream()
Must disable autoCommit in JDBC-driver. One way is to make the query run in a transaction by suffixing .transactionally.
Must set fetchSize to be something else than 0 or else postgres will push the whole resultSet to the client in one go.
Ex:
DB.stream(
find(0L, 0L)
.transactionally
.withStatementParameters(fetchSize = 1000)
).foreach(println)
Useful links:
https://github.com/slick/slick/issues/1038
https://github.com/slick/slick/issues/809
The correct way to stream in Slick is as provided in documentation is
val q = for (c <- coffees) yield c.image
val a = q.result
val p1: DatabasePublisher[Blob] = db.stream(a.withStatementParameters(
rsType = ResultSetType.ForwardOnly,
rsConcurrency = ResultSetConcurrency.ReadOnly,
fetchSize = 1000 /*your fetching size*/
).transactionally)

Groupby on multiple objects generates invalid SQL in slick

I am writing a query that calculates a possible score for a QuestionAnswer, when executing the query I get a PSQLException
Info about the model
A QuestionAnswer can have several (at least one) questionAnswerPossibilities, since there are multiple ways to answer the question correctly.
Every questionAnswerPossibility has several questionAnswerParts, in the query below we query the score per questionAnswerPossibility.
The Problematic Query
The query itself does generate SQL, but the SQL can not be executed
def queryMogelijkePuntenByVragenViaOpenVragen()(implicit session: Session) = {
(for{
ovam <- OpenVraagAntwoordMogelijkheden //questionAnswerPossibilites
ovad <- OpenVraagAntwoordOnderdelen if ovad.ovamId === ovam.id //questionAnswerParts
ova <- OpenVraagAntwoorden if ovam.ovaId === ova.id //questionAnswers
} yield ((ova, ovam), ovad.punten))
.groupBy{ case ((ova, ovam), punten) => (ova, ovam)}
.map{ case ((ova, ovam), query) => (ova, ovam, query.map(_._2).sum)}
}
Here the generated SQL (postgreSQL)
select x2."id", x2."vraag_id", x3."id", x3."volgorde", x3."ova_id", sum(x4."punten")
from "open_vraag_antwoord_mogelijkheden" x3, "open_vraag_antwoord_onderdelen" x4, "open_vraag_antwoorden" x2
where (x4."ovam_id" = x3."id") and (x3."ova_id" = x2."id")
group by (x2."id", x2."vraag_id"), (x3."id", x3."volgorde", x3."ova_id")
The problem is that the SQL can not execute , I get the following error
play.api.Application$$anon$1:
Execution exception[[
PSQLException: ERROR: column "x2.id" must appear in the GROUP BY clause or be used in an aggregate function
Position: 8]]
The SQL that is genarated contains too many brackets, the last part of the SQL should be
group by x2."id", x2."vraag_id", x3."id", x3."volgorde", x3."ova_id"
However slick generates it with brackets, am I doing something wrong here? Or is this a bug?
I solved the issue
...
} yield ((ova.id, ovam.id), ovad.punten))
Because I now only yield the nessecary id's and not all data, the sql that is generated does not contain the unnecessary braces that caused the SQL error.
I really wanted more data than just those id's, but I can work around this by using this query as a subquery, the outer query will fetch all the needed data for me.