How to use asynchronous feature of pyscopg2? - postgresql

I'm trying to execute 3 different postgresql queries with different table. Each query takes 2 seconds to execute. I was wondering if it's possible to run all 3 queries at the same time so that I can save 4 seconds. I tried using the asynchronous feature of pyscopg2 but it only returns the result of last query. Can anyone point out what I'm doing wrong ?
import select
import psycopg2
import psycopg2.extensions
def wait(conn):
while 1:
state = conn.poll()
if state == psycopg2.extensions.POLL_OK:
break
elif state == psycopg2.extensions.POLL_WRITE:
select.select([], [conn.fileno()], [])
elif state == psycopg2.extensions.POLL_READ:
select.select([conn.fileno()], [], [])
else:
raise psycopg2.OperationalError("poll() returned %s" % state)
aconn = psycopg2.connect(
dbname=pg_name,
user=pg_username,
host=pg_host,
password=pg_password,
async=1)
wait(aconn)
acurs = aconn.cursor()
acurs.execute(
"SELECT 1;"
"SELECT ST_Length(ST_GeomFromText"
"('LINESTRING(743238 2967416,743238 2967450)',4326));"
"SELECT 3;"
)
wait(acurs.connection)
result = acurs.fetchall()
print result
This only prints: "result": [[3]]

Per the Psycopg Introduction:
[Psycopg] is a wrapper for the libpq, the official PostgreSQL client library.
Then, looking at the libpq documentation for PQexec() (the function used to send SQL queries to the PostgreSQL database), we see the following note (emphasis mine):
Multiple queries sent in a single PQexec call are processed in a single transaction, unless there are explicit BEGIN/COMMIT commands included in the query string to divide it into multiple transactions. Note however that the returned PGresult structure describes only the result of the last command executed from the string.
So, unfortunately, what you're trying to do is simply not supported by psycopg2 and libpq. (This isn't to say that other client interfaces to PostgreSQL don't support it, though, but that's out of scope for this question.)
So to answer your question, what you're doing wrong is executing multiple SQL queries in one execute() call and trying to retrieve all of their results afterwards, when in fact it's not possible. You need to explicitly execute each query and retrieve the results individually, or else try and find another API to PostgreSQL that supports returning multiple result sets at once.
The Python Database API 2.0 specification does allow for the optional nextset() method to be implemented by the library which moves the cursor to the next result set returned from the queries executed, but this method is not implemented in psycopg2 (for obvious reasons) and in fact raises a NotSupportedError exception if you try to call it (see the docs).

It looks like it is now supported as of version 2.2
def wait(conn):
while True:
state = conn.poll()
if state == psycopg2.extensions.POLL_OK:
break
elif state == psycopg2.extensions.POLL_WRITE:
select.select([], [conn.fileno()], [])
elif state == psycopg2.extensions.POLL_READ:
select.select([conn.fileno()], [], [])
else:
raise psycopg2.OperationalError("poll() returned %s" % state)
Source: https://www.psycopg.org/docs/advanced.html#asynchronous-support

Related

Can't declare a variable in ArangoDB

I'm using ArangoDB 3.4.6-1 and would like to delete vertices with AQL (as stated here) in online console.
In the first step according to the tutorial you are supposed to save your Edges into a variable. In my case the statement looks like this:
LET edgeKeys = (FOR v, e, p IN 1..100 INBOUND 'Node/N3' GRAPH 'graph' RETURN e._key)
The For itself without the brackets returns the correct result:
[
"E3"
]
Yet, running the whole statement with the brackets just throws the following error:
Query: AQL: syntax error, unexpected end of query string near ')' at position 1:83 (while parsing)
I tried using a comparable command with other graphs or other returned values and objects, but always get the same error.
So far I wasn't able to find a proper solution online. The tutorial provides the following example code (copied 1:1):
LET edgeKeys = (FOR v, e IN 1..1 ANY 'persons/eve' GRAPH 'knows_graph' RETURN e._key)
And I'm getting exactly the same error, it's not even able to check the collections.
What am I doing wrong?
Only defining a variable with LET is not a valid AQL statement.
From the AQL Syntax documentation:
An AQL query must either return a result (indicated by usage of the
RETURN keyword) or execute a data-modification operation (indicated by
usage of one of the keywords INSERT, UPDATE, REPLACE, REMOVE or
UPSERT). The AQL parser will return an error if it detects more than
one data-modification operation in the same query or if it cannot
figure out if the query is meant to be a data retrieval or a
modification operation.
Using the full AQL block that is stated in the tutorial the execution works as expected since the query is executing a data-modification with REMOVE in this case. Just a RETURN operation inside the LET variable declaration is not sufficient to run an AQL query. When removing the LET operation the query works as well since in this case the AQL query directly returns the result.
Complete AQL query:
LET edgeKeys = (FOR v, e IN 1..1 ANY 'persons/eve' GRAPH 'knows_graph' RETURN e._key)
LET r = (FOR key IN edgeKeys REMOVE key IN knows)
REMOVE 'eve' IN persons
An additional RETURN also makes the query work:
LET edgeKeys = (FOR v, e IN 1..1 ANY 'persons/eve' GRAPH 'knows_graph' RETURN e._key)
RETURN edgeKeys

synchronous queries to MongoDB

is there a way to make synchronous queries to MongoDB?
I'd like to run some code only after I've retrieved all my data from the DB.
Here is a sample snipped.
Code Snippet A
const brandExists = Brands.find({name: trxn.name}).count();
Code Snippet B
if(brandExists == 0){
Brands.insert({
name:trxn.name,
logo:"default.png",
});
Trxs.insert({
userId,
merchant_name,
amt,
});
}
I'd like Code snippet B to run only after Code Snippet A has completed its data retrieval from the DB. How would one go about doing that?
You can use simple async function async function always returns a promise.
const brandExists;
async function brandExist() {
brandExists = Brands.find({
name: trxn.name
}).count();
}
brandExist().then(
// Your Code comes here
if (brandExists == 0) {
Brands.insert({
name: trxn.name,
logo: "default.png",
})
Trxs.insert({
userId,
merchant_name,
amt,
});
});
I don't think using an if statement like the one you have makes sense: the queries are sent after each other; it is possible someone else creates a brand with the same name as the one you are working with between your queries to the database.
MongoDB has something called unique indexes you can use to enforce values being unique. You should be able to use name as a unique index. Then when you insert a new document into the collection, it will fail if there already exists a document with that name.
https://docs.mongodb.com/manual/core/index-unique/
In Meteor, MongoDB queries are synchronous, so it already delivers what you need. No need to make any changes, snippet B code will only run after snippet A code.
When we call a function asynchronous we mean that when that function is called it is non-blocking, which means our program will call the function and keep going, or, not wait for the response we need.
If our function is synchronous, it means that our program will call that function and wait until it's received a response from that function to continue with the rest of the program.
Meteor is based in Node, which is asynchronous by nature, but coding with only asynchronous functions can origin what developers call "callback hell".
On the server side, Meteor decided to go with Fibers, which allows functions to wait for the result, resulting in synchronous-style code.
There's no Fibers in the client side, so every time your client calls a server method, that call will be asynchronous (you'll have to worry about callbacks).
Your code is server-side code, and thanks to Fibers you can be assure that snippet B code will only run after snippet A code.

How to insert similar value into multiple locations of a psycopg2 query statement using dict? [duplicate]

I have a Python script that runs a pgSQL file through SQLAlchemy's connection.execute function. Here's the block of code in Python:
results = pg_conn.execute(sql_cmd, beg_date = datetime.date(2015,4,1), end_date = datetime.date(2015,4,30))
And here's one of the areas where the variable gets inputted in my SQL:
WHERE
( dv.date >= %(beg_date)s AND
dv.date <= %(end_date)s)
When I run this, I get a cryptic python error:
sqlalchemy.exc.ProgrammingError: (psycopg2.ProgrammingError) argument formats can't be mixed
…followed by a huge dump of the offending SQL query. I've run this exact code with the same variable convention before. Why isn't it working this time?
I encountered a similar issue as Nikhil. I have a query with LIKE clauses which worked until I modified it to include a bind variable, at which point I received the following error:
DatabaseError: Execution failed on sql '...': argument formats can't be mixed
The solution is not to give up on the LIKE clause. That would be pretty crazy if psycopg2 simply didn't permit LIKE clauses. Rather, we can escape the literal % with %%. For example, the following query:
SELECT *
FROM people
WHERE start_date > %(beg_date)s
AND name LIKE 'John%';
would need to be modified to:
SELECT *
FROM people
WHERE start_date > %(beg_date)s
AND name LIKE 'John%%';
More details in the pscopg2 docs: http://initd.org/psycopg/docs/usage.html#passing-parameters-to-sql-queries
As it turned out, I had used a SQL LIKE operator in the new SQL query, and the % operand was messing with Python's escaping capability. For instance:
dv.device LIKE 'iPhone%' or
dv.device LIKE '%Phone'
Another answer offered a way to un-escape and re-escape, which I felt would add unnecessary complexity to otherwise simple code. Instead, I used pgSQL's ability to handle regex to modify the SQL query itself. This changed the above portion of the query to:
dv.device ~ E'iPhone.*' or
dv.device ~ E'.*Phone$'
So for others: you may need to change your LIKE operators to regex '~' to get it to work. Just remember that it'll be WAY slower for large queries. (More info here.)
For me it's turn out I have % in sql comment
/* Any future change in the testing size will not require
a change here... even if we do a 100% test
*/
This works fine:
/* Any future change in the testing size will not require
a change here... even if we do a 100pct test
*/

Intersystems Cache - Maintaining Object Code to ensure Data is Compliant with Object Definition

I am new to using intersytems cache and face an issue where I am querying data stored in cache, exposed by classes which do not seem to accurately represent the data in the underlying system. The data stored in the globals is almost always larger than what is defined in the object code.
As such I get errors like the one below very frequently.
Msg 7347, Level 16, State 1, Line 2
OLE DB provider 'MSDASQL' for linked server 'cache' returned data that does not match expected data length for column '[cache]..[namespace].[tablename].columname'. The (maximum) expected data length is 5, while the returned data length is 6.
Does anyone have any experience with implementing some type of quality process to ensure that the object definitions (sql mappings) are maintained in such away that they can accomodate the data which is being persisted in the globals?
Property columname As %String(MAXLEN = 5, TRUNCATE = 1) [ Required, SqlColumnNumber = 2, SqlFieldName = columname ];
In this particular example the system has the column defined with a max len of 5, however the data stored in the system is 6 characters long.
How can I proactively monitor and repair such situations.
/*
I did not create these object definitions in cache
*/
It's not completely clear what "monitor and repair" would mean for you, but:
How much control do you have over the database side? Cache runs code for a data-type on converting from a global to ODBC using the LogicalToODBC method of the data-type class. If you change the property types from %String to your own class, AppropriatelyNamedString, then you can override that method to automatically truncate. If that's what you want to do. It is possible to change all the %String property types programatically using the %Library.CompiledClass class.
It is also possible to run code within Cache to find records with properties that are above the (somewhat theoretical) maximum length. This obviously would require full table scans. It is even possible to expose that code as a stored procedure.
Again, I don't know what exactly you are trying to do, but those are some options. They probably do require getting deeper into the Cache side than you would prefer.
As far as preventing the bad data in the first place, there is no general answer. Cache allows programmers to directly write to the globals, bypassing any object or table definitions. If that is happening, the code doing so must be fixed directly.
Edit: Here is code that might work in detecting bad data. It might not work if you are doing cetain funny stuff, but it worked for me. It's kind of ugly because I didn't want to break it up into methods or tags. This is meant to run from a command prompt, so it would have to be modified for your purposes probably.
{
S ClassQuery=##CLASS(%ResultSet).%New("%Dictionary.ClassDefinition:SubclassOf")
I 'ClassQuery.Execute("%Library.Persistent") b q
While ClassQuery.Next(.sc) {
If $$$ISERR(sc) b Quit
S ClassName=ClassQuery.Data("Name")
I $E(ClassName)="%" continue
S OneClassQuery=##CLASS(%ResultSet).%New(ClassName_":Extent")
I '$IsObject(OneClassQuery) continue //may not exist
try {
I 'OneClassQuery.Execute() D OneClassQuery.Close() continue
}
catch
{
D OneClassQuery.Close()
continue
}
S PropertyQuery=##CLASS(%ResultSet).%New("%Dictionary.PropertyDefinition:Summary")
K Properties
s sc=PropertyQuery.Execute(ClassName) I 'sc D PropertyQuery.Close() continue
While PropertyQuery.Next()
{
s PropertyName=$G(PropertyQuery.Data("Name"))
S PropertyDefinition=""
S PropertyDefinition=##CLASS(%Dictionary.PropertyDefinition).%OpenId(ClassName_"||"_PropertyName)
I '$IsObject(PropertyDefinition) continue
I PropertyDefinition.Private continue
I PropertyDefinition.SqlFieldName=""
{
S Properties(PropertyName)=PropertyName
}
else
{
I PropertyName'="" S Properties(PropertyDefinition.SqlFieldName)=PropertyName
}
}
D PropertyQuery.Close()
I '$D(Properties) continue
While OneClassQuery.Next(.sc2) {
B:'sc2
S ID=OneClassQuery.Data("ID")
Set OneRowQuery=##class(%ResultSet).%New("%DynamicQuery:SQL")
S sc=OneRowQuery.Prepare("Select * FROM "_ClassName_" WHERE ID=?") continue:'sc
S sc=OneRowQuery.Execute(ID) continue:'sc
I 'OneRowQuery.Next() D OneRowQuery.Close() continue
S PropertyName=""
F S PropertyName=$O(Properties(PropertyName)) Q:PropertyName="" d
. S PropertyValue=$G(OneRowQuery.Data(PropertyName))
. I PropertyValue'="" D
.. S PropertyIsValid=$ZOBJClassMETHOD(ClassName,Properties(PropertyName)_"IsValid",PropertyValue)
.. I 'PropertyIsValid W !,ClassName,":",ID,":",PropertyName," has invalid value of "_PropertyValue
.. //I PropertyIsValid W !,ClassName,":",ID,":",PropertyName," has VALID value of "_PropertyValue
D OneRowQuery.Close()
}
D OneClassQuery.Close()
}
D ClassQuery.Close()
}
The simplest solution is to increase the MAXLEN parameter to 6 or larger. Caché only enforces MAXLEN and TRUNCATE when saving. Within other Caché code this is usually fine, but unfortunately ODBC clients tend to expect this to be enforced more strictly. The other option is to write your SQL like SELECT LEFT(columnname, 5)...
The simplest solution which I use for all Integration Services Packages, for example is to create a query that casts all nvarchar or char data to the correct length. In this way, my data never fails for truncation.
Optional:
First run a query like: SELECT Max(datalength(mycolumnName)) from cachenamespace.tablename.mycolumnName
Your new query : SELECT cast(mycolumnname as varchar(6) ) as mycolumnname,
convert(varchar(8000), memo_field) AS memo_field
from cachenamespace.tablename.mycolumnName
Your pain of getting the data will be lessened but not eliminated.
If you use any type of oledb provider, or if you use an OPENQUERY in SQL Server,
the casts must occur in the query sent to Intersystems CACHE db, not in the the outer query that retrieves data from the inner OPENQUERY.

Can the Sequence of RecordSets in a Multiple RecordSet ADO.Net resultset be determined, controlled?

I am using code similar to this Support / KB article to return multiple recordsets to my C# program.
But I don't want C# code to be dependant on the physical sequence of the recordsets returned, in order to do it's job.
So my question is, "Is there a way to determine which set of records from a multiplerecordset resultset am I currently processing?"
I know I could probably decipher this indirectly by looking for a unique column name or something per resultset, but I think/hope there is a better way.
P.S. I am using Visual Studio 2008 Pro & SQL Server 2008 Express Edition.
No, because the SqlDataReader is forward only. As far as I know, the best you can do is open the reader with KeyInfo and inspect the schema data table created with the reader's GetSchemaTable method (or just inspect the fields, which is easier, but less reliable).
I spent a couple of days on this. I ended up just living with the physical order dependency. I heavily commented both the code method and the stored procedure with !!!IMPORTANT!!!, and included an #If...#End If to output the result sets when needed to validate the stored procedure output.
The following code snippet may help you.
Helpful Code
Dim fContainsNextResult As Boolean
Dim oReader As DbDataReader = Nothing
oReader = Me.SelectCommand.ExecuteReader(CommandBehavior.CloseConnection Or CommandBehavior.KeyInfo)
#If DEBUG_ignore Then
'load method of data table internally advances to the next result set
'therefore, must check to see if reader is closed instead of calling next result
Do
Dim oTable As New DataTable("Table")
oTable.Load(oReader)
oTable.WriteXml("C:\" + Environment.TickCount.ToString + ".xml")
oTable.Dispose()
Loop While oReader.IsClosed = False
'must re-open the connection
Me.SelectCommand.Connection.Open()
'reload data reader
oReader = Me.SelectCommand.ExecuteReader(CommandBehavior.CloseConnection Or CommandBehavior.KeyInfo)
#End If
Do
Dim oSchemaTable As DataTable = oReader.GetSchemaTable
'!!!IMPORTANT!!! PopulateTable expects the result sets in a specific order
' Therefore, if you suddenly start getting exceptions that only a novice would make
' the stored procedure has been changed!
PopulateTable(oReader, oDatabaseTable, _includeHiddenFields)
fContainsNextResult = oReader.NextResult
Loop While fContainsNextResult
Because you're explicitly stating in which order to execute the SQL statements the results will appear in that same order. In any case if you want to programmatically determine which recordset you're processing you still have to identify some columns in the result.