Behaviour of fetchrow_hashref in Perl - perl

I am trying to execute a procedure from Perl and store the results in a text file (on Windows). I am using DBI's
fetchrow_hashref() to fetch row results. The stored procedure that I am trying to execute returns more than 5 million rows. I want to know the functionality "behind-the-scene" - particularly what happens during the
fetchrow_hashref() call. e.g. Perl executes the procedure, the procedure returns all the impacted rows, keeps it in a pool (either on Database side or the calling machine side?) and then Perl selects the rows from resultset one by one. Does it happen that way or something else?

This is a difficult question to answer as you've not said which Perl database driver you are using. I'm assuming you are using DBD::ODBC and the MS SQL Server ODBC Driver in this answer.
When you call prepare on the SQL calling the procedure the ODBC driver sends it to MS SQL Server where the procedure is parsed. On calling execute the procedure is started (how you progress through the procedure depends on a lot of things). Assuming the first thing in your procedure is a select a cursor will be created for the query and MS SQL Server will start sending the rows back to the ODBC Driver (it uses the TDS protocol). In the mean time DBD::ODBC will make ODBC calls to the driver which tells it there is a result-set (SQLNumResultCol returns a non zero value). DBD::ODBC will then query the driver for the types of the columns in the result-set and bind them (SQLBindCol).
Each time you call fetchrow_hashref, DBD::ODBC will call SQLFetch, the ODBC driver will read a row from the socket and copy the data to the bound buffers.
There are important things to realise here. Mostly MS SQL Server will write a lot of rows initially to the socket even though the ODBC driver is probably not reading them yet. As a result, if you close your statement the driver has to read a lot of rows from the socket and throw them away. If you use a non standard cursor or enable Multiple Active Statements in the driver then rows are sent back to the driver one at a time so the ODBC driver can ask the server to move forward, backward in the result-set or request a row from result-set 1 then result-set 2.
There are other areas a little unusual when using procedures like whether nocount is enabled or not and you progress through your procedure statements using SQLMoreResults (odbc_more_results). Also, the output parameters in a procedure are no available until SQLMoreResults returns false.
You may find Multiple Active Statements (MAS) and DBD::ODBC of some interest to you or may some of the other articles. You may also want to read about the TDS protocol.

Related

C# and PostgreSQL Single Row Mode

When streaming large volumes of data out of PostgreSQL into C#, using Npgsql, does the command default to Single Row Mode or will it extract the entire result set before returning? I can find no mention of Single Row Mode in the Npgsql documentation, and nothing in the source code to suggest that it is optional one way or the other.
When Npgsql sends the SQL query you give it, PostgreSQL will immediately send back all the rows. If you pass CommandBehavior.SingleRow (or SingleResult) to NpgsqlCommand.ExecuteReader, Npgsql will simply not return those rows to the user; it will consume them internally, but they are still sent from the server. In other words, if you expect these options to reduce the network bandwidth used, that won't work; your only way to do that is to limit the resultset in the SQL itself, via a LIMIT clause. This is in general a better idea anyway.
See https://github.com/npgsql/npgsql/issues/410 for a bit more detail on why we didn't implement something more aggressive.
From my experience, the default in Npgsql is to get a cursor for the result set that will fetch the number of rows you are currently processing, basically, when invoking reader.Read() you get a row from the server to the driver client. There might be some buffering taking place, but streaming the result is the norm.

What would happen if I run two SQL commands using the same DB connection?

I'm writing a program to run mass calculation and output results into PostgreSQL.
My platform is Windows Sever 2008, PostgreSQL 10. My program is written in C.
The results would be produced group by group, finishing of each group will create an extra thread to write the output.
Now since the output threads are created one by one, it is possible that two or more SQL input commands will be created simultaneously, or the previous one is under process when new ones call the function.
So my questions are:
(1) What would happen if one thread is in SQL processing and another thread called PQexec(PGconn *conn, const char *query), would they effect each other?
(2) What if I apply different PGconn? Would it speed up?
If you try to call PQexec on a connection that is in the process of executing an SQL statement, you would cause a protocol violation. That just doesn't work.
Processing could certainly be made faster if you use several database connections in parallel — concurrent transactions is something that PostgreSQL is designed for.

How to count the number of queries executed in postgres?

I have 2 sql files and each containing 100 queries.
I need to execute first 10 queries from the first sql file and then execute first 10 queries from 2nd sql file. After the executions of 10 queries from 2nd sql file, the 11th query should start execution from the 1st sql file.
Is there a way to keep count of how many queries have completed?
How to pause the query execution in 1st file and resume it after completion of certain number of queries?
You can't do this with the psql command line client, its include file handling is limited to reading the file and sending the whole contents to the server query-by-query.
You'll want to write a simple Perl or Python script, using DBD::Pg (Perl) or psycopg2 (Python) that reads the input files and sends queries.
Splitting the input requires parsing the SQL, which requires a bit of care. You can't just split into queries on semicolons. You must handle quoted "identifier"s and 'literal's as well as E'escape literals', and $dollar$ quoting $dollar$. You may be able to find existing code to help you with this, or use functionality from the database client driver to do it.
Alternately, if you can modify the input files to insert entries into them, you can potentially run them using multiple psql instances and use advisory locking as an interlock to cause them to wait for each other at set points. For details see explicit locking.

Progress ABL procedure to SQL Insert

We have a software solution that involves syncing some data between a Progress database and SQL server. Unfortunately, we do not have any Progress gurus in house, so I'm working kinda blind here and would welcome any advice that is on offer.
For the workflow that is already in place, what would work very well for us is the ability to do an external call to insert a row into an SQL database from an within ABL procedure's 'for each' loop.
Is anyone able to direct me to any code snippets or articles that might help me achieve this?
Many thanks,
In case your SQL database is MS SQL Server, you might want to have a look at OpenEdge DataServer for Microsoft SQL Server (web.progress.com/en/openedge/dataserver-microsoft.html, documentation.progress.com/output/OpenEdge102b/pdfs/dmsql/dmsql.pdf).
The DataServer provides you with ABL access to a non-Progress database so you can use standard Progress statements, e.g. CREATE to add new records or FOR EACH to retrieve query results.
OpenEdge DataServers are also available for Oracle (using Oracle Call Interface), DB2 and Sybase (using ODBC). The DataServer for MS SQL Server uses ODBC behind the scenes as well. web.progress.com/docs/datasheets/openedge/openedge_dataservers.pdf
You dont need the dataserver, connection with ADODB works fine in ABL, you can even call stored-procedures with the command object, the user you connect with will have to be granted EXEC rigths on the SQL-Server to do that.
I'm not a Progress guru, but I did do some work in it for awhile. AFAIK there is no way to have ABL code connect to a non-Progress database (part of that whole vendor lock-in strategy Progress Corp. leverages).
Your best bet is probably to have the ABL code serialize the records to XML, and use something like ActiveMQ (or even a plain socket or named pipe/FIFO depending on your setup) to send them to a program written in a more capable language to do the SQL insert.

SQLAnywhere: Watcom SQL or T-SQL

A general question.
I am developing for Sybase SQL Anywhere 10. For backwards comptibility reasons, almost all our Stored procedures are written in Transact-SQL.
Are there any advantages or disadvantages to using T-SQL instead of the Watcom dialect?
Advantages of TSQL:
greater compatibility with Sybase ASE and Microsoft SQL Server
Disadvantages of TSQL:
some statements and functionality are only available in Watcom-SQL procedures. Some examples:
greater control over EXECUTE IMMEDIATE behavior in Watcom-SQL
LOAD TABLE, UNLOAD TABLE, REORGANIZE (among others) are only available in Watcom-SQL
the FOR statement for looping over the results of a query and automatically declaring variables to contain the values is very useful, but not available in TSQL
error reporting is less consistent since TSQL procedures are assumed to handle their own errors, while Watcom-SQL procedures report errors immediately. Watcom-SQL procedures can contain an EXCEPTION clause to handle errors
statements are not delimited by semi-colons, so TSQL procedures are more difficult to parse (and read). Syntax errors can sometimes fail to point to the actual location of the error
no ability to explicitly declare the result set of a procedure
no support for row-level triggers in TSQL
event handlers can only be written using Watcom-SQL
The documentation for SQL Anywhere T-SQL compatibility is available online. There are some database options that change behaviour to more closely match what you would expect from Sybase ASE. In addition, there are some functions that can be used to translate from one syntax to another.
Note that if you want to start adding statements in the Watcom dialect into an existing stored procedure, you will need to change the SP so that it is entirely written in the Watcom dialect. You cannot mix syntaxes in a SP, trigger, or batch.
What KM said - on the other hand, the "Watcom" dialect is much closer to ISO/ANSI-standard SQL, so that dialect is more likely to match to some other products and to appeal to people familiar with SQL standards.
if you ever try to port to SQL Server (or you go for a job on SQL Server), Sybase T-SQL is very close to SQL Server T-SQL. Sybase and MS joined up back in the day, so the core of those languages are very similar.