What would happen if I run two SQL commands using the same DB connection? - postgresql

I'm writing a program to run mass calculation and output results into PostgreSQL.
My platform is Windows Sever 2008, PostgreSQL 10. My program is written in C.
The results would be produced group by group, finishing of each group will create an extra thread to write the output.
Now since the output threads are created one by one, it is possible that two or more SQL input commands will be created simultaneously, or the previous one is under process when new ones call the function.
So my questions are:
(1) What would happen if one thread is in SQL processing and another thread called PQexec(PGconn *conn, const char *query), would they effect each other?
(2) What if I apply different PGconn? Would it speed up?

If you try to call PQexec on a connection that is in the process of executing an SQL statement, you would cause a protocol violation. That just doesn't work.
Processing could certainly be made faster if you use several database connections in parallel — concurrent transactions is something that PostgreSQL is designed for.

Related

Does q/kdb+ ever act asynchronously when synchronously instructed to write tables to disk?

I've had a strange bug pop up -- when I write to a partitioned table, then immediately after do a select on that same table, I get an error like:
./2018.04.23/ngbarx/cunadj. OS reports: Operation not permitted.
This error does not appear if, after writing the table, I wait a few seconds. This to me seems like pointing towards there being a caching situation, where q responds before an operation is complete, but afaik everything I am doing should be synchronous.
I would love to understand what I am doing wrong / what is causing this error exactly / which commands are executing asynchronously.
The horrible details:
Am writing from Python connected to q synchronously using the qpython3 package
The q session is launched with slaves i.e. -s 4
To write the partitioned table, I am using the unofficial function .Q.dcfgnt which can be found here
I write to a q session that was initialized with a database directory as is usual when dealing with partitioned tables
After writing the table with .Q.dcfgnt, but before doing the select, I also do .Q.chk`:/db/; system"l /db/"; .Q.cn table in that order, just to be sure the table is up and ready to use in the q session. These might be both overkill and in the wrong order, but I believe they are all synchronous calls afaik, please correct me if I am wrong.
The trigger to the error is a 10#select from table; I understand
why this is a bad idea to do in general on a partitioned table, but
also from my understanding it shouldn't be triggering the particular error that I am getting.

DB2 LUW Parallel Jobs Execution

I have been working in DB2 LUW database, i want to submit procedures as a parallel job. Meaning I have a procesure which will do some DDL, DML statements to one table. This table is having huge data, the same procedure need to run for few more tables run in parallel.
I submit the job using DBMS_JOB.SUBMIT statement and executed the job using DBMS_JOB.RUN statement. I have job handler procedure which helps to do this in parallel.
But each job is executing in sequentially (meaning the first job got completed then the second jobs started, after 2nd job completed 3rd job getting started.
**My First Question **
how to run DBMS_JOB in parallel ?
And second issue I'm facing is the cutrent session is still waiting to get complete all the jobs. I can't use that particular session, once all the job got completed than i can have access to use that same session.
**My Second Question **
*how to make the session accessible, instead of waiting for all jobs completed *
Please help me sir/madam.
DBMS_JOB is an interface to the Administrative Taks Scheduler (ATS) of Db2-LUW for the sake of some compatibility with Oracle RDBMS. However, you can also use the ATS directly independently of DBMS_JOB, via ADMIN_TASK_ADD and related procedures.
My experience is that db2acd (the process that implements autonomic actions including the ATS) is unreliable especially when ulimits are misconfigured, and it silently won't run jobs in some circumstances. It also has a 5 minute wakeup to check for new jobs which can frustrate, and it requires an already activated database which is inconvenient for some use cases.
I would not recommend usage of the Db2 ATS for application layer functionality. Full function enterprise schedulers exist for good reasons.
For parallel invocations, I would use an enterprise scheduling tool if available, or failing-that use the scheduler supplied by the operating system either on the Db2-server or at worst on the client-side, taking care in both cases that each stored-procedure-invocation is its own scheduled-job with its own Db2-connection.
By using a Db2-connection per stored-procedure invocation, and concurrently scheduling them, they run in parallel as long as their actions don't cause mutual contention.
Apart from the above, I believe the ATS will start jobs in parallel provided that the job-defintions are correct.
Examine the contents of both ADMIN_TASK_LIST and ADMIN_TASK_STATUS administrative views, and corroborate with db2diag entries (diaglevel 4 may give more detail, even if you must use it only temporarily).
Calls to SQL PL (or PL/SQL) stored procedures are synchronous relative to the caller, which means that the Db2-connection is blocked until the stored procedure returns. You cannot "make the session accessible" if it is waiting for a stored procedure to complete, but you can open a new connection.
Different options exist for stored procedures that are written in C, or C++, or Java or C++/CLR. They have more freedom. Other options exist for messaging/broker based solutions. uch depends on available skillsets, toolsets, and experience. But in general it's wiser to keep it simple.

How to count the number of queries executed in postgres?

I have 2 sql files and each containing 100 queries.
I need to execute first 10 queries from the first sql file and then execute first 10 queries from 2nd sql file. After the executions of 10 queries from 2nd sql file, the 11th query should start execution from the 1st sql file.
Is there a way to keep count of how many queries have completed?
How to pause the query execution in 1st file and resume it after completion of certain number of queries?
You can't do this with the psql command line client, its include file handling is limited to reading the file and sending the whole contents to the server query-by-query.
You'll want to write a simple Perl or Python script, using DBD::Pg (Perl) or psycopg2 (Python) that reads the input files and sends queries.
Splitting the input requires parsing the SQL, which requires a bit of care. You can't just split into queries on semicolons. You must handle quoted "identifier"s and 'literal's as well as E'escape literals', and $dollar$ quoting $dollar$. You may be able to find existing code to help you with this, or use functionality from the database client driver to do it.
Alternately, if you can modify the input files to insert entries into them, you can potentially run them using multiple psql instances and use advisory locking as an interlock to cause them to wait for each other at set points. For details see explicit locking.

Multiple threads in db2luw

I am very new to Db2. I have a question , Developed few procedures which will perform some operations on db2 database. My question is how to create multiple threads on db2 server concurrently. I mean I have a database with 70,000 tables each having more than 1000 records . I have a procedure which will update all these 70,000 tables. So time consumption is the main factor, here. I want to divide my update statement into 10 threads , where each thread will update 7000 tables. I want to run all the 10 threads simultaneously.
Can some one kindly let me know the way , to achieve this.
DB2 c Express on windows.
There's nothing in DB2 for creating multiple threads.
The enterprise level version of DB2 will automatically process a single statement across multiple cores when and where needed. But that's not what you're asking for.
I don't believe any SQL based RDBMS allows for a SP that create it's own threads. The whole point of SQL is hat it's a higher level of abstraction, you don't have access to those kinds of details.
You'll need to write an external app in a language that supports threads and that opens 10 connections to the DB simultaneously. But depending on the specifics of the update you're doing, and hardware you have. You might find that 10 connections is too many.
To elaborate on Charles's correct answer, it is up to the client application to parallelize its DML workload by opening multiple connections to the database. You could write such a program on your own, but many ETL utilities provide components that enable parallel workflows similar to what you've described. Aside from reduced programming, another advantage of using an ETL tool to define and manage a multi-threaded database update is built-in exception handling, making it easier to roll back all of the involved connections if any of them encounter an error.

Multistatement Queries in Postgres

I'm looking to send multiple read queries to a Postgres database in order to reduce the number of trips that need to be made to a painfully remote database. Is there anything in libpq that supports this behavior?
Yes, you can use the asynchronous handling functions in libpq. On the linked page it says:
Using PQsendQuery and PQgetResult solves one of PQexec's problems: If
a command string contains multiple SQL commands, the results of those
commands can be obtained individually. (This allows a simple form of
overlapped processing, by the way: the client can be handling the
results of one command while the server is still working on later
queries in the same command string.)
For example, you should be able to call PQsendQuery with a string containing multiple queries, then repeatedly call PQgetResult to get the result sets. PQgetResult returns NULL when there are no more result sets to obtain.
If desired, you can also avoid your application blocking while it waits for these queries to execute (described in more detail on the linked page).