Temporary tables and connection pool - jboss

I use connection pooling provided by JBoss 7 and I do a connection.close() after every action.
But when I create a temporary table, it does not remain for the current user because he is using a new session from the database (due to connection.close() and pooling).
For example, I create a temp table in an action. The user changes page. The new action has to do queries in temp table but this does not exist anymore.
So, I really don't know how to provide temporary tables with this kind of architecture.

Temporary tables in combination with connection pooling means request scope for the temporary tables. You'd like to have session scope temporary tables.
I am not aware of an out-of-the box solution. So options are
Create normal tables with a prefix/suffix to associate these with a user
Use the same connection for a user throughout the whole session
Rewrite the application: Instead of temporary tables, read the data completely, and store it in the HTTP session. Possibly the database can paginate so there is no need to cache intermediate results etc.
Options 1 and 2 require cleanup, especially if the session is not shut down properly. Option 2 possibly requires many resources. So I would investigate the 3rd option, even if it seems to be the most cumbersone one.

Related

Optimize the trigger to add audit log

I have a local database which is the production database, on which all operations are being done real time. I am storing the log on each action in an audit log table in another database via trigger. It basically checks if any change is made in any of the row's column it will remove that row and add it AGAIN (which is not a good way I think as it should simply update it but due to some reasons I need to delete and add it again).
There are some tables on which operations are being done rapidly like 100s of rows are being added in database. This is slowing the process of saving the data into audit log table. Now if trigger has to like delete 100 rows and add 100 again it will affect the performance obviously and if number of rows increases it will reduce the performance more.
What should be the best practice to tackle this, I have been looking into Read Replica and Foreign Data Wrapper but as for Read Replica it's only Readable and not writable for PostgreSQL and I don't really get to know how Foreign Data Wrapper gonna help me as this was suggested by one of my colleague.
Hope someone can guide me in right direction.
A log is append-only by definition. Loggers should never be modifying or removing existing entries.
Audit logs are no different. Audit triggers should INSERT an entry for each change (however you want to define "change"). They should never UPDATE or DELETE anything*.
The change and the corresponding log entry should be written to the same database within the same transaction, to ensure atomicity/consistency; logging directly to a remote database will always leave you with a window where the log is committed but the change is not (or vice versa).
If you need to aggregate these log entries and push them to a different database, you should do it from an external process, not within the trigger itself. If you need this to happen in real time, you can inform the process of new changes via a notification channel.
* In fact, you should revoke UPDATE/DELETE privileges on the audit table from the user inserting the logs. Furthermore, the trigger should ideally be a SECURITY DEFINER function owned by a privileged user with INSERT rights on the log table. The user connecting to the database should not be given permission to write to the log table directly.
This ensures that if your client application is compromised (whether due to a malfunction, or a malicious user e.g. exploiting an SQL injection vulnerability), then your audit log retains a complete and accurate record of everything it changed.

Insert data into remote DB tables from multiple databases through trigger or replication or foreign data wrapper

I need some advice about the following scenario.
I have multiple embedded systems supporting PostgreSQL database running at different places and we have a server running on CentOS at our premises.
Each system is running at remote location and has multiple tables inside its database. These tables have the same names as the server's table names, but each system has different table name than the other systems, e.g.:
system 1 has tables:
sys1_table1
sys1_table2
system 2 has tables
sys2_table1
sys2_table2
I want to update the tables sys1_table1, sys1_table2, sys2_table1 and sys2_table2 on the server on every insert done on system 1 and system 2.
One solution is to write a trigger on each table, which will run on every insert of both systems' tables and insert the same data on the server's tables. This trigger will also delete the records in the systems after inserting the data into server. The problem with this solution is that if the connection with the server is not established due to network issue than that trigger will not execute or the insert will be wasted. I have checked the following solution for this
Trigger to insert rows in remote database after deletion
The second solution is to replicate tables from system 1 and system 2 to the server's tables. The problem with replication will be that if we delete data from the systems, it'll also delete the records on the server. I could add the alternative trigger on the server's tables which will update on the duplicate table, hence the replicated table can get empty and it'll not effect the data, but it'll make a long tables list if we have more than 200 systems.
The third solution is to write a foreign table using postgres_fdw or dblink and update the data inside the server's tables, but will this effect the data inside the server when we delete the data inside the system's table, right? And what will happen if there is no connectivity with the server?
The forth solution is to write an application in python inside each system which will make a connection to server's database and write the data in real time and if there is no connectivity to the server than it will store the data inside the sys1.table1 or sys2.table2 or whatever the table the data belongs and after the re-connect, the code will send the tables data into server's tables.
Which option will be best according to this scenario? I like the trigger solution best, but is there any way to avoid the data loss in case of dis-connectivity from the server?
I'd go with the fourth solution, or perhaps with the third, as long as it is triggered from outside the database. That way you can easily survive connection loss.
The first solution with triggers has the problems you already detected. It is also a bad idea to start potentially long operations, like data replication across a network of uncertain quality, inside a database transaction. Long transactions mean long locks and inefficient autovacuum.
The second solution may actually also be an option if you you have a recent PostgreSQL versions that supports logical replication. You can use a publication WITH (publish = 'insert,update'), so that DELETE and TRUNCATE are not replicated. Replication can deal well with lost connectivity (for a while), but it is not an option if you want the data at the source to be deleted after they have been replicated.

libpq code to create, list and delete databases (C++/VC++, PostgreSQL)

I am new to the PostgreSQL database. What my visual c++ application needs to do is to create multiple tables and add/retrieve data from them.
Each session of my application should create a new and distinct database. I can use the current date and time for a unique database name.
There should also be an option to delete all the databases.
I have worked out how to connect to a database, create tables, and add data to tables. I am not sure how to make a new database for each run or how to retrieve number and name of databases if user want to clear all databases.
Please help.
See the libpq examples in the documentation. The example program shows you how to list databases, and in general how to execute commands against the database. The example code there is trivial to adapt to creating and dropping databases.
Creating a database is a simple CREATE DATABASE SQL statement, same as any other libpq operation. You must connect to a temporary database (usually template1) to issue the CREATE DATABASE, then disconnect and make a new connection to the database you just created.
Rather than creating new databases, consider creating new schema instead. Much less hassle, since all you need to do is change the search_path or prefix your table references, you don't have to disconnect and reconnect to change schemas. See the documentation on schemas.
I question the wisdom of your design, though. It is rarely a good idea for applications to be creating and dropping databases (or tables, except temporary tables) as a normal part of their operation. Maybe if you elaborated on why you want to do this, we can come up with solutions that may be easier and/or perform better than your current approach.

report migration from SQL server to Oracle

I have a report in SQL server and I am migrating this to Oracle.
The approach I used in SQL server is load sum(sales) , person for given month into temporary tables (hash tables) and use this table to join with other transaction tables show the details, but when it comes to oracle I am not sure if I can use the same method here, because hash tables (temporary tables in SQL server) are specific to session and might not create any problem with output, please advise if there is anything in oracle which is analogous to that.
I came to know there are global temp tables in oracle, do they work in the manner I mentinoed above, also
If a user has no create/drop table privileges can they still use gloabal temp tables?
please help me.
You'll have to show some code or atleast some pseudo-code of how your process runs for anyone to help you. Having said that...
One thing that is different in oracle compared to temporary tables in other databases is that you do not create them each time you need them. You create them once and the data in the table is present either until you commit/rollback (transaction based) or until you end your session (session-based global temporary tables). Also, The data in a temporary table is visible only to the session that inserts the data into the table..
If you are generating the output files once and you don't need that data later, then Global temporary tables would probably fit in cleanly, with some minor changes.
Since you do not create the temporary tables each time you use them, you don't need the create/drop privilege. All you'd need is the insert/read privilege. Just read will not help because you cannot read another session's data anyways, so there is no use for it.

What are the major challenges of building an iPhone application that synchronizes data with a server via web APIs?

I want to build an application that utilizes the data from a server, and it needs to synchronize the data in the application with the data entered by other client applications.
So, there are some questions:
How to design the database schema efficiently? Should it replicate the same database schema on the server or should it add some more fields & entities?
What are the strategies to synchronize the data, on each application start or during some idle state of the application, or something else...
How to handle conflict of the data entered by the user within the application and data enter ed by another client application.
Any response is welcomed.
Well, you've identified the main challenges in your original question. The real answer is that this has little to do with the iPhone - database replication is just really hard.
Here are some rules of thumb I can offer:
one-way replication of data is a million times easier than two-way replication, if you can get away with it.
replication is always easier if the database schema is identical on the client and the server.
to do two-way replication, you either need to store timestamps for each row on each end, or to store the complete contents of one end on the other end. (ie. the server needs to know the client's most recent status, or the client needs to know the server's most recent status).
to allow adding rows from disconnected clients, you need to identify your rows using a GUID (or hash, eg. SHA-1), not an autoincrement field. It's possible to keep new client-added rows as "identifierless" until you sync them with the server, but that way lies madness.
there is no actual good way to do conflict resolution. The imperfect options include last-writer-wins (last person who syncs a modified record gets their copy of the record inserted), three-way-merge (when someone sends a modified record, check which columns they have changed, and change only those columns, thus not overwriting any changes to other columns), split-into-two-records (if two people make changes to the same record, just make two records and assume someone will fix it eventually), and "ask the user" (which is technically the most sound, but requires a lot of UI work and users rarely understand what a conflict even is).