I have a PostgreSQL database containing a table with several 'timestamp with timezone' fields.
I have a tool (DBSync) that I want to use to transfer the contents of this table to another server/database.
When I transfer the data to a MSSQL server all datetime values are replaced with '1753-01-01'. When I transfer the data to a PostgreSQL database all datetime values are replaced with '0001-01-01'.
The smallest possible date for those systems.
Now i recreate the source-table (including contents) in a different database on the same PostgreSQL server. The only difference: the sourcetable is in a different database. Same server, same routing. Only ports are different.
User is different but in each database I have the same rights.
How can it be that the database is responsible for an apparant different interpretation of the data? Do PostgreSQL databases have database-specific settings that can cause such behaviour? What database-settings can/should I check?
To be clear, I am not looking for another way to transfer data. I have several available. The thing that I am trying to understand is: how can it be that, if an application reads datetime info from table A in database Y on server X, it gives me the the wrong date while when reading the same table from database Z on server X will give me the data as it should be.
It turns out that the cause is probably the difference in server-version. One is a Postgres 9 (works ok), the other is a Postgres 10 (does not work okay).
They are different instances on the same machine. Somehow I missed that (blush).
With transferring I meant that I am reading records from a sourcedatabase (Postgresql) and inserting them in a targetdatabase (mssql 2017).
This is done through the application, I am not sure what drivers it is using.
I wil work with the people who made the application.
For those wondering: it is this application: https://dbconvert.com/mssql/postgresql/
When a solution is found I will update this answer with the found solution.
Related
I have created a db long ago using django. Now as we are migrating the application, so I need all the CREATE TABLE sql queries which django might have run to create the entire db for our service (which has around 70-80 tables and each table has avg 30-70 columns).
Both the servers old and new are using Postgres for databases.
But the technology stack is completely different (A 3rd party proprietary application which will host the service) instead of django.
If I start to write all the tables again from scratch, it will take at least a week or two.
Is there any way either from Postgres or from django which can generate the CREATE TABLE sql schema for an entire db keeping all the relationship as is?
Also, I have to do minor modification to that schema as per customer requirement.
p.s - pg_dump won't work as I need actual schema itself to get it reviewed from client.
I need some advice about the following scenario.
I have multiple embedded systems supporting PostgreSQL database running at different places and we have a server running on CentOS at our premises.
Each system is running at remote location and has multiple tables inside its database. These tables have the same names as the server's table names, but each system has different table name than the other systems, e.g.:
system 1 has tables:
sys1_table1
sys1_table2
system 2 has tables
sys2_table1
sys2_table2
I want to update the tables sys1_table1, sys1_table2, sys2_table1 and sys2_table2 on the server on every insert done on system 1 and system 2.
One solution is to write a trigger on each table, which will run on every insert of both systems' tables and insert the same data on the server's tables. This trigger will also delete the records in the systems after inserting the data into server. The problem with this solution is that if the connection with the server is not established due to network issue than that trigger will not execute or the insert will be wasted. I have checked the following solution for this
Trigger to insert rows in remote database after deletion
The second solution is to replicate tables from system 1 and system 2 to the server's tables. The problem with replication will be that if we delete data from the systems, it'll also delete the records on the server. I could add the alternative trigger on the server's tables which will update on the duplicate table, hence the replicated table can get empty and it'll not effect the data, but it'll make a long tables list if we have more than 200 systems.
The third solution is to write a foreign table using postgres_fdw or dblink and update the data inside the server's tables, but will this effect the data inside the server when we delete the data inside the system's table, right? And what will happen if there is no connectivity with the server?
The forth solution is to write an application in python inside each system which will make a connection to server's database and write the data in real time and if there is no connectivity to the server than it will store the data inside the sys1.table1 or sys2.table2 or whatever the table the data belongs and after the re-connect, the code will send the tables data into server's tables.
Which option will be best according to this scenario? I like the trigger solution best, but is there any way to avoid the data loss in case of dis-connectivity from the server?
I'd go with the fourth solution, or perhaps with the third, as long as it is triggered from outside the database. That way you can easily survive connection loss.
The first solution with triggers has the problems you already detected. It is also a bad idea to start potentially long operations, like data replication across a network of uncertain quality, inside a database transaction. Long transactions mean long locks and inefficient autovacuum.
The second solution may actually also be an option if you you have a recent PostgreSQL versions that supports logical replication. You can use a publication WITH (publish = 'insert,update'), so that DELETE and TRUNCATE are not replicated. Replication can deal well with lost connectivity (for a while), but it is not an option if you want the data at the source to be deleted after they have been replicated.
We recently migrated a large DB2 database to a new server. It got trimmed a lot in the migration, for instance 10 years of data chopped down to 3, to name a few. But now I find that I need certain data from the old server until after tax season.
How can I run a UNION query in DBeaver that pulls data from two different connections..? What's the proper syntax of the table identifiers in the FROM and JOIN keywords..?
I use DBeaver for my regular SQL work, and I cannot determine how to span a UNION query across two different connections. However, I also use Microsoft Access, and I easily did it there with two Pass-Through queries that are fed to a native Microsoft Access union query.
But how to do it in DBeaver..? I can't understand how to use two connections at the same time.
For instance, here are my connections:
And I need something like this...
SELECT *
FROM ASP7.F_CERTOB.LDHIST
UNION
SELECT *
FROM OLD.VIPDTAB.LDHIST
...but I get the following error, to which I say "No kidding! That's what I want!", lol... =-)
SQL Error [56023]: [SQL0512] Statement references objects in multiple databases.
How can this be done..?
This is not a feature of DBeaver. DBeaver can only access the data that the DB gives it, and this is restricted to a single connection at a time (save for import/export operations). This feature is being considered for development, so keep an eye out for this answer to be outdated sometime in 2019.
You can export data from your OLD database and import it into ASP7 using DBeaver (although vendor tools for this are typically more efficient for this). Then you can do your union as suggested.
Many RDBMS offer a way to logically access foreign databases as if they were local, in which case DBeaver would then be able to access the data from the OLD database (as far as DBeaver is concerned in this situation, all the data is coming from a single connection). In Postgres, for example, one can use a foreign data wrapper to access foreign data.
I'm not familiar with DB2, but a quick Google search suggests that you can set up foreign connections within DB2 using nicknames or three-part-names.
If you check this github issue:
https://github.com/dbeaver/dbeaver/issues/3605
The way to solve this is to create a task and execute it in different connections:
https://github.com/dbeaver/dbeaver/issues/3605#issuecomment-590405154
I am new to SSIS and am after some assistance in creating an SSIS package to do a specific task. My data is stored remotely within a MySQL Database and this is downloaded to a SQL Server 2014 Database. What I want to do is the following, create a package where I can enter 2 dates that can be compared against the create date/date modified per record on a number of tables to give me a snap shot and compare the MySQL Data to the SQL Data so that I can see if there are any rows that are missing from my local SQL Database or if any need to be updated. Some tables have no dates so I just want to see a record count on what is missing if anything between the 2. If this is better achieved through TSQL I am happy to hear about other suggestions or sites to look at where things have been done similar.
In relation to your query Tab :
"Hi Tab, What happens at the moment is our master data is stored in a MySQL Database, the data was then downloaded to a SQL Server Database as a one off. What happens at the moment is I have a SSIS package that uses the MAX ID which can be found on most of the tables to work out which records are new and just downloads them or updates them. What I want to do is run separate checks on the tables to make sure that during the download nothing has been missed and everything is within sync. In an ideal world I would like to pass in to a SSIS package or tsql stored procedure a date range, shall we say calender week, this would then check for any differences between the remote MySQL database tables and the local SQL tables. It does not currently have to do anything but identify issues, correcting them may come later or changes would need to be made to the existing sync package. Hope his makes more sense."
Thanks P
To do this, you need to implement a Type 1 Slowly Changing Dimension type data flow in SSIS. There are a number of ways to do this, including a built in transformation aptly called the Slowly Changing Dimension transformation. Whilst this is easy to set up, it is a pain to maintain and it runs horrendously slowly.
There are numerous ways to set this up using other transformations or even SQL merge statements which are detailed here: https://bennyaustin.wordpress.com/2010/05/29/alternatives-to-ssis-scd-wizard-component/
I would recommend that you use Lookup transformations as they perform better than the Slowly Changing Dimension transformation but offer better diagnostics and error handling than the better performing SQL merge statement.
Before you do this you will need to add a Checksum or Hashbytes column to your SQL data for ease of comparison with the incoming MySQL data.
In short, calculate some sort of repeatable checksum as the data is downloaded into your SQL Server, then use this in an SSIS Lookup, matching on the row key, to check for changes. Where the checksum value is different for the same row it needs updating and where there is no matching row key in your SQL Data you need to insert the new row.
How can I obtain the creation date or time of an IBM's DB2 database without connecting to the specified database first? Solutions like:
select min(create_time) from syscat.tables
and:
db2 list tables for schema SYSIBM
require me to connect to the database first, like:
db2 connect to dbname user userName using password
Is there another way of doing this through a DB2 command instead, so I wouldn't need to connect to the database?
Can db2look command be used for that?
Edit 01: Background Story
Since more than one person asked why do I need to do this and for what reasons, here is the background story.
I have a server with DB2 DBMS where many people and automated scripts are using it to create some databases for temporary tasks and tests. It's never meant to keep the data for long time. However for one reason or another (ex: developer not cleaning after himself or tests stopping forcefully before they can do the clean up) some databases never get dropped and they start to get accumulated till the hard disk is filled out eventually. So The idea of the app is to look up the age of the database and drop it, if it's older than 6 months (for example).