Postgres ODBC Bulk Loading Slow on IBM SPSS - postgresql

I have the official Postgres ODBC drivers installed and am using IBM SPSS to try and load 4 million records from a MS SQL data source. I have the option set to bulk load via ODBC, but the performance is REALLY slow. When I go SQL-SQL the performance is good, when I go Postgres-Postgres the performance is good, but when I try and go SQL-Postgres it takes about 2.5 hours to load the records.
It's almost as if it's not bulk loading at all. Looking at the output it seems like it's reading the batched record count from the source very quickly (10,000 records), but the insert into the postgres side is taking forever. When I look at the record count every few seconds it jumps from 0 to 10,000 but takes minutes to get there, whereas it should be seconds.
Interestingly I downloaded a third party driver from DevArt and the load went from 2.5 hours to 9 minutes. Still not super quick, but much better. Either Postgres ODBC does not support bulk load (unlikely since postgres to postgres loads so quickly) or there's some configuration option at play in either the ODBC driver config or SPSS config.
Has anybody experienced this? I've been looking at options for the ODBC driver, but can't really see anything related to bulk loading.

IBM SPSS Statistics uses the IBM SPSS Data Access Pack (SDAP). These are 3rd party drivers from Progress/Data Direct. I can't speak to performance using other ODBC drivers. But if you are using the IBM SPSS Data Access Pack "IBM SPSS OEM 7.1 PosgreSQL Wire Protocol" ODBC driver, then there are resources for you
The latest Release of the IBM SPSS Data Access Pack (SDAP) is version 8.0. It is available from Passport Advantage (where you would have downloaded your IBM SPSS Statistics Software) as "IBM SPSS Data Access Pack V8.0 Multiplatform English (CC0NQEN )"
Once installed, see the Help. On Windows it will be here:
C:\ProgramData\Microsoft\Windows\Start Menu\Programs\IBM SPSS OEM Connect and ConnectXE for ODBC 8.0\

Related

Automate data loading to Google Sheet from PostgreSQL database

I would like to create an automated data pulling from our PostgreSQL database to a Google sheet. I've tried JDBC service, but it doesn't work, maybe incorrect variables/config. Does anyone already try doing this? I'd also like to schedule the extraction every hour.
According the the documentation, only Google Cloud SQL MySQL, MySQL, Microsoft SQL Server, and Oracle databases are supported by Apps Script's JDBC. You may have to either move to a new database or develop your own API services to handle the connection.
As for scheduling by the hour, you can use Apps Script's installable triggers.

How to replicate a postgresql database from local to web server

I am new in the form and also new in postgresql.
Normally I use MySQL for my project but I’ve decided to start migrating towards postgresql for some valid reasons which I found in this database.
Expanding on the problem:
I need to analyze data via some mathematical formulas but in order to do this I need to get the data from the software via the API.
The software, the API and Postgresql v. 11.4 which I installed on a desktop are running on windows. So far I’ve managed to take the data via the API and import it into Postgreql.
My problem is how to transfer this data from
the local Postgresql (on the PC ) to a web Postgresql (installed in a Web server ) which is running Linux.
For example if I take the data every five minutes from software via API and put it in local db postgresql, how can I transfer this data (automatically if possible) to the db in the web server running Linux? I rejected a data dump because importing the whole db every time is not viable.
What I would like is to import only the five-minute data which gradually adds to the previous data.
I also rejected the idea of making a master - slave architecture
because not knowing the total amount of data, on the web server I have almost 2 Tb of hard disk while on the local pc I have only one hard disk that serves only to take the data and then to send it to the web server for the analysis.
Could someone please help by giving some good advice regarding how to achieve this objective?
Thanks to all for any answers.

How to find out why import fails on Google Cloud SQL

I generate a .sql file, on my laptop, that contains around 11 million insert statements into several tables.
Locally I run a MySQL database, into which I import this file. It takes a while, but it succeeds without any problems. The local MySQL version is:
mysql Ver 14.14 Distrib 5.6.16, for osx10.7 (x86_64) using EditLine wrapper
I want to import this file into a Google Cloud SQL instance. To do so, I first gzip the .sql file and upload it to a bucket in Google Cloud Storage.
Then I create a D0 pay-per-use instance (the least powerful / cheapest). I click 'Import' and enter the name of the file on cloud storage.
The import starts, but after a while (around a day) the import fails, stating: An unknown error occurred.
I tried this using a MySQL 5.5 and an experimental 5.6 instance, both fail at different inserts. (I can see what the latest successful insert was).
My problem is, I cannot find out what MySQL thinks is the problem.
How can I ask the Google developer console to show me a log? I tried on the Google APIs page which has a 'Logs' tab, but it gives me An error has occurred. Please retry later.
Maybe Google Cloud SQL has some limits on the insert statements that my local MySQL does not have?
One of the fields is a MEDIUMTEXT, which I believe can be larger than 65.536 bytes.
Any advice is appreciated.
---------- UPDATE -----------
I mailed with the cloud-sql team and they confirmed the problem was that the import timed out.
So indeed, 24 hours is the maximum time an import may take on Cloud SQL.
Solutions are: use a more powerful instance for the import (and use asynchronous replication), or split up the .sql in multiple parts.
Another approach is to use several values per insert statement, just make sure the line does not exceed 4MB. This is what the value of max_allowed_packet is on cloud sql. It speeds up the insert greatly.
In fact, this makes it possible to have the D0 instance import the file in a few hours, so I don't need to bump it to a more powerful one.

How do I setup DB2 Express-C Data Federation for a Sybase data source?

I wish to make fields in a remote public Sybase database outlined at http://www.informatics.jax.org/software.shtml#sql appear locally in our DB2 project's schema. To do this, I was going to use data federation, however I can't seem to be able to install the data source library (Sybase-specific file libdb2ctlib.so for Linux) because only DB2 and Infomatix work OOTB with DB2 Express-C v9.5 (which is the version we're currently running, I also tried the latest V9.7.)
From unclear IBM documentation and forum posts, the best I can gather is we need to spend $675 on http://www-01.ibm.com/software/data/infosphere/federation-server/ to get support for Sybase but budget-wise that's a bit out of the question.
So is there a free method using previous tool versions (as it seems DB2 Information Integrator was rebranded as InfoSphere Federation Server) to setup DB2 data wrappers for Sybase? Alternatively, is there another non-MySQL approach we can use, such as switching our local DBMS from DB2 to PostgreSQL? Does the latter support data integration/federation?
DB2 Express-C does not allow federated links to any remote database, not even other DB2 databases. You are correct that InfoSphere Federation Server is required to federate DB2 to a Sybase data source. I don't know if PostgreSQL supports federated links to Sybase.
Derek, there are several ways in which one can create a federated database. One is by using the federated database capability that is built in to DB2 Express-C. However, DB2 Express-C can only federate data from specific data sources i.e. other DB2 databases and industry standard web services. To add Sybase to this list you must purchase IBM Federation Server product.
The other way is to leverage DB2 capability to create User Defined Functions in DB2 Express-C that use OLE DB API to access other data sources. Because OLE DB is a Windows-based technology, only DB2 servers running on Windows can do that. What you do is create a table UDF that you can then use anywhere you would expect to see a table result set e.g view definition. For example, you could define a view that uses your UDF to materialize the results. These results would come from a query (via OLE DB) of your Sybase data (or any other OLE DB compliant data source).
You can find more information here http://publib.boulder.ibm.com/infocenter/idm/v2r2/index.jsp?topic=/com.ibm.datatools.routines.doc/topics/coledb_cont.html

how to backup oracle 10g

how i can backup oracle 10g like backup and restore in sql server ?
i want to backup tables and data
thank's in advance
Oracle has a comprehensive backup and recovery suite which is formally called Recovery Manager but is universally known as RMAN. Find out more.
The typical term for this is 'migrate' not 'backup'. You'll then find results in your favorite search engine to tools like:
SQL Server Migration Assistant for Oracle
If you only want table structures and data (no code), you can use SQL Server Integration Services (SSIS) package for this.
Read up here on the various methods and their associated speed
http://sqlcat.com/technicalnotes/archive/2008/08/09/moving-large-amounts-of-data-between-oracle-and-sql-server-findings-and-observations.aspx
Oracle export data pump (expdp) is also good utility to take logical backup of your schema. You have very nice features with lot of flexibility in data pump. You can follow the link given below for further reading.
http://docs.oracle.com/cd/B19306_01/server.102/b14215/dp_export.htm