Communicating with Informix from PostgreSQL? [closed] - postgresql

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 10 years ago.
Improve this question
Please help me to setup connectivity from PostgreSQL to Informix (latest versions for both). I would like to be able to perform a query on Informix from PostgreSQL. I am looking for a solution that will not require data exports (from Informix) and imports (to PostgreSQL) for every query.
I am very new in PostgreSQL and need detailed instructions.

As Chris Travers said, what you're seeking to do is not easy to do.
In theory, if you were working with Informix and needed to access PostgreSQL, you could (buy and) use the Enterprise Gateway Manager (EGM) and use the ODBC driver for PostgreSQL to allow Informix to connect to PostgreSQL. The EGM would do its utmost to appear to be another Informix database while actually accessing PostgreSQL. (I've not validated that PostgreSQL is supported, but EGM basically needs an ODBC driver to work, so there shouldn't be any problem — 'famous last words', probably.) This will include an emulation of 2PC (two-phase commit); not perfect, but moderately close.
For the converse connection (working with PostgreSQL and connecting to Informix), you will need to look to the PostgreSQL tool suite — or other sources.

You haven't said which version you are using. There are some limitations to be aware of but there are a wide range of choices.
Since you say this is import/export, I will assume that read-only options are not sufficient. That rules out PostgreSQL 9.1's foreign data wrapper system.
Depending on your version David Fetter's DBI-Link may suit your needs since it can execute queries on remote tables (see https://github.com/davidfetter/DBI-Link). It hasn't been updated in a while but the implementation should be pretty stable and usable across versions. If that fails you can write stored procedures in an untrusted language (PL/PythonU, PL/PerlU, etc) to connect to Informix and run the queries there. Note there are limits regarding transaction handling you will run into in this case so you may want to run any queries on the other tables using deferred constraint triggers so everything gets run at commit time.
Edit: A cleaner way occurred to me: use foreign data wrappers for import and a separate client app for export.
In this approach, you are going to have four basic components but this will be loosely coupled and subject to proper transactional controls. You can even use two-phase commit if you want. The four components are (not providing a complete working example here but at least a roadmap to one):
Foreign data wrappers for data import, allowing you to see data from Informix.
Views of data to be exported.
External application which manages the export aspect, written in a language of your choice. This listens on a channel like LISTEN export_informix;
Triggers on underlying tables which make view of data to be exported which raise a NOTIFY export_informix
The notifications are riased on the commit and so basically you have two stages to your transaction in this way:
Write data in PostgreSQL, flag data to be exported. Commit.
Read data from PostgreSQL, export to Informix. Commit on both sides (TPC?).

Related

Import data from cvs excel file to DB2 (mac terminal)

I have a cvs excel file on my computer and I would like to import the data from the file into a table in DB2. Does anyone know how to do this? Thank you
If you are able to connect to the Db2-database at the terminal, then you have several ways to add content into the Db2-database . Each method has different requirements (e.g. the type of the Db2-client, the privileges of your userid , the target Db2-server version and capabilities and platform, and many others...) , and you have to understand the requirements before you can properly use any of the commands. Most of these commands require the fat client of Db2 to be installed on the workstation. If your workstations lack a fat client and if you can transfer the file to a location accessible to the Db2-server then you can also use a stored procedure to run these commands at the Db2-server, but run this from your terminal, see ADMIN_CMD.
All of these options are fully documented in the free online Db2 Knowledge Centre, and each has many special command-line options to control the exact behaviour of the action. Therefore careful study is needed, along with careful rehearsal and testing. Stackoverflow is not a substitute for your education or your training. So it is wise to study the documentation before asking questions that are already answered in the documentation. Sometimes you need to study all the linked pages in the documentation and rehearse each option to fully understand how to use these things for best advantage.
Your options include:
The import command. Slow (logged), and flexible with many options.
Details here
The ingest command. Faster than import and also programmable. Details here.
The load command. Fastest but requires skill and experience to properly harness its power properly especially in high-availability environments. Details here.
Another option is to use external tables, if your Db2-server-platform-and-version supports external tables and if it has the necessary fixes already applied. This lets you represent a flat file (e.g. a CSV file) as a table in the database. In this case, you can use INSERT INTO target-table ....select ... from your-external-table. Requires skill and competence. Details here and many related pages.

Is PostgreSQL a NoSQL database? [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 5 years ago.
Improve this question
I am getting confused about PostgreSQL. In some places people are saying it's NoSQL, and in some places people are saying it's not NoSQL. I have seen something Postgres Plus which is actual NoSQL.
Can you tell me what is true?
You have been confused my marketing and buzzwords.
“NoSQL” is a buzzword describing a diverse collection of database systems that focus on “semi-structured” data (that do not fit well into a tabular representation), sharding and high concurrency at the expense of transactional integrity and consistency, the latter being among the basic tenets of relational database management systems (RDBMS).
Since SQL is the language normally used to interact with an RDBMS, the term “NoSQL” is used as a name for all these systems. Perhaps the name was also chosen because SQL, being verbose and often hard to understand, evokes negative reactions in many programmers.
Now PostgreSQL, like many other RDBMS, has added support for JSON data, which is the most popular format for semi-structured data commonly stored in NoSQL systems. Now you can say that PostgreSQL supports a certain feature commonly found in NoSQL databases.
Still, SQL is the only way to interact with a PostgreSQL database, so you couldn't call it a NoSQL database and keep a straight face unless you were in marketing.
Postgres Plus is a closed source fork of PostgreSQL, so the same applies to it.
PostgreSQL is not NoSQL.
PostgreSQL is a classical, relational database server (and syntax) supporting most of the SQL standards.
On a sidenote, I suggest doing some research into the differences and advantages. They both have a solid place and time.
PostgreSQL prides itself in standards compliance. Its SQL implementation strongly conforms to the ANSI-SQL:2008 standard. It has full support for subqueries (including subselects in the FROM clause), read-committed and serializable transaction isolation levels. And while PostgreSQL has a fully relational system catalog which itself supports multiple schemas per database, its catalog is also accessible through the Information Schema as defined in the SQL standard.
Source

Data mining with postgres in production environment - is there a better way?

There is a web application which is running for a years and during its life time the application has gathered a lot of user data. Data is stored in relational DB (postgres). Not all of this data is needed to run application (to do the business). However form time to time business people ask me to provide reports of this data data. And this causes some problems:
sometimes these SQL queries are long running
quires are executed against production DB (not cool)
not so easy to deliver reports on weekly or monthly base
some parts of data is stored in way which is not suitable for such
querying (queries are inefficient)
My idea (note that I am a developer not the data mining specialist) how to improve this whole process of delivering reports is:
create separate DB which regularly is update with production data
optimize how data is stored
create a dashboard to present reports
Question: But is there a better way? Is there another DB which better fits for such data analysis? Or should I look into modern data mining tools?
Thanks!
Do you really do data mining (as in: classification, clustering, anomaly detection), or is "data mining" for you any reporting on the data? In the latter case, all the "modern data mining tools" will disappoint you, because they serve a different purpose.
Have you used the indexing functionality of Postgres well? Your scenario sounds as if selection and aggregation are most of the work, and SQL databases are excellent for this - if well designed.
For example, materialized views and triggers can be used to process data into a scheme more usable for your reporting.
There are a thousand ways to approach this issue but I think that the path of least resistance for you would be postgres replication. Check out this Postgres replication tutorial for a quick, proof-of-concept. (There are many hits when you Google for postgres replication and that link is just one of them.) Here is a link documenting streaming replication from the PostgreSQL site's wiki.
I am suggesting this because it meets all of your criteria and also stays withing the bounds of the technology you're familiar with. The only learning curve would be the replication part.
Replication solves your issue because it would create a second database which would effectively become your "read-only" db which would be updated via the replication process. You would keep the schema the same but your indexing could be altered and reports/dashboards customized. This is the database you would query. Your main database would be your transactional database which serves the users and the replicated database would serve the stakeholders.
This is a wide topic, so please do your diligence and research it. But it's also something that can work for you and can be quickly turned around.
If you really want try Data Mining with PostgreSQL there are some tools which can be used.
The very simple way is KNIME. It is easy to install. It has full featured Data Mining tools. You can access your data directly from database, process and save it back to database.
Hardcore way is MADLib. It installs Data Mining functions in Python and C directly in Postgres so you can mine with SQL queries.
Both projects are stable enough to try it.
For reporting, we use non-transactional (read only) database. We don't care about normalization. If I were you, I would use another database for reporting. I will desing the tables following OLAP principals, (star schema, snow flake), and use an ETL tool to dump the data periodically (may be weekly) to the read only database to start creating reports.
Reports are used for decision support, so they don't have to be in realtime, and usually don't have to be current. In other words it is acceptable to create report up to last week or last month.

PostgreSql or SQL Server 2008 R2 should be use with .Net application using entity framework?

I have a database in PostgreSQL with millions of records and I have to develop a website that will use this database using Entity Framework (using dotnetConnect for PostgreSQL driver in case of PostgreSQL database).
Since SQL Server and .Net are both native to the Windows platform, should I migrate the database from PostgreSQL to SQL Server 2008 R2 for performance reasons?
I have read some blogs comparing the two RDBMS' but I am still confused about which system I should use.
There is no clear answer here, as its subjective, however this is what I would consider:
The overhead of learning a new DBMS and its tools.
The SQL dialects each RDBMS uses and if you are using that dialect currently.
The cost (monetary and time) required to migrate from PostgreSQL to another RDBMS
Do you or your client have an ongoing budget for the new RDBMS? If not, don't make the mistake of developing an application to use a RDBMS that will never see the light of day.
Personally if your current database is working well I wouldn't change. Why fix what isn't broke?
You need to find out if there is actually a problem, and if moving to SQL Server will fix it before doing any application changes.
Start by ignoring the fact you've got .net and using entity framework. Look at the queries that your web application is going to make, and try them directly against the database. See if its returning the information quick enough.
Only if, after you've tuned indexes etc. you can't make the answers come back in a time you're happy with should you decide the database is a problem. At that point it makes sense to try the same tests against a SQL Server database, but don't just assume SQL Server is going to be faster. You might find out that neither can do what you need, and you need to use faster disks or more memory etc.
The mechanism you're using to talk to a database (DotConnect or Microsoft drivers) will likely be a very minor performance consideration, considering the amount of information flowing (SQL statements in one direction and result sets in the other) is going to be almost identical for both technologies.

Sybase SQLAnywhere jConnect routines?

I have a database which is part of a closed system and the end-user of the system would like me to write some reports using the data contains in a Sybase SQL Anywhere Database. The system doesn't provide the reports that they are looking for, but access to the data is available by connecting to this ASA database.
The vendor of the software would likely prefer I not update the database and I am basically read-only as I am just doing some reporting. All is good, seal is not broken, warranty still intact, etc,etc..
My main problem is that I am using jConnect in order to read from the database, and jConnect requires some "jConnect Routines" to be installed into the database. I've found that I can make this happen by just doing an "Alter Database Upgrade JConnect On", but I just don't fully understand what this does and if there is any risks associated with it.
So, my question is does anyone know exactly what jConnect routines are and how are they used? Is there any risk adding these to a database? Should I be worried about this?
If the vendor wants you to write reports using jConnect they will have to allow the installation of the JConnect tables.
These are quite safe, where I work the DBA team install these as a matter of course and we run huge databases in production with no impact.
There is an alternative driver that you could use called jTDS. Its open source and supports MS SQL Server and Sybase. I'm not sure if they require the JConnect tables or not.
I think that the additional tables are a bit of anachronism in this day and age.
Looking at ASA 10 docs, there is another driver: the iAnywhere JDBC driver which seems to be going through the ODBC driver, and as such, probably will not require an alteration of the database.
On the other hand, installing the "jConnect system objects" is done by running the script scrits/jcatalog.sql... You can show it the DBAs, if you want to reassure them. It creates some procedures, tables, variables.
The need for this script probably comes from the fact that jConnect talks to both ASE (Sybase) and iAnywhere databases, so it needs a compatibility layer installed in the database...