Any idea how to go about doing that through a tool(preferred). Any alternate ways to do that.
You can check out the migration studio from EnterpriseDB here, although I have no experience with it.
There is no comparison to doing it yourself though - if you're not familiar with Postgres then this will get you familiar, and if you are, then aside from the data entry aspect, this should be old hat.
Use maxdb tools to generate a SQL text export of the database. Then import this file in PostgreSQL, luckily you won't need prior processing of the data dump.
Related
We have a series of modifications to a Postgres database, which can generally be written all in SQL. So it seems Flyway would be a great fit to automate these.
However, they also include imports from files to tables, such as
COPY mytable FROM '${PWD}/mydata.sql';
And secondarily, we'd like not to rely on Postgres' use of file paths like this, which apparently must reside on the server. It should be possible to run any migration from a remote client -- as in Amazon's RDS documentation (last section).
Are there good approaches to handling this kind of scenario already in Flyway? Or alternate approaches to avoid this issue altogether?
Currently, it looks like it'd work to implement the whole migration in Java and use the Postgres driver's CopyManager to import the data. However, that means most of our migrations have to be done in Java, which seems much clumsier. (As far as I can tell, hybrid Java+SQL migrations are not expected?)
Am new to looking at Flyway so thought I'd ask what other alternatives might exist with Flyway, since I'd expect it's pretty common to import a table during a migration.
Starting with Flyway 3.1, you can use COPY FROM STDIN statements within your migration files to accomplish this. The SQL execution engine will automatically use PostgreSQL's CopyManager to transfer the data.
Let me start by saying, what I know about Pentaho wouldn't fill up a single paragraph. I'm more knowledgeable about PostgreSQL. I'm working with some contractors that are building a set of monthly reports in Pentaho (v. 4.5) for my company. Some of the data needs to go through a ETL process and get rolled up for reporting purposes. From a dba(ish) point of view, I would like to move these tables into a separate PostgreSQL schema.
I know that Pentaho is often times used with MySQL (which doesn't have schemas) and I'm concerned this might cause problems. I've done some "googlin'" and I don't turn up a lot of hits on the topic, but I did find a closed bug from a few years ago - thus implying that the functionality should be supported.
before I do this, I would like to see if anyone knows of a reason this will fail or be a bad idea. (or if you've done it an it works great, please let me know that, too).
Final notes: I'm using PostgreSQL 9.1.5, and I don't have access to a Pentaho instance to even test this myself. And I'm hoping the good folks in the Stackoverflow community will share their expertise and save me from having to install one and the hours of playing/testing to get an idea of this is a bad idea.
EDIT:
I sort of knew this question was a bit vague, but I was hoping that some one would read it and share any experience they have. So, Let me spell it out more clearly and ask more explicit questions.
I have not done anything. I don't know Pentaho. I don't want to learn Pentaho (not that there is anything wrong with Pentaho... It's just not where my interests are right now). My company hired contractors (I did not hire them). They have experience with Pentaho, but with MySQL. They don't really know anything about PostgreSQL. There are some important difference between PostgreSQL and MySQL. Including the fact that PostgreSQL supports schemas (whereas MySQL uses separate database... similar in concept be behave differently in some ways). Some ORMs (and tools) don't really like this... for example, the Django framework still doesn't really fully support schemas in Postgresql (I know this because I use Python and Django often and my life is much better when I keep things in the "public" schema). Because of my experience with Django and PostgreSQL schemas, I'm a bit leery of moving this data to a new schema.
I do understand that where ever the tables are, they will need permissions to be able to access the data.
My explicit questions:
Do you use Pentaho to access a PostgreSQL database to access tables in schemas other than "public" (the default).
If so, does it just work (no problems)?
If you had problems, would you please be willing to share with me (and the Stackoverflow community) any online resources that helped you? Or would you be willing to detail what you remember here?
Do you know of anything that just won't work correctly? For example, an open bug in Pentaho related to this topic.
Again, it's not your standard kind of question. I'm hoping that someone out there has experience and is willing to share it here and save me from having to spend time setting up a new Pentaho instance and trying to learn Pentaho well enough to test it, etc.
Thanks.
Two paths you can take:
1) What previous post said ("Pentaho steps (table inputs, outputs, etc.) usually allow you to specify a database schema.")
2) In database connection, advanced tab, "The preferred schema name".
If you're working with different schemas, you can create one database connection per schema. With this approach you can leave schema field in input/output steps empty.
We use MS SQL server and I can tell you that Pentaho does struggle with the idea of a schema. Many of their apps allow you to select a schema but Pentaho, like you said, is built to use something like mySQL.
Make you pentaho database user work like it would be working in mySQL.
We made the database user default to dbo then we structured our tables like dbo.dimDimension,
dbo.factFactTable etc. Basically, only use dbo for Pentaho purposes. (Or whatever schema you want to default to.)
I use PDI and PgSQL extensively every day with a bunch of different schemas. It works fine. The only trouble you might run into is Pg's troublesome practice of forcing unquoted identifiers to lower instead of upper case. I soon realized everything was easier when I set the Advanced connection property to "Quote all in database".
Yes, you have to quote everything when you type SQL if PDI doesn't do it for you, but it works quite well. Haven't experimented with forcing all identifiers to lower case, but I expect that would work as well.
And yes, use the "Preferred schema nanme" as well, but be aware that some steps use that option and others don't. You can't, for example, expect it to add schema names to SQL you type into a Table Input step.
The only other issues you might run into are the limits of Pg's JDBC driver. It's not as good as SQL Server's or DB2's, but the only thing I've every had trouble with was sending error rows from a Table Output step to another step when the Table Output step was in batch mode.
Have fun learning PDI. It makes a great complement to your DBA skills.
Brian
Pentaho steps (table inputs, outputs, etc.) usually allow you to specify a database schema.
I did a quick test using PDI and our 8.4 Postgres instance and was able to explore, read from and write to tables in different schemas.
So, I think this is a reasonable direction. Hope this helps.
I am new to db2. I have written procedures in oracle.
I need to convert those procedures from oracle to db2.
I want to know how the procedures in db2 will be created and compiled.
Thanks in advance.
A good walkthrough can be found here.
But then you always have the redbooks that IBM puts out such as this or this.
It is worth pointing out this migration tool that you might find useful.
I have a database which is part of a closed system and the end-user of the system would like me to write some reports using the data contains in a Sybase SQL Anywhere Database. The system doesn't provide the reports that they are looking for, but access to the data is available by connecting to this ASA database.
The vendor of the software would likely prefer I not update the database and I am basically read-only as I am just doing some reporting. All is good, seal is not broken, warranty still intact, etc,etc..
My main problem is that I am using jConnect in order to read from the database, and jConnect requires some "jConnect Routines" to be installed into the database. I've found that I can make this happen by just doing an "Alter Database Upgrade JConnect On", but I just don't fully understand what this does and if there is any risks associated with it.
So, my question is does anyone know exactly what jConnect routines are and how are they used? Is there any risk adding these to a database? Should I be worried about this?
If the vendor wants you to write reports using jConnect they will have to allow the installation of the JConnect tables.
These are quite safe, where I work the DBA team install these as a matter of course and we run huge databases in production with no impact.
There is an alternative driver that you could use called jTDS. Its open source and supports MS SQL Server and Sybase. I'm not sure if they require the JConnect tables or not.
I think that the additional tables are a bit of anachronism in this day and age.
Looking at ASA 10 docs, there is another driver: the iAnywhere JDBC driver which seems to be going through the ODBC driver, and as such, probably will not require an alteration of the database.
On the other hand, installing the "jConnect system objects" is done by running the script scrits/jcatalog.sql... You can show it the DBAs, if you want to reassure them. It creates some procedures, tables, variables.
The need for this script probably comes from the fact that jConnect talks to both ASE (Sybase) and iAnywhere databases, so it needs a compatibility layer installed in the database...
I am looking for a tool/command which can compare data between two PostgreSQL databases. The reason to do this is to have some external verification that the SQL script responsible for data migration from one PostgreSQL database to the other have been written correctly.
Any pointers would be appreciated
regards
Sameer
EMS makes a tool which can do this.
http://sqlmanager.net/en/products/postgresql/dbcomparer
The PostgreSQL site has a nice Software catalogue. Peruse the Administration/development tools category.