Ive used the Data Comare tool to update schema between the same DB's on different servers, but what If so many things have changed (including data), I simply want to REPLACE the target database?
In the past Ive just used TSQL, taken a backup then restored onto the target with the replace command and/or move if the data & log files are on different drives. Id rather have an easier way to do this.
You can use Schema Compare (also by Red Gate) to compare the schema of your source database to a blank target database (and update), then use Data Compare to compare the data in them (and update). This should leave you with the target the same as the source. However, it may well be easier to use the backup/restore method in that instance.
Related
I am using the restore db to import backup into a test environment database that works fine, but I need to extend this import process to several backups from several dates into a unique test environment db... What is the command to append backups into a unique database...
thanks
Phil
You write "I get only full offline database backup files from several time date (2 weeks of production) that I need to restore in a test database for analysis... example 4 backups files of 2 weeks of data = 2 months of data ...."
and you also write "What is the command to append backups into a unique database..."
While there is no explicit method for full-offline-backup images to be combined for Db2-LUW, there's always another way to get what you need...given the right skills and tools.
IF you have a FULL backup image, it can either be restored to a new database, or it can fully overwrite an existing database. If you have 4 FULL backup images, each can be restored either into a (uniquely named) database (or overwrite 4 existing databases).
You can also restore specific tablespaces from a backup image, if properly configured. Some sites have designed discrete tablespaces for specific time periods (one per day/week/month) to help with such activities. Some sites have designed their tables to be range partitioned (with each time period having its own partition (and sometimes dedicated tablespaces also), and this makes subsequent merging of content more easy with the right skills.
If you are competent with scripting, you can restore the first (earliest) image, export the relevant table contents to flat-files, restore the next backup image and export the relevant tables to new flat-files (repeat as needed), then load these flat-files into a table for analysis. If your database size is small then this can be considered a keep-it-simple approach.
You can also do clever things with federation if you restore to discrete databases.
Separately purchasable tools exist to let you extract selected content from a backup image (which can then be loaded into a Db2 database), without needing to do a restore action. These are not included with the Db2 product. So you could extract specific table contents from a backup image if you pay for the right tools and learn how to use them. Speak with your IBM Salesperson. Such tools may require currently supported versions of Db2 however.
Someone in my org created a Data Extract. There is an issue in one of the worksheets that uses it, and we suspect it's due to a mistake in how the Union was built.
But since it's a Data Extract, I can't see the UI for the data merge. Is there anyway to take a current Data Extract and view the logic that creates it?
Download the extract from the server (I'm assuming you're using server), then open that extract using desktop. You should be able to see the details of it.
Before going too deep into extract details, note that extracts are not intended to be permanent systems of record for data - just an efficient way to work with query results for optimized reporting. So in general, you should always be able to throw away the extract and look at the original source - or recreate the extract on command. But life isn't always perfect so ...
If you use Tableau Desktop to look at your worksheet, and look at the data source icon at the top of the data pane in the left sidebar, do you see an icon for your data source that looks like two databases with one on top of (shadowing) the other? If so, you can at right click on the data source icon and view its properties to see the source database table or file path. You can then even try disabling the extract to view the original source data.
If instead you see a single database icon, you have a "naked" extract where you've discarded the reference to the original source, (unless it is stored in the catalog mentioned below.)
If your organization purchased the Data Management Add-on for Tableau Server (strongly recommended), then if your data source is published to Tableau Server you can trace its history and origin by exploring the Tableau Catalog. That is especially valuable if the extract was built by a Tableau Prep Flow.
If instead, someone built the extract another way, say by writing a custom app using the Tableau Data Extract API, then the answer is to find that program.
One last point, in recent versions of Tableau, extracts are stored in an efficient relational type database file called Hyper. Hyper extracts can either be a single table (say serializing the results of a query joining multiple tables) or a Hyper extract can contain multiple tables (say serializing caching individual tables and deferring the join for later).
That may not be relevant to your question, but could turn out to matter as you reverse engineer how the extract was created.
I need to implement schema migration mechanism for PostgreSQL.
Just to remove ambiguity: with schema-migration I mean that I need upgrade my database structures to the latest version regardless of their current state on particular server instance.
For example in version one I created some tables, then in version two I renamed some columns and in version three I removed one table and created another one. I have multiple servers and on some of them I have version one on some version three etc.
My idea:
Generate hash for output produced by
pg_dump --schema-only
every time before I change my database schema. This will be a reliable way to identify database version in the future to which the patch should apply.
Contain a list of patches with the associated hashed to which they should apply.
When I need to upgrade my database I will run an application that will search for hash that corresponds to current database structure (by calculating hash of local database and comparing it with hash set that I have) and apply associated patch.
Repeat until next hash is not found.
Could you please point any weak sides of this approach?
Have you ever heard of https://pgmodeler.io ? At the company where I work we decided to go for this since it can perform schema diff even between local and remote. We are very satisfied with it.
Otherwise if you are more for a free solution, you could develop a migration tool which can be used to apply migrations you store in a single repo. Furthermore this tool could rely on a migration table you keep in a separate schema so that your DB(s) will always know which migrations were applied or not.
The beauty of this approach is that migrations can both be about a schema change and data changes.
I hope this can give you some ideas.
I have a personal license for Tableau. I am using it to connect to .csv and .xlsx files currently but am running into some issues.
1) The .csv files are massive (10+ gig)
2) The Excel files are starting to reach the 1mil row limit
3) I need to add certain columns to the .csv files sometimes (like unique ID and a few formulas) which means that I need to open sections of them in Excel, modify what I need to, then save a new file
Would it be better to create an extract for each of these files and then connect the Tableau Workbook to the extract instead of the file? Currently I am connected directly to files and then extract data from there and refresh everyday.
I don't know about others, but I'm using that exactly guideline. I'll have some Workbooks that will simply serve to extract data from some datasource (be it SQL, xlsx, csv, mdb, or any other), and all analysis will be performed in other Workbooks, that'll connect only to tdes
The advantages are:
1) Whenever you need to update a data source, you'll need to only update once (and replace the tde file) and all your workbooks will be up to date. If you connect to the same data source and extract to different tde files, you'll have to extract to all those different tde files (and worry about having updated the extract in that specific Workbook). And even if you extract to the same tde (which doesn't make much sense), it can be confusing (am I connected to the tde or to the file? Does the extract I made in the other workbook updated this one too? Well, yes it did, but it can be confusing)
2) You don't have to worry about replacing a datasource, especially when it's a csv, xlsx or mdb file. You can keep many different versions of those files, and choose which one is the best one. For instance, I'll have table_v1.mdb, table_v2.mdb, ..., and a single table_v1.tde, which will be the extract of one of those mdb files. And I still have the previous versions in case I need them.
3) When you have a SQL connection, or anything that is not a file (csv, xlsx, mdb), extracts are very handy for basically the same reasons above, with (at least) one upside. You don't need to connect to a server every time you want to perform an analysis. That means you can do everything offline, and the person using Tableau doesn't need to have access to the SQL table (or any other source).
One good practice is always keeping a back-up when updating a tde (because, well, shit happens)
10 gig csv, wow. Yes, you should absolutely use a data extract, that would be much quicker. For that much data you could look at other connections such as MS Access or a SQL instance.
If your data have that many rows, I would try to set up a small MySQL instance on your local machine and keep the data there instead. You would be able to connect Tableau directly to the MySQL instance and would be able to easily edit the source data.
I'm currently working on a GIS database project using Manifold Ultimate.
I am able to import data from PostGIS via the database console, and edit the data as a table object within Manifold.
How do i 'commit' these changes back to PostGIS?
I am required to submit the exported database. What format is expected for a PostGIS export and how is the exporting done?
#mdsumner is correct. Linking the PostGIS data is the way to go.
If you have exported the complete table and edited records it's not simple to replace the data present in PostGIS by a new export. This will fail until you have deleted all the tables with index, triggers and sequences whose names are derived from the same name of exported drawing (with inconsistend handling of lower case). It's not enought to drop the table.
Note that with Manifolds linked storage model you have no client buffer of edited, added or deleted records that are written back in a process of commitment of a transaction. Every edit of every single column is written to PostGIS at once.
Concerning your 2. question: That depends on the target system. Manifold exports GEOMETRY type geometries. Other PostGIS clients may digest only a single type point, line or polygon. You can edit the type in "geometry_columns.type" as long as you have added only the one type of object to the drawing.
I think that if you imported the data it is no longer linked to the DB and you would need to export it and replace what is in the DB. If you link the data the edits you make are commited "live" as the data is not a copy but remains stored by the DB.
I'm not that familiar with this, but that's what the Database Console topic in help describes.