How to copy Tableau Data Extract logic? - tableau-api

Someone in my org created a Data Extract. There is an issue in one of the worksheets that uses it, and we suspect it's due to a mistake in how the Union was built.
But since it's a Data Extract, I can't see the UI for the data merge. Is there anyway to take a current Data Extract and view the logic that creates it?

Download the extract from the server (I'm assuming you're using server), then open that extract using desktop. You should be able to see the details of it.

Before going too deep into extract details, note that extracts are not intended to be permanent systems of record for data - just an efficient way to work with query results for optimized reporting. So in general, you should always be able to throw away the extract and look at the original source - or recreate the extract on command. But life isn't always perfect so ...
If you use Tableau Desktop to look at your worksheet, and look at the data source icon at the top of the data pane in the left sidebar, do you see an icon for your data source that looks like two databases with one on top of (shadowing) the other? If so, you can at right click on the data source icon and view its properties to see the source database table or file path. You can then even try disabling the extract to view the original source data.
If instead you see a single database icon, you have a "naked" extract where you've discarded the reference to the original source, (unless it is stored in the catalog mentioned below.)
If your organization purchased the Data Management Add-on for Tableau Server (strongly recommended), then if your data source is published to Tableau Server you can trace its history and origin by exploring the Tableau Catalog. That is especially valuable if the extract was built by a Tableau Prep Flow.
If instead, someone built the extract another way, say by writing a custom app using the Tableau Data Extract API, then the answer is to find that program.
One last point, in recent versions of Tableau, extracts are stored in an efficient relational type database file called Hyper. Hyper extracts can either be a single table (say serializing the results of a query joining multiple tables) or a Hyper extract can contain multiple tables (say serializing caching individual tables and deferring the join for later).
That may not be relevant to your question, but could turn out to matter as you reverse engineer how the extract was created.

Related

When you create a free form of Microstrategy, is it possible to do an automatic mapping?

When you finish the free form query in microstrategy, the next step is to map the columns.
Is there any way to do it automatically? At least make the list of the columns with its names.
Thanks!!!!
Sadly, this isn't possible. You will have to map all columns manually.
While this functionality isnt possible with freeform reporting specifically, Microstrategy Data Import will allow you the ability to create Data Import Cubes. These cubes can be configured as live connections, meaning they execute against the data source selected every time they are used, and are not your typical snapshot cube. Data Imports from a database can be sourced from a database query. This effectively allows you to write your own SQL with the end result being a report that you did not have to specify columns manually for.

Changing Datasource to SQL Server in TABLEAU

We are changing our Datasource from MS Access to SQL Server. Question,
Will we need to redevelop the Worksheets and Dashboards
Is there a way to have a Worksheet to connect to the new Datasource?
The tables are same between MS Access the SQL Server.
Thanks
No you don't need to redevelop all the worksheets and dashboards, just change the datasource you use in tableau. Create your new data source, which hopefully has very similar field names and data types as your original data source. Then go to the Data menu and choose Replace Data Source). Tableau will change your existing worksheets to reference the new data source.
Once the previous is done go to any of your worksheets and fix any problems, usually you’ll see a few fields that were different for some reason. You can replace the references to the fields if necessary. (right click on any field dimensions or measure, replace references) And might need to do some other minor surgery, delete old fields repair a group or something.
It should be all good. When you’re sure you are done with the old data source, you can close it from the data menu.

Using Data compare to copy one database over another

Ive used the Data Comare tool to update schema between the same DB's on different servers, but what If so many things have changed (including data), I simply want to REPLACE the target database?
In the past Ive just used TSQL, taken a backup then restored onto the target with the replace command and/or move if the data & log files are on different drives. Id rather have an easier way to do this.
You can use Schema Compare (also by Red Gate) to compare the schema of your source database to a blank target database (and update), then use Data Compare to compare the data in them (and update). This should leave you with the target the same as the source. However, it may well be easier to use the backup/restore method in that instance.

Tableau TDE or connect to files directly?

I have a personal license for Tableau. I am using it to connect to .csv and .xlsx files currently but am running into some issues.
1) The .csv files are massive (10+ gig)
2) The Excel files are starting to reach the 1mil row limit
3) I need to add certain columns to the .csv files sometimes (like unique ID and a few formulas) which means that I need to open sections of them in Excel, modify what I need to, then save a new file
Would it be better to create an extract for each of these files and then connect the Tableau Workbook to the extract instead of the file? Currently I am connected directly to files and then extract data from there and refresh everyday.
I don't know about others, but I'm using that exactly guideline. I'll have some Workbooks that will simply serve to extract data from some datasource (be it SQL, xlsx, csv, mdb, or any other), and all analysis will be performed in other Workbooks, that'll connect only to tdes
The advantages are:
1) Whenever you need to update a data source, you'll need to only update once (and replace the tde file) and all your workbooks will be up to date. If you connect to the same data source and extract to different tde files, you'll have to extract to all those different tde files (and worry about having updated the extract in that specific Workbook). And even if you extract to the same tde (which doesn't make much sense), it can be confusing (am I connected to the tde or to the file? Does the extract I made in the other workbook updated this one too? Well, yes it did, but it can be confusing)
2) You don't have to worry about replacing a datasource, especially when it's a csv, xlsx or mdb file. You can keep many different versions of those files, and choose which one is the best one. For instance, I'll have table_v1.mdb, table_v2.mdb, ..., and a single table_v1.tde, which will be the extract of one of those mdb files. And I still have the previous versions in case I need them.
3) When you have a SQL connection, or anything that is not a file (csv, xlsx, mdb), extracts are very handy for basically the same reasons above, with (at least) one upside. You don't need to connect to a server every time you want to perform an analysis. That means you can do everything offline, and the person using Tableau doesn't need to have access to the SQL table (or any other source).
One good practice is always keeping a back-up when updating a tde (because, well, shit happens)
10 gig csv, wow. Yes, you should absolutely use a data extract, that would be much quicker. For that much data you could look at other connections such as MS Access or a SQL instance.
If your data have that many rows, I would try to set up a small MySQL instance on your local machine and keep the data there instead. You would be able to connect Tableau directly to the MySQL instance and would be able to easily edit the source data.

What applications do you use for data entry and retrieval via ODBC?

What apps or tools do you use for data entry into your database? I'm trying to improve our existing (cumbersome) system that uses a php web based system for entering data one ... item ... at ... a ... time.
My current solution to this is to use a spreadsheet. It works well with text and numbers that are human readable, but not with foreign keys that are used to join with the other table's rows.
Imagine that I want a row of data to include what city someone lives in. The column holding this is id_city, which is keyed to the "city" table which has two columns: id (serial) and name (text).
I envision being able to extend the spreadsheet capabilities to include dropdown menu's for every row of the id_city column that would allow the user to select which city (displaying the text of the city names), but actually storing the city id chosen. This way, the spreadsheet would:
(1) show a great deal of data on each screen and
(2) could be exported as a csv file and thrown to our existing scripts that manually insert rows into the database.
I have been playing around with MS Excel and Access, as well as OpenOffice's suite, but have not found something that gives me the functionality I mention above.
Other items on my wish-list:
(1) dynamically fetch the name of cities that can be selected by the user.
(2) allow the user to push the data directly into the backend (not via external files/scripts.
(3) If any of the columns of the rows of data gets changed in the backend, the user could refresh the data on the screen to reflect any recent changes.
Do you know how I could improve the process of data entry? What tools do you use? I use PostgreSQL for the backend and have access to MS Office, OpenOffice, as well as web based solutions. I would love a solution that is flexible, powerful, and doesn't require much time to develop or deploy (I know, dream on...)
I know that pgAdmin3 has similar functionality, but from what I have seen, it is more of an administrative tool rather than something for users to use.
As j_random_hacker noted, I've used MS Access for years (since Access 97) to connect to an ODBC Data Source.
You can do this via linking to external tables: (in Access 2010:)
New -> Blank Database
External Data -> ODBC Database -> Link to Data Source
Machine Data Source -> New -> System Data Source -> Select Driver (Oracle, or whatever) -> Finish
Enter a new name for your DSN, the all of the connection parameters, then click OK
Select newly created DSN, hit ok.
You can do so much once Access sees your external table as a linked table, including sorting, filtering, etc. There's one caveat: as far as I can tell, ALL operations happen on the client side unless you're using a pass-through query. That's fine if you're looking at a table with 3000 records. With 2,000,000 records, that hurts. To be clear, all data in the table comes down to the workstation, for all tables being joined, and the join happens client-side, NOT server-side.
There are usually standalone tools for basic database management - e.g., for Oracle and MySQL a free tool called SQL Developer suffices for basic database data entry.
For more complex types (especially involving clobs) I can usually knock an application together in Java+SWT in a day if we already have the model and DAOs available on the Java side. Yeah, you have to put some effort in, but if it will be used regularly in the future then it is probably worth it.
In your case (well, the case where you have bulk imports of data) knocking up some Perl that reads from the CSV and does the city id lookup would be trivial to implement. Maybe a waste for a one-off thing? Depends on the amount of data to import.
I would be surprised if MS Access can't do what you're looking for -- this is basically the exact use case for it. Namely, quickly throwing together a nice UI for a simple CRUD DB application that a spreadsheet doesn't quite stretch to.
This is an answer, technically, but not a recommendation:
I've used Excel and SSIS for importing simple data entry files into MS SQL, but it's not adequate - there's very little ability to control the data, and SSIS is so very touchy, especially when working with Excel.
MS Access does not work well with some non-Microsoft databases. There is an open-source equivalent called Apache OpenOffice Base you may want to try.