What Tableau API does InterWorks uses in their Workbook SDK? - tableau-api

InterWorks has a Workbook SDK as part of its Power Tools for Tableau product. Does anyone know how they are able to do this? The SDK can access a workbook without Tableau Server so I don't think it's the JavaScript or REST API.

A Tableau workbook (.twb) file is in XML format. The structure may change between versions, but is relatively straight forward to follow. Most Tableau file formats are also XML. The formats ending in an x (like .twbx) are zipped directories that contain the XML file along with other files.
This means it is not too tough to read information from these XML files, or even modify them. I've edited them by hand in rare cases. Usually there is a better choice than hacking the XML internals, but you can. Just backup your file first, and don't expect Tableau support to help you if it leads to strange behavior on your workbook.
In addition to the Interworks SDK (which is a COTS product), Chris Gerrard published a free Ruby library for accessing Tableau workbooks https://rubygems.org/gems/twb (or gem install twb) and released the source on github https://github.com/ChrisGerrard/TWB, along with a few (but not all of) the scripts he's written that use the twb classes https://github.com/ChrisGerrard/TableauToolsRuby.
Chris gives some useful examples and scripts on his blog Tableau Friction, including this clever article on automatically documenting the relationships among calculated fields
http://tableaufriction.blogspot.com/2015/02/more-calculated-field-analysis-fields.html
Using twb, you can write simple Ruby scripts easily to look at workbook structure. Since Tableau can change the format when they release new versions of the software, using the SDK or twb Ruby gem could isolate your scripts from changes to the format.
Tableau has also released a Document API that supports a modest number common changes to workbooks - so you can write a script to say, update the connection string on a set of workbooks.
So you have at least four choices:
Use Interworks SDK and tools that come with support, documentation and a price tag.
Use the free open source twb library, and either reuse existing Ruby scripts or develop the ones you need. And hopefully contribute the source if you extend.
Roll your own XML parsing scripts.
Use the Tableau Document API if it supports your use case. https://github.com/tableau/document-api-python
In all cases, do backups and be prepared for some adjustments when Tableau releases a major or minor version update. Patch releases are pretty safe.

Related

Documentation and version control

Given a project I'm about to start there will be documentation produced.
What is the best practice for this?
Should the documents live with the code and assets or should there be a separate documentation store?
Edit
I'd like a wiki but I will need to print the documents etc... It's a university project.
It really depends on your team. Where I work, we keep documentation in a wiki which is linked in with our team website. For the purposes of shipping documentation, the wiki can be exported and we run it through a parser that "fancifies" the look and feel of the documentation for customer purposes.
Storing the documentation with the code (typically in your source repository) is not a bad idea. Just make sure to keep them separated. For example, keep a docs folder which is on the same level with your src folder in your repository. This way, you can quickly ship the current documentation, you can easily track revisions, and anybody new to the project can immediately jump in without having to go to multiple locations for information.
Storing it in source control is fine.
This is an interesting question -- basically, what others are saying is right about generated documentation, source files and templates/etc. should be stored in source control and generated during your build process.
As far as requirements/specs/etc. documentation, I have worked both ways, and I very much prefer using SharePoint or a Wiki/document portal that is designed for document sharing/versioning. The reason is, most non-developer folks aren't comfortable working with source control systems, and you don't gain any of the advantages of intelligent merging if you are using a binary format like Word. Plus it's nice to have internet-based access so you can reference and work on the docs in a distributed team without people having to install extra software.
Here's a 2017 summary of the options and my experience:
(extreme 1) Completely external (e.g. a wiki, Google Docs, LaTeX, MS Word, MS Onedrive)
People aren't bothered about keeping it up to date (half of them don't even know where to find the page that needs updating since it's so out of the trenches).
wiki platforms are “captive user interfaces” - your data gets stored in their proprietary schemas and is not easy to examine with a simple text editor (Confluence is even worse in that you have no access to the plaintext content at all anymore)
(extreme 2) Completely internal (e.g. javadoc)
pollutes the source code, and is usually too low level to be of any use. Well-written source code is still the best form of low level documentation.
However, I feel package-info.java files are underutilized.
(balance) Colocated documentation (e.g. README.md)
A good half way solution, with the benefits of version control. If a single README.md file is not enough, consider a doc/ folder. The only drawback of this I've seen is whether to source control helpful graphics (e.g. png files) and risk bloating the repo.
One interesting way to avoid this problem is to use plaintext diagram tools (I find Grapheasy and Text Diagram to be a breath of fresh air).
plaintext can be easily read even if your rendering engine changes as the years go by.
Github's success is in no small part thanks to its README.md located in the root of the project.
One tiny disadvantage of this approach though is that your continuous integration system will trigger a new build each time you make edits to the README.md file.
If you are writing versioned user documentation associated with each release of the product, then it makes sense to put the documentation in source control along with its associated product release.
If you are writing internal developer documentation, use automated internal source code documentation (javadoc, doxygen, .net annotations, etc) for source level documentation and a project wiki for design level documentation.
I think most of us in the industry are not really following best-practices and it of course also depends a lot on your situation.
In an agile environment where you would have a very iterative process of release, you will want to "travel light". In this particular case, Jason's suggestion of a separate Wiki really works great.
In a water-fall/big bang model, you will have a better opportunity to have a decent documentation update with each new release. Also you will need to clearly document what version of the requirements was agreed on and have loads of documentation for every tiny change you do to requirements (due to the effects it has on subsequent stages). Often if the documentation can live together with the version controlled source code it is the best.
Are you using any sort of auto-documentation or is it completely manual? Assuming that you are using an auto-documentation system, the documentation is more or less generated on the fly, and would be part of the code itself.
To me, (assuming that it's possible with whatever code you are using), this would be the preferred method of handling it, as you wouldn't need to maintain the documentation source at all.

PowerBuilder 11.5 & Version Control

What is the best version control system to implement with PowerBuilder 11.5?
If you have examples of how you have did branching/trunk/tags that would be awesome. We have tried to wrap our heads around it a few times and always run into problems because we use shared libraries such as PFC/PFE in multiple applications.
Right now we are only using PBNative, and it sucks.
The Agent SVN is a MS-SCCI Subversion plug-in works with PowerBuilder.
Here is a link that describes how to setup Agent SVN to work with PowerBuilder and Subversion.
We currently use Perforce and it's P4SCC plugin, which works very well. In fact, I'm sure I read somewhere that the guys at Sybase who write PowerBuilder, actually use Perforce themselves.
So, to be fair, let's start out by saying that while you're asking about version control, PBNative is source control. If you compare something that is intended to have more features than just keep two developers from editing the same piece of source, then yes, PBNative will suck. The Madone SL may be an incredible bicycle, but if you're trying to take a couple of laps around an Indy track, it will suck.
"Best" is a pretty subjective word. There are lots of features available in version control and configuration management tools. You can get tons of features, but you'll pay through the nose. StarTeam has some nice features like being able to trace a client change request or bug report all the way through to the changed code, and being able to link in a customized diff tool (which is particularly useful in PB). Then again, if cost is your key criteria rather than features, there are lots of free options that will get the job done. As long as the tool supports the Microsoft SCC interface, you should be OK.
There is a relatively active NNTP newsgroup that focuses on source control with PowerBuilder, which you can also access via the web. You can probably find some already-posted opinions there.
Many years ago I used Starteam to control PB applications. PowerBuilder needless to say is an outdated bear, and it has to export each and every object from its "libraries" into source control.
Currently our legacy PB apps have its libraries saved whole into Subversion, without any support for diff's etc.
We use Visual SourceSafe. We don't use PFC, but we do have libraries that are shared among several projects. Till now, each project was developed separately from the others, and so the shared libraries were duplicated. To have them synchronized, they were all shared at the VSS level. Lately we've reorganized our sources so all projects are near each other, and there's only one instance of the shared libraries.
VSS is definitively not the best source control system, to say the least, but it integrates into PB without the need of any bridges. PB has an inherent problem working with source control, so it probably won't make a very difference working with one instead of the other (at least from the PB point of view).
Now, on a personal note, I'd like to say PB 11.5 is a piece of sh*t. It constantly crashes, full of unbelievable UI nuisance and just brings productivity to its knees. It's probably the worst IDE ever created. Stay away if possible.
FYI: The new PB12 (PB.NET) will integrate with SCC systems so you can easily choose which source control system that you want to use. Since we basically have dropped PBLs (they are now directories) files can be checked in/out individually - even with a plain vanilla editor since files are now normal (unicode) text files.
StarTeam integrates so beautifully with the PB IDE. I used that combination at my previous company (PB9 and ST5.x) for several years. You should be managing your code at the object level - don't log the entire PBL into ST...
If you're having problems with that setup, hit me up offline. phoran at sybase dot com.
We use Merant Version Manager for older projects and TFS for newer work. The only issue we have is that TFS does not support keyword expansion and changing the 'read the flowerbox comments' attitude people have. Some folks are nervous about losing the inline versioning history.
We use StarTeam and have been very pleased with it. It combines bug tracking with version control. Unfortunately though we don't store our files on the object level. We just store the PBL files directly in source control. Anything that supports the SCC interface theoretically should work correctly in PowerBuilder.
PB9: We used PVCS but had stability problems with pbl corruption and also problems co-existing with later versions of Crystal Reports (dll conflict) so now we use PB9 with Dynamsoft's Source Anywhere Standalone. This system is more primitive; it is missing the more advanced features for promotion levels and for pulling out an older milestone version of all objects to make a patch build.
What we are looking for now is something which will allow more advanced "change management", to support promotion levels at the change level (rather than at the object level). Would it be better to use perforce, starteam, or (harvest change manager + HarPB), or something else? Any advice on these combinations would be greatly appreciated.
You can always use Plastic SCM with PowerBuilder through SCC. Plastic is pretty advanced in terms of graphics, tools, replica and so on, so it's always a good choice to keep in mind.

Do you version "derived" files?

Using online interfaces to a version control system is a nice way to have a published location for the most recent versions of code. For example, I have a LaTeX package here (which is released to CTAN whenever changes are verified to actually work):
http://github.com/wspr/pstool/tree/master
The package itself is derived from a single file (in this case, pstool.tex) which, when processed, produces the documentation, the readme, the installer file, and the actual files that make up the package as it is used by LaTeX.
In order to make it easy for users who want to download this stuff, I include all of the derived files mentioned above in the repository itself as well as the master file pstool.tex. This means that I'll have double the number of changes every time I commit because the package file pstool.sty is a generated subset of the master file.
Is this a perversion of version control?
#Jon Limjap raised a good point:
Is there another way for you to publish your generated files elsewhere for download, instead of relying on your version control to be your download server?
That's really the crux of the matter in this case. Yes, released versions of the package can be obtained from elsewhere. So it does really make more sense to only version the non-generated files.
On the other hand, #Madir's comment that:
the convenience, which is real and repeated, outweighs cost, which is borne behind the scenes
is also rather pertinent in that if a user finds a bug and I fix it immediately, they can then head over to the repository and grab the file that's necessary for them to continue working without having to run any "installation" steps.
And this, I think, is the more important use case for my particular set of projects.
We don't version files that can be automatically generated using scripts included in the repository itself. The reason for this is that after a checkout, these files can be rebuild with a single click or command. In our projects we always try to make this as easy as possible, and thus preventing the need for versioning these files.
One scenario I can imagine where this could be useful if 'tagging' specific releases of a product, for use in a production environment (or any non-development environment) where tools required for generating the output might not be available.
We also use targets in our build scripts that can create and upload archives with a released version of our products. This can be uploaded to a production server, or a HTTP server for downloading by users of your products.
I am using Tortoise SVN for small system ASP.NET development. Most code is interpreted ASPX, but there are around a dozen binary DLLs generated by a manual compile step. Whilst it doesn't make a lot of sense to have these source-code versioned in theory, it certainly makes it convenient to ensure they are correctly mirrored from the development environment onto the production system (one click). Also - in case of disaster - the rollback to the previous step is again one click in SVN.
So I bit the bullet and included them in the SVN archive - the convenience, which is real and repeated, outweighs cost, which is borne behind the scenes.
Not necessarily, although best practices for source control advise that you do not include generated files, for obvious reasons.
Is there another way for you to publish your generated files elsewhere for download, instead of relying on your version control to be your download server?
Normally, derived files should not be stored in version control. In your case, you could build a release procedure that created a tarball that includes the derived files.
As you say, keeping the derived files in version control only increases the amount of noise you have to deal with.
In some cases we do, but it's more of a sysadmin type of use case, where the generated files (say, DNS zone files built from a script) have intrinsic interest in their own right, and the revision control is more linear audit trail than branching-and-tagging source control.

Solution deployment, CM, InstallShield

People,
We have 4 or 5 utilities that work in conjunction with our application. These utilities are either .bat files, or VB apps, PowerBuilder, etc. I am trying to manage these utils in source control, and am trying to figure out a better way to assign versions to them. Right now, the developers use the version control's meta-data -- specifically label -- to store the version number of the tool.
My goal is to have individual InstallShield packages for each utility, and an easy means to manage and assign version numbers to these packages.
Would you recommend a separate .ini file with the info, or store the info in InstallShield .ism file itself, or just use the meta-data info from version control tool?
UPDATE:
I like the idea Orion. I have one concern though. The script that increments the version number... it can not be intelligent enough to increment Major number etc. right. e.g. if one of the utils has version 1.2.3 and we are at a point where the new version is 2.0.0. The script may not be able to handle this.
I think this has to do a lot with our branching techniques -- we don't have any. The folks thought since the utils are so small, the source may not need branches.
PowerBuilder in particular has a nice trick you can do to incorporate the build number from an ini file into the compiled application.
Details here: http://www.pbdr.com/pbtips/ex/autorev.htm
We have ini file inside source control that stores the build number and its value is used in our build scripts to determine what label to apply to the source tree after a successful build. Works very nicely for our needs. When we branch, we do have to manually kick the file to increment the proper number though.
I managed our build system at my last job, which seemed to have some parallels to what you're asking.
There were ~30 C++ projects which needed compiling, and various .NET/Java things, and the odd perl script.
This was all built on our build machine using NAnt - If I were doing it today I'd use rake, but the idea is the same.
We basically had an auto-incrementing build number which was stored in a version.txt file in the root of the repository.
Each time we did a build (automatically done each night, or also on-demand if neccessary) the script would increment this number and check the file back into source control.
All the other apps referenced this file for their version number, or for things which didn't support working like this, the script would set environment variables or perform other workarounds
I'm pretty sure that our installshield programs referenced an environment variable for their version number, but we deprecated them in favour of wix as installshield really did suck
in the case of visual studio, grep/replace the number within the .csproj files, and check them back in
Hope this gives you some ideas
Using the meta data from your version control system should keep things simpler. It's how your developers already use the system. There is no additional file to maintain. My personal experience has taught me to version the satellite applications with the same as version as the main app. K.I.S.S

What is the best/a very good meta-data reader library?

Right now, I'm particularly interested in reading the data from MP3 files (ID3 tags?), but the more it can do (eg EXIF from images?) the better without compromising the ID3 tag reading abilities.
I'm interested in making a script that goes through my media (right now, my music files) and makes sure the file name and directory path correspond to the file's metadata and then create a log of mismatched files so I can check to see which is accurate and make the proper changes. I'm thinking Ruby or Python (see a related question specifically for Python) would be best for this, but I'm open to using any language really (and would actually probably prefer an application language like C, C++, Java, C# in case this project goes off).
There is a great post on using PowerShell and TagLibSharp on Joel "Jaykul" Bennet's site. You could use TagLibSharp to read the metatdata with any .NET based language, but PowerShell is quite appropriate for what you are trying to do.
use exiftool (it supports ID3 too). written in perl, but can also be used from the command line. it has a compiled windows and mac version.
it is light-years ahead of any other metadata tool, supporting almost all known audio, video and image files, supports writing (not just reading), and knows about all the custom/extended tags used by software (such as photoshop) and hardware (many camera manufacturers).
#Thomas Owens PowerShell is now part of the Common Engineering Criteria (as of Microsoft's 2009 Product Line) and starting with Serve 2008 is included as a feature. It stands as much of a chance to be installed as Python or Ruby. You also mentioned that you were willing to go to C#, which could use TagLibSharp. Or you could use IronPython...
#Thomas Owens TagLibSharp is a nice library to use. I always lean to PowerShell first, one to promote the language, and two because it is spreading fast in the Microsoft domain. I have nothing against using other languages, I just lean towards what I know and like. :) Good luck with your project.
Further to Anon's answer - exiftool is very powerful and supports a huge range of file types, not just images, but video, audio and numerous document formats.
A Ruby interface for exiftool is available in the form of the mini_exiftool gem
see http://miniexiftool.rubyforge.org/