What is the best/a very good meta-data reader library? - metadata

Right now, I'm particularly interested in reading the data from MP3 files (ID3 tags?), but the more it can do (eg EXIF from images?) the better without compromising the ID3 tag reading abilities.
I'm interested in making a script that goes through my media (right now, my music files) and makes sure the file name and directory path correspond to the file's metadata and then create a log of mismatched files so I can check to see which is accurate and make the proper changes. I'm thinking Ruby or Python (see a related question specifically for Python) would be best for this, but I'm open to using any language really (and would actually probably prefer an application language like C, C++, Java, C# in case this project goes off).

There is a great post on using PowerShell and TagLibSharp on Joel "Jaykul" Bennet's site. You could use TagLibSharp to read the metatdata with any .NET based language, but PowerShell is quite appropriate for what you are trying to do.

use exiftool (it supports ID3 too). written in perl, but can also be used from the command line. it has a compiled windows and mac version.
it is light-years ahead of any other metadata tool, supporting almost all known audio, video and image files, supports writing (not just reading), and knows about all the custom/extended tags used by software (such as photoshop) and hardware (many camera manufacturers).

#Thomas Owens PowerShell is now part of the Common Engineering Criteria (as of Microsoft's 2009 Product Line) and starting with Serve 2008 is included as a feature. It stands as much of a chance to be installed as Python or Ruby. You also mentioned that you were willing to go to C#, which could use TagLibSharp. Or you could use IronPython...

#Thomas Owens TagLibSharp is a nice library to use. I always lean to PowerShell first, one to promote the language, and two because it is spreading fast in the Microsoft domain. I have nothing against using other languages, I just lean towards what I know and like. :) Good luck with your project.

Further to Anon's answer - exiftool is very powerful and supports a huge range of file types, not just images, but video, audio and numerous document formats.
A Ruby interface for exiftool is available in the form of the mini_exiftool gem
see http://miniexiftool.rubyforge.org/

Related

Finding built-in Unity scripts

If one would like to look into the core files of the Unity Engine, e.g: I've tried using the unity in-built "Fog" effect, but would like to see how it works on a deeper level (code). Is this something one can find, or is it encrypted in some way?
You can try to take a look at Unofficial unity decompiled repo to check if there are sources you're looking for. It's a decompiled verison so there is no guarantee that it's the actual code. Also big part of Unity's sources is written in C++ and C# scripts just call this Unity's native C++ part, so it's really incomplete.
The other option is that some of big companies, Unity's partners which have highest support level, have access to official source code. So may be you're able to find someone with access to sources.

Objective-C Documentation Generators: HeaderDoc vs. Doxygen vs. AppleDoc

I need to implement a documentation generation solution for my workplace and have narrowed it down to the three mentioned in the title. I have been able to find very little information in the way of formalized comparisons between these solutions, and I'm hoping that those of you with experience in one or more of the above can weigh in:
Here is what I have been able to glean from my initial pass:
HeaderDoc Pros: Consistent with apple's existing docs, compatibility with making apple docsets
HeaderDoc Cons: Difficult to modify behavior, project is not actively worked on, many have switched away from it (meaning there must be something deficient, though I can't quantify it).
Doxygen Pros:
Active support community b/c of wide use base, very customizable, most output types (like latex etc)
Doxygen Cons:
Takes work to make it look/behave consistent with apples docs, compatibility with apple docsets is not as simple
AppleDoc Pros:
Looks consistent with apple's existing docs, compatibility with making apple docsets,
AppleDoc Cons:
Issue with documentation of typedefs, enums, and functions, actively being developed
Does this sound accurate? Our desired solution will have:
Consistent look and feel with apples objective-c class reference
Ability for option-click to pull up documentation reference from within Xcode, and then link to the doc (just like apple's classes)
Smart handling of categories, extensions, and the like (even custom categories of apple's classes)
Ability to create our own reference pages (like this page: Loading… that can include images, and be linkable from generated class references seamlessly, like how apple's UIViewController class reference links to the linked page.
Easy to run command line commands that can be integrated into build scripts
Graceful handling of very large codebase
Based on all of the information above, are any of the above solutions clearly better than the others? Any suggestions or information to add would be extremely appreciated.
As the creator and lead developer of doxygen, let me also provide my perspective
(obviously biased as well ;-)
If you are looking for a 100% faithful replica of Apple's own documentation style, then AppleDoc is a better choice in that respect. With doxygen you'll have a hard time to get that exact same look, so I would not recommend to try.
With respect to Xcode docsets; Apple provides instructions how to set that up with doxygen (written in the time Xcode 3 was released). For Xcode 4 there is also a nice guide how to integrate doxygen.
As of version 1.8.0, doxygen supports Markdown markup, as well a large number of additional markup commands.
With doxygen you can include documentation on the main page (#mainpage) as well as on subpages (using #subpage or #page). Inside a page you can create sections and subsections. In fact, doxygen's user manual was completely written using doxygen. Besides that, you can group classes or functions together (using #defgroup and #ingroup) and inside a class make custom sections (using #name).
Doxygen uses a configuration file as input. You can generate a template with default values using doxygen -g or use a graphical editor to create and edit one. You can also pipe options through doxygen via a script using doxygen - (see question 17 of the FAQ for an example)
Doxygen is not limited to Objective-C, it supports a large range of languages including C, C++, and Java. Doxygen is also not limited to the Mac platform, e.g. it runs on Windows and Linux too. Doxygen's output also supports more than just HTML; you can generate PDF output (via LaTeX) or RTF and man pages.
Doxygen also goes beyond pure documentation; doxygen can create various graphs and diagrams from the source code (see the dot related options). Doxygen can also create a browsable and syntax highlighted version of your code, and cross-reference that with the documentation (see the source browser related options).
Doxygen is very fast for small to medium sized projects (the diagram generation can be slow though, but nowadays runs on multiple CPU cores in parallel and graphs from one run are reused in the next run).
For very large projects (e.g. millions of lines of code) doxygen allows the projects to be split into multiple parts and can then link the parts together as I explained here.
A nice real-life example of using doxygen for Objective-C can be found here.
The development of doxygen highly depends on user feedback. We have an active mailing list for questions and discussions and a bug tracker for both bugs and feature requests.
Most users of doxygen use it for C and C++ code, so naturally these languages have the most mature support and the output is more tuned towards the features and needs for these languages. That said, also wishes for and issues with other languages are taken seriously.
Note that I do nearly all doxygen development and most testing on a Mac myself.
I'm the author of appledoc, so this answer may be biased :) I tried all mentioned generators though (and more) but got frustrated as none produced results I wanted to have (similar goals as you).
According your points (I only mention appledoc and doxygen, I don't recollect headerdoc that well):
Consistent look: appledoc out of the box, other need to tweak css, but probably doable.
Generation of documentation sets (for Xcode references): appledoc full support for searchable and option-clickable documentation out of the box, doxygen generates xml and makefile which you need to invoke yourself. Additionally appledoc supports published docsets out of the box.
Categories: appledoc allows you to merge categories to known classes or leave them separate, foundation & other apple class categories are listed separately in index file. doxygen: this wasn't working best when I tried it.
Custom reference pages: appledoc supports out of the box using either markdown or custom html, doxygen: you can include custom documentation to main page, don't know if you can include more pages.
Easy command line: depends how you look at it: appledoc can take all arguments through command line switches (but also supports optional global and project settings plist files) so it should be very easy to integrate with build scripts. doxygen requires usage of configuration file to setup all parameters.
Large codebases: all tools should support this, although didn't compare timewise. Also not sure if any tool supports cached values (running over previously collected data in order to save some time) - I am looking into adding this for next major release.
It's some time since I tried using other tools, so above mentioned issues with doxygen/headerdoc may have been addressed! appledoc itself also has disadvantages: like you mention there's no support for enums, structs, functions etc (there was some work done in this direction, check this fork), and it has it's own set of issues that may prevent you using it, depending your requirements.
I am currently working on major update that will cover most glaring issues, including support for enums, structs etc. I'm regularly pushing new stuff to experimental branch as soon as I finish larger chunks and make it stable enough, so you can follow the progress. But it's still very early and progress depends on my time so it may take quite a while until working solution.
Xcode 5 will now parse your comments to search for documentation and display it:
You don't have to use appledoc or doxygen anymore (at least when you don't want to export your docs). More information can be found here

awk powered CMS

I have been lurking for a long time on this forum and I found it to be the most useful. This is my first question so forgive me if it is not phrased properly. I am looking for a simple nawk based (the server doesn't belong to me so I can not install gawk even if I wanted) CMS or collection of shell/awk scripts to help me manage my growing collection of pure XHTML 1.0/CSS files which represent my personal website. I tried TinyTim and Blis on my personal computer. Apart of being non-portable (sorry but Bash and gawk are not standard Unix tools) I found them not to be fully functional. Can anybody suggest any other solutions? I have my own growing collection of quick scripts but I need something more robust. I am willing to consider simple Perl based solution. Python would be a stretch but I really like the language and I am using it daily for scientific computing so I am willing at least to learn about that option.
I wrote a static site generator using awk and sh called Zodiac. It supports Markdown and plain HTML, a main site layout, metadata and its written in POSIX awk and sh. This could be the awk-based content management system you're looking for.
An interesting question! But this is not a traditional answer. I have numerous comments that won't fit well into the S.O. comment format, so please forgive this violation of etiquette.
As much as I like *awk, I can see several obstacles.
1. I'm not aware of any CMS tools created with nawk. I have a wide range of experience of what is available with awk, and as you've discovered, there are several, (TinyTim and Blis), but they're based on bash/gawk and they're not as fully featured as you require.
When I went to the mother-ship of awk (www.awk.info), I got the distinct impression that the site has been hacked. I did find A tiny CMS in awk , but assume it is a gawk based system. The two sites have related authors, so I'm afraid it may be hacked too. Beware!
2. It sounds like you are thinking of a traditional awk command-line and shell script based system. If so, my limited experience with CMS systems has been that they are GUI based systems for content creation and management, so a GUI page creator, AND THEN a GUI wrapper around something similar to a traditional unix repository/SCCS system. CMS experts are welcome to enumerate the differences.
So, why not just make some wrapper scripts around CVS or similar that allow you to control your repository as you need?
3. System effectiveness I: using CVS as a place holder for the repository side of your CMS system, think how big that source code is, and that it is written in 'C', which gives much finer access and control to sub-systems related to file ownership and security issues (as well as many others) than you can access in nawk or any shell. (Compiled C executes much faster of course, but in this day of 3Ghz+ processors, it's not an absolute requirement to insist on complied code)
4. system effectiveness II: You say you want to store mostly XHTML 1.0/CSS type files. That is a major set-back for your project, awk is reg-ex based language and can't effectively parse XML-like data. Have you lurked enough here to have read parse xml in bash OR complex conversions
Of course, the post I was really looking for, I can't find! Search for phrases like 'friends don't let friends do XML in sed/awk/bash' ;-)!
5. Re TinyTim and Blis: Reconsider your objection to gawk/bash: these 2 excellent languages are super-sets of nawk and ksh(88). Depending on how little/much the script rely on gawk/bash specific features, at the easy end, you may only need to change the 'she-bang' at the top of the file to #!/bin/nawk , #!/bin/ksh OR more realistically, make that change and then rewrite some code for nawk/ksh. Worst case is that the gawk and bash code rely so heavily on specific 'branded' features that is really impractical to rewrite. It's worth a look.
To complete the picture, also see gawkxml.
Obviously a gawk system, but I did make a conversion to nawk with some code changes. It worked for my needs, but I didn't try to fix the case of the self-verifying aspect of the code that didn't work ;-(
EDIT
6. Finally, look at the range of systems from the original awk creators in their classic book 'The Awk Programming Language', Chap 4 Reports and Databases, 'A relational database system' AND Chap. 6, Little Languages. There may be ideas there for you (no prebaked CMS however ;-).
So, given that perl and python both have good-to-great XML processing built-in via imported modules, I think you have to seriously consider them OR install something like xmlstarlet (per the S.O. links above) and write your shell system wrappers to work with it.
I hope this helps.
Try Jekyll:
http://jekyllrb.com/
You just write up some text files using some simple, intuitive syntax. Then when you run Jekyll, it generates a whole folder full of plain HTML files, ready to upload.
The code can be extended using Ruby plugins which add extra functionality.
It is supported on GitHub Pages: if you upload a repository full of Markdown files, GitHub will run Jekyll on it automatically and host it on your personal subdomain.
There's also Hyde which is written in Python, but I haven't tried it.
A Google search for "static website generator" will yield millions of results. Try a few and pick what you like!

Tools for manual translation of Constants/Messages .properties files

I'm looking for some tools that could be used by human translators during the process of translating our GWT application into other languages.
Currently, we have the English version of .properties files containing constants and messages, and need create the files for other languages. This tool should be easy to use, so even non-IT-lover can master it.
Or, do you suggest other method for translation of the texts?
I heard the "community" approach becomes quite popular, by that I mean that one uploads his texts to some (?) forum, and the community there creates the translations into other language - but as I said, I don't know much about this
Are there any online platforms for this purpose?
any other ideas?
See my SO answer for VB 6 source code, speech text is in french want to translate to english. The same answer works if you replace the computer langauge "VB6" by "JavaScript".

MS Word is evil! Is there a good alternative? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 5 years ago.
Improve this question
As a developer I really don't like writing documentation but when I have to I'd like to make the process as painless as possible.
The problem with Word is that it constantly gets in my way. I worry more about the layout than about the actual content ... that's why I'd like to get rid of Word.
Ideally I'd like to write my content and then 'compile' it into a document.
I've heard of LaTeX but I don't have any experience with it whatsoever. Would this be the right technology for the job? What editor (Windows) should I use? Is it a good idea to start with LyX?
EDIT: I'm not asking about documenting code (I use Sandcastle for that).
Update 2014:
We have now switched to GFM (GitHub Flavored Markdown).
It's really easy to work with.
Write code & documentation in the same IDE!
Everything can be versioned!
Get great output either as raw txt, html or pdf!
My solution to this was to invest some time in creating a decent Word Template for myself.
The important thing to do is make sure you have a Style defined for everything you can put in the document.
Once you have all the Styles defined and all of the document content tagged with the correct Style instead of formatted in an ad hoc fashion, you'll be surprised how easy it is to produce good looking Word documents quickly every time.
The wider problem here is that everyone spends hours in Word and yet it is very rare for companies to invest in Word training. At some point you have to bite the bullet and take the time to teach yourself how to use it properly, just like you would with any other tool.
Anything you can do with LyX you can do with LaTeX. LaTeX is suitable for all sorts of things; it has been used for everything from manuals to lecture slides to novels.
I think LaTeX is probably worth looking into as an option; if you've ever wanted to "code" for your word processor, LaTeX is for you. At the simplest level you can define new commands to do things for you, but there's a lot of power there. And the output looks really neat.
In my opinion, LyX is fantastic in certain circumstances, handy in others, and occasionally just gets in your way. I think it should be seen as a productivity booster for LaTeX. In other words, learn to use LaTeX before trying LyX. Both are of course free and available for Windows, though the learning curve is quite steep compared with MS Word. For long documents, or plenty of similar documents, LaTeX/LyX is probably a worthwhile investment.
I've found that wikis can be good for this. Find a wiki you like that lets you do a bit of formatting, but nothing really heavy. Ideally it should let you format code easily too - to be honest, the markdown available on SO is probably a good start.
That way:
You have change tracking built-in (assuming a decent wiki)
You can edit from anywhere
Everyone always sees the same documentation (instant distribution)
You can concentrate on content instead of formatting
You could write your documentation using your own XML format and then transform it into any format with XSL (e.g. PDF via FOP+XSL-FO ).
See also the DocBook XML format.
LaTeX is an extremely powerful tool and might well be overkill here as it is designed for scientific/mathematical literature. It has a (relatively) steep learning curve and can be tricky to coax to do exactly as you want if you're new to it. I LOVE LaTeX, but it is not really a general purpose word processor.
Have you considered OpenOffice instead?
LaTeX is really a very powerful language if you need to write documents.
Perhaps you can try texmaker, a cross-platform LaTeX editor:
Texmaker is a clean, highly
configurable LaTeX editor with good
hot key support and extensive Latex
documentation. Texmaker integrates
many tools needed to develop
documents with LaTeX, in just one
application. It has some nice
features such as syntax highlighting,
insertion of 370 mathematical symbols
with only one click, and "structure
view" of the document for easier
navigation.
What about using HTML? This way you could then publish the documentation if there will be need for many people to access it from many places.
Despite all efforts and reasonable expectation I don't think Word Processing has been "solved" yet.
My response to what I also personally find a deeply frustrating experience with MS Word is to avoid it altogether and use an auto-documenting tool like GhostDoc to generate XML from what I've already written in the code (DRY!) and deal with the XML from an XSLT based intranet site or similar later.
Are you talking about documenting your actual code? If so, I recommend Doxygen for unmanaged code and Sandcastle for managed code. Both will compile your help or build it as a website for you.
Both applications will read special tags above functions / classes / variables and compile that into the help.
Well I've never found anything wrong with MS-Word in the first place. (i.e if you take the time to know how to use it effectively). OpenOffice indeed is an amazing & credible free alternative - but then if you hate MS Word for layout related problems, the same problem is gonna occur with OpenOffice too.
Never tried the Latex system myself, but have heard its good for scientific work. I think using some HTML WYSIWYG editor would be best for you, if you want to just focus on the content.
I considered a wiki, but I decided to go with a modified Markdown notation, for the simple reason, that a wiki's content isn't easily exported and distributed outside of the wiki itself, while the Markdown can be rendered into HTML.
Answer to chris' question about my workflow: I write the documentation with a Notepad-like application (TextWrangler, only because of its word-wrapping feature) in its raw Markdown format. Then I have a small localhost documentation website with my modified Markdown parser (extended for a few features and a bit more HTML-oriented functionality) that checks for the timestamps for the documentation files - if a file has been updated, it parses that file into HTML, and stores the file in a cache.
This way I'm able to edit the source documentation on my desktop, and just press F5 in my browser to see the results immediately.
I haven't got around to trying it yet, but I've always thought AsciiDoc would be good for this kind of thing.
If you want something simpler than LaTeX, you can have a look at ReStructured Text
Read this book: http://en.wikipedia.org/wiki/The_Pragmatic_Programmer . There is some idee fixe inside, so that documentation should be built automatically. Think about using your IDE for this, or look for some additional tools. Most modern languages support generating documentation as you write the code. This can simply maintain your doc in touch with latest changes in the code.
I prefer to use a RTF editor which is a lot less clunkier than words. This way the formatting and all the headers/footers nonsense will not take up half your time. Wordpad has worked for me on several occasions. I'm stuck with Word for now though :(
there are a lot of possible ways:
embedded documentation, e.g. javadoc: good for describing APIs, not so good for the "big picture"
plain html: can be checked in under version control, a definite plus
a wiki, e.g. confluence -- great for collaboration, but has version control different from your source
LaTeX or somesuch: better suited for books or papers than typical documentation; support for graphics is cumbersome
an Office clone, e.g. OpenOffice: mostly the same as Word+Visio, but open source, with a nicer document format
I usually document the software structure (the "metaphors" of a project, component interrelations, external systems) up front, using Visio, in "freeform" UML. These are then embedded in confluence, which can be converted to PDF if someone wants a printout.
LyX
LyX is a WYSIWYM front end to LaTeX: You get the convenience of a document processor (somewhat similar to Word) with the consistency and power of LaTeX: It doesn't get in your way and can do a lot of things that professional writers need.
Note: The correct answer for you really depends on your way of thinking --- we can't decide this for you. This answer simply shows an excellent choice if you think of documentations as documents and want something similar to Word (where Word is good) that doesn't suck as Word (where Word is bad for programmers).
But many programmers think of documentation differently and hence prefer different metaphors. I myself had the same problem years ago, worked with LaTeX (as I am a mathematician), found LyX and finally settled on a Wiki/Source system that I wrote myself.
Vim is the solution for anything that means writing plain text in the most efficient possible way. If you need formatting, then use XML, Latex or something similar (in Vim).
Vim changed my life!
Simple answer: LaTeX sounds like just what you are looking for.
I use it for writing documentation myself. I will never go back to Word if I have the option.
At phc, we started with latex, then moved to docbook, and have settled (permanently I hope) on Restructured Text/Sphinx.
Latex was chosen because we are academics, and latex is the tool of choice. I believe it didn't generate good enough HTML.
Docbook was chosen for power, but it was very unwieldy. It put us off writing any documentation: code had to be manually formatted, we kept forgetting the syntax, and it was difficult to read. The learning curve was also steep.
Finally, we moved to reST, using sphinx, and that was a great decision. Documentation is now very easy to write, and both PDF and HTML versions look beautiful (though the PDF could do with some customization). Its very easy to customize too.
The best bit about reST though, is that its human readable in source form. That is a wonderful advantage. I've switched to using reST for all my stuff now, especially anything over the web (except of course academic papers, where one would be foolish to use anything but latex).
You may want to look into doxygen at http://www.doxygen.nl/, see their nice examples. In this case, the documentation is presented by tags in comments in the source.
Another option would be to use an online system like trac from http://trac.edgewall.org/ which is a wiki/doc/issuetracking system that lives on top of subversion.