Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 7 years ago.
Improve this question
I'm curious as to how many people are using Open XML (OOXML) these days (either pure or via the SDK) in closed and commercial environments. I'm fairly aware of what's going on on the 'public web' (MSDN, OpenXMLDeveloper.org, etc.), but am wondering about SO people's experience with it, both good and bad.
Are most people opting out of VBA and VSTO in favor of working directly with OOXML formats? What benefits are you getting from OOXML that you're not getting from the object model. I'd love to learn more about why you're using it or why you're not, what are you using it for, etc.
I'm just trying to get a feel from the community on OOXML as an approach toward document automation or other uses. I'm not finding community forums (this one or others) to be incredibly active with questions and users (check out the number of questions by tag of this post), so I'm wondering if I'm one of the very few who is using OOXML extensively.
Now that Microsoft Office 2007 (and especially Excel) support Open XML, I'm finding it far easier to work with than Office Automation. A few major reasons for this are:
Better performance;
No flakey IPC issues (i.e. somebody left Excel open at the Save As dialog, crash);
No dependency on Office itself, or any external components whatsoever;
Fairly easy to write Linq extensions and queries against in C#;
Can be used in server environments without any problems or risks.
Given that Office XP/2003 users can open Office 2007 files with the Compatibility Pack, I don't see any reason to continue using the old automation or "OfficeML" methods. It's a bit of a learning curve, but it's arguably the best option today - it's free, it's reliable, and best of all it's the native format used by Office 2007 today and you don't need any stupid tricks to get it to work (like attaching the XLS content-type to HTML as we did for XL2003, and having XL2007 complain about an incorrect extension).
I wouldn't say it's an outright replacement for VBA/VSTO - the thing about those is that they're usually part of a solution where the requirement is to integrate with the Office environment itself. Using OOXML would generally require you to write an entire application around it. But for simple import/export, which is probably what 90% of automation has been used for in the past, definitely, OOXML is the way to go.
Libraries like Simple OOXML can also greatly help with the learning curve.
We are using Open XML SDK for export to Excel. I have to say it is quite slow so we had to do some caching on our own (for shared strings). The library is just an object representation of the Open XML format. Sometimes it can be good thing, sometimes not. Especially when you have to know Open XML standard very well because the SDK won't handle anything for you. You have to know all the restrictions the format brings, you have to know which elements you cannot ommit in xlsx or docx, etc. It allows you to create inconsistent excel spreadsheets or word documents which is not good. Well, it's free at least :) Better than nothing.
I have used Open Xml SDK for document generation in SharePoint, PPTX generation for custom presentations (pulling from XSLX for the data) and for "building" composite documents from multiple document segments.
The format and the SDK are great. No worries on the server in ASP.NET or SharePoint scenarios, and great speed. I have not found too many scenarios where the SDK or "brute force" xml cannot accomplish a goal. One instance is password protection and DRM for documents, but these are more corner cases. I would agree with aaronaught that this is not an exclusive solution, but SharePoint, VSTO and others are tools in the tool belt for document generation solutions.
I am doing the scenario that sales performance for a country is delivered in PPT Deck using Open XML
I looked at this technology a lot, I program in VBA, but it was way way to complex for my needs. There are some great things about using it that my elearning would benefit from, like Linq and XML, but the bar for going from VBA to the underlying formats is just too high and I don't have the luxury of the kind of time and money it would take investing in learning VS.Net and Open Office XML formats.
But the one thing I think it would really help with is meta-tagging PowerPoint content for an LMS.
I've used it to parse pptx files, looking for special comments in shapes. These comments are links to other resources (uri's most often, PDF files, etc.). I then use deepzoom software to render the pptx and then render the uri's inside the shape. Fun, but slow. Use it to help with research and a "novel" way to look at posters. But this isn't LOB.
We migrated from Word Interop to Open XML very recently.
Our application is used for creating Invoices in the Word and pdf format. When Interop was used, a decent size Invoice took about 5-10 minutes to generate.
Now, using OpenXML and SSRS that time has been reduced to approximately 30 seconds.
The only problem area with Word was that some features were not backward compatible from Word 2010 to 2007 and that took some time to fix and get up and running. Like creating the Table of Contents, merging documents etc.
Other than that, I think Linq, MSDN and Eric White's Blog is enough to get you going in the right direction.
Related
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 2 years ago.
Improve this question
Around 15 years ago, I was writing .xll addins for Excel using the C-API, mostly for spreadsheet formulae (complex and time-consuming calcs), and written in C++ (very much my preferred language). I subsequently did a few things with C# wrappers.
After a long break, during which Excel and Visual Studio have moved on apace, I am once again looking at building an addin to use compiled code for a lengthy optimization, rather than using a VBA script. The optimization involves taking large arrays of data out of worksheets, crunching then and then writing the results back into the workbook (as I want to chart them etc later).
NB. This is for my own use, I am not planning on distributing any solution
In pseudo Excel VBA code what I'd like to have is something like this:
Dim obj as MyOptimizer
Dim rngInputData as Range
obj.SetData(rngInputData)
obj.Optimize
Dim v as vOutputData
v = obj.GetResult
I am a little bewildered by the options available (in no particular order):
code a C-style dll with a VBA xla wrapper
code a C-style dll with an Excel.DNA C# wrapper
code a COM object of some sort (though I am not sure with Visual Studio template to use)
Something else ...
A COM object has its attractions as it will hold state, and I can create an instance of my object without having to write a clunky handle-passing system. The other consideration is the amount of marshalling between different layers (I'll be using a lot of 2-D arrays, though only of doubles), and the type-checking of the inputs.
I'm looking for some guidance at this crossroads, before I go down the rabbit hole of one particular method ... then of course, I'll be back with more questions!
Thanks!
If C++ is still very much your preferred language, you should certainly consider that. It will give the the best performance possible, and leverage your knowledge of the C API. If you can afford it, I would also recommend you have a look at the XLL+ library. While expensive, I think if you look at the features you'll find it brings a lot of value.
I develop Excel-DNA and very much prefer to avoid C++ in favour of the .NET platform.
So, while biased, I think it will be an even better fit for your requirements. It is a vastly easier environment to work in compared to C++ (in my opinion - no XLOPERs!) However, there is some overhead in the marshaling and so you'd have some compromise on the performance. You don't indicate what range sizes or how fast you expect the interchange to happen. With Excel-DNA, I expect you can easily read, process and write a block of a million numbers in about a second (see my answer here: https://stackoverflow.com/a/3868370/44264). Similarly, a small block of 10 cells with numbers can be read and written more than 10,000 times in a second. Things will get a bit slower if you are working with long strings rather than numbers, but you suggest this is not the case.
While you should expect calculation code (apart from the Excel interop) to be very fast with C# and .NET, you can also talk to C libraries from .NET very efficiently. So having core calculation libraries in C and the use the P/Invoke mechanism to interact to them from .NET is a completely viable plan, and you still get the (large) benefit of a more pleasant environment for the Excel parts.
VBA and COM approaches to reading / writing with Excel will not perform quite as well as the Excel-DNA / .NET approach (which also uses the C API for this kind of thing under the hood, rather than COM). Still, when used properly the COM overhead is not terrible, and might not on its own be a showstopper for you.
I am interested in this kind of optimization approach myself, so would be happy to help if you do take the Excel-DNA approach. The best place for Excel-DNA questions is the Excel-DNA Google group.
Getting started with Excel-DNA would look like this:
Download and install the (free) Visual Studio 2019 Community Edition.
Select the Desktop .NET Development workload when installing. You don't need to check the Office options when installing for Excel-DNA add-ins.
Then make a new C# "Class Library (.NET Framework)" project. It's important at this step not to pick ".NET Standard" or ".NET Core" (long story...).
Then in your project install the "ExcelDna.AddIn" package from NuGet.
Read and follow the instructions in the readme that pops up.
Paste in the code snippet from here: https://stackoverflow.com/a/3868370/44264 press F5 and test in Excel.
After the slow Visual Studio install, it will only take a few minutes, and you'll get some idea of what's involved.
We just finished a small project with the primary aim of giving Microsoft StreamInsight a try.
The technology looks fine, but I have a concern about its industry traction. When we ran into issues there were only a handful of materials on the web and generally I miss a vibrant community around it.
Should we expand our use of StreamInsight or it will go down the drain in a few years like Silverlight did?
My opinion: it won't go the way of Silverlight.
Silverlight was, essentially, replaced by other technologies - specifically, HTML 5. StreamInsight doesn't have that. You are correct in that there isn't a whole lot around it. But that is because it's a relatively new technology that has a radically different paradigm, isn't very well known and has more limited use cases than something like ASP.NET MVC. But CEP is an initiative not just from Microsoft (with StreamInsight) but also Oracle, IBM and others. And as data volumes continue to increase, these technologies will be even more important.
I was at a SQL user group meeting about three months ago. I asked that exact question of the speaker (Microsoft Blue Badge). He didn't make a definitive statement but from what he did say it was clear he thought StreamInsight had a big future.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 4 years ago.
Improve this question
I am working with a small team (2 others) of developers that are geographically dispersed, and I'm looking for good ways for us to collaborate on specs... We're thinking we might use Google Docs to write the spec in so we can all have access to modify it in a central location.
What have you done? What good ideas do you have?
If you have an intranet or VPN, I would actually consider installing and using a small Wiki for these specs.
Compared to Google docs you get:
Much better versioning and change tracking (IMHO)
Much easier to start new documents for subsections
An actual markup rather than WYSIWYG (a matter of taste, I prefer LaTeX to Word).
Possible to attach variety of other file types
Very easy to backup
Very easy to create an offline version
You don't have to worry about storing sensitive materials elsewhere.
The disadvantage is that it is not WYSIWYG, which may or may not be an issue to you.
Of course, you can pick a Wiki implementation that supports a better editor, and possibly even a synchronous collaboration one.
Google Wave - exactly what it's meant for - collaboration
IMHO, a word processor is the wrong tool for a programmer. A spec should be written in a plain text editor, and utilize lightweight markup such as reStructuredText, AsciiDoc etc.
The benefits of such an approach are:
There are excellent tools to manage plain text, that are already in the hands of programmers (VCS, automated build systems, diff, patch, programming editors, grep, etc.)
A markup language allow for expressing intent rather then formatting.
That in mind, a Wiki seem to be the obvious choice.
Personally my tool chain of choice is:
reStructuredText as the markup language.
Trac as a Wiki
Firefox + the it's all text extension
Emacs + rst-mode
The choice of technology is one issue and Google docs is a good choice IMHO. But the real challenge is how to manage the process e.g. divide the tasks.
My suggestion is to first make sure that the platform and all related technologies are decided-upon as best as feasible. Then, compose a a thorough table of contents. A well-designed TOC will allow you to divide tasks properly and not "step" on each others' work. From then on you each "flesh" out your assigned sections as well as review each others' work.
In effect, each TOC subsection becomes an atomic unit of work that can be assigned and maintained by an individual who is also accountable for said section(s).
Good luck!
I think it depends on
How heavily into writing the specs you all are
If you're likely writing at the same time
Whether you intend to publish the specs.
Google Docs is nice and easy to get started with. It's also great that you can now export folders all at once. Still, for something that's going to be published to the web, a wiki or general cms is a better presentation vehicle. A wiki will also integrate with your existing site.
If you've got small specs, primarily written by one person then use whatever tool is available where you're hosting the project code or website. If you're not likely to be editing at the same time then a wiki is good.
I've done the wiki thing, the passed document thing and the Google Docs thing.
The wiki thing has a low starting effort and lasts a pretty long time. At a certain size it does get to be a pain.
The passed document thing (writes, email, edit, email, etc) only works while one person is starting everything up. As soon as there are even minor edits then it sucks.
The Google Docs thing is fine until you have several docs and several editors or want to publish it online.
hth
This isn't programming related, but I've personally used Google Docs to write shared documents and found it easy to use.
I would suggest enabling Google Gears however, in the event that the Google servers go down momentarily or an internet connection isn't available.
For writing specs collaboratively, you could try Gingko.
It's a card-tree editor, which means it's a mix between index cards and an outliner, with real-time collaboration and full Markdown support (as well as basic LaTeX).
We are still missing several features (version history, comments, etc), but for some the benefits of having everything in a tree structure outweigh these drawbacks.
Writing specs with it is great, because you can create a card for each user story, and drill into it as much as you like (and organize them into categories if you'd like).
http://gingkoapp.com
Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 13 years ago.
Improve this question
I want to convert a website to use a Content Management System for updating a large number of content pages for a website. The current website is mostly ASP.NET, but I am considering converting to PHP if it means I will have better integration with the "CMS of choice" in the market. I have heard of Joomla! and other CMS' but I would like some answers to which ones are considered better. Features that I need to support are custom sidebar and tab menus (with expandable javascript drop downs for example). Can anyone tell me of a good solution?
You should look at opensourcecms.com. It's a site that hosts demos for the majority of open source CMS's out there in both PHP and ASP.NET. You can try each one out and read the features and reviews. It's a good way to find one that meets your needs without actually installing them.
Joomla and Drupal are your most common and popular PHP based CMS solutions.
On the .NET side I would suggest only DotNetNuke. The amount of development that goes on in that CMS is second to none and there is a huge marketplace for content, modules, themes, etc. There is pretty much everything available in DNN to meet your potential needs.
The "best" CMS really, really depends on your requirements.
I will say that Joomla is pretty much typical PHP spaghetti, and I hate it, but it might work for you.
Kentico (a .NET CMS) is a pretty decent one that I've deployed a few times. Microsoft CMS is supposed to be decent, I haven't tried it though.
Without knowing specifically your requirements, I find it impossible to give a solid recommendation, though.
I didn't work with these applications yet, but AFAIK TYPO3 and ezPublish (both PHP) are considered much more professional than e.g. Joomla.
Drupal has a long history, proven track record of success (many high profile use cases, including the Obama campaign, Mozilla Firefox, and MTV in the UK), and a boatload of free modules and themes so you can start somewhere good. Drupal is also highly customizable in terms of how data is stored in terms of content types. Drupal has excellent consulting and contracting help.
Joomla is a strong second, but a quick look at Joomla criticism on wikipedia, and I think the choice gets much clearer. Two out of the three criticisms of Drupal on wikipedia are that it's too complicated, which is really a subjective matter as compared to the shortcomings of Joomla.
If web development is a hobby for you, then use an open source CMS such as those mentioned. If it is your profession, consider working towards writing your own that meets your needs. The first few will likely be a little rough, but in the long run it can prove very fulfilling and must more customizable than anything off the shelf.
Writing your own also forces you to consistently expand your skills and learn the intricacies of the programming language.
We've been tasked with creating a MOSS workflow that on it's final step will convert a document (most likely from word 2003 or 2007) to PDF and watermark it with the current date.
So far I haven't seen a definitive way to do this. Have looked at using the MS Word Interop dlls, but we will not be installing Word (or Office) onto the server - so that's really not doable. Another option I've looked at is using Aspose dll libraries for conversion.
From a topology standpoint, wondering if using one server exclusively for document conversion is a good way to implement this. (I've read some info that recommends this approach for larger organizations).
If anyone - who has preferable done this sort of thing, can give me some pointers or best practices on this I'd really appreciate it.
Thanks
I would think that starting with one server is the best way to go. Then, just monitor the workload on the machine and if it gets to be too much for one, pop another in there. That's the beauty of MOSS.