Excel::Writer::XLSX (Perl) sheet introspecion - perl

For some time I've wondered if/when this module could get some introspection abilities, beyond just hacking on the object.
For example:
Once a sheet has been written, how can I know query the sheet object to know things like how many rows and columns it has?
What I want to do is write a number of sheets, then go back to each and write more rows to them. I could keep track of last row/column on my own, but before I do that I'm wondering if I can get that data from the already-written objects, before the final workbook->close.
I think I could count the number of keys in a sheet object's _table hash, but that may be too close to the metal to be "official." I remember John saying not to do that somewhere in the CPAN docs.

I could keep track of last row/column on my own, but before I do that I'm wondering if I can get that data from the already-written objects, before the final workbook->close.
No. That isn't possible. Excel::Writer::XLSX doesn't provide any tools for introspecting the data once it crosses the APIs. This is a deliberate design decision. You should treat an Excel::Writer::XLSX object as a black box and not some sort of database.
The best way to do what you want is to track the range data in your program.

Related

How can I perform automated tests against MS Word documents using PowerShell?

We regularly need to perform a handful of relatively simple tests against a bunch of MS Word documents. As these checks are currently done manually, I am striving for a way to automate this. For example:
Check if every page actually has a page number and verify that it is correct.
Verify that a version identifier in the page header is identical across all pages.
Check if the document has a table of contents.
Check if the document has a table of figures.
Check if every figure has a caption.
et cetera. Is this reasonably feasible using PowerShell in conjunction with a Word API?
Powershell can access Word via its object model/Interop (on Windows, at any rate) and AIUI can also work with the Office Open XML OOXML) API, so really you should be able to write any checks you want on the document content. What is slightly less obvious is how you verify that the document content will result in a particular "printed appearance". I'm going to start with some comments on the details first.
Just bear in mind that in the following notes I'm just pointing out a few things that you might have to deal with. If you're examining documents produced by an organisation where people are already broadly speaking following the same standards, it may be easier.
Of the 5 examples you give, without checking the details I couldn't say exactly how you would do them, and there could be difficulties with all of them, but for example
Check if every page actually has a page number and verify that it is correct.
Difficult using either OOXML or the object model, because what you would really be checking is that the header for a particular section had a visible { PAGE } field code. Because that field code might be nested inside other fields that say "if don't display this field code", it's not so easy to be sure that there would be a page number.
Which is what I mean by checking the document's "printed appearance" - if, for example, you can use the object model to print to PDF and have some mechanism that lets PS inspect the PDF's content, that might be a better approach.
Verify that a version identifier in the page header is identical across all pages.
Similar problem to the above, IMO. It depends partly on how the version identifier might be inserted. Is it just a piece of text? Could it be constructed from a number of fields? Might it reference Document Properties or Variables, or Custom XML content?
Check if the document has a table of contents.
Perhaps enough to look for a TOC field that does not have certain options, such as a \c option that a Table of Figures would contain.
Check if the document has a table of figures.
Perhaps enough to check for a TOC field that does have a \c option, perhaps with a specific parameter such as "Figure"
Check if every figure has a caption.
Not sure that you can tell whether a particular image is "a Figure". But if you mean "verify that every graphic object has a caption", you could probably iterate through the inline and floating graphics in the document and verify that there was something that looked like a Word standard caption paragraph within a certain distance of that object. Word has two standard field code patterns for captions AFAIK (one where the chapter number is included and one where it isn't), so you could look for those. You could measure a distance between the image and the caption by ensuring that they were no more than a predefined number of paragraphs apart, or in the case of a floating image, perhaps that the paragraph anchoring the image was no more than so many paragraphs away from the caption.
A couple of more general problems that you might have to deal with:
- just because a document contains a certain feature, such as a ToC field, does not mean that it is visible. A TOC field might have been formatted as not visible. Even harder to detect, it could have been formatted as colored white.
- change tracking. You might have to use the Word object model to "accept changes" before checking whether any given feature is actually there or not. Unless you can find existing code that would help you do that using the OOXML representation of the document, that's probably a strong case for doing checks via the object model.
Some final observations
for future checks, perhaps worth noting that in principle you could create a "DocumentInspector" that users could call from Word BackStage to perform checks on a document. Not sure you can force users to run it, or that you could create it in PS, but perhaps a useful tool.
longer term, if you are doing a very large number of checks, perhaps worth considering whether you could train a ML model to try to detect problems.

Make a Class Schedule Report

How can I make my Crystal Report look like the attached image? I have had no success creating it with a crosstab.
The short answer is that Crystal Reports isn't really equipped to handle the format you're dealing with. And here's why:
Let's assume for a moment you've already figured out how to interpret your query into something usable. Since we aren't using a Cross Table, the best you could hope for would be setting a Details section for each individual time slot and arranging a large number of formulas into a grid shape:
The problem is that every Formula would need to be unique; interpreting whether there is a Class at that Time and Date, and which Class it is. There would be up to 168 of those formulas and you'd have to manually go in and modify each one to check for their own unique combination of Date and Time. Which defeats the whole purpose of using a computer - to make repeated tasks easier.
Plus you'll have difficulty with the formatting: You'd need to program every "cell" to use a unique set of colors based on the displayed Class. That part is technically doable, but there's no way to "merge the cells" when classes last longer than a half hour. You'd end up with something like this:
So don't torture yourself trying to make this happen in Crystal. Even with all the time and effort it would take to formulate the grid, there's no good way to make it look like your screenshot.
That said, it looks as though you managed to put a schedule together in Excel. Is there any reason you can't use Excel instead? It's a much more powerful tool, and a cursory Google search suggests it can handle queries as well.

Dynamic data range references for charts

I have an OpenOffice Calc spread sheet that I'm using to track some data. I have three charts made from the data. I periodically add more data to the spreadsheet. My current way to propagate this to the chart is to alter the data ranges manually of each chart. I'd like to automate this, or at least not have to redundantly change each chart separately.
My current idea was to do something like $A$1:$A{$F$1} for the ranges where $F$1 holds the current last line. Unfortunately, OpenOffice doesn't recognize this, but I thought there might be a function or work around for it. I haven't been able to find one yet.
So, is there a way to execute my idea, or perhaps a better way to do it?
There is a very similar question to this, but the asker asked for many more features and the answer was to use something other than a spreadsheet. It was never answered whether this specific feature was possible.
Also:
First method is to extend the range of the graph way down, with lots of empty space.
Second method is to include only one extra line of data in the graph and when you add data, always insert it above that line.

Any tips or best practices for adding a new item to a history while maintaining a maximum total number of items?

I'm working on some basic logging/history functionality for a Core Data iPhone app. I want to maintain a maximum number of history items.
My general plan is to ignore the maximum when adding a new item and enforce it whenever I need to fetch all the items anyway (e.g. for searching or browsing the history). Alternatively, I could do it when adding a new item: fetch the current items, add the new one, and delete the oldest one if we're at the maximum. The second way seems less efficient, since I would be fetching all the items when I otherwise wouldn't need to.
So, the questions:
Which way is better? Is there an even better way to do this that I'm not considering?
How many items would be a reasonable maximum? The history is used for text field autocompletion, so more items means better usability, unless the number of items is so huge that it's slowing stuff down.
Thanks!
Whichever method is easier to implement is the right one. You shouldn't bother with a more efficient/more complicated implementation unless it proves it's needed.
If these objects are in a to-many relationship of some kind, I'd use the relationship to manage the maximum number. (Override add<Whatever>Object: and delete the extraneous items then).
If you're just fetching them, then that's really your only opportunity to filter them out. If you're using an NSArrayController, you might be able to implement a subclass that detects when new objects are added and chops off the extra ones.
If the items are added manually by the user, then you can safely use the method of cleaning up later. With text data, a user won't enter more a few hundred items at most and text data takes up very little room. If the items are added by software, you have to check every so many entries or risk spill over.
You might not want to spend a lot of time on this. Autocomplete is not that big, usually just a few hundred entries. I would right it the simplest way, with clean up later, and then fiddle with it only if you hit a definite performance bottleneck.
Remember, premature optimization is the root of all programming evil. That and the dweebs in marketing.

What are some of the advantage/disadvantages of using SQLDataReader?

SqlDataReader is a faster way to process the stored procedure. What are some of the advantage/disadvantages of using SQLDataReader?
I assume you mean "instead of loading the results into a DataTable"?
Advantages: you're in control of how the data is loaded. You can ask for specific data types, and you don't end up loading the whole set of data into memory all at the same time unless you want to. Basically, if you want the data but don't need a data table (e.g. you're going to populate your own kind of collection) you don't get the overhead of the intermediate step.
Disadvantages: you're in control of how the data is loaded, which means it's easier to make a mistake and there's more work to do.
What's your use case here? Do you have a good reason to believe that the overhead of using a normal (or strongly typed) data table is significantly hurting performance? I'd only use SqlDataReader directly if I had a good reason to do so.
The key advantage is obviously speed - that's the main reason you'd choose a SQLDataReader.
One potential disadvantage not already mentioned is that the SQLDataReader is forward only, so you can only go through the records once in sequence - that's one of the things that allows it to be so fast. In many cases that's fine but if you need to iterate over the records more than once or add/edit/delete data you'll need to use one of the alternatives.
It also remains connected until you've worked through all the records and close the reader (of course, you can opt to close it earlier, but then you can't access any of the remaining records). If you're going to perform any lengthy processing on the records as you iterate over them, you may find that you impact other connections to the database.
It depends on what you need to do. If you get back a page of results from the database (say 20 records), it would be better to use a data adapter to fill a DataSet, and bind that to something in the UI.
But if you need to process many records, 1 at a time, use SqlDataReader.
Advantages: Faster, less memory.
Disadvantages: Must remain connected, must remember to close the reader.
The data might not be concluesive and you are not in control of your actions that why the milk man down the road has always got to carry data with him or else they gona get cracked by the data and the policeman will not carry any data because they think that is wrong to keep other people's data and its wrong to do so. There is a girl who lives in Sheffield and she loves to go out and play most the times that she s in the house that is why I dont like to talk to her because her parents and her other fwends got taken to peace gardens thats a place that everyone likes to sing and stay. usually famous Celebs get to hang aroun dthere but there are always top security because we dont want to get skanked down them ends. KK see u now I need 2 go and chill in the west end PEACE!!!£"$$$ Made of MOney MAN$$$$