JasperReports: Ordering records from XMLDataSource - jasper-reports

I recently converted a JasperReport to use the XMLDataSource instead of getting the data from the DB. This was done for performance reasons.
One of the requests was to group certain records together.
I thought I had it working, but this was because my records that were grouped, followed sequentially in the XML file I used to test. So they were already "grouped/ordered" in the XML.
Now that the report is being used in a Live environment, we have picked up that the grouping is not actually working.
After doing some searching and reading, it seems that this cannot be easily done. Because we cannot sort the records from the XMLDataSource.
So my question: Is there a way to sort/order the records from the XMLDataSource so that they will correctly group, without using a XSLT.
I only want to transform the XML as a last resort. Hoping there is another way I can do it.

Why you can't use sort inside iReport?
see this for example

Related

Compare tables to ensure non regression in postgresql

Here is my issue: I often need to compare the same postgresql tables (or views that depend on it) between some ETL code refactoring to check for non regressions in my developments.
Let's say I have an ETL code I want to refactor, which regularly uploads data in a table. Currently, once my modifs are done, I often download my data from postgresql as a .csv file as a first step, then empty it, fill it again using my refactored code, and download the data again. Then, I compare the .csv files using for instance Python in a Jupyter Notebook.
That does not seem like the way to go at all. That notably supposes I am the only one to use that table during the operation, and so many other things I can't list them all here.
Is there a better way to go ?
It sounds to me like you have the correct approach. There's no magic to the CSV export operation: whatever tool you use runs a query and formats its resultset into the file. Any other before-and-after comparison operation would have to run the same query.
If you're doing this sort of regression test on an active database, it's probably wise to put some sort of distinctive tag on your test records, maybe prepend ETLTEST- to your customer names, so it's ETLTEST-John Bull. Then you can make your queries handle only your test records. And make sure you do something reliable for ORDER BY.
Juptyer seems a complex way to diff your csv files. Most operating systems have lightweight fast difftools.

Spring batch -preload chunk related data

I am reading records from file and I need them to associate with records that are already in database.
The related database record is specified within line in the file (there is id of that record). One item read should have one related record in database. I do not want to read single record from database per item due to performance issues it might have.
Therefore I would like to read all related records from database that are related to currently processed lines within chunk. Is there a way? Or is there a way to access all items that are being processed as a part of single chunk (they should be all in memory anyway)?
I know that I could load all records that are likely to be needed, but assume there is millions of such records in database and i am only processing file that has like thousands lines.
This is clearly a case of custom reader - remember that Spring Batch is simply a framework that tries to give structure to your code & infra but doesn't impose much restrictions as what logic or code you write on your own as long as it conforms to interfaces.
Having said that, if you are not transforming any of read items in ItemProcessor, a List of read items should be available at ItemWriter & those are the read items from the file as part of chunk.
If your file is really small, you can read all items in one go using your custom file reader / parser instead of reading one by one by API provided reader & then can load only those items from DB in one go
Instead of having a Single Step Job, you can have a Two Step Job where your first step dumps file read records to a DB table & in second step you do SQL join among these two tables to find out common records.
These are simply broad ideas & implementation is up-to you. It would become hard if you start looking for ready made APIs for all the custom cases encountered in practical scenarios.
I do not want to read single record from database per item due to performance issues it might have.
What if you read all related items at once for the current item? You can achieve that using the driving query pattern. The idea is to use an item processor that does the query to the database to fetch all records related to the current item.

Import match from Excel unreliable

I have a script set up in Filemaker 11 which should import data from an excel sheet. There's a field containing a unique number in both the Filemaker database and the .xlsx file which is used to match already existing entries. "Update matching records in found set" and "Add remaining data as new record" are both enabled.
Unfortunately, Filemaker seems to behave completely arbitrarily here. Using the same script and the same .xlsx file several times in a row, the results are completely unpredictable. Sometimes already existing records are correctly skipped or updated sometimes they are added a second (or third or fifth …) time.
Is this a bug, maybe specific to version 11, which was sorted out later? Am I missing something about importing?
Here's my official answer to your question:
Imports into FileMaker databases are found set sensitive.
If you're importing records in a method that acts on existing records (update matching), FileMaker uses the found set showing in the layout on your active window to do the matching on rather than the entire table itself.
Regarding it's localness (per your comment above), it allows you to do more precise imports. In a case where you want to make sure you only match for specific records (e.g. you have a spreadsheet of data from employees at company A and you only want to update employee records for company A) you could perform a find and limit the found set to just your desired records before importing with matching turned on. This wat the import will ONLY look at the records in your found set to do it's evaluation. This means less processing because FM has to look at fewer records and also less risk that you're going to find a match that you didn't want (depending on what your match criteria is).
I'm having a hard time finding a good and up-to-date reference for you. All I can find is this one that is form FM10 days on Format. I would suggest bookmarking the FileMaker 13 help pages. It's the same set of help documents available when you use the Help menu in FileMaker Pro, but I find it much easier to search the docs via a browser.

Force indexing of a filestream in SQL Server 2012

Is it possible to force somehow the indexing service of MS SQL Server 2012 to index a particular filestream/record of a filetable?
If not, is there any way to know if a filestream/record has been indexed?
Thank you very much!
Edit: I found something. I'm not able to index a single file, but I may be able to understand what files have been indexed.
using this query: EXEC sp_fulltext_keymappings #table_id; you'll know every record that has been indexed, is better than nothing...
It sounds like you want to full text index a subset of the files within a single file table. (If it's otherwise, clarify your question and I'll edit the answer). There are two ways you could approach this.
One approach, is to use two distinct FileTables (MyTable_A and MyTable_B), place the files you want indexed in MyTable_A, and the non-indexed ones in MyTable_B. Then apply a full text index to A, but not B. If you need the files to appear in a unified fashion within SQL, just gate access through a view that UNIONs the two filetables. A potential pitfall is that it requires two distinct directory structures. If you need a unified file system structure this approach won't work.
Another approach, is to create an INDEXED VIEW of the files you want to full text indexed. Then apply a full text index to the view. Disclaimer: I have not tried this approach, but apparently it works.

How can I limit DataSet.WriteXML output to typed columns?

I'm trying to store a lightly filtered copy of a database for offline reference, using ADO.NET DataSets. There are some columns I need not to take with me. So far, it looks like my options are:
Put up with the columns
Get unmaintainably clever about the way I SELECT rows for the DataSet
Hack at the XML output to delete the columns
I've deleted the columns' entries in the DataSet designer. WriteXMl still outputs them, to my dismay. If there's a way to limit WriteXml's output to typed rows, I'd love to hear it.
I tried to filter the columns out with careful SELECT statements, but ended up with a ConstraintException I couldn't solve. Replacing one table's query with SELECT * did the trick. I suspect I could solve the exception given enough time. I also suspect it could come back again as we evolve the schema. I'd prefer not to hand such a maintenance problem to my successors.
All told, I think it'll be easiest to filter the XML output. I need to compress it, store it, and (later) load, decompress, and read it back into a DataSet later. Filtering the XML is only one more step — and, better yet, will only need to happen once a week or so.
Can I change DataSet's behaviour? Should I filter the XML? Is there some fiendishly simple way I can query pretty much, but not quite, everything without running into ConstraintException? Or is my approach entirely wrong? I'd much appreciate your suggestions.
UPDATE: It turns out I copped ConstraintException for a simple reason: I'd forgotten to delete a strongly typed column from one DataTable. It wasn't allowed to be NULL. When I selected all the columns except that column, the value was NULL, and… and, yes, that's profoundly embarrassing, thank you so much for asking.
It's as easy as Table.Columns.Remove("UnwantedColumnName"). I got the lead from
Mehrdad's wonderfully terse answer to another question. I was delighted when Table.Columns turned out to be malleable.