XPath. Select nodes based on an other, related node - select

I have an xml that contains two groups of related values:
<Rows>
<!-- first group -->
<Row>
<Sequence>100</Sequence>
<Value>+</Value>
</Row>
<Row>
<Sequence>105</Sequence>
<Value>+</Value>
</Row>
<Row>
<Sequence>110</Sequence>
<Value>-</Value>
</Row>
<!-- second group -->
<Row>
<Sequence>150</Sequence>
<Value>20</Value>
</Row>
<Row>
<Sequence>155</Sequence>
<Value>15</Value>
</Row>
<Row>
<Sequence>160</Sequence>
<Value>90</Value>
</Row>
</Rows>
Each element of 1st group related to an element of 2nd group: sequence -> sequence + 50
I need to get a node set of nodes from 2nd group which related nodes from 1st group contain + sign (nodes with sequences 150 and 155).
These nodes are needed for future sorting and enumerating.
/Rows/Row[contains(/Rows/Row[Sequence = (./Sequence - 50)]/Value, '+')]
I tried the above xpath, but failed as ./ is referenced to the current context (within second brackets), but I need to access to the parent one (within first brackets).
Do anybody know a solution for that?
Regards, Aikin
P.S. substring(./Sequence - 50, 1, 3) is used to get

Just turn the query upside down:
/Rows/Row[/Rows/Row[contains(Value, '+')]/Sequence = Sequence - 50]
It's also worth noting that xsl:key may be useful here to speed things up.

There is a distinction between the context node (.) and the current node (current()) in XPath/XSLT. See also this related question:
Current node vs. Context node in XSLT/XPath?
In your example you would need to use current() to refer to the current node, e.g.
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" indent="yes"/>
<xsl:template match="Row">
<Row1>
<xsl:copy-of select="."/>
</Row1>
<Row2>
<xsl:copy-of select="/Rows/Row[./Sequence = (current()/Sequence + 50) and current()/Value = '+']"/>
</Row2>
</xsl:template>
</xsl:stylesheet>
The above short XSLT snippet will copy each row with a corresponding row from the second group to the output document with the condition you gave in your question (difference is 50 and the first row has a + value).

Related

Burst Mode Vs Fully Streaming XSLT

I have written an XSLT to transform a huge incoming XML file to JSON using burst mode streaming. I am new to XSLT and have heard that there is a better way of fully streaming XSLT code which is more efficient and faster then burst mode.
Can someone please help me understand -
1. What is the difference between burst mode vs Fully streaming ?
2. How can i convert below XSLT code to fully streaming to improve the perfomance?
Below is my burst mode XSLT code -
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:wd="urn:com.workday.report/INT1109_CR_REV_Customer_Invoices_to_Connect" exclude-result-prefixes="xs" version="3.0">
<xsl:mode streamable="yes" on-no-match="shallow-skip"/>
<xsl:output method="text" encoding="UTF-8" indent="no"/>
<xsl:template match="wd:Report_Data">
<xsl:iterate select="wd:Report_Entry/copy-of()">
<!--Define Running Totals for Statistics -->
<xsl:param name="TotalHeaderCount" select="0"/>
<xsl:param name="TotalLinesCount" select="0"/>
<!--Write Statistics -->
<xsl:on-completion>
<xsl:text>{"Stats": </xsl:text>
<xsl:text>{"Total Header Count": </xsl:text>
<xsl:value-of select="$TotalHeaderCount"/>
<xsl:text>,</xsl:text>
<xsl:text>"Total Lines Count": </xsl:text>
<xsl:value-of select="$TotalLinesCount"/>
<xsl:text>}}</xsl:text>
</xsl:on-completion>
<!--Write Header Details -->
<xsl:text>{"id": "</xsl:text>
<xsl:value-of select="wd:id"/>
<xsl:text>",</xsl:text>
<xsl:text>"revenue_stream": "</xsl:text>
<xsl:value-of select="wd:revenue_stream"/>
<xsl:text>",</xsl:text>
<!--Write Line Details -->
<xsl:text>"lines": [ </xsl:text>
<!-- Count the number of lines for an invoice -->
<xsl:variable name="Linescount" select="wd:total_lines"/>
<xsl:iterate select="wd:lines">
<xsl:text> {</xsl:text>
<xsl:text>"sequence": </xsl:text>
<xsl:value-of select="wd:sequence"/>
<xsl:text>,</xsl:text>
<xsl:text>"sales_item_id": "</xsl:text>
<xsl:value-of select="wd:sales_item_id"/>
<xsl:text>",</xsl:text>
</xsl:iterate>
<xsl:text>}]}
</xsl:text>
<!--Store Running Totals -->
<xsl:next-iteration>
<xsl:with-param name="TotalHeaderCount" select="$TotalHeaderCount + 1"/>
<xsl:with-param name="TotalLinesCount" select="$TotalLinesCount + $Linescount"/>
</xsl:next-iteration>
</xsl:iterate>
</xsl:template>
</xsl:stylesheet>
Here is the sample XML -
<?xml version="1.0" encoding="UTF-8"?>
<wd:Report_Data xmlns:wd="urn:com.workday.report/INT1109_CR_REV_Customer_Invoices_to_Connect">
<wd:Report_Entry>
<wd:id>CUSTOMER_INVOICE-6-1</wd:id>
<wd:revenue_stream>TESTA</wd:revenue_stream>
<wd:total_lines>1</wd:total_lines>
<wd:lines>
<wd:sequence>ab</wd:sequence>
<wd:sales_item_id>Administrative Cost</wd:sales_item_id>
</wd:lines>
</wd:Report_Entry>
<wd:Report_Entry>
<wd:id>CUSTOMER_INVOICE-6-10</wd:id>
<wd:revenue_stream>TESTB</wd:revenue_stream>
<wd:total_lines>1</wd:total_lines>
<wd:lines>
<wd:sequence>ab</wd:sequence>
<wd:sales_item_id>Data - Web Access</wd:sales_item_id>
</wd:lines>
</wd:Report_Entry>
</wd:Report_Data>
If the order of properties in the JSON doesn't matter then you could directly create XSLT/XPath 3 maps and arrays with xsl:map/xsl:map-entry (or the XPath 3.1 map constructor) and the Saxon specific extension element saxon:array (unfortunately the XSLT 3 language standard lacks an instruction to create an array). Furthermore most of your iteration parameters seem to be easily implemented as accumulators:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:saxon="http://saxon.sf.net/"
extension-element-prefixes="saxon"
xpath-default-namespace="urn:com.workday.report/INT1109_CR_REV_Customer_Invoices_to_Connect"
exclude-result-prefixes="#all"
version="3.0">
<xsl:output method="adaptive" indent="yes"/>
<xsl:mode streamable="yes" use-accumulators="#all" on-no-match="shallow-skip"/>
<xsl:accumulator name="header-count" as="xs:integer" initial-value="0" streamable="yes">
<xsl:accumulator-rule match="Report_Entry" select="$value + 1"/>
</xsl:accumulator>
<xsl:accumulator name="lines-count" as="xs:integer" initial-value="0" streamable="yes">
<xsl:accumulator-rule match="Report_Entry/total_lines/text()" select="$value + xs:integer(.)"/>
</xsl:accumulator>
<xsl:template match="Report_Data">
<xsl:apply-templates/>
<xsl:sequence
select="map {
'Stats': map {
'Total Header Count' : accumulator-after('header-count'),
'Total Lines Count' : accumulator-after('lines-count')
}
}"/>
</xsl:template>
<xsl:template match="Report_Entry">
<xsl:map>
<xsl:apply-templates/>
</xsl:map>
</xsl:template>
<xsl:template match="Report_Entry/id | Report_Entry/revenue_stream | lines/sequence | lines/sales_item_id">
<xsl:map-entry key="local-name()" select="string()"/>
</xsl:template>
<xsl:template match="Report_Entry/lines">
<xsl:map-entry key="local-name()">
<saxon:array>
<xsl:apply-templates/>
</saxon:array>
</xsl:map-entry>
</xsl:template>
</xsl:stylesheet>
The example uses output method adaptive as your current sample doesn't create a single JSON object and I have simply tried to create the same output as your current code; the JSON output method would need a single map or array as the main sequence result.
Code works with streaming and Saxon EE 9.9.1.1 in oXygen, unfortunately 9.8 doesn't consider the code streamable.
As for general rules, there are limits as to what you can achieve with accumulators and template matching when streaming; as you can see, the accumulator to sum up the values from the total_lines elements needs to match on the text child to not consume the element in the accumulator (Saxon has another extension of capturing accumulators to ease such tasks however).
So far I would rather say it is more important to find a way to get around the streamability analysis and to have the streamable code return the right and same result as the non-streamable code; for instance, while experimenting with an approach to generate JSON with streaming using two transformation steps where some sample data similar to yours is the input, the XML representation for JSON the result of the first transformation and the JSON supposed to be the result of using xml-to-json on the first step's result I ran into a Saxon bug https://saxonica.plan.io/issues/4215.
With streaming, it seems there is not enough test coverage or implementation maturity to be able to combine features reliably in a complex and scalable way, partly due to a complex spec, partly due to the limited use of that stuff by the XSLT community.
So if you find a working way for a particular problem to use streaming to keep memory consumption lower or manageable compared to the normal XSLT 2/3 tree based processing then you can of course experiment with changes to improve performance but it is easy to break things.
One general observation is that streaming allows you to access all attributes of the currently processed/matched element but not its children, therefore it can help immensely to insert a processing steps that transforms elements into attributes if you have a simple child element structure. That way you can then often avoid any copy-of(). But of course you need a way to combine two stylesheets which Saxon allows with its API but doing it requires writing Java or .NET code.

Generate XML root element with attribute for XML in Spark(scala) using databricks

I wanted to create a nested XML from CSV/DataFrame in scala spark. I am using Databricks spark-XML library for converting the DataFrame to XML format.
I was trying to create an output like below, but unable to achieve it
<rows version="1">
<row>
<name id=10>Mahashree</name>
</row>
</rows>
I have tried with struct
{"_VALUE":"Mahashree","#id":10,"__version":1}
but resulted as below
<rows>
<row>
<name id=10>Mahashree
<__version>1</__version>
</name>
</row>
</rows>
can anyone help to make the attribute in the root element?
Thanks in Advance

Filemaker: values form related table

I have 2 tables, DATA and IMAGES, in a relationship based on item_number.
Records in DATA each have 3 records from IMAGES gathered.
For example, a record with item_number 010050 is linked to these records in IMAGES:
010050.eps
010050_table.tif
010050_drawing.png
In the corresponding record in DATA I have the fields:
main
table
drawing
My aim is to set the fields in DATA like:
010050.eps => main
010050_table.tif => table
010050_drawing.png => drawing
I tried:
ExecuteSQL("SELECT filename FROM images WHERE filename = ?"; ""; ""; "010050_drawing")
Who could give me a hint?
I couldn't figure out how I can do that in xslt.
To take a simplified example, suppose you have exported from your Data table, with only two fields included in the export: Data::ItemNumber and Images::FileName. Your raw export would then look something like this (two records shown):
XML
<?xml version="1.0" encoding="UTF-8"?>
<FMPXMLRESULT xmlns="http://www.filemaker.com/fmpxmlresult">
<ERRORCODE>0</ERRORCODE>
<PRODUCT BUILD="" NAME="FileMaker" VERSION=""/>
<DATABASE DATEFORMAT="" LAYOUT="" NAME="" RECORDS="" TIMEFORMAT=""/>
<METADATA>
<FIELD EMPTYOK="YES" MAXREPEAT="1" NAME="ItemNumber" TYPE="NUMBER"/>
<FIELD EMPTYOK="YES" MAXREPEAT="1" NAME="Images::FileName" TYPE="NUMBER"/>
</METADATA>
<RESULTSET FOUND="2">
<ROW MODID="0" RECORDID="1">
<COL>
<DATA>010050</DATA>
</COL>
<COL>
<DATA>010050.eps</DATA>
<DATA>010050_table.tif</DATA>
<DATA>010050_drawing.png</DATA>
</COL>
</ROW>
<ROW MODID="0" RECORDID="2">
<COL>
<DATA>2345</DATA>
</COL>
<COL>
<DATA>2345_extra.gif</DATA>
<DATA>2345_table.gif</DATA>
<DATA>2345_drawing.jpg</DATA>
<DATA>2345.tiff</DATA>
</COL>
</ROW>
</RESULTSET>
</FMPXMLRESULT>
Note the different order as well as the extra image in the second record.
After applying the following stylesheet:
XSLT
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:fmp="http://www.filemaker.com/fmpxmlresult"
exclude-result-prefixes="fmp">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:template match="/">
<root>
<xsl:for-each select="fmp:FMPXMLRESULT/fmp:RESULTSET/fmp:ROW">
<xsl:variable name="item-no" select="fmp:COL[1]/fmp:DATA"/>
<item>
<number>
<xsl:value-of select="$item-no"/>
</number>
<main>
<xsl:value-of select="fmp:COL[2]/fmp:DATA[starts-with(., concat($item-no, '.'))]"/>
</main>
<table>
<xsl:value-of select="fmp:COL[2]/fmp:DATA[starts-with(., concat($item-no, '_table.'))]"/>
</table>
<drawing>
<xsl:value-of select="fmp:COL[2]/fmp:DATA[starts-with(., concat($item-no, '_drawing.'))]"/>
</drawing>
</item>
</xsl:for-each>
</root>
</xsl:template>
</xsl:stylesheet>
you would obtain:
Result
<?xml version="1.0" encoding="UTF-8"?>
<root>
<item>
<number>010050</number>
<main>010050.eps</main>
<table>010050_table.tif</table>
<drawing>010050_drawing.png</drawing>
</item>
<item>
<number>2345</number>
<main>2345.tiff</main>
<table>2345_table.gif</table>
<drawing>2345_drawing.jpg</drawing>
</item>
</root>
Honestly, unless there's more happening than what you've described, you're making this too hard.
If there will always be exactly 3 IMAGE records for each DATA record, and especially if you want to directly relate each image to a specific type, you don't need the IMAGE table. A much simpler solution would be to add 3 container fields to your DATA table (say, main_image, table_image, drawing_image). You wouldn't need SQL or a portal, and you wouldn't have to worry about keeping related images linked to their type.
OTOH, if you truly need a separate table, consider hard-wiring three dedicated 1:1 relationships, one for each image type. To do this, add 3 special key fields in DATA: (main_image_id, etc.) and create a relationship for each image type, each between one of these fields and the primary IMAGE. Then, whenever you create a DATA record and 3 related IMAGE records, put the key of the matching image in the matching DATA::xxx_image_id field. That way, you'd have a clean 1:1 relationship, and each related image would be locked to its type.
thanks for your replies
I need the value of the images in those fields to get it exported to XML and InDesign (into table cells).
The issue is that I know the filenames , namely the item_number followed by _drawing or _table, but don't know the extension on forehand (many files and from different sources)
File formats differ therefore.
My solution now is:
- imported all the images in a related (item_number, his is also part of the filename) a FMP table
- I calculated whether the file is a table, drawing or illustration and put those values in a field 'kind' by:
Case (
PatternCount ( filename ; "_drawing." ) ;
"drawing" ;
PatternCount ( filename ; "_table." ) ;
"table" ;
Length ( filename )=10 ;
"main" ;
""
)
- then I did the query in each of the fields, main - table - drawing. The one below is the calculation in the 'table' field.
ExecuteSQL("
SELECT path_xml
FROM images
WHERE item_number = ? and kind = ?";
"";""; item_number ; "table"); ""
)
Perhaps a bit clumsy, but it works now.

XPath: what's the different between "begin with one slash" and "begin with 2 slashes"?

I read some Xpath code, some begin with "/xxxx", some begin with "//xxxx". What're their differences? Do they have different behavior just in "Select" or different also in other behaviors?
I didn't find corresponding explanations on this site, any hints?
Thanks.
Beginning an XPath with one slash will retrieve the root of the document, so that /xxxx will match only the <xxxx> element that is the root of the XML.
Example:
<?xml version="1.0"?>
<xxxx> <!-- this one will match -->
<level>
<xxxx /> <!-- this one won't -->
</level>
</xxxx>
Whereas //xxxx will match all <xxxx> elements anywhere in the document.
Example:
<?xml version="1.0"?>
<xxxx> <!-- this one will match -->
<level>
<xxxx /> <!-- this one will match as well -->
<sublevel>
<xxxx /> <!-- and also this one -->
</sublevel>
</level>
</xxxx>

Xforms checkbox - replacing True / False with Y / N

I am writing a form (betterFORMs/Xforms) to be displayed to the user with a selection of checkboxes.
If a checkbox is empty the form should bind an "N" into the element. When ticked, a "Y".
I realise there are answers to this question already but I have tried all answered solutions with no luck.
The first solution I attempted to use is here - stackoverflow link
(the first solution looks good, but I have had more success with solution 2 as I am not using Orbeon)
The answer given is what I am looking for, but I am having trouble implementing this into my form. I am not using Xhtml or Orbeon and so the binding I use seems to be different to that used in the solution.) I have tried tweaking this code to fit into my form but I get a repetitive error from the xml parser every time I load the page - but it doesn't point me to anywhere related to the new code.
The next solution I have tried is found here - stackoverflow link
This answer has had the best results in my code because the checkbox values change to N when not used and when they are deselected. The problem I have with this solution is that the Y set in the form element is contained in braces - [].
output example:
<addressProof>N</addressProof><other>[Y]</other><otherText>_text_</otherText>
Here is a snippet of my form:
model:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="2.0"
xmlns="http://www.w3.org/2002/06/xhtml2" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xforms="http://www.w3.org/2002/xforms" xmlns:xsd="http://www.w3.org/2001/XMLSchema"
<xsl:output method="xml" />
<xsl:template match="/">
<xforms:model id="documentsChecklist">
<xforms:instance>
<actionform xmlns="">
<xforms:model id="documentsChecklist">
<xforms:instance>
<actionform xmlns="">
<detail>
<other></other>
<otherText></otherText>
</detail>
</actionform>
</xforms:instance>
<xforms:bind id="other" nodeset="/actionform/detail/other" calculate="choose(. = 'Y', ., 'N')"/>
<xforms:bind nodeset="/actionform/detail/otherBox" relevant="/actionform/detail/other ='Y'" />
</xforms:model>
form:
<div id="formBody"><br />
<xforms:select bind="other" appearance="full" align="right">
<xforms:item>
<xforms:label>Other</xforms:label>
<xforms:value>Y</xforms:value>
</xforms:item>
</xforms:select>
<xforms:input ref="/actionform/detail/otherText">
<xforms:label>Please specify:*</xforms:label>
</xforms:input>
</div>
</xsl:template>
</xsl:stylesheet>
Why does the checkbox value now get set to "[Y]" instead of "Y"? (Could it be something to do with an Array?) Thanks.
PS. I know I could do this using a boolean for each checkbox, with the checkbox value corresponding to the boolean, which in turn updates the bind value. I would rather not have to have a big block of boolean elements and binds as I have a large amount of check boxes. this solution has an example here - stackoverflow link
A select control allows you to select more than one item and I wonder if it is why the XForms implementation you are using is adding square brackets (according to specifications selected values have to be separated by a space character, which is not always very convenient by the way).
I am afraid that XForms 1.1 and XForms 2.0 require to use extra intermediate nodes and bindings. It would be useful to be able to add 2 XPath expressions for bindings: one to convert node value to control value and the other one back from control value to node value.
As a workaround, I use another extension in XSLTForms: XSLT stylesheets for converting instances.
-Alain