Kafka connect - file pulse - 'xml attribute' extraction - apache-kafka

I am trying to use the file pulse connector to read XML file.
I am new to Kafka/Kafka Connect/XML processing
For file like below, I'd like to keep the data "unit", and the "string1", "string2".
currently, by default the processed payload drop them.
<?xml version="1.0" encoding="UTF-8"?>
<data>
<someField>someValue</someField>
<anotherField-I-Forced-the-type-to-Array>
<value unit="string1">123</value>
<value unit="string2">456</value>
</anotherField-I-Forced-the-type-to-Array>
<lastField>lastValue</lastField>
</data>
Does some kind of configruation already exist?
I have not found the configuration in the doc https://streamthoughts.github.io/kafka-connect-file-pulse/docs/developer-guide/file-readers/
Please help and maybe give some examples if there are already solution exist.
currently I got this payload. You can see unit and its value string1, string2 are gone.
"anotherField-I-Forced-the-type-to-Array": [
{
"value": [
"123",
"456"
]
}
],
ps. The version I used is 1.5.2 downloaded zip from here https://github.com/streamthoughts/kafka-connect-file-pulse/releases
curious, based on this article: https://medium.com/streamthoughts/streaming-data-into-kafka-s01-e02-loading-xml-file-21b5e69c645
the playlist does have 'name' attribute' and it was not lost.
<playlist name="BestOfStarWars">

FYI This is now fixed in 1.5.3 version very quickly

Related

Automatically create i18n directory for VSCode extension

I am trying to understand the workflow presented in https://github.com/microsoft/vscode-extension-samples/tree/master/i18n-sample for localizing Visual Studio Code extensions.
I cannot figure out how the i18n directory gets created to begin with, as well as how the set of string keys in that directory get maintained over time.
There is one line in the README.md which says "You could have created this folder by hand, or you could have used the vscode-nls-dev tool to extract it."...how would one use vscode-nls-dev tool to extract it?
What I Understand
I understand that you can use vscode-nls, and wrap strings like this: localize("some.key", "My String") to pick up the localized version of that string at runtime.
I am pretty sure I understand that vscode-nls-dev is used at build time to substitute the content of files in the i18n directory into the transpiled JavaScript code, as well as creating files like out/extension.nls.ja.json
What is missing
Surely it is not expected that: for every file.ts file in your project you create an i18n/lang/out/file.i18n.json for every lang you support...and then keep the set of keys in that file up to date manually with every string change.
I am assuming that there is some process which automatically goes "are there any localize("key", "String") calls in file.ts for new keys not yet in file.i18n.json? If so, add those keys with some untranslated values". What is that process?
I have figured this out, referencing https://github.com/Microsoft/vscode-extension-samples/issues/74
This is built to work if you use Transifex for your translator. At the bare minimum you need to use .xlf files as your translation file format.
I think that this is best illustrated with an example, so lets say you wanted to get the sample project working after you had deleted the i18n folder
Step 1: Clone that project, and delete the i18n directory
Step 2: Modify the gulp file so that the compile function also generates nls metadata files in the out directory. Something like:
function compile(buildNls) {
var r = tsProject.src()
.pipe(sourcemaps.init())
.pipe(tsProject()).js
.pipe(buildNls ? nls.rewriteLocalizeCalls() : es.through())
.pipe(buildNls ? nls.createAdditionalLanguageFiles(languages, 'i18n', 'out') : es.through())
.pipe(buildNls ? nls.bundleMetaDataFiles('ms-vscode.node-debug2', 'out') : es.through())
.pipe(buildNls ? nls.bundleLanguageFiles() : es.through())
Step 3: Run the gulp build command. This will generate several necessary metadata files in the out/ directory
Step 4: Create and run a new gulp function to export the necessarry translations to the xlf file. Something like:
gulp.task('export-i18n', function() {
return gulp.src(['package.nls.json', 'out/nls.metadata.header.json', 'out/nls.metadata.json'])
.pipe(nls.createXlfFiles("vscode-extensions", "node-js-debug2"))
.pipe(gulp.dest(path.join('vscode-translations-export')));
}
Step 5: Get the resulting xlf file translated. Or, add some dummy values. I cant find if/where there is documentation for the file format needed, but this worked for me (for the extension):
<?xml version="1.0" encoding="utf-8"?>
<xliff version="1.2" xmlns="urn:oasis:names:tc:xliff:document:1.2">
<file original="package" source-language="en" target-language="ja" datatype="plaintext"><body>
<trans-unit id="extension.sayHello.title">
<source xml:lang="en">Hello</source>
<target>JA_Hello</target>
</trans-unit>
<trans-unit id="extension.sayBye.title">
<source xml:lang="en">Bye</source>
<target>JA_Bye</target>
</trans-unit>
</body></file>
<file original="out/extension" source-language="en" target-language="ja" datatype="plaintext"><body>
<trans-unit id="sayHello.text">
<source xml:lang="en">Hello</source>
<target>JA_Hello</target>
</trans-unit>
</body></file>
<file original="out/command/sayBye" source-language="en" target-language="ja" datatype="plaintext"><body>
<trans-unit id="sayBye.text">
<source xml:lang="en">Bye</source>
<target>JA_Bye</target>
</trans-unit>>
</body></file>
</xliff>
Step 6: Stick that file in some known location, let's say /path/to/translation.xlf. Then add/run another new gulp task to import the translation. Something like:
gulp.task('i18n-import', () => {
return es.merge(languages.map(language => {
console.log(language.folderName)
return gulp.src(["/path/to/translation.xlf"])
.pipe(nls.prepareJsonFiles())
.pipe(gulp.dest(path.join('./i18n', language.folderName)));
}));
});
Step 7: Run the gulp build again.
The i18n/ directory should now be recreated correctly! Running the same build/export/translate/import/build steps will pick up any new changes to the localize() calls in your TypeScript code
Obviously this is not perfect, there are a lot of hardcoded paths and such, but hopefully it helps out anyone else who hits this issue.

XSLT streaming not streaming

I'm using Saxon-EE for the purpose of streaming XSLT transformation of large XML. The transformation works fine but it seems it's not really streaming since the java.exe process is inflating: for a 100 MB XML, process memory increases ~1GB. This is the XSLT:
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="3.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:fo="http://www.w3.org/1999/XSL/Format"
xmlns:bb="urn:xx-zz-1.1"
xmlns:aa="urn:xx-yy-1.1">
<xsl:mode streamable="yes"/>
<xsl:output method="text" omit-xml-declaration="yes" indent="no"/>
<xsl:template match="/">
<xsl:for-each select="aa:LevelOne/aa:LevelTwo">
<xsl:iterate select="bb:LevelThree! copy-of(.)">
<xsl:value-of select="concat(bb:fieldOne,',',bb:fieldTwo,'
')"/>
</xsl:iterate>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
This is the XML:
<?xml version="1.0" encoding="utf-8"?>
<aa:LevelOne xmlns="urn:xx-zz-1.1" xmlns:aa="urn:xx-yy-1.1">
<aa:LevelTwo xmlns="urn:xx-zz-1.1" xmlns:aa="urn:xx-yy-1.1">
<LevelThree xmlns="urn:xx-zz-1.1">
<fieldOne>f1</fieldOne>
<fieldTwo>f2</fieldTwo>
</LevelThree>
<!-- Level three is repeated many times -->
</aa:LevelTwo>
</aa:LevelOne>
I would like to know if there (& what) is a problem with the XSLT above.
The code I use:
net.sf.saxon.s9api.Processor processor = new net.sf.saxon.s9api.Processor(true);
processor.setConfigurationProperty(Feature.STREAMABILITY, "standard");
XsltCompiler compiler = processor.newXsltCompiler();
XsltExecutable stylesheet = compiler.compile(new StreamSource(stylesheetFile));
Serializer out = processor.newSerializer(outputCsvFile);
Xslt30Transformer transformer = stylesheet.load30();
transformer.applyTemplates(new StreamSource(xmlFile), out);
EDIT: Fixed the XSLT so it compiles & added XML example.
Remark: using command java -cp "<path>\test;<path>\saxon9ee.jar" com.example.test.Test -t does not ouput additional info (only the printlns in the code). java -cp "<path>\test;<path>\saxon9ee.jar" -t com.example.test.Test outputs: Unrecognized option: -Xt Error: Could not create the Java Virtual Machine. If I change the XSLT to non-streamable rule e.g. remove the iterate line, program outputs Template rule is not streamable, also without -t option. In this case if I remove the streamability requirement from code/xslt, the error goes away.
Thanks.
Probably the most likely reason Saxon would fall back to non-streaming mode is that it hasn't located a Saxon-EE license. The easiest way to test that is (unintuitively!) by calling processor.isSchemaAware() - that will only be true if you're running Saxon-EE code with a recognized license, which is exactly the same condition to enable streaming.
If it hasn't found a license, the Saxon documentation includes a section on troubleshooting license problems at http://www.saxonica.com/documentation/index.html#!about/license
Also, try it from the command line with option -t; that will give you more information (a) about streaming, and (b) about loading of license files.
I think, if the data is as simple and as regular as shown in the question, then you can avoid the use of copy-of() and simply use
<xsl:iterate select="bb:LevelThree">
<xsl:value-of select="bb:*" separator=","/>
<xsl:text>
</xsl:text>
</xsl:iterate>
That in a quick test shows a reduced memory consumption compared to your posted approach.
As for your posted approach not using streaming with Saxon EE 9.9, I have tested the posted XSLT and the input sample with Saxon 9.9 EE from the command line with the -t option and it shows the input is streamed.
I also think the Java code shown is fine to process the file with streaming with Saxon EE.
For a detailed analysis of the memory consumption and any problems you encounter with that it might be better to raise an issue with all details on the Saxonica support site. I am not sure how the memory info Saxon outputs relates exactly to the one you say you see for java.exe.

Set the name of an input control in a jrxml file. Is it possible?

I'd like to set the name of an input control in the jrxml file where it is defined; is that possible?
I know how to set the name of the input control via the Repository Explorer in Jaspersoft Studio, and I know how to set the name of an input control via the Jaspersoft Server.
However, I'd like to set the name of an input control in the jrxml file so that it will be set automatically upon being published to the server. Is there a property to use, similar to the following:
<parameter name="status_date_minimum" class="java.sql.Date">
<property name="some.property.key" vhalue="Minimum Status Date"/>
<defaultValueExpression><![CDATA[java.sql.Date.valueOf(java.time.LocalDate.now().minusYears(10).withMonth(1).withDayOfMonth(1))]]></defaultValueExpression>
</parameter>
As noted by #Siddharth in comments and suggested to me by a co-worker, there is a way to specify the label for the control outside of the user interface.
JasperReports Server associates each report with an XML file that it appears to create around the time it publishes your report to the server. The XML file contains, among other information, the labels for any input controls.
For an example of the XML file, first publish your report to a location on JasperReports Server. For the purpose of this example, the report file name is report.jrxml and the location is path/to/your; JasperReports Server appears to publish your report to path/to/your/report/Main jrxml (per JasperSoft Studio Repository Explorer) or path/to/your/report (per JasperReports Server Web UI).
Second, export your report from JasperReports Server (via the Web UI or via the command line); JasperReports Server will produce a zip file with the following content:
/index.xml
/resources/path/.folder.xml
/resources/path/to/.folder.xml
/resources/path/to/your/.folder.xml
/resources/path/to/your/report.xml
/resources/path/to/your/report_files/main_jrxml.data
main_jrxml.data contains the data from report.jrxml; report.xml contains the labels for any input controls. The content of report.xml may be similar to the following:
<?xml version="1.0" encoding="UTF-8"?>
<reportUnit exportedWithPermissions="true">
<folder>/resources/path/to/your</folder>
<name>report</name>
<version>2</version>
<label>report</label>
<description></description>
<creationDate>2018-03-21T18:12:41.759-04:00</creationDate>
<updateDate>2018-03-21T18:48:35.602-04:00</updateDate>
<mainReport>
<localResource
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
exportedWithPermissions="false" dataFile="main_jrxml.data" xsi:type="fileResource">
<folder>/resources/path/to/your/report_files</folder>
<name>main_jrxml</name>
<version>4</version>
<label>Main jrxml</label>
<creationDate>2018-03-21T18:12:41.759-04:00</creationDate>
<updateDate>2018-03-21T18:48:35.410-04:00</updateDate>
<fileType>jrxml</fileType>
</localResource>
</mainReport>
<inputControl>
<localResource
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
exportedWithPermissions="false" xsi:type="inputControl">
<folder>/resources/path/to/your/report_files</folder>
<name>status_date_minimum</name>
<version>1</version>
<label>status_date_minimum</label>
<creationDate>2018-03-21T18:48:35.602-04:00</creationDate>
<updateDate>2018-03-21T18:48:35.602-04:00</updateDate>
<type>2</type>
<mandatory>false</mandatory>
<readOnly>false</readOnly>
<visible>true</visible>
<dataType>
<localResource exportedWithPermissions="false" xsi:type="dataType">
<folder>/resources/path/to/your/report_files/status_date_minimum_files</folder>
<name>myDatatype</name>
<version>0</version>
<label>myDatatype</label>
<creationDate>2018-03-21T18:48:35.602-04:00</creationDate>
<updateDate>2018-03-21T18:48:35.602-04:00</updateDate>
<type>3</type>
<strictMin>false</strictMin>
<strictMax>false</strictMax>
</localResource>
</dataType>
</localResource>
</inputControl>
<inputControl>
<localResource
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
exportedWithPermissions="false" xsi:type="inputControl">
<folder>/resources/path/to/your/report_files</folder>
<name>status_date_maximum</name>
<version>1</version>
<label>status_date_maximum</label>
<creationDate>2018-03-21T18:48:35.602-04:00</creationDate>
<updateDate>2018-03-21T18:48:35.602-04:00</updateDate>
<type>2</type>
<mandatory>false</mandatory>
<readOnly>false</readOnly>
<visible>true</visible>
<dataType>
<localResource exportedWithPermissions="false" xsi:type="dataType">
<folder>/resources/path/to/your/report_files/status_date_maximum_files</folder>
<name>myDatatype</name>
<version>0</version>
<label>myDatatype</label>
<creationDate>2018-03-21T18:48:35.602-04:00</creationDate>
<updateDate>2018-03-21T18:48:35.602-04:00</updateDate>
<type>3</type>
<strictMin>false</strictMin>
<strictMax>false</strictMax>
</localResource>
</dataType>
</localResource>
</inputControl>
<alwaysPromptControls>true</alwaysPromptControls>
<controlsLayout>1</controlsLayout>
</reportUnit>
You may edit the content of the reportUnit/inputControl/localResource/label element to change the name of the label.
Once edited, you may import the data into the JasperReports Server. If you import through the command line, I recommend importing the directory, not the zip file - it appears that the command line import is picky about the zip format. Also, if you import through the command line, you must restart the JasperReports Server before you may run your changed report.

How to generate test overview in Frisby

As mentioned in frisby official document (http://frisbyjs.com/) , I am using --junitreport something like following
jasmine-node ./demo/validation_spec.js/ --junitreport --output
C:\Users\Administrator\Documents\script/Reports
which is generating 15 xml files. As I am calling 15 time post using for loop. File contains data like following
<?xml version="1.0" encoding="UTF-8" ?>
<testsuites>
<testsuite name="Frisby Test:blank action " errors="0" tests="1" failures="0" time="0.284" timestamp="2017-03-21T13:55:15">
<testcase classname="Frisby Test: blank action " name="
[ POST https://localhost:8443/api/v2/settings/pointwise ]" time="0.282"></testcase>
</testsuite>
</testsuites>
so my difficulty is that I need to read each and every file to check if tests is pass or not. I want some kind of overview page like http://maven.apache.org/surefire/maven-surefire-report-plugin/surefire-report.html
which contain information how many test passed, fail etc. Is it possible to get that.

using curl to interact with REST API - error: "Content not allowed in prolog"

I am sending the following curl command from terminal in Mac OSX:
curl -d "OPERATION_NAME=ADD_REQUEST&TECHNICIAN_KEY=1AD….4&INPUT_DATA=TestData.xml" http://xxx.xx.xx.xx/sdpapi/request/
I am getting back the response:
FailedError when performing - ADD_REQUEST - Content is not allowed in prolog.
Here is my xml file:
<?xml version="1.0" encoding="UTF-8"?>
<Operation>
<Details>
<requester>Me</requester>
<subject>Test</subject>
<description>Testing curl input</description>
</Details>
</Operation>
I have checked and my xml file is indeed a UTF-8 file. I can tell from google searches that this most likely has to do with my encoding, however, I can't find how to fix it.
I've also tried saving the xml file as ANSI on my pc, xml file looks like this:
<?xml version="1.0"?>
<Operation>
<Details>
<requester>Me</requester>
<subject>Test</subject>
<description>Testing curl input</description>
</Details>
</Operation>
I downloaded Notepad++ and checked the encoding, which is UTF8 with no BOM.
I am still getting the same error. Can anyone see what I am doing wrong?
Updated 9/19 to add:
In addition to everything I've tried in the comments below, I've also tried this:
curl -d "OPERATION_NAME=ADD_REQUEST&TECHNICIAN_KEY=xxxxxxxxxxxxxxxxx&INPUT_DATA=<?xml version="1.0" encoding="utf-8"?><Operation><Details><requester>Me</requester><subject>Test</subject><description>Testing curl input</description></Details></Operation>" http://xxx.xx.xx.xx/sdpapi/request/
The error I'm getting now is: "Error when performing - ADD_REQUEST - The value following "version" in the XML declaration must be a quoted string."
Anyone have any thoughts?
I replaced <?xml version="1.0" encoding="utf-8"?> with <?xml version=%221.0%22 encoding=%22utf-8%22?>
It is now working.