Ant Jar task corrupts manifest encoding - encoding

Per Jar specification manifest encoding has to be UTF-8.
In some scenario (eg merge), manifests produced by ant's jar task got corrupted and special chars would be double encoded.
Original manifest (utf-8):
...
Application-Name: spécial
...
Final manifest (utf-8) after beeing processed by ant's jar task:
...
Application-Name: spécial
...

Jar tasks beeing able to process file-sets allows the developper to specify the original manifest character encoding.
Unfortunately, although the mandatory (final) encoding is utf-8 there is no default in ant's jar task and then the original manifest processing is relying on the platform default... Windows-1252 in my case where the original manifest (coming from another jar) is truly in utf-8
Solution : specify the encoding in the task attribute
<jar destfile="final.jar" filesetmanifest="merge" manifestencoding="UTF-8">
<zipfileset src="original.jar">
[...]
</zipfileset>
</jar>

I've just found my old bugreport about this for NetBeans.
As a workaround, I added manifestEncoding="${source.encoding}"
attribute to the copylibs tag in build-impl.xml

Related

Ant replace task doesn't work

I have Powershell script. It has a first line:
$installation_folder = #aaa#
And have an Ant buildfile with this task:
<target name="prepare-install-script" description="Preparation of installation script">
<replace file="install.ps1" propertyfile="${template-properties}">
<replacefilter token="#aaa#" value="installation.dir"/>
</replace>
</target>
All files are initialized. Logs said:
[replace] Replacing in c:\Users\install.ps1: #aaa# --> sdfsdf
But in the file nothing changed.
What can it be?
You have to change the encoding. This will work:
<target name="prepare-install-script" description="Preparation of installation script">
<replace file="install.ps1" token="#aaa#" value="installation.dir" encoding="UTF-16"/>
The problem was that when you write script in PowerShell default Windows IDE it became somethings like "binary" file with some system information. That's why Ant can't do replacing.
Fixing by copy script to simply text editor and save as ps1.

antlr4 Jar has duplicate classes in different packages - don't know which are referred to by internal code

I am using antlr4.3 (complete) jar.
It has many duplicates in org.antlr.runtime and org.antlr.v4.runtime packages.
In code when I explicitly use 'v4.runtime' - at runtime, classpath picks up 'runtime'.
So I extracted the jar and recreated it without org.antlr.runtime.
But apparently some classes like RecognitionException is now not found.
How should I resolve this other than:
Exploding the latest Jar and specifying org.antlr.v4.runtime BEFORE org.antlr.runtime so that a duplicate class will be picked up from v4.runtime, and if there isn't one in it, it will look at org.antlr.runtime...??
To add to the above, here's the code snippet which gives a problem: the jars are in the classpath.
import org.antlr.v4.runtime.CharStream;
import org.antlr.v4.runtime.ANTLRInputStream;
public class AntlrMain {
public static void main(String[] args) {
System.out.println("Start Hello World");
try {
InputStream is = new FileInputStream(
"/home/ecworkspace/antlrCentral/DSL/mydev.dsl");
org.antlr.runANTLRInputStream input = new ANTLRInputStream(is);
org.antlr.v4.runtime.CharStream cs = (org.antlr.v4.runtime.CharStream) input;
VCFGLexer lexer = new VCFGLexer(cs);
Initially in the ANtlrMain class, I wasn't using explicit
org.antlr.v4.runtime.; but that failed at runtime, with 'CharStream not found'.
Then I changed to include full path of the class
Then changed the ANTLR4 Jar to 'exclude' org.antlr.runtime (it has org.antlr.v4.runtime). That's when the 'RecognitionException not found' error occurred.
The grammar by the way, compiles OK, generating all my VCFG*.java and tokens classes, where VCFG is the grammar name.
UPDATE 1
Keeping in line with suggestions from all - I removed my answer to my own questions and adding it to this original questions.
In antlr-4.2-complete.jar, I see:
/tmp/x/ $ jar -xf antlr-4.2-complete.jar
/tmp/x/ $ ls org/antlr
runtime stringtemplate v4
/tmp/x/ $ ls org/antlr/v4
analysis codegen parse semantics Tool$1UndefChecker.class Tool$OptionArgType.class
automata misc runtime tool Tool.class Tool$Option.class
/tmp/x/ $ ## The 2 runtimes above: org.antlr.runtime and org.antlr.v4.runtime
/tmp/x/ $ ## which to use where, along with same-name classes in
/tmp/x/ $ ## org.antlr and org.antlr.v4
So, in build.xml, I use above jar to:
`
java -jar antlr-4.2-complete grammar.g4 => compiles and gives me
VCFG*.java and VCFG*.tokens
javac -cp "antlr-4.2-complete-jar" VCFG*.java => Succeeds. I have
the VCFG*.class collection.
Then I compile my code AntlrMain.java (which uses AntlrInputStream
etc.), again with the above antlr jar and some 3rd-party Jars
(logging, commons) => successfully.
Finally the RUN of java -cp "antlr-4.2-complete.jar:log4j.jar" -jar
myJar => FAILS on 'CharStream' not found.
UPDATE 2
Adding, based on your response.
I have only recently started posting questions on Stackoverflow. So pointers about whether to respond to my question to provide more info, or to comment to a reply etc. are welcome.
-cp <3rd-party> is -cp "log4j.jar:commonsLang.jar".
By -cp "above-jar" I meant -cp "antlr-4.2-complete.jar.
And if I have not mentioned it, it is an oversight - I have, for every 'java' and 'javac commands, included antlr-4.2-complete.jar.
BUT I see you indicating antlr-runtime-4.2.jar. So there ARE separate antlr-runtime jar and antlr-complete jars.
In the 4 steps below (I am leaving out -cp for convenience, but am including antlr-4.2-complete.jar for 'every' step.
I believe, I should be using the antlr-run-time and antlr-complete jars at different steps:
1 (java MyGrammar.java)
2 (javac MyGrammar*.java)
3. javac MyOwnCode.java
4. Run myCode (java MyCode) ...
which of the two antlr JARs (runtime and complete; and their versions) should I then use, at each of the above 4 steps?
The jar file does not contain duplicate classes. The code generation portion of the ANTLR 4.3 Tool relies on the ANTLR 3.5.2 Runtime library, which is included in the "complete" jar file. While some of the classes have the same name as classes in ANTLR 4, they are not duplicates and cannot be used interchangeably.
#280Z28 / Sam:
I am mortified, but have to admit the simplest answer is most often the correct.
I spent time fleshing out the JAR, making multiple JAR files out of it, include one for compile, one for run and on and on.
The answer is succinctly explain in the ANT build.xml code snippet below: where I produce the 'final' production JAR file, which is the only JAR then included while executing my Main program:
<jar destfile="${p_productionJar}">
<fileset dir="${p_buildDir}" casesensitive="yes">
<include name="**/*.class"/>
</fileset>
<zipfileset includes="**/*.class" src="${p_genCodeJar}"/>
<!-- How did I miss including p_antlrJar earlier?? -->
<zipfileset includes="**/*.class" src="${p_antlrJar}"/>
<zipfileset includes="**/*.class" src="${p_jschJar}"/>
<zipfileset includes="**/*.class" src="${p_log4jJar}"/>
<zipfileset includes="**/*.class" src="${p_commonslangJar}"/>
<manifest>
<attribute name="Main-Class" value="AntlrMain"/>
.....
The production Jar was missing ${p_antlrJar} => which is antlr-4.3-complete.jar!!!!
You did mention this in your answer... but it was a pretty silly mistake to do, and didn't think I had done it...
Thank you.

Command line compiler for XTend

Hi allI've found XTend (http://xtend-lang.org) and it really sounds great! But, I can't see any standalone command line compiler for this language. It seems only to run under eclipse. I've done some research, and found some people saying, that it has a command line compiler, but I can't find a download link.
Does the compiler exist, standalone, or do you need eclipse to use it?
Regards
It is not documented, but there is indeed a command line compiler in the Xtend code base - the same one used by the Maven plug-in (that is documented in the Xtend homepage).
If Maven plug-in does not work for you, then you could download the standalone jar version directly from the Maven repository at http://build.eclipse.org/common/xtend/maven/org/eclipse/xtend/org.eclipse.xtend.standalone/2.3.1/ (for version 2.3.1), and execute the org.eclipse.xtend.core.compiler.batch.Main class from it.
This class executes the xtend compiler, and usage information can be displayed (also readable from the source file).
You can use the xtend standalone compiler. For my case I copied the following .jar files to a folder named xtendc:
com.google.guava_21.0.0.v20170206-1425.jar
com.google.inject_3.0.0.v201312141243.jar
javax.inject_1.0.0.v20091030.jar
org.antlr.runtime_3.2.0.v201101311130.jar
org.apache.log4j_1.2.15.v201012070815.jar
org.eclipse.emf.common_2.15.0.v20180914-1817.jar
org.eclipse.emf.ecore.xmi_2.15.0.v20180706-1146.jar
org.eclipse.emf.ecore_2.16.0.v20181124-0637.jar
org.eclipse.equinox.common_3.10.200.v20181021-1645.jar
org.eclipse.jdt.core_3.16.0.v20181130-1748.jar
org.eclipse.xtend.core_2.16.0.v20181203-1347.jar
org.eclipse.xtend.lib.macro_2.16.0.v20181203-0507.jar
org.eclipse.xtext.common.types_2.16.0.v20181203-0528.jar
org.eclipse.xtext.util_2.16.0.v20181203-0514.jar
org.eclipse.xtext.xbase.lib_2.16.0.v20181203-0507.jar
org.eclipse.xtext.xbase_2.16.0.v20181203-0528.jar
org.eclipse.xtext_2.16.0.v20181203-0514.jar
org.objectweb.asm_7.0.0.v20181030-2244.jar
And then, in that folder I executed the CLI main class of the batch compiler:
java -cp "*" org.eclipse.xtend.core.compiler.batch.Main -d <path-to-xtend-gen-folder> -useCurrentClassLoader <path-to-src-folder>
CLI usage of main class is documented to be as following:
Usage: Main <options> <source directories>
where possible options include:
-d <directory> Specify where to place generated xtend files
-tp <path> Temp directory to hold generated stubs and classes
-cp <path> Specify where to find user class files
-encoding <encoding> Specify character encoding used by source files
-javaSourceVersion <version> Create Java Source compatible to this version. Can be: 1.5, 1.6, 1.7, 1.8, 9, 10
-noSuppressWarningsAnnotation Don't put #SuppressWarnings() into generated Java Code
-generateGeneratedAnnotation Put #Generated into generated Java Code
-includeDateInGeneratedAnnnotation If -generateGeneratedAnnotation is used, add the current date/time.
-generateAnnotationComment <string> If -generateGeneratedAnnotation is used, add a comment.
-useCurrentClassLoader Use current classloader as parent classloader
-writeTraceFiles Write Trace-Files
so you will need to pass your classpath there.

Can't make Ant write proper version info with unicode (c) character

After upgrading ant from 1.6 to 1.8.3 version info resources of Windows .dlls that are built with Ant became corrupted.
Previously this value was properly saved to the version-info resource:
product.copyright=\u00a9 Copyright 20xx-20xx yyyyyyyyyy \u2122 (so (c) and TM symbols were properly displayed).
After upgrading Ant default encoding was changed to UTF-8 which is expected, but currently Copyright string looks like this:
© Copyright 20xx-20xx yyyyyy ™
This is not a console issue - I checked with hex editor and File Properties dialog - both display it incorrectly.
Looking at file's hexdump I see that following (obviously incorrect) mapping occurs
\u00a9 -> 0x00c2 0x00a9
\u2122 -> 0x00e2 0x201e 0x00a2
The problem here is that Ant encodes UTF-8 bytes (not Unicode string) into 16-bit characters and writes it to version-info.
Although this looks like a bug in ant, I would ask if anyone managed to find any workarounds for this or similar problems.
Here are some snippets from the script:
Project properties file:
...
product.copyright=(c) Copyright 2005-2012 Clarabridge
....
Files included into build.xml:
<versioninfo id="current-version" if="is-windows"
fileversion="${product.version}"
productversion="${product.version}"
compatibilityversion="1"
legalcopyright="${product.copyright}"
companyname="${product.company}"
filedescription="${ant.project.name}"
productname="${ant.project.name}"
/>
...
<cc objdir="${target.dir}/${target.platform}/obj"
outfile="${target.dir}/${target.platform}/${ant.project.name}"
subsystem="other"
failonerror="true"
incremental="false"
outtype="shared"
runtime="dynamic"
>
<versioninfo refid="current-version" />
<compiler refid="compiler-shared-${target.platform}" />
<compiler refid="rc-compiler" />
<linker extends="linker-${target.platform}">
<libset dir="${target.dir}/${target.platform}/lib" libs="${lib.list}" />
</linker>
<fileset dir="${src.dir}" casesensitive="false">
<include name="*.cpp"/>
</fileset>
</cc>
Your bug is that something is misinterpreting the UTF-8 characters as 8-bit ones!!!
BTW, Java doesn’t use 16-bit characters; that would be UCS-2. Java uses UTF-16, which is just as much a variable-width encoding as UTF-8 is. Distressing how many Java programmers screw this up!
UTF-8 has 8-bit code units where UTF-16 has 16-bit code units; neither one supports an “8-bit character” or a “16-bit character”. If you catch yourself writing code that thinks they do, you’ve just written buggy code.
Your output is the result of erroneously displaying UTF-8 as though it were in Latin1, which does use 8-bit characters. You, however, do not.

ANT Javac and special characters

I have an ANT task defined like so:
<javac source="1.5" target="1.5" srcdir="${src.dir}" destdir="${classes.dir}" deprecation="on" debug="on" classpathref="classpath" fork="true" memoryMaximumSize="512m" encoding="UTF-8">
<include name="${app.directory}/**/*.java"/>
</javac>
This works fine, but when I have classes with special characters in their names it gives me the following error:
[iosession] Compiling 131 source files to /C24/PUB/io-stds/trunk/standards/GSIT/build/test/deployment/build/classes
[iosession] javac: file not found: /C24/PUB/io-stds/trunk/standards/GSIT/build/test/deployment/src/java/biz/c24/io/minos/AléaChiffréClass.java
[iosession] Usage: javac <options> <source files>
[iosession] use -help for a list of possible options
[iosession] Target compile finished
[iosession]
[iosession] Building unsuccessful 2 seconds
When I remove the "fork=true" it works, but then it ignores the "memoryMaximumSize" setting. I also tried the nested approach, but to no avail.
Any ideas?
It's perhaps not the answer you expect but my advice would be to remove all non-ascii letters from the names of methods and classes. I'm French-speaking too, and I've never seen any company, even in France and using French as its development language, accept accented letters in class names and methods. It's just not good practice, simply because it would be very hard for a non French developer, without accents on his keyboard, to use these classes and methods.
If you use a good IDE, it should allow you to refactor your code easily.
Apache did confirm that the encoding attribute only applies to the file contents and not file names. I reverted back to using fork only when needed and kept encoding="UTF-8".