[sakai2-tcc] Are we supporting Java 1.5 for 2.8?

Anthony Whyte arwhyte at umich.edu
Fri Feb 18 05:56:11 PST 2011


If we want to drop 1.5 then we need to make sure that the code compiles after adjusting Maven's compiler plugin settings from 1.5 to 1.6.  Last time I checked (early December) the kernel failed to build due to an issue with kernel util's validater.java due to "unmappable character for encoding UTF-8 (53 cases)."  

I have a fix for that which I shared with Beth et al (see below); but I've not implemented it in trunk yet.  I can commit the fix as soon as there is consensus here and then begin the process of making sure the rest of the code (including indies) compile.

      <plugin>
                <inherited>true</inherited>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-compiler-plugin</artifactId>
                <configuration>
                    <source>1.5</source>  [change to 1.6]
                    <target>1.5</target>  [change to 1.6] 
                </configuration>
            </plugin>



Begin forwarded message:

> From: Anthony Whyte <arwhyte at umich.edu>
> Date: December 16, 2010 4:56:44 PM EST
> To: Kirschner Beth <bkirschn at umich.edu>
> Cc: Horwitz David <david.horwitz at uct.ac.za>, Alan Berg <A.M.Berg at uva.nl>
> Subject: Kernel-utils: validator.java and use of escape sequences for Latin-1 characters
> 
> Beth--I wrote some test code to convince myself (and you) that the original proposal to replace the characters that Maven considers unmappable with unicode escape sequences in kernel-utils validator.java results in no change in behavior.  It's attached as converter.tgz.
> 
> The results:
> 
> protected static final String MAP_TO_A = "\u00E2\u00E4\u00E0\u00E5\u00C4\u00C5\u00E1\u00E1";
> 
> is equivalent to
> 
> protected static final String MAP_TO_A = "âäàåÄÅáá";
> 
> Java as designed interprets the escape sequences correctly; it does not consider the value of the string below as an undifferentiated string of slashes, letters and numbers with a length of 48.  Rather it returns a length of 8 characters and outputs the string as âäàåÄÅáá.
> 
> converter.java (the test code) includes the set of string constants (e.g., MAP_TO_A, etc.) and the static method escapeResourceName(string, id) that are included in validator.java.  I've also included a couple of convenience methods that are not needed in validator.java.  I provide a couple of additional constants that I use for testing.  
> 
> If you pass to escapeResourceName(String id) the character "ß",  having defined the constant MAP_TO_B = "\u00DF\u00DF" instead of "ßß" (as is current practice), the character is properly escaped as "b" -- there is no change in behavior.  Same is true in my second test: "â" is also properly escaped using the MAP_TO_A with escape sequences above.
> 
> I also tried the new Java SE 6 Normalizer.normalize() method which allows you to decompose characters, separating the character proper from its associate diacritical.  It works well for accented characters but cannot handle characters such as "ß", so it is no easy substitute for the conditional logic of escapeResourceName().
> 
> I also play around with commons-lang escapeJava() and unescapeJava() methods: the return values show that Java interprets escape sequences correctly.
> 
> converter.java output:
> 
> 1. Equality tests
> LATIN_1_A=âäàåÄÅáá
> LATIN_1_A length=8
> MAP_TO_A=âäàåÄÅáá
> MAP_TO_A unicode string length=8
> LATIN_1_B=ßß
> LATIN_1_B length=2
> MAP_TO_B=ßß
> MAP_TO_B unicode string length=2
> âäàåÄÅáá=âäàåÄÅáá
> ßß=ßß
> 
> 2. Escape Latin-1 characters using escapeResourceName()
> escaped LATIN_1_A characters: a
> escaped LATIN_1_B characters: b
> 
> 3. commons-lang StringEscapeUtils.escapeJava()
> Latin-1 B escapeJava=\u00DF
> Latin-1 B unescapeJava=ß
> 
> 4. Normalizer.normalize()
> Latin-1 A normalizer after decomposition=a
> Latin-1 B normalizer after decomposition=ß [FAILURE]
> 
> The easiest way to run the code and view all characters it outputs is to untar the archive and take a look at it in Eclipse.  Right click on converter.java then Run as...-> Java application.  You can run the jar in the terminal but accented characters will be displayed as "?".
> 
> java -jar converter.jar
> 
> 
> 
> 
> If these tests satisfy you I'll Jira the change, update validator.java (replacing the string constants as currently defined) and then finish up reconfiguring Maven to compile the Kernel in Java 6.  After that, I will do a second commit adding missing characters and removing duplicate characters.
> 
> Cheers,
> 
> Anth
> 
> 
> 3.1 Unicode
> http://java.sun.com/docs/books/jls/third_edition/html/lexical.html#95413
> 
> 3.3 Unicode escapes
> http://java.sun.com/docs/books/jls/third_edition/html/lexical.html#100850
> 
> Normalizer
> http://download.oracle.com/javase/6/docs/api/java/text/Normalizer.Form.html#NFD
> 
> 
> 
> Begin forwarded message:
> 
>> From: Beth Kirschner <bkirschn at umich.edu>
>> Date: December 3, 2010 8:40:25 AM EST
>> To: Anthony Whyte <arwhyte at umich.edu>
>> Cc: Stephen Marquard <smarquard at gmail.com>, Theriault Seth <slt at columbia.edu>, Horwitz David <david.horwitz at uct.ac.za>, Thomas Zach <zach at aeroplanesoftware.com>, Swinsburg Steve <steve.swinsburg at gmail.com>, Eng Jim <jimeng at umich.edu>, Severance Charles <csev at umich.edu>
>> Subject: Re: Kernel: compiler plugin and Java 1.6: local tests
>> 
>> Hi Anthony,
>> 
>>   Looking at Validator.java, it looks like it's already using unicode characters and what you're proposing is changing these to ascii-encoded unicode, which is different and I expect the code wouldn't work the same (i.e. validation would fail). Another option would be to move those unicode strings into a properties file (this is where your ascii-encoded unicode would be appropriate, since when the ResourceBundle.getString() method will convert these into unicode characters when loading the property.
>> 
>> - Beth
>> 
>> On Dec 2, 2010, at 4:51 PM, Anthony Whyte wrote:
>> 
>>> Reconfiguring the maven compiler plugin to 1.6 and then attempting a kernel trunk build results in a build failure due to the presence of unmappable characters in kernel-utils validator.java.  Locally, I replaced the characters with unicode with the result that kernel compiled successfully with a reconfigured maven-compiler-plugin for 1.6.  
>>> 
>>> Is unicode substitution an acceptable solution?
>>> 
>>> I also reconfigured the javadoc and pmd plugins for 1.6.  Javadoc jars generated successfully (mvn javadoc:jar).  PMD report generated successfully (mvn pmd:pmd).
>>> 
>>> 
>>> UNICODE
>>> 
>>> In replacing the characters with unicode I did so in the order specified in the current set of strings.  If a character was repeated I repeated it.
>>> 
>>> Questions (if the solution is acceptable)
>>> 
>>> 1.  Does character order in the strings matter?
>>> 2.  Are the repeated characters deliberate or duplicates?
>>> 	example: protected static final String MAP_TO_B =  "ßß ";
>>> 	example: protected static final String MAP_TO_E = "...ÆÆ";
>>> 3.  A fair number of Latin-1 characters are not included in the strings (particularly capitalizations).  Perhaps they should be added.
>>> 
>>> 
>>> NEW PROPERTIES
>>> 
>>> I also created two properties <compile.source> and <compile.target> with values of "1.6" and use the variables with the compiler, javadoc and pmd plugins.  This makes it possible to keep all three plugins in sync with respect to our compiler of choice.
>>> 
>>> 
>>> MY ENVIRONMENT
>>> 
>>> Mac OS X 10.6.5
>>> java version "1.6.0_22"
>>> Java(TM) SE Runtime Environment (build 1.6.0_22-b04-307-10M3261)
>>> Java HotSpot(TM) 64-Bit Server VM (build 17.1-b03-307, mixed mode)
>>> 
>>> 
>>> TESTS
>>> 
>>> Quick Test 1: configure compiler plugin for 1.6 (see local modification section below).  Perform build (mvn clean install)
>>> Result: Build Failure: validator.java unmappable character for encoding UTF-8 (53 cases)
>>> 
>>> Quick Test 2: replace validator.java UTF-8 characters with unicode  (see below)
>>> Result: Build successful
>>> 
>>> Quick Test 3: generate javadocs (mvn javadocs:jar)
>>> Result: Build successful
>>> 
>>> Quick Test 4: generate pmd (mvn pmd:pmd)
>>> Result: Build successful
>>> 
>>> 
>>> CURRENT STATE (run mvn help:effective-pom)
>>> . . .
>>> <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
>>> 
>>> maven-compiler-plugin
>>> <version>2.0.2</version>  [NOTE currently we don't specify the version -- not ideal as Maven is choosing an old version for us]
>>> <source>1.5</source>
>>> <target>1.5</target>
>>> 
>>> maven-javadoc-plugin (pluginManagement)
>>> <version>2.5</version>
>>> <source>1.5</source>
>>> <target>1.5</target>
>>> 
>>> maven-pmd-plugin (reporting)
>>> <version>2.5</version>  [NOTE currently we don't specify the version]
>>> <sourceEncoding>utf-8</sourceEncoding>
>>> <targetJdk>1.5</targetJdk>
>>> 
>>> maven-javadoc-plugin (reporting)
>>> <version>2.7</version>
>>>        <link>http://download.oracle.com/javase/1.5.0/docs/api/</link>
>>>        <link>http://java.sun.com/products/servlet/2.3/javadoc/</link>
>>> 
>>> 
>>> LOCAL MODIFICATIONS (kernel base pom)
>>> 
>>> New <properties>
>>>        <compile.source>1.6</compile.source>
>>>        <compile.target>1.6</compile.target>
>>>        <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>  (existing property)
>>> 
>>> maven-compiler-plugin
>>> 	<version>2.3.2</version> (latest)
>>> 	<source>${compile.source}</source>
>>>        <target>${compile.target}</target>
>>>        <encoding>${project.build.sourceEncoding}</encoding>
>>> 
>>> maven-javadoc-plugin (pluginManagement)
>>>    <version>2.7</version> (latest)
>>>    <source>${compile.source}</source>
>>>    <target>${compile.target}</target>
>>> 
>>> maven-pmd-plugin (reporting)
>>> 	<version>2.5</version> (latest 2.5; 2.3 adds 1.6 support)
>>> 	<sourceEncoding>${project.build.sourceEncoding}</sourceEncoding>  (removed hard-coded "utf-8")
>>> 	<targetJdk>${compile.source}</targetJdk>
>>> 
>>> maven-javadoc-plugin (reporting)
>>> <link>http://download.oracle.com/javase/6/docs/api/</link>
>>> <link>http://download.oracle.com/docs/cd/E17802_01/products/products/servlet/2.3/javadoc/</link>
>>> 
>>> 
>>> org/sakaiproject/util/Validator.java
>>> 
>>> Changed:
>>> 
>>> 	protected static final String MAP_TO_A = "\u00E2\u00E4\u00E0\u00E5\u00C4\u00C5\u00E1\u00E1";
>>> 	
>>> 	protected static final String MAP_TO_B = "\u00DF\u00DF";
>>> 
>>> 	protected static final String MAP_TO_C = "\u00C7\u00E7\u00A2\u00A2";
>>> 	
>>> 	protected static final String MAP_TO_E = "\u00E9\u00EA\u00EB\u00E8\u00C9\u00E6\u00C6\u00C6";
>>> 	
>>> 	protected static final String MAP_TO_I = "\u00EF\u00EE\u00EC\u00ED\u00ED";
>>> 
>>> 	protected static final String MAP_TO_L = "\u00A3\u00A3";
>>> 
>>> 	protected static final String MAP_TO_N = "\u00F1\u00D1\u00D1";
>>> 
>>> 	protected static final String MAP_TO_O = "\u00F4\u00F6\u00F2\u00D6\u00F3\u00F3";
>>> 		
>>> 	protected static final String MAP_TO_U = "\u00FC\u00FB\u00F9\u00DC\u00FA\u00FA";
>>> 
>>> 	protected static final String MAP_TO_Y = "\u00FF\u00A5\u003F\u003F";
>>> 
>>> 	protected static final String MAP_TO_X = "\u003F\u003F\u003F\u00A7\u00A9\u00AA\u00AE\u00B1\u003F\u00B4\u00B5\u00B6\u00BF\u003F";
>>> 




On Feb 18, 2011, at 7:26 AM, Matthew Buckett wrote:

> On 18 February 2011 00:32, Steve Swinsburg <steve.swinsburg at gmail.com> wrote:
>> 
>> Are we supporting Java 1.5 for Sakai 2.8? A resounding 'no' would be ideal.
> 
> I think think this sounds good.
> 
> I can't see any real reason to support 1.5, it's not like Sakai shares
> VMs with other applications, and backward compatibility is very good
> for locally written code.
> 
> -- 
>   Matthew Buckett
>   VLE Developer, LTG, Oxford University Computing Services
> _______________________________________________
> sakai2-tcc mailing list
> sakai2-tcc at collab.sakaiproject.org
> http://collab.sakaiproject.org/mailman/listinfo/sakai2-tcc
> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://collab.sakaiproject.org/pipermail/sakai2-tcc/attachments/20110218/e4175963/attachment-0002.html 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: converter.tgz
Type: application/octet-stream
Size: 7043 bytes
Desc: not available
Url : http://collab.sakaiproject.org/pipermail/sakai2-tcc/attachments/20110218/e4175963/attachment-0001.obj 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://collab.sakaiproject.org/pipermail/sakai2-tcc/attachments/20110218/e4175963/attachment-0003.html 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 3829 bytes
Desc: not available
Url : http://collab.sakaiproject.org/pipermail/sakai2-tcc/attachments/20110218/e4175963/attachment-0001.bin 


More information about the sakai2-tcc mailing list