Monday, January 4, 2010

Puzzle 15: Hello Whirled











 < Day Day Up > 







Puzzle 15: Hello Whirled



The following program is a minor variation on an old chestnut. What does it print?





/**

* Generated by the IBM IDL-to-Java compiler, version 1.0

* from F:\TestRoot\apps\a1\units\include\PolicyHome.idl

* Wednesday, June 17, 1998 6:44:40 o'clock AM GMT+00:00

*/

public class Test {

public static void main(String[] args) {

System.out.print("Hell");

System.out.println("o world");

}

}






Solution 15: Hello Whirled



This puzzle looks fairly straightforward. The program contains two statements. The first prints Hell and the second prints o world on the same line, effectively concatenating the two strings. Therefore, you might expect the program to print Hello world. You would be sadly mistaken. In fact, it doesn't compile.



The problem is in the third line of the comment, which contains the characters \units. These characters begin with a backslash (\) followed by the letter u, which denotes the start of a Unicode escape. Unfortunately, these characters are not followed by four hexadecimal digits, so the Unicode escape is ill-formed, and the compiler is required to reject the program. Unicode escapes must be well formed, even if they appear in comments.



It is legal to place a well-formed Unicode escape in a comment, but there is rarely a reason to do so. Programmers sometimes use Unicode escapes in Javadoc comments to generate special characters in the documentation:





// Questionable use of Unicode escape in Javadoc comment



/**

* This method calls itself recursively, causing a

* <tt>StackOverflowError</tt> to be thrown.

* The algorithm is due to Peter von der Ah\u00E9.

*/




This technique represents an unnecessary use of Unicode escapes. Use HTML entity escapes instead of Unicode escapes in Javadoc comments:





/**

* This method calls itself recursively, causing a

* <tt>StackOverflowError</tt> to be thrown.

* The algorithm is due to Peter von der Ahé.

*/




Either of the preceding comments should cause the name to appear in the documentation as "Peter von der Ahé," but the latter comment is also understandable in the source file.



In case you were wondering, the comment in this puzzle was derived from an actual bug report. The program was machine generated, which made it difficult to track the problem down to its source, an IDL-to-Java compiler. To avoid placing other programmers in this position, tools must not put Windows filenames into comments in generated Java source files without first processing them to eliminate backslashes.



In summary, ensure that the characters \u do not occur outside the context of a valid Unicode escape, even in comments. Be particularly wary of this problem in machine-generated code.

























     < Day Day Up > 



    No comments: