<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>bitwalker.nl &#187; i18n</title>
	<atom:link href="http://www.bitwalker.nl/blog/category/i18n/feed" rel="self" type="application/rss+xml" />
	<link>http://www.bitwalker.nl</link>
	<description>agile software development</description>
	<lastBuildDate>Sun, 15 Aug 2010 20:49:53 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8.4</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Java in a global word &#8211; some internationalization pitfalls</title>
		<link>http://www.bitwalker.nl/blog/java-in-a-global-word-some-internationalization-pitfalls</link>
		<comments>http://www.bitwalker.nl/blog/java-in-a-global-word-some-internationalization-pitfalls#comments</comments>
		<pubDate>Sat, 03 Mar 2007 20:26:37 +0000</pubDate>
		<dc:creator>Harald Walker</dc:creator>
				<category><![CDATA[i18n]]></category>
		<category><![CDATA[java]]></category>
		<category><![CDATA[server]]></category>
		<category><![CDATA[spring]]></category>

		<guid isPermaLink="false">http://www.bitwalker.nl/blog/java-in-a-global-word-some-internationalization-pitfalls</guid>
		<description><![CDATA[For the second time in the past years I had to get a Java based web-application ready for international use, which means support for various international character sets. The first time I didn&#8217;t have much experience yet and the application was difficult to debug, so it was a tiresome and lengthy trial and error approach. [...]]]></description>
			<content:encoded><![CDATA[<p>For the second time in the past years I had to get a Java based web-application ready for international use, which means support for various international character sets. The first time I didn&#8217;t have much experience yet and the application was difficult to debug, so it was a tiresome and lengthy trial and error approach. The second time it was easier since we had a better architecture and tools but also here some unexpected problems showed up.</p>
<p>There were basically three aspects, which made internationalization difficult:</p>
<ol>
<li> All levels of a system and application are effected (e.g. operating system, I/O operations, database, web-tier, web-services). If the settings don&#8217;t  match at some point, it might results in wrong character conversion. Usually this means we have to look at database encoding, file system encoding and http request/response encoding.</li>
<li>A lot of software which is being used (e.g. 3rd party components and libraries) uses a default character encoding, which is usually ISO-8859-1 (Latin-1). Is this western ignorance?</li>
<li>You don&#8217;t have the time to understand the complicated and often unclear issues surrounding character set encoding. If the world could just decide to switch to one global encoding, our live would be much easier.</li>
</ol>
<p>The goal is to use only one character encoding within the Java application (in our case UTF-8 seems to be fine for the job), so we only have to handle different character sets at entry and exit points  (web, files, &#8230;).</p>
<p>Let&#8217;s look at places which might need some attention:</p>
<p><strong>Application servers</strong></p>
<p>Start these with the correct JVM arguments.</p>
<p>Default file encoding<br />
The default file encoding is being used by the InputStreamReader and OutputStreamReader. If it is not set, the file encoding of the operating system will be used which can lead to unexpected results if you have a team which works on different systems or if the deployment system differs from the development environment. Set it with -Dfile.encoding=UTF-8</p>
<p>Next check and if necessary configure the default character encoding.</p>
<p>Resin<br />
The default value is ISO-8859-1.<br />
see <a href="http://www.caucho.com/resin-3.0/config/env.xtp#character-encoding" title="Specify the default character encoding for the environment.">Specify the default character encoding for the environment.</a></p>
<p>Tomcat<br />
Default encoding of Tomcat 5 is UTF-8. If not, you can specify it in $CATALINA_BASE/conf/web.xml or in your webapp&#8217;s own web.xml.</p>
<p>WebSphere<br />
Default character encoding is UTF-8. For more information see: <a href="http://www-306.ibm.com/software/globalization/j2ee/encoding.jsp" title="Developing J2EE Global Applications : Character Encoding">Developing J2EE Global Applications : Character Encoding</a></p>
<p><strong>Database</strong></p>
<p>Switch the complete database, individual tables or individual columns to UTF-8. How this can be done differs per database system.</p>
<p>For some JDBC database drivers you have to specify the encoding explicitly, others drivers are smart enough to determine the database encoding automatically.</p>
<p>Oracle 10g<br />
In order to review the current settings enter SELECT * FROM V$NLS_PARAMETERS;<br />
NLS_CHARACTERSET and NLS_LENGTH_SEMANTICS are interesting for us. Oracle recommends using Unicode character set AL32UTF8 for all new system deployments.<br />
If you don&#8217;t want to change the settings for the database, you can use the NCHAR, NVARCHAR2, and NCLOB datatypes instead. Their default encoding is AL16UTF16.</p>
<p>Additional information:<br />
<a href="http://appsdbablog.com/blog/2006/10/changing_the_character_set_in.html" title="Visit page outside Confluence" rel="nofollow">Changing The Character Set In Oracle Applications</a><br />
<a href="http://www.oracle-base.com/articles/9i/CharacterSemanticsAndGlobalization9i.php" title="Visit page outside Confluence" rel="nofollow">Character Semantics and Globalization</a></p>
<p><strong>Spring Framework</strong></p>
<p>Since Spring handles you requests, it needs some extra configuration:<br />
<a href="http://mrj.woo.dk/squareroot/2006/02/16/character-encoding-in-submitted-forms/" title="Visit page outside Confluence" rel="nofollow">Add filter to web.xml and spring configuration<br />
</a>In this case the Spring framework does most of the work for you. Without such framework you might have to do some conversion between different character encoding types yourself.</p>
<p><strong>Java Servlet Pages</strong></p>
<p>Use: &lt;%@ page pageEncoding=&#8221;UTF-8&#8243; contentType=<span class="code-quote">&#8220;text/html; charset=UTF-8&#8243; %&gt;</span></p>
<p>To set the default page encoding used for all jsp files, use</p>
<p>&lt;jsp-property-group&gt;<br />
(&#8230;)<br />
&lt;page-encoding&gt;utf-8&lt;/page-encoding&gt;<br />
&lt;/jsp-property-group&gt;</p>
<p>Additional information:<br />
<a href="http://java.sun.com/j2ee/1.4/docs/tutorial/doc/JSPIntro13.html" title="Setting Properties for Groups of JSP Pages">Setting Properties for Groups of JSP Pages</a></p>
<p>It is a good idea to add<br />
&lt;META http-equiv=&#8221;Content-Type&#8221; content=&#8221;text/html; charset=UTF-8&#8243;&gt;<br />
to the html as well.</p>
<p><strong>Templating Engines and layout frameworks</strong></p>
<p>Sitemesh<br />
By default Sitemesh uses ISO-8859-1. All used JSP pages should define UTF-8 as encoding. If you have various decorators and includes, these must all use the same encoding.</p>
<p>Velocity<br />
Also Velocity uses ISO-8859-1 as default. This has been the large pitfall on my first internationalization project. I wasted a lot of time before I knew this.<br />
Velocity allows you to specify the character encoding of your template resources on a template by template basis. The output encoding is an application specific setting and can be set in the runtime configuration with following configuration key: output.encoding (it might be a good idea to set input.encoding as well)<br />
More information: <a href="http://velocity.apache.org/engine/devel/developer-guide.html" title="Velocity Developer Guide">Velocity Developer Guide</a></p>
<p>Freemarker<br />
You can specify the charset of the template in the <a href="http://freemarker.sourceforge.net/docs/ref_directive_ftl.html" title="template itself">template itself</a> and the charset of the output with the<br />
setOutputEncoding(outputCharset) method of a Freemarker processing environment.</p>
<p><strong>Resource bundles</strong></p>
<p>Edit message bundles for non western languages in UTF-8 mode and then convert this file to an ascii format for Java. Call native2ascii encoding, specifying the original file has UTF-8 encoding:<br />
native2ascii -encoding UTF-8 messages_cn.txt messages_cn.properties</p>
<p><strong>Additional information<br />
</strong><br />
<a href="http://www.javaworld.com/javaworld/jw-04-2004/jw-0419-multibytes.html?page=2"> Java World: Multibyte-character processing in J2EE</a><br />
<a href="http://www.javaworld.com/javaworld/jw-01-1998/jw-01-indepth_p.html">An in-depth look at Java&#8217;s character type</a><br />
<a href="http://www.cs.tut.fi/~jkorpela/chars.html"> A tutorial on character code issues</a></p>
<!-- Social Bookmarks BEGIN -->
<div class="social_bookmark">
<a><strong><em>Bookmark It</em></strong></a>
<br />
<div class="d">
<br />
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.bloglines.com/sub/http://www.bitwalker.nl/blog/java-in-a-global-word-some-internationalization-pitfalls" rel="nofollow" title="Add to&nbsp;Bloglines"><img class="social_img" src="http://www.bitwalker.nl/wp-content/plugins/social-bookmarks/images/bloglines.png" title="Add to&nbsp;Bloglines" alt="Add to&nbsp;Bloglines" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://blogmarks.net/my/new.php?mini=1&amp;simple=1&amp;url=http://www.bitwalker.nl/blog/java-in-a-global-word-some-internationalization-pitfalls&amp;title=Java+in+a+global+word+%26%238211%3B+some+internationalization+pitfalls" rel="nofollow" title="Add to&nbsp;Blogmarks"><img class="social_img" src="http://www.bitwalker.nl/wp-content/plugins/social-bookmarks/images/bmarks.png" title="Add to&nbsp;Blogmarks" alt="Add to&nbsp;Blogmarks" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://del.icio.us/post?url=http://www.bitwalker.nl/blog/java-in-a-global-word-some-internationalization-pitfalls&amp;title=Java+in+a+global+word+%26%238211%3B+some+internationalization+pitfalls" rel="nofollow" title="Add to&nbsp;Del.icio.us"><img class="social_img" src="http://www.bitwalker.nl/wp-content/plugins/social-bookmarks/images/delicious.png" title="Add to&nbsp;Del.icio.us" alt="Add to&nbsp;Del.icio.us" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://digg.com/submit?phase=2&amp;url=http://www.bitwalker.nl/blog/java-in-a-global-word-some-internationalization-pitfalls&amp;title=Java+in+a+global+word+%26%238211%3B+some+internationalization+pitfalls" rel="nofollow" title="Add to&nbsp;digg"><img class="social_img" src="http://www.bitwalker.nl/wp-content/plugins/social-bookmarks/images/digg.png" title="Add to&nbsp;digg" alt="Add to&nbsp;digg" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.dzone.com/links/add.html?description=Java+in+a+global+word+%26%238211%3B+some+internationalization+pitfalls&amp;url=http://www.bitwalker.nl/blog/java-in-a-global-word-some-internationalization-pitfalls&amp;title=Java+in+a+global+word+%26%238211%3B+some+internationalization+pitfalls" rel="nofollow" title="Add to&nbsp;DZone"><img class="social_img" src="http://www.bitwalker.nl/wp-content/plugins/social-bookmarks/images/dzone.png" title="Add to&nbsp;DZone" alt="Add to&nbsp;DZone" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.facebook.com/sharer.php?u=http://www.bitwalker.nl/blog/java-in-a-global-word-some-internationalization-pitfalls" rel="nofollow" title="Add to&nbsp;Facebook"><img class="social_img" src="http://www.bitwalker.nl/wp-content/plugins/social-bookmarks/images/facebook.png" title="Add to&nbsp;Facebook" alt="Add to&nbsp;Facebook" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://fleck.com/litebookmarklet.php?url=http://www.bitwalker.nl/blog/java-in-a-global-word-some-internationalization-pitfalls&amp;title=Java+in+a+global+word+%26%238211%3B+some+internationalization+pitfalls" rel="nofollow" title="Add to&nbsp;Fleck"><img class="social_img" src="http://www.bitwalker.nl/wp-content/plugins/social-bookmarks/images/fleck.png" title="Add to&nbsp;Fleck" alt="Add to&nbsp;Fleck" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.google.com/bookmarks/mark?op=edit&amp;output=popup&amp;bkmk=http://www.bitwalker.nl/blog/java-in-a-global-word-some-internationalization-pitfalls&amp;title=Java+in+a+global+word+%26%238211%3B+some+internationalization+pitfalls" rel="nofollow" title="Add to&nbsp;Google Bookmarks"><img class="social_img" src="http://www.bitwalker.nl/wp-content/plugins/social-bookmarks/images/google.png" title="Add to&nbsp;Google Bookmarks" alt="Add to&nbsp;Google Bookmarks" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://netvouz.com/action/submitBookmark?url=http://www.bitwalker.nl/blog/java-in-a-global-word-some-internationalization-pitfalls&amp;title=Java+in+a+global+word+%26%238211%3B+some+internationalization+pitfalls&amp;popup=no" rel="nofollow" title="Add to&nbsp;Netvouz"><img class="social_img" src="http://www.bitwalker.nl/wp-content/plugins/social-bookmarks/images/netvouz.png" title="Add to&nbsp;Netvouz" alt="Add to&nbsp;Netvouz" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://reddit.com/submit?url=http://www.bitwalker.nl/blog/java-in-a-global-word-some-internationalization-pitfalls&amp;title=Java+in+a+global+word+%26%238211%3B+some+internationalization+pitfalls" rel="nofollow" title="Add to&nbsp;reddit"><img class="social_img" src="http://www.bitwalker.nl/wp-content/plugins/social-bookmarks/images/reddit.png" title="Add to&nbsp;reddit" alt="Add to&nbsp;reddit" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://slashdot.org/bookmark.pl?url=http://www.bitwalker.nl/blog/java-in-a-global-word-some-internationalization-pitfalls&amp;title=Java+in+a+global+word+%26%238211%3B+some+internationalization+pitfalls" rel="nofollow" title="Add to&nbsp;Slashdot"><img class="social_img" src="http://www.bitwalker.nl/wp-content/plugins/social-bookmarks/images/slashdot.png" title="Add to&nbsp;Slashdot" alt="Add to&nbsp;Slashdot" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.stumbleupon.com/submit.php?url=http://www.bitwalker.nl/blog/java-in-a-global-word-some-internationalization-pitfalls&amp;title=Java+in+a+global+word+%26%238211%3B+some+internationalization+pitfalls" rel="nofollow" title="Add to&nbsp;Stumble Upon"><img class="social_img" src="http://www.bitwalker.nl/wp-content/plugins/social-bookmarks/images/stumbleupon.png" title="Add to&nbsp;Stumble Upon" alt="Add to&nbsp;Stumble Upon" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.spurl.net/spurl.php?url=http://www.bitwalker.nl/blog/java-in-a-global-word-some-internationalization-pitfalls&amp;title=Java+in+a+global+word+%26%238211%3B+some+internationalization+pitfalls" rel="nofollow" title="Add to&nbsp;Spurl"><img class="social_img" src="http://www.bitwalker.nl/wp-content/plugins/social-bookmarks/images/spurl.png" title="Add to&nbsp;Spurl" alt="Add to&nbsp;Spurl" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://www.technorati.com/faves?add=http://www.bitwalker.nl/blog/java-in-a-global-word-some-internationalization-pitfalls" rel="nofollow" title="Add to&nbsp;Technorati"><img class="social_img" src="http://www.bitwalker.nl/wp-content/plugins/social-bookmarks/images/technorati.png" title="Add to&nbsp;Technorati" alt="Add to&nbsp;Technorati" /></a>
<a onclick="window.open(this.href, '_blank', 'scrollbars=yes,menubar=no,height=600,width=750,resizable=yes,toolbar=no,location=no,status=no'); return false;" href="http://myweb2.search.yahoo.com/myresults/bookmarklet?u=http://www.bitwalker.nl/blog/java-in-a-global-word-some-internationalization-pitfalls&amp;t=Java+in+a+global+word+%26%238211%3B+some+internationalization+pitfalls" rel="nofollow" title="Add to&nbsp;Yahoo My Web"><img class="social_img" src="http://www.bitwalker.nl/wp-content/plugins/social-bookmarks/images/yahoo.png" title="Add to&nbsp;Yahoo My Web" alt="Add to&nbsp;Yahoo My Web" /></a>
<br />
</div>
</div>
<!-- Social Bookmarks END -->
]]></content:encoded>
			<wfw:commentRss>http://www.bitwalker.nl/blog/java-in-a-global-word-some-internationalization-pitfalls/feed</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
	</channel>
</rss>
