VAST 8 / Seaside Tips: Handling accented Characters


Seaside is made for UTF.Or, let’s say it ignores the issue of character encodings by default and assumes everybody spits out utf-8 these days.

VA Smalltalk is not yet UTF-aware, but Instantiations is working on it with high priority.

What am I talking about? Well, if you like me are in a country with accented characters, like  German Umlauts, you may have seen the effect of rendering something like this in VAST 8 under seaside:

html text: 'Die größten Äpfel wachsen an den Bäumen in Böhmen'.

Your Browser will show a lot of cryptic characters instead of Umlauts, because Seaside by default send out a meta tag that looks like this:

<meta content=”text/html;charset=utf-8 http-equiv=”Content-Type/>

BUT: VAST / SST does neither manage nor encode your string constants in the Smalltalk source into UTF-8.

There is a “solution” for this problem, which is probably way from perfect. You can configure your application to send a meta tag with another encoding, which is easy. Just set the #charSet preference in your application registration method like this:

(WAAdmin register: self asApplicationAt: ‘Apfelwissen.de’)
preferenceAt: #charSet put: ‘iso-8859-1’;

Of course, if your country is not Germany, you may probably need another charSet for your nice little snakes and circles around your letters😉

Now you can render umlauts from Smalltalk strings.

But that is only half the rent, maybe even less. What about Form Data? I mean the stuff you type into text input fields etc. in your browser. You might have already guessed it: WRONG!

Again, Seaside by default sends out FORM tags with an accept-charset=”utf-8″ attribute. And again, you can overwrite this:

html form
acceptCharset: 'iso-8859-1';
with: [
html textInput....

So as long as you make sure all your forms are accepting the same character set as you display, this will make everything work fine. If you enter a person’s name as Jörg Überhör, it will be displayed perfectly.
Are we there yet?
I am not sure. What about Strings coming from your data storage or another application server? As long as it is encoded in iso-8859-1 (or whatever charset you use in your country), things will surely work really nicely. But what if your database contains utf-8?

The real solution to this problem is when VAST will be capable of handling UTF in user data as well as in source code. Or at least will offer some code and guidelines to encode strings correctly.
But until then, we’ll have to worry about such stuff for a while. But hopefully not too long.

Maybe We’ll learn more about when UTF is coming and how we can handle it  in John’s presentation at this year’s ESUG