VA Smalltalk, Seaside, Ajax and … unicode!


Lately I found the reason for an annoying bug in my Seaside Application in VA Smalltalk. It is once again related to encoding/decoding of Ajax request parameters. Once again I’ve been bitten by the fact that VAST is not unicode-ready yet (well, we’ve been waiting for unicodesupport in VAST for more than 4 years now, as you can see in this post of mine dating back to August 2009).

But first things first: What was the actual problem?

For some input fields I need a specialized auto-completer that asks the server for a rendered list of search results (a hierarchical list, so nothing the out-of-the-box autocompleter of jQuery could accomplish). So I wrote a little jQuery plugin that reacts to each keyup event of the textinput and sends an ajax request back to the Seaside server. The callback on the server will then use the user input from the input field to search possible hits and render a drop-down div with these results in hierarchical form (and some more stuff).

Here’s the snippet from my javaScript plugin that builds the ajax request:

input = $(this).val();
if(input.length >= options.minLength) {

  if(jqXHR) {jqXHR.abort()}; //don''t wait for old results

  jqXHR=$.ajax({
    "url": options.url,
    "data": [
      options.queryFields,
      options.serializeCallback+"="+input,
    ].join("&"),
  }).always(drawTarget());

Don’t worry too much about jqXHR and options. They are not really relevant here.

This works quite nicely: You enter a few letters, the server searches objects in the db and renders a list of hits.

Until you enter a percent sign.

When the trouble started

The first thing that happened, was an exception in the server image in WAUrl class>>#decodePercent: (read more about it here) when I entered a String like ‘test%’. And that is when all began…

First of all, I thought that is a bug in Seaside (which it probably even is). But then you may never forget that % is not such an uncommon character in URLs. And an ajax call is nothing else than a HTTP request to the server, which consists of a URL and possibly some additional data.

Hmm. Then I thought I’d simply remove a percent sign from the search string in the Javascript code, right before I send the Ajax request. Like this, for example:

input = $(this).val().replace (/%/g, ""); // remove %

That seemed to work. But wait! What if there is more than just % that makes trouble. In the end, what we’re handling here is going to be part of an URL, so maybe there is some better way to do this?

Always encode Strings before you send them as Ajax request Parameters

In fact, there is: encodeURI() is the javascript function to use in such cases.

So I tried this:

input = encodeURI($(this).val());

And it seemed to work equally well and feels much better, because now I can be sure that all encoding/decoding stuff is now handled correctly. And if it’s not, it’s not my fault.

Great. Joachim’s a Hero. He can read api.jquery.com and search on stacktrace.com. Wow! Applause!

…but you have to do it right

But wait: It only seemed to work well. Until I searched for the term ‘Büromöbel’. The search results were wrong. No hits, even though there should have been more than one…

I know I’ve been running in circles around this whole topic several times before, but I fall flat onto my nose every time again. I need to remember: Ajax requests are UTF-8 (or is it UTF-16?) encoded by default. I’ve tried several times to change this, but couldn’t. So the search term that gets encoded in javascript gets handed through to my Callback method as is. And since Seaside is being developped on Pharo Smalltalk, it doesn’t worry about encodings and character sets, it just works.

There are a few things I did in the past to avoid the transport of unicode in my application, I had found a way to create the SQL database in a way that it explicitly uses the ISO-8859-15 code set, I configured the Seaside Application to use the IS=-8859-15 charSet and even patched WAFormTag to explicitly render each HTML form in ISO-8859-15, so that submits will always send back ISO-8859. That all went nice and good. But still, Ajax wouldn’t play according to the rules.

So back to Yahoo, stackoverflow and friends.

How I escaped the tragedy

There is yet another javascript method that seems to be exactly what I need: escape().

So since I had tried so many things before, it couldn’t hurt to try this one as well. So my javascript code ends up looking like this:

input = escape($(this).val());
if(input.length >= options.minLength) {

  if(jqXHR) {jqXHR.abort()}; //don''t wait for old results

  jqXHR=$.ajax({
    "url": options.url,
    "data": [
      options.queryFields,
      options.serializeCallback+"="+input,
    ].join("&"),
  }).always(drawTarget());

And there we are: all input is getting to the callback with correct German Umlauts, Characters like %,_,& etc. are decoded correctly and so it seems this story finally has a happy end. So escape doesn’t change the encoding of characters other than the ones it escapes.

Here’s a very good explanation of the differences between escape(), encodeURI() and encodeURIComponent that helped me look and experiment in the right direction: http://xkr.us/articles/javascript/encode-compare/

So what can we learn from this exercise?

  1. If you write a piece of javascript code that uses a String as a parameter of an AJAX call to any kind of Server, don’t forget to encode the String before sending it. Otherwise, some characters are going to break your server logic (because they will be interpreted as separators in the URL instead of part of a Parameter)
  2. There are three options you should check for encoding: encodeURI(), encodeURIComponent(), escape()
  3. If your server application does not handle unicode format and you want to avoid adding a decoding step on the server side (an option that I discarded for performance reasons right from the start), escape() will most probably be the best option for your case
  4. Sometimes, a bug goes undetected for months. When it finally shows up, it sends you on a journey through far-away countries of knowledge you never dreamed of. This can be fun and joy and eye-opening. But it can also be frustrating. Try to enjoy it😉