Today I was on the hunt. And I finally shot the beast that killed hundreds of good men over the last decades or so.
Well, at least it seems it’s dead, and I don’t really know why. I know it’s dead and I was hunting for it, but I am not sure I pulled the trigger.
But let’s start at the beginning. Yesterday I had a shiny new version of my Seaside Application and I had cross packaged it for Linux and tested it on two different Linux machines. And it crashed. It crashed without much comment, the Server just didn’t respond any more, the process was dead and all it left for me was a vmtrap.log and an vmtrap.img file in the image directory. Funnily, it crashed very reliably, within only a few Web page requests, but it died in many different places. The vmtrap.log almost always indicated the image had gotten to some method that was not the same of any of the previous crashes.
The funny thing was that the packaged image of the day before ran nicely, without any crashes. It ran using the very same VM and all external libs and stuff. The only difference was that other image. The code changes were only in application code, no external calls, strange extensions to UndefinedObject or any evil class initialization stuff. Just a few bug fixes and stuff.
So to sum the situatuon up: the dev image on Windows ran, a dev image on Linux ran as well, the packaged image of the previous day ran – even on the same machines using the same path and vm and libs and ini and whatever. The VM crashed in various places, and I could not find any pattern why it may have crashed.
I even created two completely fresh 😄 images and repackaged and went through the .es files and whatnot. I also repackaged the image of the previous day and all was the same again: old one was good, new one was broken.
So I was somewhat out of luck. I asked the wizard of G and found a very old post that seemed to talk about the very same problem. It came up with the (crude?) theory that someone must have screwed bytecodes in the library (Maybe I should mention I use a Linux 😄 image for packaging while I develop on Windows, so it’s not related to any image problems. I throw images away very often to avoid such problems right from the start) and recompiling methods helped their problem.
So I thought there’s not much to loose. I fired up a Config Map Browser, browsed the Changes between previous day and current (incl. required maps) and simply changed all changed methods by entering a blank or newline somewhere in the Methods. Luckily, I hadn’t changed that much code yesterday, maybe 50 methods. Have I ever mentioned I really love envy?
And what can I say? I fired up the Linux VM, cross-packaged the new (unchanged, but recompiled) code and the problem seems to be solved. The packaged image is running and seems to be stable.
So what could have gone wrong? I am packaging once or twice a week and haven’t seen this ever before.
Ah, and before you ask: None of the methods I had changed were ever mentioned in the vmtrap.log files…
But here’s my theory (please take it with a grain of salt):
I remember two things that happened to me yesterday. One was that I added an instance variable with accessors (using RB) named ‘class’ which wasn’t a good idea, so I later renamed the variable (but not the Accessors) to ‘cssClass’. I know I later couldn’t version/release that class because it was “inconsistent withe the edition in the Library”, an issue that is easy to solve most of the times.
I remember I once had lots of variables and accessors to rename in a very poor code base and I’ve had similar problems with these accessor methods that obviously didn’t work right. The source code said it would be doing one thing, but the results of executing them was different… Back then, changing the accessor methods by hand and saving /compiling new editions of them solved the issue.
I am not sure this theory is any good and am not sure how I could ever build a test for this stuff or write a somewhat useful support case. I only hope I remember this post if I ever encounter such a strange problem again.
So what do we learn from this?
- Version your code frequently to have milestones to check for changes!
- Envy is one of your best friends when it comes to finding differences (handling them is another story and envy is not yet good at that)
- Package early. package and test the packaged code often so that you know when things go wrong (and so that you don’t have too many of these surprises the night before shipping date)
- Sometimes the fix of a bug doesn’t feel right. How can I know the next build tomorrow isn’t screwed again?