Well, a developer’s day is full of victories and waterloos. So I’ve had my developer’s day today.
After I had discussed my Server startup problems related to ports already in use, Marten came up with an explanation for VAST’s behavior on the VAST Support Group:
The reason for this behaviour is located in the method: SstTcpTransport>>#basicOpenListeningSocket and it seems to be correctly handled by VASmalltalk.
In this method a socket option (“SOREUSEADDR”) is set to true (if the configuration of the tcp network allows it, which is normally the case of VASmalltalk).
Hmm. Even if it’s not incorrect, I still don’t like it.
So my first waterloo was that Marten proved me wrong: this is not a bug or wrong behavior – from a very technical standpoint. I still thinnk the server pretends to be ready when it is not, and I don’t like it if my software tells me lies right into my face, especially if that even costs me lots of time…
The thread at the discussion group then continues with suggestions like changing the setting Marten mentioned in some very low level method for binding to a Socket. That is of course very deep down and you never know what you break if you open ALL TCP/IP Sockets as exclusive listeners.
So I decided to stick with my workaround that I described in my initial post on the subject (the reworked one, of course). It ran great – on Windows.
So here was my second waterloo. When I had XD packaged this and tried to start the same image twice on the Linux test machine, the Adaptor never left the #isStarting phase. The log contained lots of entries telling me we’re waiting for the adaptor to finish starting:
'2013-04-24 17:32:48,917: [INFO] Waiting for the Adapter to finish starting' '2013-04-24 17:32:49,519: [INFO] Waiting for the Adapter to finish starting' '2013-04-24 17:32:50,121: [INFO] Waiting for the Adapter to finish starting' '2013-04-24 17:32:50,723: [INFO] Waiting for the Adapter to finish starting' '2013-04-24 17:32:51,325: [INFO] Waiting for the Adapter to finish starting'
.. and so on. So the Socket behavior differs between Windows and Linux. On Windows, the retry loop was never run, the Adapter was not running immediately.
So I decided to extend my ugly workaround a little more with a retry count. I hate the code as it looks now, because it reminds me of really dark days in my programmer career. I’ll show it to you nevertheless, because it may be useful for others as well (I know, it’s not rocket science, but copying it from here is faster than thinking about it – you’re welcome!):
startServerAdaptor "self startServerAdaptor" |adaptor maxRetries retries| maxRetries := 5. retries := 1. [ (adaptor := WASstServerAdaptor port: self port) start. [adaptor isStarting and: [retries <= maxRetries]] whileTrue: [ EsLogManager info: ('Waiting for the Adapter to finish starting, retry count = %1' bindWith: retries asString). (Delay forSeconds: 0.5) wait. retries := retries +1. ]. adaptor isRunning ifFalse: [Error signal: ('Adaptor is not Running - maybe port %1 is already in use?' bindWith: self port asString)]. EsLogManager info: 'WASStServerAdaptor started on port: ' , port asString] on: Error do: [:ex | EsLogManager error: 'Seaside Adaptor couldn''t start due to: ' , ex description. ex pass]
This is not a victory to be proud of, but one that will help me waste far less time lookinkg for phantom bugs. Here’s what happens in the Server ssh session when I start the image a second time:
UIProcess reportError: Adaptor is not Running - maybe port xxxxx is already in use? Dumping walkback to file: walkback.log
And the server process is immediately exiting. When I look into the application log I can see the server didn’t start because the port was in use. Great! Why not be happy about a small victory that may lay the ground for faster progress. It’s not much ado about nothing, and II’m pretty sure it pays back one day.
