Glorp and embedded objects – how null is nil and why not?

Hi, it’s been a while. I am still trying to find the time and energy to write posts more regularly. One of the things I had planned was to write about things I learned, especially when I learn something new about (or even solve a problem with) Glorp, my favorite Object-Relational mapper.

It’s almost New Year’s Eve, so this post is proof that it’s never too late to work on your new year’s resolutions, even if they were made a few years ago. So, to make things complete on this side thread: Happy New Year 2022 everybody! Let’s hope we’re finally getting ahead of the pandemic wave and get back to a new normal life.

But now for the problem I am proud to have solved on my own 😉

I have this Smalltalk class named DateRange in my Application. It only has two instance variables: #startDate and #endDate. But it bundles a lot of logic having to do with whether a DateRange overlaps with another one, if contains a Date, starts or ends before or after a certain Date or another DateRange, can be used to sort objects by the DateRange it covers, etc. You get the picture.

The business logic of this DateRange is so helpful in my domain (Accounting) and occurs so often that I wrote this Class and was looking for ways to store DateRange instances in our relational database as two columns in the table of the containing business objects. So a business year has a start and end date, the duration of usage of some good you use in your business has a start and an end date, a Tax Report has a startDate and and endDate. But it makes absolutely no sense to store all DateRanges in their own DateRange table in the database and use 1:n relations to load/store these DateRange objects. It is much better to store the #startDate and #endDate as columns in the BusinessYear table or the TaxReport table, but still have Glorp retrieve the combination of these two attributes as an instance of DateRange.

The good news: Glorp can do that. It Provides an EmbeddedValueToOneMapping. This Mapping is super handy and clever and it even allows you to define the concrete column names in a table for your startDate and endDate in this concrete table, so that you can even store multiple DateRanges in a row if a business object has more than one DateRanges (like, say, a period of availability and a period of vailidity). Thus you can store multiple DateRanges in one row, each having separate column names, but still mapped to the same Class.

The Mapping in the Descriptor looks like this:

(aDescriptor newMapping: EmbeddedValueOneToOneMapping)
    attributeName: #availabiltyPeriod;
    fieldTranslation: (
        (Join new)
            addSource: (table fieldNamed: 'avail_start_date')
                target: ((self tableNamed: 'DATERANGE_EMBEDDED') fieldNamed: 'startDate');
            addSource: (table fieldNamed: 'avail_end_dat')
                target: ((self tableNamed: 'DATERANGE_EMBEDDED') fieldNamed: 'endDate');
            yourself).

For this to work, you also need a special mapping that knows how to convert two dates to a DateRange and back. It is called an embedded table mapping and looks like this:

tableForDATERANGE_EMBEDDED: aTable
  aTable createFieldNamed: 'startDate' type: platform date.

  aTable createFieldNamed: 'endDate' type: platform date.

This is extremely powerful and works great. You can store and load objects with DateRanges in instance variables and get DateRange instances back from the database.

…until the DateRange is nil. And by “is nil” I mean a DateRange that is not present in an object. In our example above, we have a business object whose #availabilityPeriod is nil instead of an instance of DateRange.

Glorp is clever enough to save this DateRange as two NULL values in the avail_start_date and avail_end_date columns. So far so good.

If you are as old as I am, you know that movies back in the 70ies and 80ies had time. By this I mean you could have sequences in movies where nothing happened for a minute or even longer. You just watched some object flying away from or towards you in space, or the camera showed you a long road through a desert and after 20 seconds or so you could see a car breaking out of the horizon and you watched it while it drove to the current camera position for 50 seconds, diving down some hills, diving out of our view and come back to view a few times while growing bigger with each hill.

They don’t make movies like that any more, but I like the idea that someone is still reading and wants to learn more about my Glorp problem, so take this as the long winding road leading up to the climax of my post 😉

When Glorp reads back an embedded object, it will always instantiate on Object of the mapped class, in our case the DateRange. Only after instantiating the DateRange it will populate its instance variables.

So the end result of storing the content of the instance variable availabilityPeriod that is nil and reading it back will be a DateRange with a startDate of nil and an endDate of nil. Which, obviously, is not the same as nil. This may or may not cause all kinds of problems, especially if your business code does #isNil checks on your availabilityPeriod variable. Before you save an object to the database, availabilityPeriod is nil. As soon as you refresh the object or load it back from the database it will not be nil but a DateRange containing the startDate nil and the endDate nil.

Ouch.

Let’s watch the evil guy enter his car and drive back to wherever he came from for a while. In former times, we’d be watching this and enjoy this quiet moment and maybe feel a host of mixed emotions that the director of the movie wanted us to feel. These days, viewers aren’t trained to enjoy such scenes any more. Maybe you take this moment to think about the consequences of Glorp’s strange behaviour and ways to possibly solve this problem instead.

I am far from being perfect as a developer, but here are options that I came up with:

  1. I could try to make DateRange behave like nil (objects are all about polymorphism, right?) when all its instance variables are nil
  2. Wait: there must be an option to tell the mapping to not instantiate anything if all columns are NULL and return nil instead

While the first option sounds not too bad at first, let me tell you it is actually a really bad idea. Using DateRange will turn into a game of Minesweeper without the hints on uncovered fields. Almost every method in a DateRange will have to start with the philosophical problem of whether this DateRange is possibly not actually a real DateRange but something nillish. Or somethingd like that. Let’s not go into the details here, you’re still reading, so I assume you accept I tried and proved the idea to be bad.

Of course I started my journey with option 2. of the above list. I was looking for ways to tell Glorp to not instantiate a DateRange if both startDate and endDate are nil (or better: their respective columns in the DB are NULL). But I couldn’t find any attribute in EmbeddedValueToOneMapping that is capable of doing it. (If you don’t believe that I looked and also searched for help on this, follow this link to see I did). And I gave up.

Until I encountered this problem a second time, and this time I wasn’t ready to give up.

And it turns out there is a way, but it doesn’t work by configuring the mapping.

I guess you are not interested in whether the bad guy’s car is red or what brand it is, so I’ll skip this section and come right to the solution. I’ll even save you from reading what I did to find it. It’s an interesting story per se, but you’re not paying me for anecdotes, are you?

During the building phase of an Object, each newly created and populated instance gets send the message #glorpPostFetchValidate.

If this method answers anything else than self (which, as you know, is the default return value of every Smalltalk method), Glorp will populate the mapped instance variable with nil. So I implemented this instance method in DateRange:

glorpPostFetchValidate: aSession
	"This allows us to do post-read notification of the objects. Note that if this method explicitly returns a false, then we will treat that as meaning that the object is invalid and should not be read. Yes, this is kind of a hack."
	
	(self startDate  isNil and: [self endDate isNil]) ifTrue: [^false ]

So if I now read objects with DateRange mappings, the objects will contain a real nil as value in their instance variables.

myDbSession read: Product where: [:id| id = 5483].

Now has an availabilityPeriod of nil. All existing code still works

You’ve reached the end of the story. Thanks for staying and reading. I know there’s little plot for a long post. Like in a cult movie from the past.

But wait, there is something strange I would like to mention.

Remember the movie Coming To America starring Eddie Murphy as Prince Akeem Joffer? If you stood up when the end credits start scrolling over the screen, you missed the best part. These jokes after the credits still occur in movies today, so even if you are young enough not to know the movie, you probably know the parts of Ice Age when Scrat fights the facts of life after the credits. If you don’t, never mind.
There is a part of the EmbeddedValueToOneMappings that leaves me stunned.

If you query the database with a query like this:

session read: Product where: [:p| p availabilityPeriod = nil].

Glorp still knows how to convert this to correct SQL, because it will create a statement like this:

SELECT t1.id, ..., t1.AVAIL_START_DATE, t1.AVAIL_END_DATE
 FROM PRODUCT t1
 WHERE ((t1.AVAIL_START_DATE IS NULL) AND (t1.AVAIL_END_DATE IS NULL))

So Glorp is pretty much aware of what nillishness of an embedded object means. It just doesn’t use this knowledge at read/build time. Or I am still missing something – maybe someone can enlighten me.

I wonder if this is more a bug or shortcoming in Glorp or if I am just expecting too much…