Dynamic Web Development with Seaside

25.1Image-Based Persistence

While a Smalltalk image should not really be used as an artifact of code management (you should use Monticello packages, change-sets, Store packages to manage your code elements and build your image from packages), it can be used to store objects. With some precaution you can use the image as a simple and powerful object-oriented database. Therefore you can delay the need to hook up a database during much of your development and often your deployment. The point is to find the adequate solution for your problem.

On his blog Ramon Leon advocates that not all applications need a relational-database back-end, and he explains some of the advantages of a lighter-weight approach at http://onsmalltalk.com/simple-image-based-persistence-in-squeak/.

Understanding the right level of database is important since it will lower the stress on the development. This is why solutions like Prevayler based on the Command design pattern have emerged over the years. Such approaches mimic the notion of the Smalltalk image, even if they offer a better store granularity.

Not directly using a relational database will also ease the evolution of your application which in case of the prototyping phase will certainly do. In addition, working with full objects all the way down is more productive. So let’s have a look at image storage mechanisms.

The simplest approach is the following: you save your image. Now if the image crashes because for example your disc is full, you are in trouble. The second level is to perform several backups. Later on you can switch to an object-oriented database approach such as GOODS, Magma or GemStone. Of course saving an image does not work well if you have to share data between different applications not running in the same image. So you get the simplicity and the limits of simplicity too.

Saving an image. The expression SmalltalkImage current saveSession saves an image, i.e., all the objects that are accessible in your system. Now based on that we can build a small utility class. Let us define ImageSaver as a class for saving the image.

Object subclass: #ImageSaver
instanceVariableNames: ''
classVariableNames: ''
poolDictionaries: ''
category: 'ImageSaver'
ImageSaver class>>saveImage
SmalltalkImage current saveSession

Now the question is when do we save our data. For the ToDo application, each time an item is changed, added or removed would be a possibility. Having an explicit save button is another solution. We let you decide for your application.

On a Mac Book pro with around ten applications running in parallel, it takes about 1100 ms to save the Seaside image which highly depends on the size of the image. Therefore, this will have an influence on choice. For two lines of code, this is a good tradeoff. Now we will use the solution proposed by Ramon Leon to improve the robustness on crashes of the approach.

Backing Up images. Using the image itself as a database is not free of problems. An average image is well over 30 megabytes, saving it takes a bit of time, and saving it while processing http requests is a risk you want to avoid. In addition you want to avoid having several processes saving the image.

ReferenceStream provides a solution to serialize objects to disk. On every change you just snapshot the entire model. Note that this isn’t as crazy as it might sound, most applications just don’t have that much data. If you’re going to have a lot of data, clearly this is a bad approach, but if you’re already thinking about how to use the image for simple persistence because you know your data will fit in ram, here’s how Ramon Leon does it.

We define a simple abstract class that you can subclass for each project. With a couple of lines you get a Squeak image-based persistent solution which is fairly robust and crash proof and more than capable enough to allow you just to use the image without the need for an external database.

Object subclass: #SMFileDatabase
instanceVariableNames: ''
classVariableNames: ''
poolDictionaries: ''
category: 'SimpleFileDb'
SMFileDatabase class
instanceVariableNames: 'lock'

All the methods that follow are class-side methods. First, we’ll need a method to fetch the directory where rolling snapshots are kept. Note that we use the name of the class as the directory entry.

SMFileDatabase class>>backupDirectory
^ (FileDirectory default directoryNamed: self name) assureExistence.

The approach here is simple, a subclass should implement repositories to return the root object to be serialized. Therefore we often just return an array containing the root collection of each domain class.

SMFileDatabase class>>repositories
self subclassResponsibility

The subclass should also implement restoreRepositories: which will restore those repositories back to wherever they belong in the image for the application to use them.

SMFileDatabase class>>restoreRepositories: someRepositories
self subclassResponsibility

Should the image crash for any reason, we want the last backup to be fetched from disk and restored. So we need a method to detect the latest version of the backup file, which we will tag with a version number in when saving.

SMFileDatabase class>>lastBackupFile
^ self backupDirectory fileNames
detectMax: [:each | each name asInteger]

Once we have the file name, we’ll deserialize it with a read-only reference stream.

SMFileDatabase class>>lastBackup
| lastBackup |
lastBackup := self lastBackupFile.
lastBackup ifNil: [ ^ nil ].
^ ReferenceStream
readOnlyFileNamed: (self backupDirectory fullNameFor: lastBackup)
do: [ :f | f next ]

This requires you extend the class ReferenceStream with readOnlyFileNamed:do: as follows. This way you do not have to remember to close your streams.

ReferenceStream class>>readOnlyFileNamed: aName do: aBlock
| file |
file := self oldFileNamed: aName.
^ file isNil
ifFalse: [ [ aBlock value: file ] ensure: [ file close ] ]

Now we can provide a method to actually restore the latest backup. Later, we will make sure this happens automatically.

SMFileDatabase class>>restoreLastBackup
self lastBackup
ifNotNilDo: [ :backup | self restoreRepositories: backup ]

We provide a hook with a default value representing the number of old versions.

SMFileDatabase class>>defaultHistoryCount
^ 15

Now we define a method trimBackups that suppresses the older versions so that we do not fill up the disc with more data than needed.

SMFileDatabase class>>trimBackups
| entries versionsToKeep |
versionsToKeep := self defaultHistoryCount.
entries := self backupDirectory entries.
entries size < versionsToKeep ifTrue: [ ^ self ].
((entries sortBy: [ :a :b | a first asInteger < b first asInteger ])
allButLast: versionsToKeep)
do: [ :entry | self backupDirectory deleteFileNamed: entry first ]

Note that you can change this strategy and keep more versions.

Serializing Data. Now we are ready to actually serialize the data. Since we want to avoid multiple processes to save our data at the same time, we will invoke trimBackups within a critical section, figure out the next version number, and serialize the data (using the method newFileNamed:do:), ensure to flush it to disk before continuing. Let’s define the method newFileNamed:do: as follows.

ReferenceStream class>>newFileNamed: aName do: aBlock
| file |
file := self newFileNamed: aName.
^ file isNil
ifFalse: [ [ aBlock value: file ] ensure: [ file close ] ]
SMFileDatabase class>>saveRepository
| version |
lock critical: [
self trimBackups.
version := self lastBackupFile isNil
ifTrue: [ 1 ]
ifFalse: [ self lastBackupFile name asInteger + 1 ].
ReferenceStream
newFileNamed: (self backupDirectory fullPathFor: self name) , '.' , version asString
do: [ :f | f nextPut: self repositories ; flush ] ]

So far so good, let’s automate it. In Squeak, we can register classes so that their method shutDown: is called when the image is quit and startUp: when the image is booting. Using this mechanism, we can make sure that when the image is saved a backup is automatically performed and will be automatically restored at startup time. This way if your computer crashes, relaunching the image will automatically load the latest backup.

We’ll add a method to schedule the subclass to be added to the start up and shutdown sequence. Note that you must call this for each subclass, not for this class itself. This method also initializes the lock and must be called before saveRepository since this is cleaner. To achieve this behavior, we use the addToStartUpList: and addToShutDownList: messages as follows:

SMFileDatabase class>>enablePersistence
lock := Semaphore forMutualExclusion.
Smalltalk addToStartUpList: self.
Smalltalk addToShutDownList: self

So on shutdown, if the image is actually going down, we just save the current data to disk by specializing the method shutDown:.

SMFileDatabase class>>shutDown: isGoingDown
isGoingDown ifTrue: [ self saveRepository ]

And on startup we can restore the last backup by specializing the method startUp:.

SMFileDatabase class>>startUp: isComingUp
isComingUp ifTrue: [ self restoreLastBackup ]

Now, if you want a little extra snappiness and you’re not worried about making the user wait for the flush to disk, we’ll add little convenience method for saving the repository on a background thread.

SMFileDatabase class>>takeSnapshot
[self saveRepository]
forkAt: Processor systemBackgroundPriority
named: 'snapshot: ' , self class name

Now for the ToDo application. We create ToDoFileDatabase as a subclass of the class SMFileDatabase.

SMFileDatabase subclass: #ToDoFileDatabase
instanceVariableNames: ''
classVariableNames: ''
poolDictionaries: ''
category: 'ImageSaver'

We make sure that the persistency is enabled by specializing the class initialize method as follows:

ToDoFileDatabase class>>initialize
"self initialize"
self enablePersistence

Now the list of items is the only root of our object model so we specify it as entry point for the store in the repositories method.

ToDoFileDatabase class>>repositories
^ ToDoList default

Since we need a way to change the current list of todo items we extended the class with the method default: that is defined as follows:

ToDoList class>>default: aToDoList
Default := aToDoList
ToDoFileDatabase class>>restoreRepositories: someRepositories
ToDoList default: someRepositories

We modify the method renderContentOn: of the ToDoListView to offer the possibility of saving.

ToDoListView>>renderContentOn: html
html heading: self model title.
html form: [
html unorderedList: [ self renderItemsOn: html ].
html submitButton
text: 'Save' ;
callback: [ ToDoFileDatabase saveRepository].
html submitButton
callback: [ self add ];
text: 'Add' ].
html render: editor

The expression ToDoFileDatabase restoreLastBackup lets you restore the latest backup.

This solution offers a simple persistency mechanism that is more robust and easier than just saving an image. It works for those small projects where you really don’t want to bother with a real database. Just sprinkle a few MyFileDbSubclass saveRepository or MyFileDbSubclass takeSnapshot’s around your application code whenever you feel it is important, and you’re done.

On the one hand, saving the image is easy, but on the other it saves all the data. Let us now take a look at other approaches that can select what is saved.

Copyright © 19 March 2024 Stéphane Ducasse, Lukas Renggli, C. David Shaffer, Rick Zaccone
This book is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 license.

This book is published using Seaside, Magritte and the Pier book publishing engine.