CQRS/ES with Scala, Akka and Lift

Wow, more than a year has passed since the last blog post. Time for an update.

When we started creating fleetdna.com, we iterated quickly using the ORM built into Lift: Mapper. It is a fairly simple (in a good way!) ORM based on the Active Record pattern. It comes with builtin support for things like CRUD screens etc. This allowed us to create functionality fast, essential in a new company trying to get a product to market.

Fast forward a few years. As product complexity grows, the simplicity of Mapper and the tight coupling of data to UI was becoming a burden. It was not possible to create the more advanced queries without running into the N+1 problem. We also needed to present and edit information in the UI which is not directly related to single records. Basically we are moving towards a more task based UI.

Most of our data is timeline based and we spent some time adding temporal database concepts to Mapper. And while it worked, an Active Record based ORM is not really suited to these scenarios.

So we started to look at Event Sourcing for the core parts of the application and haven’t looked back. There are numerous good resources on ES, so I wont’t go into the details here. The basic idea is that instead of recording the state of an object, we record the (business) events that has happened. So the current state of the object is a left fold of all the events that is stored.

It’s important to realise that we haven’t chosen this architecture because we need to scale to very high loads. We don’t. The application we provide (SaaS based fleet management) is very much a niche B2B product. We have chosen the ES architecture because we believe it provides a clean approach to the business problems we are trying to solve. Also, we still have some functionality using Mapper, but this is mostly admin-style data such as users, tenants and other base data.

Currently, our once simple one-project product now comprises 8 subprojects (3 Bounded Contexts, 1 Shared Kernel, UI and some general purpose libraries which may someday be open-sourced :-).) In the following I’ll try to highlight some of the choices we’ve made and how we’ve used some of Scala’s features to improve the code.

Domain Modelling

We have started using Domain Driven Design (DDD) for the core features of the product. We  employ a Shared Kernel that contain common base data used throughout the application. The kernel is mostly implemented using Mapper.

Immutable Domain Models

For the core domain, we are creating immutable domain models using Scala’s type system along the lines described in the excellent blog posts by Erik Rozendaal and Martin Krasser.

Consider the following business problem:

  • We need to keep track of vehicles
  • A vehicle can be Live or Expired
  • A Live vehicle can be expired
  • An Expired vehicle can be revived

The simple, and perhaps standard, way to model this in OOP is something along the lines of this:

class Vehicle {
  private var isLive: Boolean
  def expire {
    if (!isLive) throw new RuntimeException("Vehicle not live")
    isLive = false
  }
  def revive {
    if (isLive) throw new RuntimeException("Vehicle already live")
    isLive = true
  }
}

A better way is to use the type system to encode this constraint: Make it a compile time error trying to expire a Vehicle that is not live:

sealed trait Vehicle {
  def licensePlate: String
  def VIN: String
  ...
}

case class LiveVehicle extends Vehicle {
  def expire: ExpiredVehicle = {...}
}

case class ExpiredVehicle extends Vehicle {
  def revive: LiveVehicle = {...}
}

The main benefit of this is that we end up with Intention Revealing interfaces since it is simply not possible to call e.g. revive on a LiveVehicle.

In practice, there are of course constraints that cannot be encoded in the type system. We are using Validation from the Scalaz project to capture either the DomainValidation failures or an updated aggregate. Since we use ES, we also need to capture the events that result from executing a business action. We combine both of these in a State monad similar to Martin’s Update.  So a domain method may end up looking like this

sealed trait VehicleEvent extends DomainEvent
case class VehicleExpired(when: LocalDate) extends VehicleEvent
sealed trait Vehicle extends Aggregate[VehicleEvent, Vehicle] {...}
case class LiveVehicle extends Vehicle {
  def expire(when: LocalDate): Update[Vehicle, VehicleEvent] = {
    if (okToExpireVehicle) update(VehicleExpired(when))
    else reject("error.uhhohh")
  }
}

Each aggregate also has a handle method that needs to handle the events passed. The update methods applies the events (by invoking handle) to the current aggregate and returns the resulting aggregate as well as the events, all wrapped in Update.

def accept[E, A](event: E, result: A) =
  Update[E, A](events => (event :: events, Success(result)))
def update(event: VehicleEvent) = accept(event, handle(event))
def handle = {
  case VehicleExpired(at) => ExpiredVehicle(...)
  case _ => this
}

Encoding our aggregate Vehicle as an ADT as shown above has a number of other benefits. When retrieving the aggregate from the repository, we get an object of type Vehicle. This is fine if we only need the functionality common to all the states a Vehicle can be in. If we need functionality specific to e.g a LiveVehicle, we can do a pattern match. The compiler will then warn if there are cases we are not handling. This comes in handy when adding new states (e.g. new subtypes of Vehicle). This of course means we should never pattern match on “_” 🙂

Tagged Types

We have encoded most fields in our domain models with unique types unless it’s clear we’re just storing e.g numbers without unit or things like names. It’s fairly easy to create ADTs in Scala.

But for types that are not really product types, it seems a bit clunky to define a case class just to hold a single value.

case class CompanyId(id: Long)
val x = CompanyId(42)
...
println("The id is %d".format(x.id))

Fortunately it’s possible to avoid this using a technique called Tagged Types. Scalaz includes this in a slightly different way,  so we end up doing something like this:

trait CompanyIdTag
type CompanyId = Long @@ CompanyIdTag
def CompanyId(id: Long) = Tag[Long, CompanyIdTag](id)
...
val x = CompanyId(42)
println("The id is %d".format(x))

We have a number of Entities maintained in Mapper. They are referenced by Longs. Aggregates are referenced by UUID. We encode all of these using tagged types so the compiler will complain when we mix e.g. CompanyIds with UserIds. As we’ll see later we also utilise this type information to drive the rendering.

Event Storage and Projection

To minimise the number of operational dependencies we’ve implemented a simple persistent event storage and queuing system on top of PostgreSQL. It’s loosely based on the concepts described by Greg Young in his CQRS material. Events are simply serialised using JSON.

Read Models

You don’t normally query your event storage to extract information for e.g. UI and reporting. Instead you generate a number of read models from the events that are generated. This nicely separates the updates caused by business events (which will only write to the event store) from the reading needed to generate UIs and the various reporting needs.
This is the essence of the CQRS pattern and allows for both scalability on read-heavy systems (it’s much easier to scale the read side than the write side) but also allows the read-models to change independently of the write model (as long as your events contain the required information.)

In our application, each of the Bounded Contexts define an EventProjector which is responsible for handling the events relevant for generating all the read models used in this BC. An EventProjector is simply an Akka actor that responds to the events being generated and updates the corresponding read models.

For now, each of our read-models consists of a single table.  We define a case class for each table and  use Squeryl to read & write the read-model. Having the model defined in code makes it very easy to generate test cases for the event projector.

Our queries are very simple since most of the information is denormalised during event-projection (there are a few exceptions where we reference entities in the Shared Kernel).

Rendering the UI

For each of the view models we create a case class representing the data to display. Using Squeryl, we can query the read-models without too much ceremony, reusing the tagged types used in the domain.

We have defined a Displayable type class to generate either a String or HTML representation of a value:

trait Displayable[T] {
  def toDisplayString(in: T): String
  def toDisplayHtml(in: T): NodeSeq = Text(toDisplayString(in))
}

Since we’re using tagged types for all values representing an id, we can easily display rich content even for these values. The latest SNAPSHOT versions of Lift also support type class based rendering when using the CSS transforms. We can derive a CanBind type class from a Displayable

 implicit def displayableToCanBind[T : Displayable]: CanBind[T] = new CanBind[T] {
    def apply(it: => T)(ns: NodeSeq) = implicitly[Displayable[T]].toDisplayHtml(it)
  }

This makes it quite easy to render a rich UI based on the type information.

case class QuoteFile(leasingCompany: LcId, file: UploadedFileId)
val qfs: Seq[QuoteFile] = ...
".quoteFileRow" #> (qfs.map(qf =>
".lc" #> qf.leasingCompany &
".file" #> qf.file))

The above code looks up the leasing company id and resolves the name, renders the name with a link to further details. It also generates a download link for the uploaded file id which, when clicked, will generate an authenticated S3-url for downloading the file.

Future improvements

We’ve been running with this new architecture in production for almost a year. During this time, we’ve added new functionality without too much fuss and the underlying structure still feels clean and solid. But there are of course always things that can be improved :-).

Since it may take some time for the read models to be updated, we’ve are doing some optimistic client side updates when initiating an action. This means the UI may come out of sync in case of errors etc. So I’ve toyed with the idea of making most of the UI reflect the read-model state using comet (which has excellent support in Lift!).

I’ve also been looking at AngularJS as a way to push some of the rendering to the client. At that point it may make sense simply to make the read models JSON files that are accessed directly by the client. But I’m still working on this one…..

Martin has started the Eventsourced project for doing event sourcing with Akka. This looks interesting but so far our homegrown solution seems to work fine for our needs. At some point we may need more features and it may make sense to look at this, but as always it’s a tradeoff. It adds complexity both in code and operational dependencies.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: