Latest Tweets

 

Modelling Yokes in the App Engine datastore

In my first technical post about Yoke and I thought I’d describe how I tackled building a scalable system for linking (yoking) users and items using App Engine.

Links between Users and Items are called Yokes, they are first class objects and are stored in the datastore as entities. This allows them to contain indexable information about how items relate to users.

In a SQL database this would be modelled naturally with two one-to-many relationships from Users to Yokes, and from Items to Yokes. The equivalent data model in App Engine involves adding two key properties to Yoke entities that associate each of them with a User and an Item. In App Engine there is the added consideration of entity groups. Entity groups are the unit of transactionality in App Engine and have a heavily constrained write rate as a result.

Since any given Item or User may be yoked extremely frequently, this has the consequence that Yoke entities must be root entities and cannot therefore partake in a transaction that also modifies the properties of an Item or User. But why would a datastore transaction which, say, creates a Yoke need to modify properties on the User or the Item or some other entity?

App Engine lacks the familiar SQL aggregation operations such as COUNT, SUM and AVG, so answering queries such as “how many times has this item been yoked?” requires that the aggregation be performed by the application by maintaining a counter in the datastore, and this necessarily involves at least one other entity. To also answer queries like “how many times has this user been yoked?” naturally requires two counters.

On what entities should we store our counters? Naively, we might think that the User and Item entities would be good choices, but this is not the case. To see why, it’s necessary to look at the distribution of yokes. Specifically, some items/users will be extremely popular and will be yoked very frequently. This is a problem if we store counters as part of the User/Item entities because those entities will become heavily contended.

To overcome this we need a new kind of entity, which the Yoke platform calls a Standing. Standings aggregate information about Yokes, for both Users and Items. So when the Yoke platform records a new Yoke it actually needs to create or update three different entities: the Yoke, the User’s Standing and the Item’s Standing. The process by which this occurs is necessarily quite complicated on App Engine because:

  1. In the HR implementation of the datastore, eventual consistency makes it difficult to avoid creating the same entities twice.
  2. A failure might occur at any time during (or between) any operation on any entity.
  3. The Yoke server application needs to stay responsive, even in the case of high datastore latency.

To help meet these requirements, Yoke transactionally enqueues a task when it creates (or deletes) a Yoke entity. The task wraps-up the work of updating the Standings. This means that the Yoke is created (or deleted) immediately, but the work of reconciling the Standings is deferred (usually only briefly), can take as long as necessary (up to 30mins), and will be automatically retried in the event of an error. These are standard characteristics of tasks in App Engine, unfortunately tasks also introduce a couple more complications:

  1. Tasks can execute in any order, so the code that deals with standings needs to handle out-of-order updates (eg. a count might temporarily be negative).
  2. The same task may occasionally fire multiple times, so it must be possible to identify that the task has already run (even if only partially completed).

To handle this, each task that adjusts a Standing is allocated an ID. At each step in the process of adjusting the Standing (including the first step which involves adding or deleting the Yoke) a temporary Marker is created containing the same ID. Before the task starts its work, it attempts to get each possible Marker from the datastore and, based on which Markers are present, it resumes its work at a particular step. After the task has finished its work, all the Marker entities are deleted.

Since it’s a possibility (though only a remote one) that a task may fail to clean up all of its Markers, a cron job sweeps through periodically and deletes any Markers it identifies as being stale. The only loose-end this leaves is the possibility that two tasks will ‘race’ to create the same Standing entity (creation of Standings is deferred since many Items and Users will never be yoked). This is handled by a separate mechanism that I may describe in a later post.

The absence of unrestricted transactions in the App Engine datastore does significantly increase the work needed to perform operations that one would consider to be basic in SQL databases or their equivalents, but any given approach (there is more than possible approach here) can be generalized and reused. So perhaps its best to think of App Engine as providing a lower-level persistence service that can then be tailored to meet specific needs.