The Proliferation of Registries


04.13.09 Posted in Courant News, Software Development by Max

As I’ve dis­cussed before, one of the core design tenets of Courant News was the abil­ity for news orgs to cus­tomize and add on to our core plat­form with­out hav­ing to mod­ify the code of the plat­form itself. While it is pos­si­ble to cre­ate a cohe­sive plat­form, it is more dif­fi­cult to allow out­side code to hook into it with­out actu­ally mod­i­fy­ing the plat­form itself.

One com­mon way, adopted by the Django built-in admin app, as well as  a num­ber of com­mon Django reusable apps like django-tagging and django-mptt, is that of a reg­istry sys­tem. I’ve been jok­ing with one of my Courant cohorts, Robert Baskin (@rsbaskin), on twit­ter about reg­istries, and I thought it was time to let every­one else in on the discussion.

An Exam­ple

One way to man­age func­tion­al­ity across var­i­ous con­tent types is to define the set of con­tent types in your set­tings file. A recent exam­ple is django-shorturls, which lets you gen­er­ate short URLs for your con­tent by speci­fiy­ing the con­tent types to expose with a given prefix/abbreviation.

# in settings.py
SHORTEN_MODELS = {
    'A': 'myapp.animal',
    'V': 'myapp.vegetable',
    'M': 'myapp.mineral'
}

This works won­der­fully when you have full con­trol of the code­base, but if we were to include django-shorturls in Courant itself, it would require news orgs to mod­ify our code* to add in their own cus­tom con­tent types or tweak how our stan­dard con­fig­u­ra­tion works.

You can solve this prob­lem by cre­at­ing a reg­istry sys­tem. Each model can then reg­is­ter itself with the URL short­en­ing app, pass­ing along the pre­fix para­me­ter of its choice. For example:

# in myapp/models.py
from django.db import models
from courant.core.shorturls import shorturls
 
class Planet(models.Model):
    name = models.CharField(max_length=100)
    ...
shorturls.register(Planet, 'P')

In the courant.core.shorturls app, it keeps track of all the mod­els that reg­is­ter with it, and then can use this reg­istry in its inter­nal code. It also means that if you want to change the pre­fix for, say, the built-in Arti­cle model from ‘A’ to ‘S’ (for story), you could do this:

# in myapp/models.py or any other place that will get automatically run by Django, such as an __init__.py
from courant.core.news.models import Article
from courant.core.shorturls import shorturls
 
shorturls.unregister(Article)
shorturls.register(Article, 'S')

In this man­ner you can tweak the default con­fig­u­ra­tion, while also hook­ing your own new con­tent types into our functionality.

The Pro­lif­er­a­tion Problem

The short URL app exam­ple above is a nice, clear exam­ple of the types of sit­u­a­tions where reg­istries make sense. But as I men­tioned in the begin­ning of this post, there are a num­ber of other apps that use reg­istries, both within Courant and in the gen­eral Django com­mu­nity. There’s the Django admin, django-tagging, django-mptt (hier­ar­chi­cal rela­tion­ships), and django-comment-utils on the out­side. We also have the get tag which I described pre­vi­ously and a reg­istry for the search sys­tem (upcom­ing post once a few bugs are fixed). Our Arti­cle model cur­rently looks like this (much simplified):

# courant/core/news/models.py
from django.db import models
from courant.core.discussions.moderation import moderator, CourantModerator
from courant.core.gettag import gettag
from courant.core.search import search
 
class Article(models.Model):
    heading = ...
    ...
moderator.register(Article, CourantModerator)
gettag.register(Article, name_field='heading')
search.register(Article,
                fields=('heading', 'subheading', 'summary', 'body'),
                filter_fields=('section', 'display_type','status'),
                date_field='published_at',
                use_delta=True)

We haven’t actu­ally writ­ten the short URLs app I described above, but likely will, in which case you can tack on yet another reg­is­tra­tion call there. As you can see, this starts to rapidly build up for the more com­monly used mod­els, although that’s admit­tedly a rather small per­cent­age of all of our models.

For those who fol­low Rob or I on twit­ter, you may have noticed us jok­ing about meta-registries, which would be a reg­istry to help man­age all of these indi­vid­ual reg­istries. How that would actu­ally work is up for debate, and is really noth­ing more than an inside joke (though not so inside anymore).

Alter­na­tives?

In light of this poten­tial prob­lem of run­away reg­istry cre­ation, we’ve been con­sid­er­ing some other options. For some of the more com­pli­cated reg­istries, like for search, we’d most likely be bet­ter served by going to a declar­a­tive syn­tax like mod­els them­selves. django-haystack has taken this approach, and I actu­ally much pre­fer it in many respects, and just haven’t yet got­ten around to build­ing some­thing sim­i­lar to work with django-sphinx (our search tool of choice; expla­na­tion of that deci­sion for that future post on search). It still requires a small reg­istry of its own, but the reg­is­tra­tion call is han­dled in a sep­a­rate file from the model itself, sim­i­lar to how the admin sys­tem works.

Another option could be to tack addi­tional options onto the mod­els’ inter­nal Meta classes and cus­tomize the python metapro­gram­ming that Django does to build python objects from your model def­i­n­i­tions. A reg­is­tra­tion process would still be occur­ing behind the scenes, but you wouldn’t be required to inter­act with it directly. For the above Arti­cle exam­ple, it might look like this (again, the sur­round­ing parts of the model are much simplified):

# courant/core/news/models.py
from django.db import models
from courant.core.discussions.moderation import CourantModerator
 
class Article(models.Model):
    heading = ...
    ...
 
    class Meta:
        # standard Django meta options
        ordering = '-published_at'
 
        # custom Courant meta options
        short_url_prefix = 'A'
        get_tag = {'name_field': 'heading'}
        moderator = CourantModerator
    ...

This is maybe slightly cleaner because you don’t need all the imports at the top of the file, but I’m not sure it gains you much in the end. I per­son­ally rather like the explic­it­ness of import­ing every­thing and man­u­ally reg­is­ter­ing it. Note that the get_tag meta option in this exam­ple uses a python dic­tio­nary because there are actu­ally a num­ber of other optional para­me­ters that you can pass to it, and you wouldn’t want to have a meta option for every sin­gle pos­si­ble para­me­ter. My biggest hes­i­ta­tion with going this route is that it prob­a­bly involves muck­ing with metapro­gram­ming and doing some behind-the-scenes magic, which I don’t really think is a favor­able cost/benefit tradeoff.

Con­clu­sion

I hope that explains some of my twit­ter ram­blings over the past cou­ple weeks, and gives some addi­tional insight into how Courant will enable cus­tomiza­tion and exten­sion of the plat­form with­out fork­ing the code. Please post com­ments if you have any ques­tions or thoughts or opinions.

* Tech­ni­cally, this exam­ple resides in the set­tings file, which the news org has full con­trol over any­ways and thus would have no prob­lem mod­i­fy­ing. But I think it makes more sense to define the pre­fixes with the mod­els they are related to, and not hid­den away in a setting.



Leave a Reply