Make Delegation Work in Python

The Delegation Problem

Let’s look at a problem that all coders have faced; something that I call the delegation problem. To explain, I’ll use an example. Here’s an example class you might see in a content management system:

class WebPage:
    def __init__(self, title, category="General", date=None, author="Jeremy"):
        self.title,self.category,self.author = title,category,author
        self.date = date or datetime.now()
        ...

Then, you want to add a subclass for certain types of page, such as a product page. It should have all the details of WebPage, plus some extra stuff. One way to do it would be with inheritance, like this:

class ProductPage(WebPage):
    def __init__(self, title, price, cost, category="General", date=None, author="Jeremy"):
        super().__init__(title, category=category, date=date, author=author)
        ...

But now we’re violating the Don’t Repeat Yourself (DRY) principal. We’ve duplicated both our list of parameter names, and the defaults. So later on, we might decide to change the default author to “Rachel”, so we change the definition in WebPage.__init__. But we forget to do the same in ProductPage, and now we have a bug 🐛! (When writing the fastai deep learning library I’ve created bugs many times in this way, and sometimes they’ve been extremely hard to track down, because differences in deep learning hyper-parameters can have very subtle and hard to test or detect implications.)

To avoid this, perhaps we could instead write it this way:

class ProductPage(WebPage):
    def __init__(self, title, price, cost, **kwargs):
        super().__init__(title, **kwargs)
        ...

The key to this approach is the use of **kwargs. In python **kwargs in a parameter like means “put any additional keyword arguments into a dict called kwarg. And **kwargs in an argument list means “insert all key/value pairs in the kwargs dict as named arguments here”. This approach is used in many popular libraries, such as matplotlib, in which the main plot function simply has the signature plot(*args, **kwargs). The plot documentation says “The kwargs are Line2D properties” and then lists those properties.

It’s not just python that uses this approach. For instance, in the R language the equivalent to **args is simply written ... (an ellipsis). The R documentation explains: “Another frequent requirement is to allow one function to pass on argument settings to another. For example many graphics functions use the function par() and functions like plot() allow the user to pass on graphical parameters to par() to control the graphical output.This can be done by including an extra argument, literally ‘…’, of the function, which may then be passed on”.

For more details on using **kwargs in python, Google will find you many nice tutorials, such as this one. The **kwargs solution appears to work quite nicely:

p = ProductPage('Soap', 15.0, 10.50, category='Bathroom', author="Sylvain")
p.author
'Sylvain'

However, this makes our API quite difficult to work with, because now the environment we’re using for editing our Python code (examples in this article assume we’re using Jupyter Notebook) doesn’t know what parameters are available, so things like tab-completion of parameter names and popup lists of signatures won’t work 😢. In addition, if we’re using an automatic tool for generating API documentation (such as fastai’s show_doc or Sphinx), our docs won’t include the full list of parameters, and we’ll need to manually add information about these delegated parameters (i.e. category, date, and author, in this case). In fact, we’ve seen this already, in matplotlib’s documentation for plot.

Another alternative is to avoid inheritance, and instead use composition, like so:

class ProductPage:
    def __init__(self, page, price, cost):
        self.page,self.price,self.cost = page,price,cost
        ...

p = ProductPage(WebPage('Soap', category='Bathroom', author="Sylvain"), 15.0, 10.50)
p.page.author
'Sylvain'

This has a new problem, however, which is that the most basic attributes are now hidden underneath p.page, which is not a great experience for our class users (and the constructor is now rather clunky compared to our inheritance version).

Solving the problem with delegated inheritance

The solution to this that I’ve recently come up with is to create a decorator that is used like this:

@delegates()
class ProductPage(WebPage):
    def __init__(self, title, price, cost, **kwargs):
        super().__init__(title, **kwargs)
        ...

…which looks exactly like what we had before for our inheritance version with kwargs, but has this key difference:

print(inspect.signature(ProductPage))
(title, price, cost, category='General', date=None, author='Jeremy')

It turns out that this approach, which I call delegated inheritance, solves all of our problems; in Jupyter if I hit the standard “show parameters” key Shift-Tab while instantiating a ProductPage, I see the full list of parameters, including those from WebPage. And hitting Tab will show me a completion list including the WebPage parameters. In addition, documentation tools see the full, correct signature, including the WebPage parameter details.

To decorate delegating functions instead of class __init__ we use much the same syntax. The only difference is that we now need to pass the function we’re delegating to:

def create_web_page(title, category="General", date=None, author="Jeremy"):
    ...

@delegates(create_web_page)
def create_product_page(title, price, cost, **kwargs):
    ...

print(inspect.signature(create_product_page))
(title, price, cost, category='General', date=None, author='Jeremy')

I really can’t overstate how significant this little decorator is to my coding practice. In early versions of fastai we used kwargs frequently for delegation, because we wanted to ensure my code was as simple as possible to write (otherwise I tend to make a lot of mistakes!) We used it not just for delegating __init__ to the parent, but also for standard functions, similar to how it’s used in matplotlib’s plot function. However, as fastai got more popular, I heard more and more feedback along the lines of “I love everything about fastai, except I hate dealing with kwargs”! And I totally empathized; indeed, dealing with ... in R APIs and kwargs in Python APIs has been a regular pain-point for me too. But here I was, inflicting it on my users! 😯

I am, of course, not the first person to have dealt with this. The Use and Abuse of Keyword Arguments in Python is a thoughtful article which concludes “So it’s readability vs extensibility. I tend to argue for readability over extensibility, and that’s what I’ll do here: for the love of whatever deity/ies you believe in, use **kwargs sparingly and document their use when you do”. This is what we ended up doing in fastai too. Last year Sylvain spent a pleasant (!) afternoon removing every kwargs he could and replaced it with explicit parameter lists. And of course now we get the occassional bug resulting from one of us failing to update parameter defaults in all functions…

But now that’s all in the past. We can use **kwargs again, and have the simpler and more reliable code thanks to DRY, and also a great experience for developers. 🎈 And the basic functionality of delegates() is just a few lines of code (source at bottom of article).

Solving the problem with delegated composition

For an alternative solution, let’s look again at the composition approach:

class ProductPage:
    def __init__(self, page, price, cost): self.page,self.price,self.cost = page,price,cost

page = WebPage('Soap', category='Bathroom', author="Sylvain")
p = ProductPage(page, 15.0, 10.50)
p.page.author
'Sylvain'

How do we make it so we can just write p.author, instead of p.page.author. It turns out that Python has a great solution to this: just override __getattr__, which is called automatically any time an unknown attribute is requested:

class ProductPage:
    def __init__(self, page, price, cost): self.page,self.price,self.cost = page,price,cost
    def __getattr__(self, k): return getattr(self.page,k)

p = ProductPage(page, 15.0, 10.50)
p.author
'Sylvain'

That’s a good start. But we have a couple of problems. The first is that we’ve lost our tab-completion again… But we can fix it! Python calls __dir__ to figure out what attributes are provided by an object, so we can override it and list the attributes in self.page as well.

The second problem is that we often want to control which attributes are forwarded to the composed object. Having anything and everything forwarded could lead to unexpected bugs. So we should consider providing a list of forwarded attributes, and use that in both __getattr__ and __dir__.

I’ve created a simple base class called GetAttr that fixes both of these issues. You just have to set the default property in your object to the attribute you wish to delegate to, and everything else is automatic! You can also optionally set the _xtra attribute to a string list containing the names of attributes you wish to forward (it defaults to every attribute in default, except those whose name starts with _).

class ProductPage(GetAttr):
    def __init__(self, page, price, cost):
        self.page,self.price,self.cost = page,price,cost
        self.default = page

p = ProductPage(page, 15.0, 10.50)
p.author
'Sylvain'

Here’s the attributes you’ll see in tab completion:

[o for o in dir(p) if not o.startswith('_')]
['author', 'category', 'cost', 'date', 'default', 'page', 'price', 'title']

So now we have two really nice ways of handling delegation; which you choose will depend on the details of the problem you’re solving. If you’ll be using the composed object in a few different places, the composition approach will probably be best. If you’re adding some functionality to an existing class, delegated inheritance might result in a cleaner API for your class users.

See the end of this post for the source of GetAttr and delegates.

Making delegation work for you

Now that you have this tool in your toolbox, how are you going to use it?

I’ve recently started using it in many of my classes and functions. Most of my classes are building on the functionality of other classes, either my own, or from another module, so I often use composition or inheritance. When I do so, I normally like to make available the full functionality of the original class available too. By using GetAttr and delegates I don’t need to make any compromises between maintainability, readability, and usability!

I’d love to hear if you try this, whether you find it helpful or not. I’d also be interested in hearing about other ways that people are solving the delegation problem. The best way to reach me is to mention me on Twitter, where I’m @jeremyphoward.

A brief note on coding style

PEP 8 shows the “coding conventions for the Python code comprising the standard library in the main Python distribution”. They are also widely used in many other Python projects. I do not use PEP 8 for data science work, or for teaching more generally, since the goals and context are very different to the goals and context of the Python standard library (and PEP 8’s very first point is “A Foolish Consistency is the Hobgoblin of Little Minds”. Generally my code tends to follow the fastai style guide, which was designed for data science and teaching. So please:

Source code

Here’s the delegates() function; just copy it somewhere and use it… I don’t know that it’s worth creating a pip package for:

def delegates(to=None, keep=False):
    "Decorator: replace `**kwargs` in signature with params from `to`"
    def _f(f):
        if to is None: to_f,from_f = f.__base__.__init__,f.__init__
        else:          to_f,from_f = to,f
        sig = inspect.signature(from_f)
        sigd = dict(sig.parameters)
        k = sigd.pop('kwargs')
        s2 = {k:v for k,v in inspect.signature(to_f).parameters.items()
              if v.default != inspect.Parameter.empty and k not in sigd}
        sigd.update(s2)
        if keep: sigd['kwargs'] = k
        from_f.__signature__ = sig.replace(parameters=sigd.values())
        return f
    return _f

And here’s GetAttr. As you can see, there’s not much to it!

def custom_dir(c, add): return dir(type(c)) + list(c.__dict__.keys()) + add

class GetAttr:
    "Base class for attr accesses in `self._xtra` passed down to `self.default`"
    @property
    def _xtra(self): return [o for o in dir(self.default) if not o.startswith('_')]
    def __getattr__(self,k):
        if k in self._xtra: return getattr(self.default, k)
        raise AttributeError(k)
    def __dir__(self): return custom_dir(self, self._xtra)