Object-capability Model
The object-capability model is a really interesting model of programming that I don’t think is well taught and certainly not well practiced, nor even encouraged by most modern programming languages.
Think about the concrete actions that your application can take: making a
web request, reading/
Imagine trying to track down all the places in a code-base where these
actions could be taken, because you’re trying to trace a bug or you’re
trying to see what actions a library call might make. In most modern
programming languages this is really hard. Let’s pick as an example just
making web requests. You might start by searching for all references to
HttpClient
(or similar class in your preferred language).
But what if it uses a different HTTP primitive, like
HttpWebRequest
? Or what if the usage of the primitive is
hidden inside one of the libraries you import, like a standard client
library for a particular service, and you don’t have the source code
available? It’s basically impossible to find everything, even for something
this important and this ubiquitous.
The problem is that the “make a web request” capability is available
statically to every part of your program, or any of its dependencies, via
public classes like HttpClient
. If I’m writing a piece of code
that needs to make a web request, nothing can stop me from doing it. I
don’t need to be given permission/
The way that the object-capability model handles this is to dictate how parts of a program can access each capability, and you can use this to track usage of a capability across your program. The object-capability model defines four ways that part of a program can access a capability:
-
Initial conditions: The runtime of your program may make certain
capabilities available to anybody/
everybody. This can be a global capability like most language primitives, or the availability can be limited to only parts of the program by classes marked as protected
,private
orinternal
. - Parenthood: If an object A creates another object B, then A usually gets a reference to B automatically. This means that A can likely access any capability that B can access.
-
Endowment: If an object A creates another object B, then A can pass B
any/
all references it wishes, e.g. via the arguments of the constructor. - Introduction: If an object A has a reference to other objects B and C, then A can give B a reference to C via one of B’s methods.
The reason why it’s so hard to track the “make a web request” capability in
most modern programming languages is that it’s widely available through
initial conditions—
Imagine instead if the only place that you were able to call the
HttpClient
constructor was in Program.cs
, and you
weren’t able to create subclasses. Basically, imagine if we limited the
availability of HttpClient
via initial conditions to a single
file.
If any other part of our program wants to use HttpClient
, it
would need to use some other mechanism for transferring
capabilities—HttpClient
would essentially need to request
permission for one via a constructor or method argument. We could track the
flow of HttpClient
s via constructor and method signatures,
knowing that they can only possibly originate in Program.cs
.
This is very similar to how dependency injection and inversion of control
work in object-oriented programming languages, but we usually only use
these techniques for our own code—
Another key point in the object capability model is that references to
objects should be unforgeable. Unless an object is given a reference
through one of the four means above—
Most programming languages don’t allow you to forge object references. Some
do, like how in C you can cast an arbitrary integer to a pointer to obtain
a reference to an object/
But not all capabilities are objects in your runtime, and not all references are object references. Think about access to a database or a web service endpoint, which usually have a connection string or a URL. This is a reference, but it’s just an ordinary string.
Your appsettings.json
or web.config
file is an
initial condition for obtaining such a reference to a database or web
service endpoint, and usually access to these values is limited to
Startup.cs
so you can track the flow of these references via
constructor and method arguments. From the perspective of the
object-capability model, this is great news.
Unfortunately they can also be hard-coded into the program source anywhere,
or reconstructed from a template and environment name, etc. These
references are forgeable—
Why does this all matter? So what if some arbitrary part of my code-base can make a web request without my knowledge? What’s the big deal? Well for one this could be a security problem if a library you depend on suddenly started exfiltrating data because it has full access to your filesystem and the internet. This isn’t a hypothetical threat either, something just like this happened with the event-stream NPM package.
It also makes it really hard to reason about program behaviour when you have no way of knowing which parts of your program have access to which capabilities. If your program has a bug and you identify the cause to be an unwanted call to another service or local process, and there’s no limit to which parts of your code-base could be doing this, it’s a lot harder to track down the source of the problem.
The object-capability model addresses these problems by limiting the ways that capabilities can be passed around your application to one of four methods: initial conditions, parenthood, endowment, and introduction. By further restricting which capabilities are freely available across our entire codebase via initial conditions, we can restrict which pieces of code can access each capability, track how the capabilities are passed from one class to another via constructor or method arguments, and have a better understanding of what our program is doing and where.