The Ancient Sendmail Capabilities Issue

The following was once this file: http://userweb.kernel.org/~morgan/sendmail-capabilities-war-story.html but that URL became invalid when the kernel.org site was rebuilt after a malicious break-in. I'm reproducing it here because it retains value as a cautionary tale.

A number of folk have asked about this, and so here is a summary of the sendmail+capabilities bug (of old - long since fixed) wrapped up in a discussion of a patch I submitted to the kernel reviving securebits as per-process control bits.

Here is the bugtraq posting from the sendmail folk:

For what its worth, the explanation of the bug in the advisory is not as correct as the one I've given below, and I know for a fact that I wrote the patch that fixed the problem. But the sendmail folk were, I believe, correct that they were likely not the only application to be exposed to this issue and the safest thing was to use a new kernel. It is somewhat unfortunate that this has become known as the sendmail-capabilities bug, and you can tell from the tone of their advisory that they felt this way too... Hopefully this write up will go some way to undo that impression.

Andrew Morton said: 

> Can you please provide us with a reprise of

> - what was the bug which caused us to cripple capability inheritance back

>   in the days of yore?  (Some sendmail thing?) 

The bug, at its heart, was a misfire of the original filesystem-capability deficient implementation. 

Working backwards from the observed bug:  Sendmail with ([e]uid=0) did setuid(something other than 0) "expecting" (no check) it to return success, and by implication drop all remnant of the process having privilege. 

With capabilities trying to implement setuid-0-ness, it was possible for an unprivileged user to block capable(CAP_SETUID) from sendmail and cause this and only this system call to fail. Since sendmail contained the (historically valid) assumption that this couldn't fail it blindly continued executing believing it had, indeed, dropped privilege when it actually hadn't even changed uid. Using this hole, an unprivileged user could gain root privilege... 

If that had been the whole thing it might have truly been only a sendmail bug, one that the sendmail folk very promptly fixed, end of story. What caused us to change the kernel was the observation (1) that this kind of denial of syscall-service attack was so easy for a regular user to perform. A more subtle issue (2) also existed with the set*uid() (and other) system calls which added weight to wanting to "cripple" capabilities a bit before moving forward again.

To understand (1), we need to review the capability model. You can read about it in depth in the last draft of the POSIX.1e spec but, here is a quick summary:

Capabilities come in 3 flavors (Inheritable, Permitted, and Effective) and 2 sets: process capabilities and file capabilities. I'll abreviate them as: (pI,pP,pE) (fI,fP,fE). 

A process can change its own capabilities directly using the capset() system call - but generally wrapped, as per POSIX.1e, in the API of libcap. The basic model is that a process can drop anything from any of its three p? capability sets at any time, but only add to its pI or pE set capabilities that are already present in pP.

Privileged operations are permitted, capable(CAP_FOO) != 0, if pE has the CAP_FOO capability raised. The exercise of privilege by capability aware applications can/should thus be carefully performed by an application when it raises pE bits only around critical sections of privilege needing code. 

The capability rules for evolving state as processes evolve are: fork() lets each of the p? capabilities get duplicated exactly; and exec() convolutes the capabilities as follows:

where p?' signifies the post-exec() capabilities owned by the process. X was not specified by the POSIX.1e doc - it was a detail left to the discretion of each implementation [X has subsequntly become cap_bset, but we will treat it as ~0 in this abbreviated history].

At a high level the pP' rule is where privilege comes from: the union of forced capabilities, fP, and optionally inheritable fI bits. 

The kernel's initial implementation for capabilities, had no support for any of the f?s. Which meant that nothing could ever get pE' != 0. For most users executing non-privileged applications this was fine: they were supposed to have pP'=pE'=0! However, legacy setuid-0 and processes run by root needed a mechanism to obtain pE' != 0.

To emulate the superuser, the fateful decision (mine) was that the kernel gave all users pI=~0 (that is the potential to gain all privilege). Ignoring the special case of init - which had its capabilities specified in the kernel, the superuser concept could then be mapped by the following rules: 

   if (uid==0 or file-to-exec-is-setuid0) {

     fE = fI = ~0;

     fP = 0;

   } else {

     fI = fP = fE = 0;

   } 

If you follow the rules above you will see that this gives pE'=pP'=~0 to privileged programs.  What this didn't do was offer any way for non-uid=0 processes to gain pE bits - they could never be capable() of anything. Non-root users getting capabilities is a key feature of the POSIX.1e model, but there was no need to hack in such a feature because we had an active effort underway to implement filesystem capabilities leveraging Andreas Grünbacher's extended attributes [and in my kernel sandbox filesystem capabilities worked fine:   http://www.kernel.org/pub/linux/libs/security/linux-privs/old/ *-fcap  but that is getting off-track].

Getting back to (1), the problem with this simple model was that as per POSIX.1e capset() allowed any unprivileged (pE=0) user to drop any pI bit, and that meant that any subsequently invoked setuid-0 program could be forced to only get access to a subset of all of the capabilities: pE' = pP' = (pI & fI)...  This is how an unprivileged user could invoke setuid-0 sendmail and have its privilege dropping setuid() call fail.

One could argue that clearly every legacy privileged program had the same potential problem and it was judged, I particularly recall Ted Ts'o being a strong advocate, that we avoid this ASAP..! It turned out that with a simple change - a few lines of kernel code - one could eradicate this issue and this was the change that "crippled capabilities".

The change was as follows.  Init would give all processes pI=0 (that is *no* inheritable potential for gaining any privilege, and nothing for an unprivileged user to drop). And the rules for uid-0 exec() mapping would be:

   if (uid==0 or file-to-exec-is-setuid0) {

     fE = fP = ~0;

     fI = 0;

   } else {

     fI = fP = fE = 0;

   }

If you follow the rules above you will see that this gives pE'=pP'=~0 to privileged programs just like the old mapping, but has no mechanism for an unprivileged user to influence it.

The more subtle issue (2) was that setuid(), and all of its very many variants, have very strange semantics with respect to privilege. Nominally, for transitions not involving *uid=0, this class of system call simply changes the *uid of the current process or fails to. Its return value is a clear indication of whether the transition succeeded or not. If a transition involves *uid=0 then all sorts of subtle and hard to follow other things also occur...

With respect to capabilities, this system call can only return one status - success or not. For a process where one is capable(CAP_SETUID) then how does the calling application differentiate between 'just changed uid' and 'just changed uid, and dropped all privilege'? In a pure capability mode you want the former and in a legacy mode you want the latter. But in neither mode is any status value from the system call available to tell you that you got what you wanted.  By suppressing the sendmail class of bugs we put off this subtle issue for another day... And I was still happy because filesystem capabilities worked fine in my sandbox etc. .  Hopefully that explains what 'the sendmail bug' was.

> - Why was that security hole considered unfixable?

The lesson of the above is that taking privilege away can cause subtle and problematic things to happen with legacy privileged applications. The capability API has a way for capability aware apps to understand they don't have enough privilege, but this is not true of legacy applications. Where such a power has been added to the kernel, it has since generally required privilege to effect - least we inadvertently re-enable 'the sendmail bug'.

> - How does this change avoid reintroducing that hole?

After a lot of half-baked hacking around with capabilities, extended attributes made it into the kernel, and then courtesy of Serge Hallyn, we now have filesystem capability support in linux 2.6!  The patch under discussion requires privilege to suppress legacy support. It is certainly important to know what you are doing when you use it, but given that it requires privilege to enable it does not reintroduce the "sendmail bug". 

>> Filesystem capability support makes it possible to do away with

>> (set)uid-0 based privilege and use capabilities instead. That is, with

>> filesystem support for capabilities but without this present patch,

>> it is (conceptually) possible to manage a system with capabilities

>> alone and never need to obtain privilege via (set)uid-0.

>>

>> Of course, conceptually isn't quite the same as currently possible

>> since few user applications, certainly not enough to run a viable

>> system, are currently prepared to leverage capabilities to exercise

>> privilege. Further, many applications exist that may never get

>> upgraded in this way, and the kernel will continue to want to support

>> their setuid-0 base privilege needs.

>

> Are you saying that plain old setuid(0) apps will fail to work with

> CONFIG_SECURITY_FILE_CAPABILITIES=y?

No.  I'm basically interested in evolving the capability implementation back to the POSIX.1e model and making it whole - but most certainly without crippling legacy superuser support in the process.  As folk get more comfortable with this full capability model. I believe we can delete more cruft from the main kernel, but even that clean up will leave a fully functional legacy model in place. I feel it should be for something like init, or one of its children to be able to run subsystems in capability-only or legacy modes.

>> Where pure-capability applications evolve and replace setuid-0

>> binaries, it is desirable that there be a mechanisms by which they

>> can contain their privilege. In addition to leveraging the per-process

>> bounding and inheritable sets, this should include suppressing the

>> privilege of the uid-0 superuser from the process' tree of children.

>>

>> The feature added by this patch can be leveraged to suppress the

>> privilege associated with (set)uid-0. This suppression requires

>> CAP_SETPCAP to initiate, and only immediately affects the 'current'

>> process (it is inherited through fork()/exec()). This

>> reimplementation differs significantly from the historical support for

>> securebits which was system-wide, unwieldy and which has ultimately

>> withered to a dead relic in the source of the modern kernel.

>> 

>> With this patch applied a process, that is capable(CAP_SETPCAP), can

>> now drop all legacy privilege (through uid=0) for itself and all

>> subsequently fork()'d/exec()'d children with:

>> 

>>   prctl(PR_SET_SECUREBITS, 0x2f);

>>

>> Applying the following patch to progs/capsh.c from libcap-2.05

>> adds support for this new prctl interface to capsh.c: >> >> ...

>>

>> Acked-by: Serge Hallyn

>

> Really?  I'd feel a lot more comfortable if yesterday's version 1 had led  

Yep. Really: [..Quoting Serge..] 

> > Cool, I'd certainly say it's ready, and please feel free to add

> > Acked-by: Serge Hallyn

> > at this point.  Of course Andrew Morton's fears are not uncalled for,

> the whole trouble with subtle security interactions is that they're

> subtle  :)  and easy to miss.  For what it's worth I'll run a few ltp

> tests against your next version and try to verify that there are no

> changes, and no changes when composed with selinux. [..]

> to a stream of comments from suitably-knowledgeable kernel developers which

> indicated that those developers had scrutinised this code from every

> conceivable angle and had declared themselves 100% happy with it.

FWIW I've submitted this patch as an RFC about 2 or 3 times over the last few months to LSM with very good feedback at each turn. My goal was to make it 100% backwardly compatible with existing code - and where possible to absorb all of the capability relevant code into the capability LSM.

> Maybe I'm over-reacting here.  Feel free to tell me if I am :) But as I

My sense is that this is a necessary feature to make file capabilities a viable implementation (whole and free from all that setuid fixup cruft), but in an evolutionary way and not the failed revolution that global securebits offered. 

Finally, I completely agree that the whole filesystem capability model needs extensive review and the benefit of practical use to establish its viability. So, more eyes and more use are needed before we drop its EXPERIMENTAL config label.

> told you outside the bathroom today: I _really_ don't want to read about

> this patch on bugtraq two years hence.  

Me neither.

Cheers

Andrew G. Morgan <morgan@kernel.org>

2008/02/02