We present a new type of attack against Intel SGX named transient snapshot attacks. To the best of our knowledge, this new type of attack has not been reported by anyone before us. It has affected all versions of Intel SGX SDK since v1.7 was released in December 2016. This is when the SDK introduced a new feature named Intel SGX Protected File System Library (SGX-PFS), which is vulnerable to the new attack. SGX-PFS provides the essential ability to secure the file I/O for SGX applications, which is a popular feature among SGX users. Given the history and popularity of this feature, the vulnerability that we found may impact a great portion of SGX SDK's users.
SGX-PFS provides protected file API for SGX enclaves. The API is similar to its libc's counterpart (fopen, fread, fwrite, etc.), but offers the extra protection required by SGX enclaves: the protected files are encrypted and saved on the untrusted disk during a write operation, and they are verified for confidentiality and integrity during a read operation. SGX-PFS also offers freshness in the sense that a protected file of SGX-PFS cannot be partially rolled back to older versions. A protected file is also guaranteed to be consistent regardless of any crashes.
SGX-PFS is known to have some security limitations. For example, a documented non-objective is swapping attacks (swapping two protected files of the same name). Another example is side-channel attacks (using side channels like file sizes and file offsets). These are known issues that are well-documented by Intel and well-understood by end-users. However, what we are about to describe---the transient snapshot attacks---is a new type of threat that has not been documented by Intel.
For our discussion, we define a snapshot as the persistent state of a system at a point of time. For example, the snapshots of a Docker container are the file changes in the container saved with "docker commit". An enclave can also have snapshots. If an enclave uses protected files of SGX-PFS to persist its states, then these protected files constitute the snapshot of the enclave.
With snapshots defined, we can now explain transient snapshots. Transient snapshots of an enclave are those that are not expected by the application logic inside the enclave. Having unexpected content, and transient snapshots may cause unexpected behaviors of the enclave. In the worst case, unexpected behaviors can lead to security loopholes (as we will show in the attack demo).
Exploiting transient snapshots to attack enclaves is possible due to three factors.
1. Transient snapshots can be generated by enclaves. To maximize performance, the storage I/O stack, either in normal environments or SGX environments, generally gives users few guarantees on the ordering and timing of writes. For example, the I/O elevator in an OS kernel (including a library OS for SGX) may reorder or merge I/O writes to block devices. As another example, according to the POSIX standard, file writes may not persist on the disk until calling the fsync or fdatasync system call. Consequently, the storage I/O stack may produce intermediate on-disk states that are neither visible nor expected by users---or in our words, transient snapshots. This conclusion also applies to SGX-PFS.
2. Transient snapshots can be captured by the adversary. Transient snapshots are not a security issue in normal environments simply because they are usually not accessible to adversaries. But in SGX environments, the adversary can control everything beyond enclaves, including monitoring every I/O operation and saving every bit of on-disk data. Thus, the adversary can capture the snapshots of an enclave at an arbitrary time, including the snapshots of every protected file of SGX-PFS.
3. Transient snapshots can be replayed by the adversary. After capturing a snapshot of an enclave, the adversary can start new instances of the enclave with that snapshot. For SGX-PFS, different snapshots of a protected file share the same root key; the enclave cannot differentiate between them. So snapshots of protected files can be replayed.
After observing the fact that an enclave generates transient snapshots, the adversary can capture and replay transient snapshots in an attempt to cause unexpected behaviors of the application logic inside the enclave. This is what we call a transient snapshot attack.
Transient snapshot attacks may also apply to other SGX systems (such as Occlum) that feature some kind of secure file I/O ability, even TEEs other than SGX (e.g., Intel TDX).Â