Motivating example
Consider the example of using curl through its official docker image. What threats are we exposed to in the software supply chain? (We choose curl simply because it is a popular open-source package, not to single it out.)
The first problem is figuring out the actual supply chain. This requires significant manual effort, guesswork, and blind trust. Working backwards:
- The “latest” tag in Docker Hub points to 7.72.0.
- It claims to have come from a Dockerfile in the curl/curl-docker GitHub repository.
- That Dockerfile reads the following artifacts, assuming there are no further
fetches during build time:
- Docker Hub image: registry.hub.docker.com/library/alpine:3.11.5
- Alpine packages: libssh2 libssh2-dev libssh2-static autoconf automake build-base groff openssl curl-dev python3 python3-dev libtool curl stunnel perl nghttp2
- File at URL: https://curl.haxx.se/ca/cacert.pem
- Each of the dependencies has its own supply chain, but let’s look at curl-dev, which contains the actual “curl” source code.
- The package, like all Alpine packages, has its build script defined in an
APKBUILD
in the Alpine git repo. There are several build dependencies:
- File at URL: https://curl.haxx.se/download/curl-7.72.0.tar.xz.
- The APKBUILD includes a sha256 hash of this file. It is not clear where that hash came from.
- Alpine packages: openssl-dev nghttp2-dev zlib-dev brotli-dev autoconf automake groff libtool perl
- File at URL: https://curl.haxx.se/download/curl-7.72.0.tar.xz.
- The source tarball was presumably built from the actual upstream GitHub
repository
curl/curl@curl-7_72_0, by
running the commands
./buildconf && ./configure && make && ./maketgz 7.72.0
. That command has a set of dependencies, but those are not well documented. - Finally, there are the systems that actually ran the builds above. We have no indication about their software, configuration, or runtime state whatsoever.
Suppose some developer’s machine is compromised. What attacks could potentially be performed unilaterally with only that developer’s credentials? (None of these are confirmed.)
- Directly upload a malicious image to Docker Hub.
- Point the CI/CD system to build from an unofficial Dockerfile.
- Upload a malicious Dockerfile (or other file) in the curl/curl-docker git repo.
- Upload a malicious https://curl.haxx.se/ca/cacert.pem.
- Upload a malicious APKBUILD in Alpine’s git repo.
- Upload a malicious curl-dev Alpine package to the Alpine repository. (Not sure if this is possible.)
- Upload a malicious https://curl.haxx.se/download/curl-7.72.0.tar.xz. (Won’t be detected by APKBUILD’s hash if the upload happens before the hash is computed.)
- Upload a malicious change to the curl/curl git repo.
- Attack any of the systems involved in the supply chain, as in the SolarWinds attack.
SLSA intends to cover all of these threats. When all artifacts in the supply chain have a sufficient SLSA level, consumers can gain confidence that most of these attacks are mitigated, first via self-certification and eventually through automated verification.
Finally, note that all of this is just for curl’s own first-party supply chain steps. The dependencies, namely the Alpine base image and packages, have their own similar threats. And they too have dependencies, which have other dependencies, and so on. Each dependency has its own SLSA level and the composition of SLSA levels describes the entire supply chain’s security.
For another look at Docker supply chain security, see Who’s at the Helm? For a much broader look at open source security, including these issues and many more, see Threats, Risks, and Mitigations in the Open Source Ecosystem.
Vision: Case Study
Let’s consider how we might secure curlimages/curl from the motivating example using the SLSA framework.
Incrementally reaching SLSA 4
Let’s start by incrementally applying the SLSA principles to the final Docker image.
SLSA 0: Initial state
Initially the Docker image is SLSA 0. There is no provenance. It is difficult to determine who built the artifact and what sources and dependencies were used.
The diagram shows that the (mutable) locator curlimages/curl:7.72.0
points to
(immutable) artifact sha256:3c3ff…
.
SLSA 1: Provenance
We can reach SLSA 1 by scripting the build and generating
provenance. The build script was
already automated via make
, so we use simple tooling to generate the
provenance on every release. Provenance records the output artifact hash, the
builder (in this case, our local machine), and the top-level source containing
the build script.
In the updated diagram, the provenance attestation says that the artifact
sha256:3c3ff…
was built from
curl/curl-docker@d6525….
At SLSA 1, the provenance does not protect against tampering or forging but may be useful for vulnerability management.
SLSA 2 and 3: Build service
To reach SLSA 2 (and later SLSA 3), we must switch to a hosted build service that generates provenance for us. This updated provenance should also include dependencies on a best-effort basis. SLSA 3 additionally requires the source and build platforms to implement additional security controls, which might need to be enabled.
In the updated diagram, the provenance now lists some dependencies, such as the
base image (alpine:3.11.5
) and apk packages (e.g. curl-dev
).
At SLSA 3, the provenance is significantly more trustworthy than before. Only highly skilled adversaries are likely able to forge it.
SLSA 4: Hermeticity and two-person review
SLSA 4 requires two-party source control and hermetic builds. Hermeticity in particular guarantees that the dependencies are complete. Once these controls are enabled, the Docker image will be SLSA 4.
In the updated diagram, the provenance now attests to its hermeticity and
includes the cacert.pem
dependency, which was absent before.
At SLSA 4, we have high confidence that the provenance is complete and trustworthy and that no single person can unilaterally change the top-level source.
Full graph
We can recursively apply the same steps above to lock down dependencies. Each non-source dependency gets its own provenance, which in turns lists more dependencies, and so on.
The final diagram shows a subset of the graph, highlighting the path to the upstream source repository (curl/curl) and the certificate file (cacert.pem).
In reality, the graph is intractably large due to the fanout of dependencies. There will need to be some way to trim the graph to focus on the most important components. While this can reasonably be done by hand, we do not yet have a solid vision for how best to do this in an scalable, generic, automated way. One idea is to use ecosystem-specific heuristics. For example, Debian packages are built and organized in a very uniform way, which may allow Debian-specific heuristics.
Composition of SLSA levels
An artifact’s SLSA level is not transitive, so some aggregate measure of security risk across the whole supply chain is necessary. In other words, each node in our graph has its own, independent SLSA level. Just because an artifact’s level is N does not imply anything about its dependencies’ levels.
In our example, suppose that the final curlimages/curl Docker image were SLSA 4 but its curl-dev dependency were SLSA 0. Then this would imply a significant security risk: an adversary could potentially introduce malicious behavior into the final image by modifying the source code found in the curl-dev package. That said, even being able to identify that it has a SLSA 0 dependency has tremendous value because it can help focus efforts.
Formation of this aggregate risk measure is left for future work. It is perhaps too early to develop such a measure without real-world data. Once SLSA becomes more widely adopted, we expect patterns to emerge and the task to get a bit easier.
Accreditation and delegation
Accreditation and delegation will play a large role in the SLSA framework. It is not practical for every software consumer to fully vet every platform and fully walk the entire graph of every artifact. Auditors and/or accreditation bodies can verify and assert that a platform or vendor meets the SLSA requirements when configured in a certain way. Similarly, there may be some way to “trust” an artifact without analyzing its dependencies. This may be particularly valuable for closed source software.