Skip to content

Add container build backend and build verify command#2525

Draft
leighmcculloch wants to merge 54 commits intomainfrom
feat/reproducible-builds-via-docker
Draft

Add container build backend and build verify command#2525
leighmcculloch wants to merge 54 commits intomainfrom
feat/reproducible-builds-via-docker

Conversation

@leighmcculloch
Copy link
Copy Markdown
Member

@leighmcculloch leighmcculloch commented Apr 27, 2026

What

Add two container build backends to stellar contract build (and deploy/upload) — --backend docker and --backend docker-all — that produce reproducible wasms by running the build inside a linux/amd64 container, recording the resolved image digest plus the source git remote/sha, the per-package build options, and which backend produced the wasm in contract metadata. Add a stellar contract build verify subcommand that reads everything it needs from the wasm's metadata, rebuilds with the same toolchain, image, and build options, and reports which (if any) rebuilt artifact is byte-identical to the original. Add a mainnet warning on stellar contract deploy when the wasm is missing the meta entries needed for independent verification.

Why

Contract builds vary across host OS, architecture, and toolchain, preventing third parties from independently confirming a deployed contract was built from given source. Pinning the build to an image digest plus the rust toolchain version makes builds reproducible, recording the source repo + commit + per-package build options lets verifiers rebuild the exact same artifact, and the new verify subcommand automates the rebuild-and-compare check.

Closes #2506.

Two backends: docker vs docker-all

Two container-based backends are introduced. They differ in where in the pipeline the host CLI runs vs. the in-container CLI runs.

--backend docker --backend docker-all
Container image user-chosen rust image (default rust:latest) layered image: FROM <user-chosen rust> + rustup target add wasm32v1-none + cargo install --locked --git https://github.com/stellar/stellar-cli --rev <host's commit sha> stellar-cli
What runs in the container cargo rustc only the entire stellar contract build pipeline (cargo + meta injection + spec filtering + optional wasm-opt)
What runs on the host post-processing (meta injection, spec filtering, optional wasm-opt) only orchestration (build the layered image, run the in-container CLI, copy outputs to --out-dir if requested)
bldimg records the resolved base image digest the resolved base image digest (the layered image is reconstructable from bldimg + cliver)
bldbkd records docker docker-all
What a verifier needs the right docker image and the right host stellar-cli version (host post-processing is part of the build, so a host-CLI-version mismatch makes the rebuild diverge) only the right docker image — the host CLI version is irrelevant because the in-container CLI is what writes the wasm
First-run cost docker pull of the base image docker pull of the base image + a one-time cargo install --git --rev <sha> stellar-cli to build the layered image (cached locally afterward)

Both backends embed the same set of build-recording meta entries (see table below), so a wasm built with either backend can be deployed identically and verified by any user with docker. Choosing between them is a pragmatic trade: docker is cheaper to run but couples reproducibility to the host CLI version; docker-all captures the whole pipeline at the cost of an extra layered-image build.

Discussion of the design choices: #2525 (comment).

How it works

Three parts: build-time recording, deploy-time warning, and verify-time reproduction.

Build

stellar contract build --backend local                                  # default; host build
stellar contract build --backend docker                                 # cargo in rust:latest, post-processing on host
stellar contract build --backend docker=docker.io/library/rust:1.83
stellar contract build --backend docker=docker.io/library/rust@sha256:e4f0…
stellar contract build --backend docker-all                             # entire pipeline in a layered stellar-cli image
stellar contract build --backend docker-all=docker.io/library/rust:1.83
stellar contract build --backend docker-all=docker.io/library/rust@sha256:e4f0…

For all backends (including local), the build:

  • Records bldbkd (one of local, docker, docker-all) so anyone inspecting the wasm can see which build path produced it.
  • Detects whether the workspace is a clean git checkout. If clean and there's an origin remote, embeds source_repo (URL canonicalized to https://…), source_rev (full HEAD SHA), and per-package build options (bldopt_manifest_path relative to git root, bldopt_package, bldopt_profile, optional bldopt_optimize). The manifest path is auto-inserted whether or not --manifest-path was passed on the CLI.
  • If the working tree has uncommitted changes, prints a warning and omits source_repo / source_rev / bldopt_*.
  • If not a git repo, silently omits.

For --backend docker, additionally:

  1. Pulls the chosen image (default docker.io/library/rust:latest) with --platform=linux/amd64 over the Docker daemon's HTTP API (via bollard — no docker binary required), and resolves a fully-qualified <registry>/<path>@sha256:<digest> reference. The pull runs only once per build invocation; multi-contract workspaces reuse the cached digest.
  2. Bind-mounts on the container:
    • <git_root or workspace_root>/workspace (rw, source)
    • <target_dir>/target (rw, build output, host reads the wasm afterward)
    • host ~/.cargo/registry/usr/local/cargo/registry (rw, cached crate downloads)
    • The host's ~/.rustup is not mounted — host toolchain binaries are platform-specific (e.g. Mach-O on macOS) and can't run inside the linux/amd64 container. The image's pre-installed rustup state is used; the wasm target is installed at the start of each container run.
  3. The container runs as the host uid:gid, so files written to the bind mounts are readable/writable by the host user.
  4. Runs rustup --quiet target add [--toolchain <pin>] wasm32v1-none && exec cargo [+<pin>] rustc … inside the container — the wasm target is always installed up-front, so the build doesn't depend on the workspace's rust-toolchain.toml.
  5. Forces --locked. Sets SOURCE_DATE_EPOCH from git log -1 --format=%ct. Sets CARGO_TERM_COLOR=always. --remap-path-prefix uses container paths so wasms don't leak host paths.
  6. Streams pull progress and cargo output to the user's terminal in execution order: pull progress first, then the cargo invocation line, then cargo's own output (Compiling …, Finished …).
  7. Post-processing (inject_meta, spec shaking, optional wasm-opt) runs on the host as before.

For --backend docker-all, instead of (4)–(7) above:

  1. Pulls and resolves the chosen base image as for docker.
  2. Builds a layered image locally from an embedded Dockerfile:
    ARG BASE_IMAGE
    FROM ${BASE_IMAGE}
    ARG WASM_TARGET
    RUN rustup target add ${WASM_TARGET}
    ARG STELLAR_CLI_REPO
    ARG STELLAR_CLI_REV
    RUN cargo install --locked --git ${STELLAR_CLI_REPO} --rev ${STELLAR_CLI_REV} stellar-cli
    STELLAR_CLI_REV is the full 40-char commit sha of the host stellar-cli (extracted from version::git()); the layered image therefore contains the same CLI version as the host. The image is tagged stellar-cli-build:<short-hash-of(base_digest, cli_rev)> so docker's layer cache hits across runs.
  3. Runs the in-container stellar contract build --backend local --bldimg <base-digest> --bldbkd docker-all … against the same bind mounts as docker. The in-container CLI does cargo + meta injection + spec filtering + optional wasm-opt itself; the host only copies outputs to --out-dir if requested.
  4. The host needs no rust toolchain or wasm-opt — the layered image owns the entire build pipeline.

Verification on a different machine reconstructs the layered image from bldimg (base) + cliver (CLI version) — the layered image digest itself doesn't need to be recorded because the recipe is deterministic.

The wasm's contractmetav0 custom section is populated with up to nine entries:

key value injected by
cliver 26.0.0#abc1234 (CLI version + git rev) stellar-cli
bldbkd local / docker / docker-all stellar-cli (this PR)
bldimg docker.io/library/rust@sha256:… (fully-qualified) stellar-cli (this PR; with --backend docker[-all])
rsver 1.83.0 (resolved rustc version) soroban-sdk
source_repo https://github.com/user/repo (clean repo's origin) stellar-cli (this PR)
source_rev full HEAD SHA stellar-cli (this PR)
bldopt_manifest_path e.g. contracts/foo/Cargo.toml (relative to git) stellar-cli (this PR)
bldopt_package cargo package name being built stellar-cli (this PR)
bldopt_profile cargo profile (e.g. release) stellar-cli (this PR)
bldopt_optimize true (only present when --optimize was used) stellar-cli (this PR)

For full reproducibility from day one, pin the image with --backend docker[-all]=<name>@sha256:… and commit before building.

--backend and --docker-host are also exposed on stellar contract deploy and stellar contract upload (which auto-build when no --wasm / --wasm-hash is given), so the same flags work end-to-end.

Deploy

stellar contract deploy against mainnet now warns when the wasm is missing any of cliver, bldimg, rsver, source_repo, source_rev, bldopt_manifest_path, bldopt_package, bldopt_profile:

⚠ the wasm being deployed is missing reproducibility meta entries: ["bldimg", "source_repo", "source_rev", "bldopt_manifest_path", "bldopt_package", "bldopt_profile"]. The deployed wasm may not be independently verifiable. To make it reproducible, build with `stellar contract build --backend docker` (or `--backend docker-all`) in a clean git repository.

The check is mainnet-only (matches network passphrase against Public Global Stellar Network ; September 2015); on testnet/futurenet/local the wasm deploys silently.

Verify

verify is a subcommand of build — it lives at stellar contract build verify, and works on multi-contract workspaces by rebuilding and finding the match.

stellar contract build verify --contract-id CXXX… --network mainnet
stellar contract build verify --wasm-hash <hash>  --network mainnet
stellar contract build verify --wasm contract.wasm
  1. Fetches the original wasm (file path, hash, or contract id, same flags as contract info).
  2. Reads cliver, bldimg, bldbkd, rsver, bldopt_manifest_path, bldopt_package, bldopt_profile, optional bldopt_optimize from the wasm's meta. bldbkd is treated as docker if absent (legacy wasms predating this PR). For bldbkd: local, errors out (local builds aren't reproducible).
  3. Picks the backend to use for the rebuild from bldbkd:
    • docker: uses Backend::Docker { image: bldimg }. Errors if the running CLI's cliver doesn't match the wasm's (host post-processing makes this mismatch fatal).
    • docker-all: uses Backend::DockerAll { image: bldimg }. Skips the cliver mismatch check — the in-container CLI is what matters, and it's installed at the wasm's cliver regardless of the host CLI.
  4. Resolves bldopt_manifest_path against the cwd's git top-level (via git rev-parse --show-toplevel) so verify works from anywhere inside the checkout.
  5. Rebuilds with the chosen backend, the toolchain pinned to <rsver> (i.e. cargo invoked as cargo +<rsver> rustc …), and the recorded --manifest-path/--package/--profile/--optimize flags. rustup uses that exact rust version regardless of rust-toolchain.toml's channel.
  6. Hashes every rebuilt artifact and looks for a match against the original. Prints ✅ on match (with the matching crate's name); ⚠ + non-zero exit on mismatch (with each rebuilt artifact's name + hash).

The user is responsible for checking out the matching commit before running verify; verify rebuilds from the working tree. (source_repo and source_rev are embedded in meta to help users find the right commit, but verify itself doesn't clone — that would add a separate trust path.)

End-to-end example (docker backend)

$ stellar contract build --backend docker
ℹ Pulling from library/rust
   Digest: sha256:e4f09e8fe5a2366e7d3dc35e08bd25821151e3ed8fdbd3a6a16b51555f0c551d
   Status: Image is up to date for rust:latest
ℹ CARGO_BUILD_RUSTFLAGS='--remap-path-prefix=/usr/local/cargo/registry/src/= --remap-path-prefix=/workspace=' SOROBAN_SDK_BUILD_SYSTEM_SUPPORTS_SPEC_SHAKING_V2=1 cargo rustc --locked --manifest-path=/workspace/contracts/foo/Cargo.toml --crate-type=cdylib --target=wasm32v1-none --release
   Compiling foo v…
    Finished `release` profile [optimized] target(s) in 1.09s
ℹ Build Summary:
   Wasm File: target/wasm32v1-none/release/foo.wasm (907 bytes)
   Wasm Hash: 9f86d081…
✅ Build Complete

$ stellar contract info meta --wasm target/wasm32v1-none/release/foo.wasm
cliver=26.0.0#abc1234
bldbkd=docker
bldimg=docker.io/library/rust@sha256:e4f0…0c551d
rsver=1.83.0
source_repo=https://github.com/user/my-contract
source_rev=abc1234567890abcdef…
bldopt_manifest_path=contracts/foo/Cargo.toml
bldopt_package=foo
bldopt_profile=release

# Later, on a different machine, with the matching commit checked out
# AND the matching stellar-cli version installed:
$ stellar contract build verify --wasm-hash <hash> --network mainnet
ℹ Loading contract from network...
ℹ Loading meta from contract...
   Original wasm hash: 9f86d081…
   stellar-cli version: 26.0.0#abc1234
   rust version: 1.83.0
   Docker image: docker.io/library/rust@sha256:e4f0…0c551d
   Build backend: docker
   Manifest path: contracts/foo/Cargo.toml
   Package: foo
   Profile: release
ℹ Pulling from library/rust
   Digest: sha256:e4f0…0c551d
ℹ CARGO_BUILD_RUSTFLAGS=… cargo +1.83.0 rustc --locked --manifest-path=/workspace/contracts/foo/Cargo.toml --package=foo --profile=release ...
   Compiling foo v…
✅ Build Complete
✅ Verified: rebuilt foo wasm matches 9f86d081…

End-to-end example (docker-all backend)

$ stellar contract build --backend docker-all
ℹ Pulling from library/rust
   Digest: sha256:e4f09e8fe5a2366e7d3dc35e08bd25821151e3ed8fdbd3a6a16b51555f0c551d
ℹ Building stellar-cli build image stellar-cli-build:a1b2c3d4e5f6g7h8 (base docker.io/library/rust@sha256:e4f0…, stellar-cli abc1234567890abcdef…)
   Step 1/6 : ARG BASE_IMAGE
   Step 2/6 : FROM ${BASE_IMAGE}
   Step 3/6 : ARG WASM_TARGET
   Step 4/6 : RUN rustup target add ${WASM_TARGET}
   Step 5/6 : ARG STELLAR_CLI_REV
   Step 6/6 : RUN cargo install --locked --git https://github.com/stellar/stellar-cli --rev abc1234… stellar-cli
ℹ stellar contract build --backend local --bldimg docker.io/library/rust@sha256:e4f0… --bldbkd docker-all --manifest-path /workspace/contracts/foo/Cargo.toml --profile release --locked
   Compiling foo v…
✅ Build Complete

$ stellar contract info meta --wasm target/wasm32v1-none/release/foo.wasm
cliver=26.0.0#abc1234
bldbkd=docker-all
bldimg=docker.io/library/rust@sha256:e4f0…0c551d


# Later, on a different machine — host stellar-cli version is irrelevant:
$ stellar contract build verify --wasm-hash <hash> --network mainnet
ℹ Loading meta from contract...
   Build backend: docker-all
ℹ Building stellar-cli build image stellar-cli-build:… (base …, stellar-cli abc1234…)

✅ Verified: rebuilt foo wasm matches 9f86d081…

Notes

  • Communication with the daemon: bollard's HTTP API over the docker socket (/var/run/docker.sock, or whatever --docker-host / DOCKER_HOST points at). Same connect_to_docker helper used by stellar container start/stop/logs, with the same Docker Desktop fallback ($HOME/.docker/run/docker.sock). No shell-out to the docker CLI. A podman socket exposing the Docker API would also work (untested).
  • Caching: the bind-mount of host ~/.cargo/registry lets the container reuse crate downloads the host already has. For docker-all, docker's layer cache also keeps the cargo install stellar-cli layer warm across runs as long as (base_digest, cli_rev) is unchanged.
  • No rust-toolchain.toml dependency: every container build runs rustup target add wasm32v1-none (with --toolchain <pin> when verifying) before cargo, so the workspace's rust-toolchain.toml directives are not relied on.
  • Toolchain pinning: verify uses cargo +<rsver> (rustup's explicit toolchain selector) — this overrides rust-toolchain.toml's channel and ensures the same exact rust version is used across machines and time.
  • Image fully-qualified: bldimg is normalized to <registry>/<path>@sha256:<digest> (e.g. rust:latestdocker.io/library/rust@sha256:…) so verify can resolve it without relying on the local registry config.
  • docker-all requires a host CLI built from a commit: the layered image installs the stellar-cli at the host's full 40-char commit sha. Homebrew/crates.io/cargo-git installs all produce a usable sha. Local-cargo-run builds work as long as HEAD has been pushed to origin (so cargo install --git --rev <sha> can fetch it). See Normalize stellar-cli version rendering in stellar version and cliver meta #2535 for the in-progress normalization of the cliver rendering.
  • Source URL canonicalization: source_repo is normalized to https://… form (e.g. [email protected]:user/repo.githttps://github.com/user/repo).
  • Build options auto-recorded: bldopt_manifest_path is recorded relative to the git repo root regardless of whether --manifest-path was passed on the CLI. Verify resolves it against the cwd's git top-level so the command works from anywhere inside the checkout.
  • Multi-contract workspaces: container builds run sequentially with a blank line between each contract's output; the docker pull only happens before the first contract; verify adds an additional blank line before its final verdict so the ✅/⚠ stands apart from the per-contract Build Complete lines.
  • Aborted container runs: may leave a stopped container; clean with docker container prune.

Status

This is an experiment in validating the ideas in #2506. May or may not be destined for merging — at this moment it is an experiment in validating the approach. We have not yet decided which of docker and docker-all to keep; both are exercising real tradeoffs and the PR ships both so they can be compared in practice.

@github-project-automation github-project-automation Bot moved this to Backlog (Not Ready) in DevX Apr 27, 2026
@leighmcculloch
Copy link
Copy Markdown
Member Author

Each meta field needs a very detailed specification of exactly what valid values are expected, and the format(s) those values can take.

For example, depending on how you install the stellar-cli today, the cliver field can contain any of the following formats:

Homebrew:

cliver: 26.0.0#

Cargo crates.io install:

cliver: 26.0.0#60f7458e7ecffddf2f2d91dc6d0d2db4fab03ecc

Cargo git install:

cliver: 26.0.0#v20.0.0-836-gfe07b3678833e07c43235a6caaeccff81e146856

Without a precise spec for each field, downstream tooling (verifiers, registries, indexers) can't reliably parse or validate these values.

@leighmcculloch
Copy link
Copy Markdown
Member Author

Currently only the cargo build step runs inside the container. The image is a stock rust:latest (with the wasm target installed at runtime) — everything the stellar-cli does after the build (meta injection, post-processing, etc.) runs on the host as part of the stellar binary, against the wasm output that comes back through the bind mount.

That works, and the host-side stellar-cli version is captured in the wasm via the cliver meta, so verify does have what it needs to detect a mismatch. But it still splits the build pipeline across two environments: a pinned, reproducible one (the image digest) and a host-resolved one (the user's installed stellar). To reproduce a build, a verifier needs both the right image and the right stellar on their host — the image digest alone isn't sufficient. To verify a build produced by a different stellar version than the one installed locally, the user has to install that other version first.

A way to get the best of both:

Embed a Dockerfile in the stellar-cli that, at build time, layers on top of whatever rust base image is requested:

  1. FROM the user-chosen rust base image (still flexible — users can pin to any rust:<sha> they trust)
  2. Add the wasm target
  3. Install the requested stellar-cli version

Both the rust version and the stellar-cli version are specified at build (and verify) time, the image gets built locally, and the entire build pipeline runs inside it. The user on the host doesn't need to install a matching stellar-cli to verify a build produced by a different version — that version is installed into the image instead.

This avoids the supply-chain cost of us owning and publishing a bespoke stellar build image, keeps the rust base image flexible, and still lets the image digest capture the whole pipeline.

It also resolves the cliver-format issue raised in #2525 (comment) — since the embedded Dockerfile installs the stellar-cli exactly one way, cliver will be rendered exactly one way too, instead of varying by host install method (homebrew vs. crates.io vs. cargo-git).

It also keeps the door open for SDF or someone else to host prebuilt images later as a convenience — but we don't have to figure that out for the first iteration. The embedded-Dockerfile approach defers that decision without foreclosing it.

@leighmcculloch
Copy link
Copy Markdown
Member Author

Opened an issue about the cliver inconsistency here:

@leighmcculloch
Copy link
Copy Markdown
Member Author

Heads up on a devx tradeoff worth documenting: building inside an amd64 container on Apple Silicon (or any non-amd64 host) runs under emulation — qemu via Docker Desktop / OrbStack — which is a real performance hit on large rust compilations.

Small contracts won't notice it. Workspaces with many contracts, or contracts with heavy dep trees, will see noticeably slower builds inside docker (sometimes a multiple of the native build time). This is a fundamental cost of pinning the build to a single arch for reproducibility, not something the --backend docker[-all] impl can avoid.

It's also not only a perf concern: users on container runtimes that don't ship qemu/binfmt support (some minimal Linux setups, certain rootless or stripped-down podman configs, etc.) won't be able to run amd64 containers on arm64 hosts at all — they'll see exec format error or similar and the build will simply fail.

Worth surfacing in docs so users aren't caught off guard, and worth keeping --backend local as the default for the dev inner loop.

@leighmcculloch
Copy link
Copy Markdown
Member Author

leighmcculloch commented May 1, 2026

Another scaling tradeoff worth flagging between the two backends:

--backend docker just pulls a stock rust image and lets the host CLI do post-processing — no per-(rust, cli) image to build.

--backend docker-all needs an image with both rust and a specific stellar-cli rev installed. Ideally that image would be prebuilt for known (rust, cli) combinations and just pulled from a registry — same speed as the docker backend. As implemented in this PR it's layered locally instead, which means for each new (rust, cli) pair the user encounters, the image has to be built via a full cargo install stellar-cli (a real cli compile, minutes). After that, docker's layer cache hits on the same pair across builds.

In practice this is an edge case — most users build with a small set of (rust, cli) combinations and the layered image gets cached the first time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Backlog (Not Ready)

Development

Successfully merging this pull request may close these issues.

Add --docker option and stellar contract verify for reproducible builds

1 participant