Reproducibility · Results · Verify
Verify our discharge numbers yourself.
Every routine on /results/ has a discharge percentage next
to it. This kit lets you re-run the proof yourself and confirm the
numbers. ~5 minutes per routine on a modern laptop with gnatprove
installed.
What's in the kit
- README.md — full instructions, install notes, interpretation of unproved residue
- Makefile — one target per routine plus an
alltarget - expected-results.txt — the numbers we publish, diff-against
- routines/ — per-routine subdirectories with
.ads,.adb, build.gpr, and the original Fortran where available - dark-factory-verifier.tar.gz — the whole kit packed for download
The two-command verification
tar xzf dark-factory-verifier.tar.gz cd dark-factory-verifier && make all
The make all target runs gnatprove --level=1 on every routine
and writes a summary to actual-results.txt. Compare against
expected-results.txt with diff. Modulo prover wall-time, the two
files should be identical.
What this kit doesn't include
The factory itself — the translator, the spec expander, the body-fill stage, the algorithms catalogue, the per-stage prompts, the prover-feedback loops. These are how the output gets produced. The factory is the moat; the output is what you can verify.
Said plainly: anyone with gnatprove can confirm what our tool
produced. The question of whether we have a tool that produced
it is a separate one, answered by the cumulative weight of
everything on /results/ — fifteen-plus routines, gnatmake-clean
SPARK, discharge numbers anyone can re-derive, and the
erratum we published
when our own audit caught a stub-shaped body in one of them.
What the unproved residue means
Across the kit, the unproved obligations fall into three classes:
floating-point overflow checks at theoretical extremes (closed by
tighter Pre conditions); proof gaps from unaxiomatised library
functions like Exp (closed by pragma Assume); and
inductive ghost arguments at --level=1 (closed by re-running
at --level=2). None are body-correctness failures. See the
README for the per-class breakdown.
Found a discrepancy?
If your actual-results.txt doesn't match
expected-results.txt we want to know. Email
tony.gair@thedarkfactory.co.uk
with your output. The publication contract works only if the
re-runs match.