Benchmarks
Benchmarks
Section titled “Benchmarks”Sephera includes a reproducible benchmark harness for the Rust CLI.
The benchmark scope is intentionally narrow: it measures the local release binary, records enough machine metadata to make results interpretable, and writes reports that are easy to diff or archive.
Default datasets
Section titled “Default datasets”smallmediumlarge
Optional datasets:
repoextra-large
extra-large targets roughly 2 GiB of generated source data. It is intended as a manual stress benchmark and is not part of the default workflow or normal CI.
What gets measured
Section titled “What gets measured”The harness records:
- platform and architecture metadata
- Python version and executable
- exact commands used for each run
- per-run timing samples
- parsed LOC totals from CLI output
- captured stdout and stderr
The CLI also prints a human-readable elapsed-time line, but the benchmark harness keeps its own wall-clock measurement as the primary benchmark metric.
Useful commands
Section titled “Useful commands”Run the default benchmark suite:
python benchmarks/run.pyInclude the current repository:
python benchmarks/run.py --datasets repo small medium largeRun the manual stress benchmark:
python benchmarks/run.py --datasets extra-large --warmup 0 --runs 1Interpreting results
Section titled “Interpreting results”The most production-representative parts of the benchmark are:
- release-mode binaries
- repeated runs over deterministic corpora
- exact command capture
- the real
locCLI path
The least production-representative parts are:
- local background activity
- thermal throttling or power management
- warmed filesystem cache after early runs
- synthetic corpora, which improve reproducibility but do not mirror every real repository
For the full benchmark harness notes kept alongside the codebase, see benchmarks/README.md in the repository.