ghrunners

Role: Personal project
Status: Private/local
Release: local tag v0.5.0
Key outcome: Typed findings and guarded control act on the same authoritative launchd state.
Stack: RustCLImacOS

What it is

ghrunners discovers every actions-runner on a Mac and cross-checks launchctl domain state, the process tree, .runner JSON, the worker logs, and (optionally) the GitHub API. The observability core — status, describe, doctor, logs — surfaces problems as typed findings; a separate, guarded control verb acts on the same state. It is a private, local tool; the snapshot it describes is named under Evidence below.

fig. 3.1 — four evidence sources → typed findings → guarded control

The problem

When a A machine you run yourself to execute GitHub Actions jobs, instead of GitHub's cloud runners — you own the OS, the network, and what persists between jobs. quietly dies on a Mac, the truth is scattered: macOS's system service manager — it starts, stops, and supervises background services (roughly the macOS counterpart to Linux's systemd). holds the domain and last-exit state, the process tree says whether a worker is really alive, the install dir holds the config behind 0700 permissions, and the GitHub API knows whether the job ever landed. You SSH in and reassemble it by hand — and the obvious fix, restarting the runner, is exactly where a careless command can kill an in-progress build or bootstrap it into the wrong launchd domain.

Constraints & key decisions

One-shot observability from authoritative state. Rather than run a monitoring A program that stays resident in the background between jobs. ghrunners is deliberately the opposite — one-shot: it runs, acts, and exits, holding no long-lived state., ghrunners is a single no-state invocation that parses the launchctl print body — readable without sudo — so the run-state and last-exit it reports are authoritative even for a non-root caller, not guessed from ps. Cost: one-shot means no continuous monitoring or historical trend — you run it, or cron it, when you want a snapshot.

Partial output over fail-fast. Runner install dirs sit under another user at mode 0700, so a strict tool would just error out without sudo. Instead the missing pieces degrade gracefully: PID - and REPO ? are not errors, and a permission-denied path becomes an Unreadable finding rather than a fatal stop, while the state column stays authoritative from the sudo-free print body. Cost: every blank cell is a path the tool has to design and explain — permission gap versus real absence — and the output is simply less complete without sudo.

Guarded control, bound to observed state. The obvious next step — start / stop / restart — is the dangerous one, so control (bootstrap / unload / restart) resolves its target from the same authoritative state the observability core reads, never free-form launchctl. --dry-run prints the resolved command without running it; the --domain override is rejected with a usage error anywhere except a DoubleLoaded unload (every other verb has a deterministic target an override would subvert); and --yes is required before a verb may kill a live Runner.Worker. Cost: more preconditions and flags to carry, and the tool deliberately refuses some commands raw launchctl would run.

Typed findings and stable exit codes over free-text logs. "Is this runner healthy?" needs a machine answer, so problems are 13 typed findings with severities — doctor prints only the actionable warn / error ones, each with a fix: line — and the process exits 1 when status or doctor sees a warning or error, so ghrunners doctor >/dev/null || mail … works as a cron health check. Cost: every finding is a typed contract to define, trigger, and assign a severity and fix — a maintained catalogue, not free text.

Evidence

Private snapshot at local tag v0.5.0 — there is no public source to inspect. What backs the claims is the running surface: status, describe, doctor, logs, and the guarded control verbs work today, over a catalogue of 13 typed findings (from LoadedNotRunning and OrphanListener to DoubleLoaded and Unreadable) with cron-friendly exit codes, JSON output, and a --dry-run preview on every control action. The capability arrived in dated steps: control verbs in v0.2, authoritative liveness from the print body in v0.3, and the doctor sweep plus OrphanListener in v0.4.

Candidate scope, not committed — deeper GitHub API enrichment and new findings as edge cases surface on real hosts. A public source link follows if and when the repository becomes reachable.

What it isn't

Not a runner installer or unregister tool.
Not a daemon or persistent monitor.
Not fleet management — control is single-runner and guarded, not bulk orchestration.
Not Linux or Windows tooling.
Private/local — not a public-source project today.

← All work