A World With Human-in-the-Loop Agent Safety in 2030
Published 21 April 2026 · 7 min read
The calm Tuesday
Ruth’s agent books her mother’s GP appointment in Leicester. Automatic tier — no review. Later her agent tries to book a £2,400 overseas specialist consultation. Mandatory review tier — a credentialed health reviewer in the Gera panel sees the proposed action, the relevant preferences, the marketplace reputation. 14 seconds later a signed decision returns. The consultation goes through. Ruth never saw the review happen; she only saw “waiting 14 seconds longer than usual.”
The review profession
By 2030 a new job exists: credentialed agent-action reviewer. Specialties include health, finance, safeguarding, cross-border commerce, high-value retail. Reviewers work from anywhere with a secure terminal, are trained for weeks, accept accountability for signed decisions, and are paid in a way that values judgement over volume. The nearest present-day analogue is a loss-adjuster or a compliance officer.
The mistakes that stop happening
- A jailbroken agent authorises £5,000 of bookings no user would approve — caught at Tier 2.
- A bereaved grandparent’s agent accepts a grief- targeted scam — distress signals trigger Tier 3.
- A cross-border transaction lands in a jurisdiction the user has no relationship with — Tier 2 review asks for confirmation.
- A novel-pattern marketplace takes aim at first-time users with manipulative consent cards — sampled Tier 1 review flags it, cuts it off.
The mistakes that still happen
Review is not infallible. Reviewers are human; occasionally they will approve something they should not, or refuse something they should not. The 2030 floor is “far better than no review, still not perfect.” Dispute paths are robust and error cases produce public case studies that feed reviewer training.
Dispute volume falls
The counter-intuitive outcome is that total dispute volume falls because bad transactions get refused at the review stage rather than arbitrated after the fact. Arbitration becomes narrower and faster. Marketplaces with high Tier 2 refusal rates lose reputation and adjust or exit.
Compliance becomes measurable
Regulators stop demanding “ethical AI processes” and start demanding published review-rate dashboards. A marketplace’s safety posture becomes a measurable number: what fraction of high-risk actions receive review, what is the reviewer-decision audit chain, what is the appeal success rate. This is legible to regulators in a way current AI-safety talk is not.
The union question
Reviewer jobs matter and should be good jobs. We expect a global reviewer association to form (we would welcome it). Sensitive specialties (safeguarding, mental-health context, health) need mental-health support, shift limits, and the option to decline cases. This is a workforce-design question as much as a protocol-design question.
The risks we track
- Race to the bottom — a competing protocol without review out-competes on speed. Mitigation: our approach wins on dispute + regulatory metrics, not raw speed.
- Reviewer monoculture — all reviewers end up from similar backgrounds and miss cases. Mitigation: geographic and linguistic diversity in the pool, enforced.
- Cost. If review is too expensive, marketplaces route around it. Mitigation: smart tiering, not universal review.
The ordinary shift
Most users in 2030 never think about review. It happens quietly on the transactions where it helps most, and not at all on the ones where it would just add latency. The infrastructure feels lighter because the worst cases do not happen. This is the product.
Help design agent safety that scales.
Join the waitlist