Clinician-in-the-loop¶
"Human in the loop" is the most over-used phrase in clinical AI. It can mean anything from "a clinician glances at outputs in batches" to "every output is independently re-derived." The phrase is meaningful only with detail.
What good review looks like¶
The clinician should be able to:
- See the inputs the system saw. Source notes, retrieved context, structured fields.
- See the system's output and its citations. Each clinical claim mapped to a source.
- Edit any field freely. The system should never make edits hard or punitive.
- Surface or override verifier flags. With a recorded reason where applicable.
- Sign off explicitly. Sign-off is an action, not a default state.
What rubber-stamping looks like¶
- Edit rate trending toward zero.
- Time-to-sign trending toward seconds.
- Sign-off as the path of least resistance.
- No mechanism for the clinician to flag uncertainty.
Rubber-stamping is the most likely way a well-built system harms patients. The system itself must be designed to discourage it.
Designing against rubber-stamping¶
- Friction in the right places. Sign-off should require an active step, not auto-progress.
- Highlight changed fields and citations. Force visual attention to where the model is most likely to have erred.
- Show uncertainty. Where the system is uncertain (low retrieval recall, refused fields, verifier flags), surface it visibly.
- Track and surface edit rates. A clinician whose edit rate has trended to zero gets a dashboard nudge.
- Spot-check audits. A random sample of signed plans is reviewed by a peer; results are shared back as learning.
Review workflow design choices¶
Synchronous vs. asynchronous¶
- Synchronous (clinician interactively reviews the draft as it's produced): better for high-stakes generation, slower at scale.
- Asynchronous (system drafts in batch; clinician reviews in a queue): scales better, risks queue-pressure shortcuts.
Most clinical settings end up hybrid: synchronous for initial assessments and high-risk cases, asynchronous for routine reauthorization or summaries.
Tiered review¶
Not every case needs the same depth of review. Tier by risk:
- Tier 1: complex / high-stakes (initial assessments, severe behavior, medical complexity), substantive synchronous review.
- Tier 2: routine reauth, well-known cases, asynchronous review with attention to changes.
- Tier 3: non-clinical formatting and administrative content, light review.
Tiering must be transparent: the clinician knows what tier this case is in and why.
Escalation paths¶
The system must always offer a path to escalate:
- "I don't agree with this draft" → human-only re-derivation, with reasons captured.
- "I'm not the right reviewer for this" → reassignment.
- "Something is wrong" → adverse event log entry.
Supervision of trainees and paraprofessionals¶
For ABA and similar fields where supervised credentials (RBTs, BCaBAs) implement plans authored by a senior clinician (BCBA), the system must not blur supervision lines:
- Authoring credential is unchanged: the senior clinician owns the plan.
- Trainees and paraprofessionals see the plan, not the AI draft pre-review.
- Supervision documentation (per BACB or analogous standards) reflects the senior clinician's actual review activity.
The clinician is not a guardrail¶
A common framing, "the clinician is the safety check," is true and dangerous. True, because they catch errors the system will make. Dangerous, because it can be used to justify shipping a less-safe system. The clinician's review reduces residual risk; it does not licence raising the baseline risk by relying on them. Build the safest system first; then layer review.