Tags
Agentic UX, Workshop
Author
Wout Helsmoortel
AI systems do not fail because they generate poor text. They fail because they reason poorly. They miss context, misuse sources, and lose traceability once outputs leave the prompt window. In complex environments such as defense, finance, or enterprise projects, those gaps create real risk. To solve this, teams need a structured way to design how people and AI reason together. They need to define what a good answer looks like, what sources count as truth, and when humans should step in. CAPER offers that structure. It is a simple workshop that turns reasoning into something teams can see, test, and improve. It replaces isolated AI experiments with a shared language for designing reliable, explainable workflows.
What CAPER is
CAPER is a design framework and a 90-minute workshop that helps teams map how an agent or system should think through a complex, high-stakes task. It is built for work where correctness, traceability, and collaboration matter.
C – Context and Objective
A – Approved Sources and Proof
P – Plan to Answer
E – Escalation for Uncertainty
R – Rationale and Review
A CAPER session results in a shared reasoning map, a filled canvas, an example of the final output, and the metrics to measure improvement over time.
When to Use It
Use CAPER when designing or improving reasoning-heavy tasks such as:
Producing Risk Assessment Briefs or compliance summaries
Conducting technical readiness reviews
Drafting or auditing process documentation
Designing explainable AI workflows in enterprise systems
Who Joins
Product managers, UX designers, domain experts, engineers, and if needed, security or compliance officers. The session works best when at least one participant has hands-on knowledge of the process being redesigned.
What You Need
One concrete task, such as “Prepare a Risk Assessment Brief for a new digital system”
Access to templates, policies, and previous examples
A list of verified information sources
The CAPER Canvas (one page)
A digital or physical whiteboard for notes
The CAPER workshop, step by step
Step | Purpose | What You Produce |
---|---|---|
1. Context and Objective | Define the task. Who is asking for the brief, and what is its purpose? Clarify what makes a good Risk Assessment Brief. | One-line goal statement and a success checklist. |
2. Approved Sources and Proof | List the sources that define risk standards: internal security frameworks, ISO or NIST controls, audit templates, and historic incidents. Exclude drafts or undocumented material. | A list of approved sources with validation rules and citation format. |
3. Plan to Answer | Map the reasoning steps. Example: identify system boundaries, gather risk inputs, rate likelihood and impact, propose mitigations, verify completeness. | A five- to seven-step reasoning sequence used by both humans and agents. |
4. Escalation for Uncertainty | Identify where the system or analyst might be unsure. For each uncertainty, create one clarifying question. Example: “Is third-party data processing included?” Define who reviews unresolved cases. | A list of fallback questions and escalation paths. |
5. Rationale and Review | Design how reasoning and results appear. The final Risk Assessment Brief shows key risks with citations, rationale, and confidence level. Reviewers can accept or comment on each risk. | A mock output with review options and feedback categories. |
Example: Producing a Risk Assessment Brief (RAB)
A security analyst needs to prepare a Risk Assessment Brief for a new internal application.
The goal is to provide leadership with a clear, traceable summary of the main risks and their mitigations.
Context and Objective
The Risk Assessment Brief summarizes identified risks, their likelihood, and mitigation plans. Success is defined as completeness, alignment with corporate risk policy, and clarity for non-technical decision-makers.
Approved Sources and Proof
Allowed inputs include the organization’s Risk Management Standard, the corporate threat library, the most recent incident database, and ISO 27005 guidelines. Every risk entry links to its supporting evidence. Outdated or unverified material is flagged.
Plan to Answer
The system or analyst follows a fixed plan:
Identify system boundaries and assets.
Retrieve similar past risk cases.
Match threats to vulnerabilities.
Estimate likelihood and impact using internal scoring.
Recommend mitigations and residual risk.
Verify completeness against the template.
Generate a summary with sources and confidence indicators.
Escalation for Uncertainty
If the data is incomplete or conflicting, the system asks: “Is external data transfer part of the system design?” If the analyst cannot answer, the case is routed to Security Engineering for review. Each escalation is recorded and reviewed monthly.
Rationale and Review
Each risk item shows the reasoning behind its score, the source used, and a short narrative explaining mitigation. Reviewers can accept, request clarification, or reject entries with reason codes such as “missing data,” “unclear impact,” or “duplicate risk.” Feedback is logged to refine templates and prompts.
Beyond One Brief: Building Institutional Memory
While the workshop example focuses on a single Risk Assessment Brief, the same CAPER structure can be reused for other evaluation and assurance processes. Over time, teams identify recurring uncertainties, align sources, and capture best-practice reasoning steps. The result is not a self-writing system but a self-improving workflow where expertise, traceability, and learning are built in.
What Executives Need to Enable
Clear boundaries. Define which tasks can use AI assistance and where human review is mandatory.
Governed sources. Maintain updated risk libraries, templates, and evidence repositories.
Useful metrics. Track time to approved brief, reviewer effort, and escalation rate.
Oversight rhythm. Hold short monthly CAPER sessions to validate new tasks and review metrics.
Conclusion
CAPER gives organizations a practical way to design how people and AI work together on reasoning-heavy tasks. It replaces ad hoc prompting with a shared process that makes goals, sources, logic, and uncertainty explicit.
Whether the task is producing a single Risk Assessment Brief or maintaining ongoing risk reviews, CAPER helps teams move from isolated automation experiments to accountable collaboration where reasoning is visible, auditable, and continuously improving.
References and Acknowledgments
Author: Wout Helsmoortel
Founder, Shaped — specializing in agentic AI systems, explainable architectures, and learning transformation for defense and enterprise environments.
References:
NN/g. Redefine Your Design Skills to Prepare for AI. Nielsen Norman Group.
OpenAI. Working with Evals. Developer guide. OpenAI.
OpenAI. Evaluation Best Practices. OpenAI.
Zheng, L., et al. Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena. arXiv.
Gu, J., et al. A Survey on LLM-as-a-Judge. arXiv.
Anthropic. A New Initiative for Developing Third-Party Model Evaluations. Anthropic.
NIST. AI Risk Management Framework 1.0. NIST Publications.
NIST. Generative AI Profile for the AI RMF. NIST Publications.
ISO/IEC 42001. AI Management Systems. ISO.
IBM. AI Factsheets. IBM.