Designing UX for AI that Thinks: The CAPER Workshop

Tags

Agentic UX, Workshop

Author

Wout Helsmoortel

AI systems do not fail because they generate poor text. They fail because they reason poorly. They miss context, misuse sources, and lose traceability once outputs leave the prompt window. In complex environments such as defense, finance, or enterprise projects, those gaps create real risk. To solve this, teams need a structured way to design how people and AI reason together. They need to define what a good answer looks like, what sources count as truth, and when humans should step in. CAPER offers that structure. It is a simple workshop that turns reasoning into something teams can see, test, and improve. It replaces isolated AI experiments with a shared language for designing reliable, explainable workflows.

What CAPER is

CAPER is a design framework and a 90-minute workshop that helps teams map how an agent or system should think through a complex, high-stakes task. It is built for work where correctness, traceability, and collaboration matter.

C – Context and Objective
A – Approved Sources and Proof
P – Plan to Answer
E – Escalation for Uncertainty
R – Rationale and Review

A CAPER session results in a shared reasoning map, a filled canvas, an example of the final output, and the metrics to measure improvement over time.

When to Use It

Use CAPER when designing or improving reasoning-heavy tasks such as:

Producing Risk Assessment Briefs or compliance summaries
Conducting technical readiness reviews
Drafting or auditing process documentation
Designing explainable AI workflows in enterprise systems

Who Joins

Product managers, UX designers, domain experts, engineers, and if needed, security or compliance officers. The session works best when at least one participant has hands-on knowledge of the process being redesigned.

What You Need

One concrete task, such as “Prepare a Risk Assessment Brief for a new digital system”
Access to templates, policies, and previous examples
A list of verified information sources
The CAPER Canvas (one page)
A digital or physical whiteboard for notes

The CAPER workshop, step by step

Step	Purpose	What You Produce
1. Context and Objective 20 min	Define the task. Who is asking for the brief, and what is its purpose? Clarify what makes a good Risk Assessment Brief.	One-line goal statement and a success checklist.
2. Approved Sources and Proof 20 min	List the sources that define risk standards: internal security frameworks, ISO or NIST controls, audit templates, and historic incidents. Exclude drafts or undocumented material.	A list of approved sources with validation rules and citation format.
3. Plan to Answer 20 min	Map the reasoning steps. Example: identify system boundaries, gather risk inputs, rate likelihood and impact, propose mitigations, verify completeness.	A five- to seven-step reasoning sequence used by both humans and agents.
4. Escalation for Uncertainty 15 min	Identify where the system or analyst might be unsure. For each uncertainty, create one clarifying question. Example: “Is third-party data processing included?” Define who reviews unresolved cases.	A list of fallback questions and escalation paths.
5. Rationale and Review 15 min	Design how reasoning and results appear. The final Risk Assessment Brief shows key risks with citations, rationale, and confidence level. Reviewers can accept or comment on each risk.	A mock output with review options and feedback categories.

Example: Producing a Risk Assessment Brief (RAB)

A security analyst needs to prepare a Risk Assessment Brief for a new internal application.
The goal is to provide leadership with a clear, traceable summary of the main risks and their mitigations.

Context and Objective
The Risk Assessment Brief summarizes identified risks, their likelihood, and mitigation plans. Success is defined as completeness, alignment with corporate risk policy, and clarity for non-technical decision-makers.

Approved Sources and Proof
Allowed inputs include the organization’s Risk Management Standard, the corporate threat library, the most recent incident database, and ISO 27005 guidelines. Every risk entry links to its supporting evidence. Outdated or unverified material is flagged.

Plan to Answer
The system or analyst follows a fixed plan:

Identify system boundaries and assets.
Retrieve similar past risk cases.
Match threats to vulnerabilities.
Estimate likelihood and impact using internal scoring.
Recommend mitigations and residual risk.
Verify completeness against the template.
Generate a summary with sources and confidence indicators.

Escalation for Uncertainty
If the data is incomplete or conflicting, the system asks: “Is external data transfer part of the system design?” If the analyst cannot answer, the case is routed to Security Engineering for review. Each escalation is recorded and reviewed monthly.

Rationale and Review
Each risk item shows the reasoning behind its score, the source used, and a short narrative explaining mitigation. Reviewers can accept, request clarification, or reject entries with reason codes such as “missing data,” “unclear impact,” or “duplicate risk.” Feedback is logged to refine templates and prompts.

Beyond One Brief: Building Institutional Memory

While the workshop example focuses on a single Risk Assessment Brief, the same CAPER structure can be reused for other evaluation and assurance processes. Over time, teams identify recurring uncertainties, align sources, and capture best-practice reasoning steps. The result is not a self-writing system but a self-improving workflow where expertise, traceability, and learning are built in.

What Executives Need to Enable

Clear boundaries. Define which tasks can use AI assistance and where human review is mandatory.
Governed sources. Maintain updated risk libraries, templates, and evidence repositories.
Useful metrics. Track time to approved brief, reviewer effort, and escalation rate.
Oversight rhythm. Hold short monthly CAPER sessions to validate new tasks and review metrics.

Conclusion

CAPER gives organizations a practical way to design how people and AI work together on reasoning-heavy tasks. It replaces ad hoc prompting with a shared process that makes goals, sources, logic, and uncertainty explicit.

Whether the task is producing a single Risk Assessment Brief or maintaining ongoing risk reviews, CAPER helps teams move from isolated automation experiments to accountable collaboration where reasoning is visible, auditable, and continuously improving.

References and Acknowledgments

Author: Wout Helsmoortel
Founder, Shaped — specializing in agentic AI systems, explainable architectures, and learning transformation for defense and enterprise environments.

References:

NN/g. Redefine Your Design Skills to Prepare for AI. Nielsen Norman Group.
OpenAI. Working with Evals. Developer guide. OpenAI.
OpenAI. Evaluation Best Practices. OpenAI.
Zheng, L., et al. Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena. arXiv.
Gu, J., et al. A Survey on LLM-as-a-Judge. arXiv.
Anthropic. A New Initiative for Developing Third-Party Model Evaluations. Anthropic.
NIST. AI Risk Management Framework 1.0. NIST Publications.
NIST. Generative AI Profile for the AI RMF. NIST Publications.
ISO/IEC 42001. AI Management Systems. ISO.
IBM. AI Factsheets. IBM.

Related insights

From Human to Agentic Evaluation: Scaling Expertise Without Losing Trust

Tags

From Human to Agentic Evaluation: Scaling Expertise Without Losing Trust

Tags

From Human to Agentic Evaluation: Scaling Expertise Without Losing Trust

Tags

The Shaped Framework for Implementing Agentic Workflows

Tags

The Shaped Framework for Implementing Agentic Workflows

Tags

The Shaped Framework for Implementing Agentic Workflows

Tags

Designing UX for AI that Thinks: The CAPER Workshop

Designing UX for AI that Thinks: The CAPER Workshop

Designing UX for AI that Thinks: The CAPER Workshop

Agentic UX, Workshop

Wout Helsmoortel

What CAPER is

When to Use It

Who Joins

What You Need

The CAPER workshop, step by step

Step

Purpose

What You Produce

1. Context and Objective20 min

Define the task. Who is asking for the brief, and what is its purpose? Clarify what makes a good Risk Assessment Brief.

One-line goal statement and a success checklist.

2. Approved Sources and Proof20 min

List the sources that define risk standards: internal security frameworks, ISO or NIST controls, audit templates, and historic incidents. Exclude drafts or undocumented material.

A list of approved sources with validation rules and citation format.

3. Plan to Answer20 min

Map the reasoning steps. Example: identify system boundaries, gather risk inputs, rate likelihood and impact, propose mitigations, verify completeness.

A five- to seven-step reasoning sequence used by both humans and agents.

4. Escalation for Uncertainty15 min

Identify where the system or analyst might be unsure. For each uncertainty, create one clarifying question. Example: “Is third-party data processing included?” Define who reviews unresolved cases.

A list of fallback questions and escalation paths.

5. Rationale and Review15 min

Design how reasoning and results appear. The final Risk Assessment Brief shows key risks with citations, rationale, and confidence level. Reviewers can accept or comment on each risk.

A mock output with review options and feedback categories.

Example: Producing a Risk Assessment Brief (RAB)

Beyond One Brief: Building Institutional Memory

What Executives Need to Enable

Conclusion

References and Acknowledgments

Related insights

From Human to Agentic Evaluation: Scaling Expertise Without Losing Trust

From Human to Agentic Evaluation: Scaling Expertise Without Losing Trust

From Human to Agentic Evaluation: Scaling Expertise Without Losing Trust

The Shaped Framework for Implementing Agentic Workflows

The Shaped Framework for Implementing Agentic Workflows

The Shaped Framework for Implementing Agentic Workflows

1. Context and Objective
20 min

2. Approved Sources and Proof
20 min

3. Plan to Answer
20 min

4. Escalation for Uncertainty
15 min

5. Rationale and Review
15 min