The Shaped Framework for Implementing Agentic Workflows

Agentic, Workflows, Framework

Author

Wout Helsmoortel

Most organizational processes were built for predictability. They follow fixed rules, predefined roles, and linear steps. That works in stable conditions, but it collapses when environments shift. Regulations evolve, data multiplies, and experts are spread thin. Traditional automation improves efficiency, but not adaptability. Agentic systems do more. They interpret, evaluate, and learn. They understand intent, adjust behavior in context, and scale expertise safely. At Shaped, we design these systems as living processes that evolve through human feedback and continuous evaluation. This is the Shaped Agentic Process Implementation Framework: a model for turning everyday workflows into adaptive, explainable, and self-improving systems.

The Shaped Agentic Process Implementation Framework

Agentic systems aren’t one product or model — they’re ecosystems that combine strategy, design, orchestration, and evaluation.
This framework helps organizations move from identifying opportunities to deploying reliable agentic workflows.

1. Strategic Alignment

The first step is choosing where and why to apply agentic principles.
Not every process benefits equally. The best candidates are those where expert reasoning matters, feedback is frequent, and outcomes are measurable.

Teams map current workflows, identify bottlenecks, and define goals such as reducing review time, improving accuracy, or increasing traceability.

This stage also clarifies what remains human-led and what can become agent-led.

Example:
A defense training command finds that field evaluation reports take weeks to process. Experts spend more time formatting than analyzing insights.
The goal becomes clear: automate structure and validation while keeping humans responsible for interpretation.

2. Knowledge and Context Foundation

Agents are only as good as the data and policies that ground them.
This stage builds the factual and contextual backbone of the system.

Teams curate verified datasets, connect internal repositories, and define ontologies that describe both information and roles.
By understanding relationships between concepts, policies, and users, agents can reason on structured knowledge and adapt their behavior to the person or context they serve.

Compliance rules, policies, and business logic are built in from the start, not checked after the fact.

Example:
An accounting department merges its internal manuals, updated tax rules, and precedent cases into a single retrievable knowledge base.
Each role within the team — trainee, reviewer, auditor — is also defined in the ontology. This lets the agent tailor its reasoning and tone: it can surface guidance for juniors or validation logic for seniors.
When the agent drafts fiscal advice, it draws from verified sources and adapts its response to the user’s role and responsibility.

3. Agentic Process Design

This is where static workflows become adaptive and context-aware.
Teams identify agentic moments—points in a process where the system should reason, clarify, or escalate. Each moment defines how the agent behaves: when it acts, when it asks, and when it waits for a human.

Prompts are written as structured reasoning patterns: interpret, verify, act, summarize, and evaluate. These shape how the agent “thinks.” Alongside them, workflow logic defines the sequence, dependencies, and what happens if a step fails. Together they form a process that can adapt intelligently rather than follow a script.

Tool and data connections are added through orchestration frameworks like AgentKit or n8n, specifying what the agent can access, how it retrieves context, and when to retry or roll back. Evaluation checkpoints are embedded to measure confidence and trigger human review when needed.

Example:
In an accounting workflow, agents now assist rather than just automate. When a junior accountant uploads a fiscal draft, the agent analyses it and says:

“It looks like this advice is missing the updated small-business VAT threshold. Would you like me to insert the current rate and reference the source?”

The agent then applies the right policy, justifies its choice, and adds citations. If confidence in any section is low, it flags the report for senior review. The process becomes self-correcting, faster, and consistently aligned with policy.

4. Orchestration and Execution

Once the process is designed, orchestration brings it to life.
This is where agents coordinate actions, access data, and interact with tools and other agents to complete work in a structured, auditable way.

An orchestrated workflow gives the agent both autonomy and guardrails. It defines how the agent plans tasks, what tools it can use, and how it should respond if something goes wrong. Memory components preserve context across steps so that the agent can reason consistently, while observability layers log each decision, making every action traceable.

Through orchestration frameworks, agents can call APIs, retrieve or update data, and collaborate with peer agents. One agent might extract key figures from a document, another validates them against policy, and a third compiles the report and evaluation summary. Together, they operate as a coordinated system rather than isolated bots.

Example:
During a fiscal audit simulation, the orchestrator triggers three specialized agents. The first gathers figures from the accounting database, the second checks each value against current tax thresholds, and the third generates a human-readable summary with source citations.
If an API call fails or returns ambiguous data, the orchestrator pauses execution, requests clarification, and retries. Every step is logged and replayable, ensuring both transparency and reliability.
The outcome is a process that not only performs tasks but manages its own quality and continuity.

5. Evaluation and Learning Loop

This stage is what makes a system truly agentic.
Evaluation isn’t a separate phase at the end of the process, it’s embedded directly into the workflow.

Experts first define what “good” looks like through clear metrics such as accuracy, compliance, and completeness. These become the standards that Evaluator Agents learn to apply automatically. Each output is compared with expert judgments, and the differences feed back into continuous refinements. Over time, alignment improves until the agent can handle most routine evaluations independently, while experts focus on complex or exceptional cases.

Example:
After generating a fiscal report, the Evaluator Agent reviews it for completeness and compliance.
If its judgment diverges from the human reviewer’s, the system logs the discrepancy and adjusts its evaluation logic in the next iteration.
Within weeks, human review time is cut in half while quality remains stable; a closed feedback loop that keeps both agents and processes learning.

6. Governance and Oversight

Even the most capable agentic systems are only as trustworthy as the oversight that governs them.
Governance ensures that every action, decision, and data interaction remains transparent and accountable.

This stage focuses on traceability and control. Every reasoning step and tool call is logged, each model or prompt version is recorded, and access to data is managed through roles and permissions.

Dashboards surface key indicators such as accuracy, bias, drift, and compliance, allowing teams to monitor system performance and identify when human review is needed.

Example:
In the fiscal reporting environment, auditors can open any completed evaluation, trace how the agent reached its conclusion, and review which datasets or policies were referenced.
If a regulation changes, governance procedures ensure that the relevant knowledge sources are updated and versioned before the agent resumes operation.
The result is a system that builds trust through verifiable accountability.

7. Human Enablement and Change

Agentic transformation only succeeds when people grow with it.
This stage focuses on giving teams the skills and confidence to guide, supervise, and improve their agentic systems.

Experts move from executing routine work to shaping how the system reasons and evaluates. Training covers how to read agent rationales, interpret scores, and refine evaluation criteria. Over time, these same teams learn to design new agentic workflows themselves, turning human expertise into an evolving knowledge base.

Example:
In the accounting team, analysts now review the reasoning behind evaluator decisions rather than the raw data.
When they spot recurring gaps or blind spots, they adjust the criteria and retrain the agent.
This keeps automation grounded in human judgment and ensures that the system improves through collaboration rather than replacement.

Bringing It All Together

The Shaped Agentic Process Implementation Framework is a continuous ecosystem where each stage strengthens the next.

Strategy defines what matters, knowledge provides the foundation, design gives shape to intent, orchestration brings it to life, evaluation ensures learning, governance maintains accountability, and humans drive improvement.

Together they form a loop that keeps processes alive and adaptable.

Over time, workflows evolve into more intelligent systems that scale judgment, maintain transparency, and respond to change with confidence.

Agentic systems represent more than automation. They are a shift toward processes that think, explain, and improve.

By embedding reasoning, evaluation, and feedback inside the workflow, organizations create systems that not only perform tasks but understand why and how they do them.

When this mindset takes hold, operations stop being static procedures and become living networks of expertise and adaptation.

That is what it means to build agentic organizations with Shaped: systems that learn with you, improve with you, and strengthen every time they are used.

References and Acknowledgments

Author: Wout Helsmoortel
Founder, Shaped — specializing in agentic AI systems, explainable architectures, and learning transformation for defense and enterprise environments.

Academic references

Gu, J., et al. (2024). A Survey on LLM-as-a-Judge. arXiv.
Li, H., et al. (2024). LLMs-as-Judges: A Comprehensive Survey on LLM-based Evaluation Methods. arXiv.
Zheng, L., et al. (2023). Judging LLM-as-a-Judge with MT-Bench and Chatbot Arena. arXiv.
Li, Z., et al. (2024). Split and Merge: Aligning Position Biases in LLM-based Evaluation. EMNLP 2024. ACL Anthology.

Industry and practice

OpenAI. Working with Evals. Developer guide. OpenAI.
Anthropic. Initiative for Third-Party Model Evaluations. (2024). Anthropic.
IBM Research. The Future of AI Agent Evaluation and ITBench. (2025). IBM Research.
EvidentlyAI. LLM-as-a-Judge: Complete Guide. (2025). Evidently AI.

Related insights

From Human to Agentic Evaluation: Scaling Expertise Without Losing Trust

Evaluation, LLM-as-Judge

From Human to Agentic Evaluation: Scaling Expertise Without Losing Trust

Evaluation, LLM-as-Judge

From Human to Agentic Evaluation: Scaling Expertise Without Losing Trust

Evaluation, LLM-as-Judge

Designing UX for AI that Thinks: The CAPER Workshop

Agentic UX, Workshop

Designing UX for AI that Thinks: The CAPER Workshop

Agentic UX, Workshop

Designing UX for AI that Thinks: The CAPER Workshop

Agentic UX, Workshop