Decision Risk for AI Agents

01 / Heritage in numbers

42M+

PyOD downloads

9.8K+

PyOD GitHub stars

Adopted by OpenAI, Amazon, Walmart, Databricks, Apache Beam, and the European Space Agency.

Recommended in the US Department of Defense CDAO Generative AI Responsible AI Toolkit.

02 / The problem

The hard part is not catching attacks. It is reconstructing why a non-adversarial decision was wrong.

Banks, hospitals, and claims offices are putting AI agents in the loop on decisions that used to take a human. Most of the time the agent is fine. Sometimes it produces a wrong answer that was perfectly internally consistent: every reasoning step was reasonable, every tool call was the one it usually makes, but a single dependency the agent leaned on had drifted from what the firm now considers correct.

When that wrong answer surfaces months later in an audit or a regulator examination, the trace is usually gone, the model has been re-deployed twice, and the engineer who configured it left the firm. What is missing is a record that survives the deployment, a way to re-derive the agent's reasoning under the dependencies that were live at the time, and a way to prove the record has not been edited.

In practice

A broker-dealer's surveillance agent clears a trading flag that should have escalated. The agent followed the firm's threshold rules, but the customer's KYC tier was provisional at decision time and is no longer treated that way. Three months later, FINRA opens a supervision review on the cleared flag. A replay-and-verify check catches the dependency drift; the firm can show what happened and why before the regulator asks.

03 / Heritage

Operating discipline, new substrate.

The lab's work centers on one question: how do you keep AI systems inspectable, safe, and accountable when they ship to real production work? That agenda builds on a decade of anomaly and outlier detection at scale. PyOD is the clearest deployed proof point: a practical library used across fraud detection, intrusion analysis, operations, and scientific monitoring. Decision risk for AI agents is the next substrate. When an agent's call becomes part of the production system, the same operating discipline applies.

For an agent's decision, the question is whether the call can be replayed, checked against the dependencies that were live at the time, and trusted before regulated work depends on it. The engineering posture (reproducibility, scale, evidence quality) carries over directly.

04 / Approach

Four moves.

Capture, replay, seal, respond. Designed for audit, not for catching attacks.

01

Capture.

Save what the agent decided and what evidence it relied on. Keep the record outside the system itself, so it survives the next model swap and the next policy edit.
02

Replay.

Re-run the decision against its original context. If a different value of any single input would have produced a different answer, that is the dependency the audit will want to see.
03

Seal.

Chain each step of the trace cryptographically. Anyone editing the record after the fact breaks the chain, and the break is visible to the auditor. The same technique a notary or a court would expect.
04

Respond.

A flag is not enough. When a problem decision is identified, the system has to do something about it: block the call, hand it off to a person, or roll back to the last decision that held up under check.

05 / Concept walkthrough

What it looks like.

A conceptual control-plane view. Not a screenshot of running software. The red row is where the dependency drift surfaced at replay.

Dependency drift Trace TR-0712 · KYC policy snapshot differs from canonical at replay 2 min ago

TR-0712 · surveillance agent

replayed against the 2026-05-09 18:42 UTC snapshot · 4 steps

01 decision decisionsurveillance agent cleared trading flag for customer CUST-44910 (score 0.71, retail-tier threshold 0.85) at trace time
02 replay replayre-derive the decision under the original dependency snapshot 85 ms
03 verify verifycompare snapshot dependencies to the current canonical policy 12 ms
04 violation violationcustomer KYC tier was provisional at trace time, retail at replay time. Current policy requires escalation at 0.71 for provisional tier. Decision would have flipped under the verified dependency. fail

The agent's reasoning was internally consistent. The dependency it relied on was not. Use one of the actions above (block, hand off, or roll back) to handle the case before it shows up in an audit.

06 / About

Yue Zhao is an Assistant Professor of Computer Science at the University of Southern California, where he leads the FORTIS Lab. His research focuses on AI auditing: building methods, benchmarks, and open-source tools that make AI systems inspectable, safe, and accountable.

PyOD is Yue's most visible open-source system, with 42M+ downloads, 9.8K+ GitHub stars, use at OpenAI, Apache Beam, Amazon, Walmart, Databricks, and the European Space Agency, and recommendation in the US Department of Defense CDAO Generative AI Responsible AI Toolkit.

Other open-source contributions co-developed with collaborators and students at the lab include TrustLLM, a trustworthiness benchmark for LLMs. TrustLLM is cited in a US Senate committee report, a NIST special publication on adversarial machine learning, the US Department of Defense CDAO Generative AI Responsible AI Toolkit, the International AI Safety Report 2026, and three editions of the Future of Life Institute AI Safety Index. Adjacent student-led projects at the lab include agent-audit and Aegis.

Full bio and publication list at yzhao062.github.io.

07 / Contact

hello@fortislabs.ai

Working with AI agents in regulated workflows (financial services, healthcare, claims handling, compliance review)? Reach out.

Decision risk for AI agents.

The hard part is not catching attacks. It is reconstructing why a non-adversarial decision was wrong.

Operating discipline, new substrate.

Four moves.

Capture.

Replay.

Seal.

Respond.

What it looks like.

TR-0712 · surveillance agent

hello@fortislabs.ai

Decision risk
for AI agents.