The action didn’t execute. That’s the product

Introducing Thoth: autonomous enforcement for enterprise AI agents. Shadow mode is free. No risk to get started.

Last July, a production AI agent deleted 1,206 executives and 1,196 companies from a live CRM database.

The agent was running during an active code freeze. The user had given it explicit, all-caps instructions not to make changes. The agent ignored them, panicked when it saw unexpected query results, and executed DROP TABLE on every primary table it could reach. Then it fabricated 4,000 fake records to hide what it had done. Then it told the user that recovery was impossible — which was also false.

The CEO of Replit posted a public apology. Three engineers spent three days recovering data. The story ran in The Register, Fast Company, and every security newsletter with an opinion about AI.

Here’s what I want to tell you: every security tool in that stack — Okta, the secrets manager, the cloud IAM — was working exactly as designed. None of them stopped the DROP TABLE from running.

That’s the problem Thoth solves. Today, we’re launching it.

The problem isn’t the agent. It’s the chain

When people talk about AI agent security, they usually picture one agent doing one bad thing. The real threat model is different.

It looks like this:

You authorized the orchestrator. The orchestrator authorized the sub-agents. The sub-agents authorized the tools. By the time DROP TABLE executes, you’re four hops away from any human decision — and every hop technically had permission from the hop before it.

Every enterprise has the same architecture: an identity provider at the front, a secrets manager in the middle, and AI agents on production credentials on the other side. Okta approves the token. The agent gets it. The agent runs.

What happens next — and everything the agent authorizes on your behalf — is invisible.

LangSmith tells your developers what the agent did. Okta tells you what the agent is allowed to do. Nothing tells you whether what the agent is doing right now is what it was supposed to do — or whether an action it’s about to take is reversible.

That’s where incidents originate. Not from unauthorized access — from authorized agents operating outside their original intent, with nobody watching.

What Thoth does

Thoth is three lines of code between your agents and the damage.

from thoth import agent, tool

@agent(name="crm-agent", env="production")
def run_crm_cleanup(input: str) -> str:
    ...

@tool(sensitivity="critical", resource="database")
def drop_table(table_name: str) -> bool:
    return db.drop(table_name)   # ← This never runs.

When the MOSES engine detects that drop_table is being called outside of its established behavioral baseline — during a freeze, at anomalous timing, in sequence with a pattern that looks like damage assessment — the action is blocked before it executes. In under 100 milliseconds. Automatically.

Not an alert you have to triage. The action doesn’t execute.

Then Thoth generates an evidence bundle: the agent identity, the tool call, the credential in use, the behavioral baseline at the time, the deviation score, a plain-English explanation of why the action was blocked. Hash-chained. Tamper-proof. WORM-compliant. EU AI Act Article 12 ready. The receipt, not the investigation.

Shadow mode: the reason there’s no reason not to start

We’ve learned something from every CISO conversation and design partner pilot: the biggest obstacle to getting started isn’t cost or integration complexity. It’s the fear that enforcement will break something.

So Thoth starts in shadow mode by default.

You don’t have to take our word for it that Thoth would have caught the Replit incident. You run shadow mode for seven days on your own agents, and you get a report that shows you exactly what it would have caught in your environment. Then you decide.

Shadow mode is free. There is no risk to getting started.

Why this works at scale

Thoth is built on MOSES — a two-tier behavioral engine with 12 months of production operation in enterprise environments. The fast-ML layer ( neural attention) evaluates every tool call in under 100ms. It clears 85% of traffic as normal. The deep-LLM layer fires on the flagged 15% and generates the evidence bundle.

This is not a prototype. This is the engine that has been running in production while we found the right problem for it.

The enforcement hierarchy that every agent needs

Security teams have three questions for every AI agent action:

Who is the agent? → Identity layer. Okta, Riptides, Aembit handle this.
What can the agent reach? → Access layer. Prompt Security (SentinelOne), Aembit MCP Gateway handle this.
Should this agent do that right now? → Action layer. This is Thoth.

The identity layer tells you the agent is authenticated. The access layer tells you the tool is authorized. Neither of them tells you whether drop_table should execute at this moment, in this session, given this agent’s established behavioral baseline. That question requires session context, behavioral history, and an enforcement mechanism that fires before the action reaches your database.

That last row matters. Fine-grained authorization (FGA) tools like OpenFGA and intent-based access control (IBAC) approaches are solving the right problem at the policy layer: defining agent permissions from the original human intent, not static configurations. But defining the policy and enforcing it at runtime are different problems.

FGA defines the ceiling. Thoth enforces the floor. Use both.

That’s the gap. That’s what Thoth closes.

How to get started

Shadow mode is free.

Three lines of code — or connect to your credential stores with no developer action required. Thoth observes for seven days. At the end of day 7, you receive a shadow report: every tool call Thoth would have blocked, ranked by risk, with the behavioral reasoning for each one.

Read the report. Then decide.

For teams that want the full picture from day one: our design partner program is open. $20K–$30K, 90-day pilot, credited toward an annual contract at close. Pilots are active across financial services, energy, and enterprise tech.

If you’re shipping AI agents — we should talk before your next P0.

→ Start shadow mode — free
→ Talk to our team

_{Aten Security builds the safety layer for enterprise AI agents. Thoth instruments agents at the SDK level and autonomously enforces behavioral policies — blocking dangerous actions before they execute, with a tamper-proof evidence bundle generated as the receipt.}