Building AI Agents on Top of ERPNext

For a business to be AI-native, its system of record has to be AI-native first.

That's not a marketing line — it's the first thing you bump into when you try to point an LLM at a real company. The agent is only as good as the data surface it can actually see and act on. If your ERP is a black box behind a thick UI, no amount of model capability will save you. The agent will hallucinate, the tool calls will fail, and you'll end up writing glue code forever.

ERPNext is interesting precisely because it inverts that problem. Every doctype is a REST resource. Every field is introspectable. Every workflow is a whitelisted method. It's the most honest ERP surface we've ever wired an agent into — and it's open source, so there are no gatekeepers between you and the data model.

This post is about what we built on top of it, the thesis driving the work, and the specific architectural decisions that made the agents actually work.

The Thesis: AI-Native ERP Starts With an AI-Native System of Record

Most "AI in ERP" stories right now are vendor copilots bolted onto closed systems. You get a sidebar, you get a chat box, you get a demo where someone asks "what were my top 5 customers last quarter" and the answer comes back formatted nicely. That's fine. That's also not what AI-native means.

AI-native means the agent can do things. Create a sales order when a qualified lead asks for one. Reconcile a payment against three overlapping invoices. Flag a purchase order whose receipt never arrived. Spin up a work order when inventory drops below reorder level. Not as demos — as the normal operating state of the business.

For that to work, three things have to be true:

Every business object is addressable. The agent needs to read and write any record without the vendor shipping a new feature.
Schema is introspectable at runtime. The agent can't ship with hardcoded field lists — doctypes evolve, custom fields get added, and the agent has to discover this without a redeploy.
Workflows are callable. Submitting a document, cancelling it, moving it through an approval — these need to be first-class operations, not scraped from a UI.

ERPNext happens to tick all three. The Frappe framework underneath it exposes a full REST surface (/api/resource/<Doctype> for CRUD, /api/method/frappe.client.get_meta for schema, /api/method/frappe.client.submit for workflow transitions). Every field you see in the UI is addressable through the same API the UI itself uses. This isn't a public API layer bolted on top of a private one — it is the API.

That property alone is what makes building agents on ERPNext fundamentally different from building them on SAP or NetSuite. You're not fighting the platform. You're standing on it.

Why ERPNext Specifically

We chose ERPNext for three reasons, and all three matter.

It's open source, and that's not a philosophical preference — it's an architectural one. When you're building agents, you spend a surprising amount of time reading framework internals. What does this whitelisted method actually do? How does the permission check work when a child table is updated? What happens to linked documents on cancel? With ERPNext you can just read the code. With closed-source ERP, you file a support ticket and wait.

Its doctype coverage is deep. Sales, purchasing, inventory, manufacturing, accounting, HR, projects, assets, quality, CRM — all in one data model, all with the same REST conventions. One integration pattern, seven domains. That's what made the per-domain agent architecture actually tractable.

It has real-world adoption at the SMB layer. ERPNext runs actual businesses — from manufacturers to hospitals to retailers. The agent work isn't a lab exercise. Whatever we build is directly deployable against a production site, which changes how you think about edge cases.

One thing to be clear about: ERPNext has no native AI layer as of v16. There's no built-in LLM hook, no copilot extension point, no "agent runtime." If you want agents, you build them outside the platform and talk to it over REST. That turns out to be the right architecture anyway, for reasons we'll get into.

What We Built

Three things, layered:

1. A portable ERPNext skill

A self-contained skill that teaches Claude the Frappe REST API — auth model, permission gotchas, doctype conventions, schema introspection, workflow transitions. It includes a stdlib-only Python helper (frappe_client.py) that wraps the common operations: list, get, create, update, submit, cancel, count, get_meta. No dependencies, no build step, drops into any Claude Code session.

This is the foundation every agent sits on. The skill carries the institutional knowledge about how Frappe actually works — the stuff that takes weeks to learn the hard way — and makes it available to the model on demand.

2. A web demo app with the Claude Agent SDK

A React + Vite frontend, an Express + WebSocket backend, the Claude Agent SDK driving tool calls, and SQLite for persistent chat + cost + tool-call history. The agent has access to the ERPNext skill via Bash tool invocations — so every conversation is also a live audit trail of exactly which REST calls ran against the site.

The UI shows tool calls streaming in as the agent works: frappe_client.py list "Sales Invoice", frappe_client.py get_meta "Work Order", frappe_client.py submit "Journal Entry" ACC-JV-2026-00042. You see the agent reason, introspect the schema, and then act. When it writes, you see exactly what it wrote.

3. Per-domain agent demos

We're building one agent per ERPNext domain: Sales, Accounts Payable, Inventory, Support, Manufacturing, Accounting, HR. Each runs as the same Claude Agent SDK process, but scoped to a dedicated agent-bot ERPNext user with only the roles relevant to that domain (Sales Manager for the Sales agent, Stock Manager for Inventory, and so on).

The scoping matters. An agent with full System Manager access is a nightmare for security review and a worse nightmare for debugging. An agent with Sales User + Sales Manager roles is blast-radius limited by the platform's own permission model. When it tries to touch something outside its lane, ERPNext returns PermissionError and the agent surfaces that cleanly.

Architecture: External Service + REST, Not In-Process

The single most important architectural call we made was to run the agent as an external service talking to ERPNext over REST, rather than as an in-process custom Frappe app.

The temptation to build it in-process is real. Frappe has a whole app model. You can write Python that runs inside the bench, call Frappe's ORM directly, hook into document events, and skip the REST layer entirely. On paper this looks cleaner. In practice it's a trap.

Here's why external-first wins:

Upgrade coupling. An in-process app ties your agent's lifecycle to Frappe's. Every bench migrate, every ERPNext point release, every patch — you're in the upgrade path. External services update independently.
Runtime freedom. You want to run a specific LLM SDK version, a specific Python version, pin specific dependencies. Inside the bench you inherit whatever Frappe dictates. Outside, you control your own runtime.
Blast radius. A bug in an in-process app can take down the ERP. A bug in an external service degrades to "the agent is down" — the business keeps running.
Deployment topology. External means you can host the agent anywhere, talk to multiple ERPNext sites from one agent, or multiple agents against one site. In-process forces 1:1.

The sole downside — latency of HTTP vs. in-process calls — is negligible for an agent architecture. The agent is already waiting hundreds of milliseconds for model tokens. A 20ms REST call is noise.

One specific anti-pattern we'd call out: don't try to use Frappe Server Scripts as the agent's reasoning layer. Server Scripts run inside RestrictedPython, which blocks the imports you need (HTTP clients, SDKs, almost anything interesting). It's a sandbox designed for small trigger logic, not for hosting an LLM loop. We tried. It doesn't work. Build external.

The Gotchas We Hit

A few things that cost us time so you don't have to.

The permission model is roles-first, not flags. System Manager sounds like the keys-to-the-kingdom role, and for platform admin stuff it is — but it grants zero access to ERPNext business doctypes. If your agent is returning PermissionError: Insufficient Permission for Sales Order, the fix is to add Sales Manager (or Sales User) to the agent's user, not to escalate System Manager further. This trips up every new integrator.

Schema is not in the docs — it's in the API. The official Frappe docs cover transport, auth, and filter syntax well, but they do not list doctype field schemas. Custom fields, client customisations, app-specific extensions all mean that the "real" schema is whatever the running site says it is. frappe.client.get_meta is how you discover it. Every agent we ship calls get_meta before it writes.

Submittable documents aren't just "saved." In Frappe, doctypes like Sales Invoice, Journal Entry, and Work Order have a three-state lifecycle: Draft → Submitted → Cancelled. A POST /api/resource creates a draft. Nothing actually happens (no GL entry, no stock movement) until you submit. Agents need to know which doctypes are submittable and call frappe.client.submit explicitly. Otherwise you end up with a graveyard of draft records that look right but aren't wired into anything.

The linked-document web is deeper than it looks. Creating a Sales Invoice isn't one record — it's an invoice plus child table items plus taxes plus terms plus GL entries on submit plus possibly a Payment Entry downstream. The agent needs to understand these dependencies, and the best way to teach it is to let it read the doctype with get_meta and follow the link-type fields. Don't hardcode the graph. Introspect it.

What We Learned About Agent Architecture in General

Building this has changed how we think about agent design more broadly.

The system of record matters more than the model. You can swap Claude for GPT for whatever comes next, and the agent's behaviour changes modestly. You can swap ERPNext for SAP, and the agent's possibility space collapses entirely. Where you connect matters more than what you think with.

Skills beat prompts. Dumping API docs into a system prompt gets you a mediocre agent. Packaging the knowledge as a skill — with working code, concrete examples, and a triggering description — gets you an agent that reliably does the right thing. The skill pattern is underrated for domain-specific agents.

Scoping via platform permissions is free blast-radius protection. Instead of building a permission layer in your agent, inherit one from the ERP. Give the agent's user exactly the roles it needs and no more. When the agent tries to do something it shouldn't, the platform stops it cold, and the error message tells you what happened. This is much better than trying to sandbox the LLM's output yourself.

Tool-call observability is not optional. Every REST call the agent makes is logged with its arguments, duration, and response. You need this for debugging, for audit, and — frankly — for trust. Users will not let an agent touch their accounting without being able to see exactly what it did.

Where This Goes

The per-domain demos are the starting point, not the endpoint. What we're actually building toward is a Chief-of-Staff-style orchestrator that sits above the domain agents: a Sales question routes to the Sales agent, a stockout alert wakes up the Inventory agent, a month-end close choreographs the Accounting agent. Each domain agent stays narrow and composable. The orchestrator does the coordination.

That's the AI-native ERP picture. Not a copilot sidebar. Not a chatbot that reads from your database. An actual operating layer where agents do the coordination work that humans used to do — and the ERP is the substrate they act on, not a destination they query.

ERPNext is the right place to build it. The REST surface is honest, the data model is open, and the permission system gives you guardrails without getting in the way. If you're thinking about AI-native ERP and you haven't looked at the Frappe stack yet, it's worth the afternoon.

If you want to see the demo — or want to talk about building agents on top of your own ERP — get in touch.