Skip to main content

LLM Integration

The engine doubles as a pre-filter in front of an LLM. Deterministic rules cover the known cases for free; @Default calls the model only for inputs that fall outside every rule - the ones nobody anticipated.

Architecture

Incoming fact


┌─────────────────┐ match ┌──────────────┐
│ KnowledgeEngine│ ──────────▶ │ Return result│ ← microseconds, $0
│ │ └──────────────┘
│ │ no match
│ │ ──────────▶ ┌──────────────┐
└─────────────────┘ │ LLM │ ← ~800ms, API cost
└──────────────┘

Example: support ticket router

from enum import Enum
from pydantic_ai import Agent
from airules import Default, Fact, KnowledgeEngine, Rule, StringField

class Team(Enum):
BILLING = "billing"
AUTH = "auth"
SHIPPING = "shipping"
RETURNS = "returns"
GENERAL = "general"

class Ticket(Fact):
subject: StringField
body: StringField

_TEAM_VALUES = ", ".join(t.value for t in Team)

_agent = Agent(
"anthropic:claude-haiku-4-5",
result_type=Team,
system_prompt=(
f"Classify a support ticket into exactly one of: {_TEAM_VALUES}. "
"Reply with the team name only."
),
)

class TicketRouter(KnowledgeEngine[Ticket, Team]):

@Rule(Ticket.subject.contains("billing", case_insensitive=True)
| Ticket.body.contains("invoice", case_insensitive=True))
def billing(self, ticket: Ticket) -> Team:
return Team.BILLING

@Rule(Ticket.subject.contains("password", case_insensitive=True)
| Ticket.subject.contains("login", case_insensitive=True))
def auth(self, ticket: Ticket) -> Team:
return Team.AUTH

@Rule(Ticket.subject.contains("return", case_insensitive=True)
| Ticket.subject.contains("refund", case_insensitive=True))
def returns(self, ticket: Ticket) -> Team:
return Team.RETURNS

@Default
def llm_fallback(self, ticket: Ticket) -> Team:
result = _agent.run_sync(f"Subject: {ticket.subject}\n\n{ticket.body}")
return result.output

An "invoice" or "password" ticket never hits the API. A ticket about a broken screen reader on the checkout page does - and the model handles it correctly without you having written a rule for it.

Async variant

Use run_async when your fallback needs to await:

@Default
async def llm_fallback(self, ticket: Ticket) -> Team:
result = await _agent.run(f"Subject: {ticket.subject}\n\n{ticket.body}")
return result.output

team = await TicketRouter().run_async(ticket)

Synchronous @Rule methods don't need to change - the engine handles the mix.

The iterative improvement loop

Your @Default hits are a roadmap:

  1. Observe - track which inputs are hitting the LLM (see Observability)
  2. Analyze - cluster the default facts by field values and common patterns
  3. Add rules - write a new @Rule for each pattern you can enumerate
  4. Repeat - default rate drops, token spend drops, latency drops

The LLM trains the rules engine. Every correct LLM classification is signal for a new rule to cover it. Over time, the engine handles more and more of the traffic - the LLM handles less and less.

When not to use a rules engine

  • Highly fuzzy matching where no static condition can be stated - if you genuinely can't articulate a rule, the engine can't express it
  • Simple 2–3 branch logic that will never grow - a plain Python if is the right tool
  • Pure ML pipelines where classification depends on learned embeddings or similarity scores