The market for AI in customer service is exploding, but behind the impressive growth—projected to climb toward $47.82 billion by 2030—lies a complex engineering challenge. Enterprises are no longer impressed by demos; they need secure, reliable automation that resolves issues, not just escalates them. To understand how this is being done at scale, we spoke with Dominic Jainy, a senior member of the IEEE who has spent years architecting the bridge between human knowledge and executable AI. He walks us through the unglamorous but critical work of turning static support procedures into autonomous agents, focusing on why security permissions often matter more than an agent’s intelligence, how to shape years of tribal knowledge into a reliable asset, and what it truly takes to prove an agent behaves as expected in a high-stakes production environment.
Many support teams struggle to move beyond static playbooks. When transforming a human-readable SOP into an executable AI workflow, what are the first few technical steps you take, and what common “brittle” points tend to break during this conversion?
The very first thing we do is shift our mindset from documentation to execution. An SOP sitting in a PDF is only a suggestion, but an SOP that runs is a resolution. To make that happen, the initial technical step isn’t to write a one-off script, but to build a standardized contribution platform. This establishes a single, repeatable way to translate human steps into machine logic, incorporating a structured SOP format, a central registry for approved tools, and a consistent execution framework. This prevents every new use case from becoming a bespoke, time-consuming engineering project. The most common breaking point is ambiguity. In a human-readable document, a vague step is glossed over. In code, that same step can become a catastrophic infinite loop in production. You feel that pressure when a single missed detail in one of our 350+ automated SOPs could derail our goal of saving 500,000 hours of manual work.
You’ve emphasized that an agent’s permissions are more critical than its intelligence. Could you walk us through how you design a permission boundary for a customer environment and how that “security-first” approach prevents a helpful agent from becoming a liability?
It’s easy to get romantic about an AI’s ability to reason, but in a live enterprise environment, the first and most important question is always, “What is this agent allowed to touch?” If you can’t answer that with absolute precision, you don’t have automation; you have a significant risk. We designed our entire system with a security-first posture, treating access rights as a core part of the workflow itself, not a constraint to be bolted on later. For instance, we architected a “Forward Access Session token” approach, which grants the agent tightly scoped, temporary permissions to perform a specific task and nothing more. The default behavior is always inaction. If the permission boundary is unclear for any reason, the agent is designed to stop and do nothing. This approach required comprehensive application security reviews, because in a support setting, an unauthorized “fix” is far worse than no fix at all.
Enterprises often have years of tribal knowledge documented in wikis and notes. Instead of just collecting this data, how does your contribution platform actively shape this raw information, and what standards are essential for making it reliably usable by an automated agent?
This is a critical point. Simply throwing years of accumulated tribal knowledge—all those edge cases and half-documented steps—into a large model doesn’t create reliability. It just creates a more confident-sounding version of the same mess. Our platform was designed as a contribution mechanism to actively shape this knowledge from the moment it enters the system. It’s not a passive repository. We enforce a standardized SOP structure, a central tool registry, and vector storage so that information arrives in a clean, executable format. An automated workflow has zero tolerance for the kind of ambiguity a human can easily navigate on a wiki page. This structured, disciplined approach is what makes it possible to automate 117,000 support engineering cases and save 368,000 hours of work. You’re not just collecting knowledge; you’re refining it into a functional asset.
An AI agent can be helpful but unsafe, or safe but useless. Beyond simple accuracy, what key metrics do you use to evaluate an agent’s true performance, and what does your continuous validation process look like to prevent regressions after launch?
You’ve hit on the central tension of this work. Agent quality isn’t one number; it’s a balance. A helpful agent that isn’t safe is a liability, and a safe one that’s useless provides no value. To manage this, we build evaluation directly into the delivery process, which is the core discipline of MLOps. The platform was designed with modular separation between authoring, storage, orchestration, tools, and execution. This allows us to validate each component in isolation before integrating it. Our continuous validation process involves constant monitoring to ensure that new changes don’t introduce regressions that could silently increase case escalations. We learned early on that shipping an agent isn’t a single launch day event. It is a continuous promise that the workflow will remain correct and safe next week, next month, and next year.
Looking ahead, the ability to prove why an agent took a specific action will be crucial. What kind of “evidence” or audit trail is most important to provide, and how does that transparency build the trust needed to scale automation significantly?
The future of enterprise AI belongs to teams that can prove their agents behave responsibly. The most important evidence isn’t just a log of what the agent did, but a clear, step-by-step trail explaining why it took an action and, crucially, confirming it was authorized to do so. This audit trail must link every action back to a specific, approved tool and a validated permission boundary. This level of transparency is non-negotiable for building trust. Without it, every mistake erodes confidence and slows down the adoption of broader automation. When you can programmatically defend an agent’s behavior in the moments that matter, you build the credibility needed to scale toward a goal like driving a $1.2 billion impact on the support bottom line. In a market crowded with demos, credibility will come from systems that can defend their actions under scrutiny.
What is your forecast for the future of enterprise support automation?
The future isn’t a world where humans are absent from support. It’s a world where humans stop doing the repetitive, automatable tasks that should never have required their expertise in the first place. The most valuable skill will be the ability to draw clean, intelligent boundaries: defining precisely what can and should be automated, what must be escalated for human judgment, and what evidence is required before either decision is made. As agentic AI systems grow more powerful, the differentiator won’t be capability alone. It will be the demonstrable proof of safe, reliable, and authorized action. The teams that win will be those who can build systems that don’t just act, but act without becoming reckless, and can prove it every single time.
