How Do You Turn SOPs Into Secure AI Agents?

The market for AI in customer service is exploding, but behind the impressive growth—projected to climb toward $47.82 billion by 2030—lies a complex engineering challenge. Enterprises are no longer impressed by demos; they need secure, reliable automation that resolves issues, not just escalates them. To understand how this is being done at scale, we spoke with Dominic Jainy, a senior member of the IEEE who has spent years architecting the bridge between human knowledge and executable AI. He walks us through the unglamorous but critical work of turning static support procedures into autonomous agents, focusing on why security permissions often matter more than an agent’s intelligence, how to shape years of tribal knowledge into a reliable asset, and what it truly takes to prove an agent behaves as expected in a high-stakes production environment.

Many support teams struggle to move beyond static playbooks. When transforming a human-readable SOP into an executable AI workflow, what are the first few technical steps you take, and what common “brittle” points tend to break during this conversion?

The very first thing we do is shift our mindset from documentation to execution. An SOP sitting in a PDF is only a suggestion, but an SOP that runs is a resolution. To make that happen, the initial technical step isn’t to write a one-off script, but to build a standardized contribution platform. This establishes a single, repeatable way to translate human steps into machine logic, incorporating a structured SOP format, a central registry for approved tools, and a consistent execution framework. This prevents every new use case from becoming a bespoke, time-consuming engineering project. The most common breaking point is ambiguity. In a human-readable document, a vague step is glossed over. In code, that same step can become a catastrophic infinite loop in production. You feel that pressure when a single missed detail in one of our 350+ automated SOPs could derail our goal of saving 500,000 hours of manual work.

You’ve emphasized that an agent’s permissions are more critical than its intelligence. Could you walk us through how you design a permission boundary for a customer environment and how that “security-first” approach prevents a helpful agent from becoming a liability?

It’s easy to get romantic about an AI’s ability to reason, but in a live enterprise environment, the first and most important question is always, “What is this agent allowed to touch?” If you can’t answer that with absolute precision, you don’t have automation; you have a significant risk. We designed our entire system with a security-first posture, treating access rights as a core part of the workflow itself, not a constraint to be bolted on later. For instance, we architected a “Forward Access Session token” approach, which grants the agent tightly scoped, temporary permissions to perform a specific task and nothing more. The default behavior is always inaction. If the permission boundary is unclear for any reason, the agent is designed to stop and do nothing. This approach required comprehensive application security reviews, because in a support setting, an unauthorized “fix” is far worse than no fix at all.

Enterprises often have years of tribal knowledge documented in wikis and notes. Instead of just collecting this data, how does your contribution platform actively shape this raw information, and what standards are essential for making it reliably usable by an automated agent?

This is a critical point. Simply throwing years of accumulated tribal knowledge—all those edge cases and half-documented steps—into a large model doesn’t create reliability. It just creates a more confident-sounding version of the same mess. Our platform was designed as a contribution mechanism to actively shape this knowledge from the moment it enters the system. It’s not a passive repository. We enforce a standardized SOP structure, a central tool registry, and vector storage so that information arrives in a clean, executable format. An automated workflow has zero tolerance for the kind of ambiguity a human can easily navigate on a wiki page. This structured, disciplined approach is what makes it possible to automate 117,000 support engineering cases and save 368,000 hours of work. You’re not just collecting knowledge; you’re refining it into a functional asset.

An AI agent can be helpful but unsafe, or safe but useless. Beyond simple accuracy, what key metrics do you use to evaluate an agent’s true performance, and what does your continuous validation process look like to prevent regressions after launch?

You’ve hit on the central tension of this work. Agent quality isn’t one number; it’s a balance. A helpful agent that isn’t safe is a liability, and a safe one that’s useless provides no value. To manage this, we build evaluation directly into the delivery process, which is the core discipline of MLOps. The platform was designed with modular separation between authoring, storage, orchestration, tools, and execution. This allows us to validate each component in isolation before integrating it. Our continuous validation process involves constant monitoring to ensure that new changes don’t introduce regressions that could silently increase case escalations. We learned early on that shipping an agent isn’t a single launch day event. It is a continuous promise that the workflow will remain correct and safe next week, next month, and next year.

Looking ahead, the ability to prove why an agent took a specific action will be crucial. What kind of “evidence” or audit trail is most important to provide, and how does that transparency build the trust needed to scale automation significantly?

The future of enterprise AI belongs to teams that can prove their agents behave responsibly. The most important evidence isn’t just a log of what the agent did, but a clear, step-by-step trail explaining why it took an action and, crucially, confirming it was authorized to do so. This audit trail must link every action back to a specific, approved tool and a validated permission boundary. This level of transparency is non-negotiable for building trust. Without it, every mistake erodes confidence and slows down the adoption of broader automation. When you can programmatically defend an agent’s behavior in the moments that matter, you build the credibility needed to scale toward a goal like driving a $1.2 billion impact on the support bottom line. In a market crowded with demos, credibility will come from systems that can defend their actions under scrutiny.

What is your forecast for the future of enterprise support automation?

The future isn’t a world where humans are absent from support. It’s a world where humans stop doing the repetitive, automatable tasks that should never have required their expertise in the first place. The most valuable skill will be the ability to draw clean, intelligent boundaries: defining precisely what can and should be automated, what must be escalated for human judgment, and what evidence is required before either decision is made. As agentic AI systems grow more powerful, the differentiator won’t be capability alone. It will be the demonstrable proof of safe, reliable, and authorized action. The teams that win will be those who can build systems that don’t just act, but act without becoming reckless, and can prove it every single time.

Explore more

Closing the Feedback Gap Helps Retain Top Talent

The silent departure of a high-performing employee often begins months before any formal resignation is submitted, usually triggered by a persistent lack of meaningful dialogue with their immediate supervisor. This communication breakdown represents a critical vulnerability for modern organizations. When talented individuals perceive that their professional growth and daily contributions are being ignored, the psychological contract between the employer and

Employment Design Becomes a Key Competitive Differentiator

The modern professional landscape has transitioned into a state where organizational agility and the intentional design of the employment experience dictate which firms thrive and which ones merely survive. While many corporations spend significant energy on external market fluctuations, the real battle for stability occurs within the structural walls of the office environment. Disruption has shifted from a temporary inconvenience

How Is AI Shifting From Hype to High-Stakes B2B Execution?

The subtle hum of algorithmic processing has replaced the frantic manual labor that once defined the marketing department, signaling a definitive end to the era of digital experimentation. In the current landscape, the novelty of machine learning has matured into a standard operational requirement, moving beyond the speculative buzzwords that dominated previous years. The marketing industry is no longer occupied

Why B2B Marketers Must Focus on the 95 Percent of Non-Buyers

Most executive suites currently operate under the delusion that capturing a lead is synonymous with creating a customer, yet this narrow fixation systematically ignores the vast ocean of potential revenue waiting just beyond the immediate horizon. This obsession with immediate conversion creates a frantic environment where marketing departments burn through budgets to reach the tiny sliver of the market ready

How Will GitProtect on Microsoft Marketplace Secure DevOps?

The modern software development lifecycle has evolved into a delicate architecture where a single compromised repository can effectively paralyze an entire global enterprise overnight. Software engineering is no longer just about writing logic; it involves managing an intricate ecosystem of interconnected cloud services and third-party integrations. As development teams consolidate their operations within these environments, the primary source of truth—the