How to Move Beyond the Portal to a True Developer Platform?

Dominic Jainy stands at the forefront of the modern cloud-native movement, possessing a deep technical mastery of artificial intelligence, machine learning, and blockchain architectures. With years of experience navigating the complexities of large-scale IT infrastructures, he has become a leading voice in the evolution of platform engineering. His perspective is shaped by the practical realities of moving beyond simple automation toward building cohesive, self-healing systems that empower developers without overwhelming them. In this discussion, we explore the structural shift from basic developer portals to sophisticated internal platforms, the critical role of control planes, and the emerging influence of AI agents in orchestrating enterprise-grade environments.

The conversation centers on the limitations of current portal solutions like Backstage and the necessity of a dedicated execution layer to manage live deployments. We delve into the concept of the “messy middle”—the fragile point-to-point integrations that often plague growing organizations—and how to replace them with robust abstractions. Dominic elaborates on the multi-plane architecture required for a state-of-the-art platform, the importance of separating developer-facing intent from infrastructure implementation, and how open-source reference implementations like OpenChoreo are paving the way for the next generation of cloud-native development.

Many organizations find that cataloging services in a portal does not solve the underlying problem of managing live deployments. Why is there such a significant gap between having a service catalog and actually running a functional development platform?

The reality is that a portal like Backstage is fundamentally a solution for the “discovery” problem, not the “execution” problem. When you first roll out a portal, it feels like a victory because you finally have a unified catalog, structured documentation, and golden-path templates that provide a front door to your ecosystem. However, once a developer finds a service in that catalog, they immediately start asking operational questions: “Is this actually running?” or “Where are my logs for the staging environment?” Backstage was designed as a CNCF project to organize information, but it assumes that a sophisticated execution layer already exists beneath it to handle deployments, environment management, and runtime policies. Without that layer, the portal remains a static map of a territory that is constantly changing, leading to a frustrating disconnect between the information developers see and the actual state of their workloads in Kubernetes.

As organizations try to bridge this gap, they often end up with what you describe as the “messy middle.” Could you explain how these point-to-point integrations become a maintenance burden and why they fail over time?

The “messy middle” occurs when you try to force-connect your portal directly to your CI/CD pipelines, GitOps repositories, and observability tools using custom, fragile wiring. It usually starts innocently enough—you might write a script to link a Backstage component to an Argo CD application or a Datadog dashboard—but as your organization scales, these point-to-point connections multiply exponentially. Every time you upgrade a tool or change a security policy, you risk breaking dozens of these brittle links, turning your platform team into a full-time maintenance crew rather than a group that builds new features. You would never design a high-availability production back-end with this many disorganized dependencies, so it is a mistake to accept this lack of architecture for the very system that powers your entire engineering organization. We see teams spending nearly 60% of their time just keeping these integrations alive instead of improving the developer experience, which is a clear signal that the system design is fundamentally flawed.

You’ve emphasized that a platform should be treated both as a product and as a complex system. How does adopting a product mindset change the way a platform team approaches these architectural challenges?

Treating the platform as a product means you start with the developer experience as your primary focus, identifying their specific pain points—like cognitive load and context switching—and designing a system that addresses them coherently. From a systems engineering perspective, this requires a rigorous adherence to principles like the separation of concerns and the definition of clear, extensible interfaces. You have to move away from “organically grown” pipelines stitched together with tribal knowledge and move toward a framework where developers and platform engineers interact with well-defined abstractions rather than implementation details hidden in Helm charts. By decoupling the developer-facing abstractions from the underlying infrastructure, you allow both the application code and the infrastructure to evolve independently, ensuring that your platform remains agile enough to adopt new technologies without requiring a complete rewrite of your existing workflows.

One of the most critical elements you mentioned is the need for high-level abstractions to reduce cognitive load. How should these abstractions be structured to speak the developer’s language while still mapping effectively to Kubernetes?

The goal is to meet developers where they are, using a vocabulary that reflects their daily work—concepts like Projects, Components, and Endpoints—rather than forcing them to master the intricacies of Kubernetes primitives. In a well-designed platform, a “Project” acts as a cloud-native application boundary that the platform automatically translates into Kubernetes namespaces and network policies to ensure isolation. When a developer declares a “Dependency” on a specific resource, the platform should be smart enough to inject the necessary environment variables and configure egress and ingress policies automatically, ensuring the dependency graph you see in the portal reflects the actual permitted traffic flow. This approach removes the need for developers to manually manage hundreds of lines of YAML, allowing them to focus on code while the platform handles the complex translation into the 1,000-odd Kubernetes resources required to run a secure, scalable service.

While developer abstractions are vital, you’ve also pointed out that platform engineers need their own set of abstractions. What are the key concepts that platform teams should use to manage the infrastructure side of the equation?

Platform abstractions are the tools that allow us to define standards and enforce policies across the entire organization without writing custom configurations for every single microservice. We work with concepts like Namespaces to define ownership boundaries, Data Planes to represent our Kubernetes clusters, and Environments to define the specific runtime contexts like “dev,” “test,” or “prod.” We also utilize “Traits,” which are reusable capabilities—such as autoscaling, security policies, or observability configurations—that can be attached to any component type to compose complex behaviors without duplicating code. By defining a “Pipeline” as a platform primitive, we can encode our operational processes directly into the system, ensuring that every deployment follows the same rigorous path to production regardless of which team is pushing the code.

If the portal is the front door and the data plane is where workloads run, what is the specific role of the “Control Plane” in bridging these two layers?

The control plane is effectively the “brain” of the platform; it acts as a compiler that takes high-level developer intent and converts it into the low-level infrastructure configurations that Kubernetes understands. It doesn’t just deploy resources once and walk away; it continuously reconciles the state of the environment, monitoring for “drift” between the declared intent in the portal and the actual state of the running workloads. If a configuration changes or a pod fails to meet the defined policy, the control plane is responsible for correcting that divergence and ensuring the system stays in its desired state. Furthermore, it aggregates runtime data from the data plane and maps it back to our high-level abstractions, ensuring that when a developer looks at a “Component” page, they see a unified, accurate story of its health, replicas, and recent deployments.

You mentioned that programmability is a non-negotiable feature for a modern control plane. How do you balance the need for extensibility with the requirement for strict organizational guardrails?

Programmability is what allows a platform to evolve alongside the business, but it must be implemented as “constrained flexibility” to prevent the system from descending back into a messy collection of one-off scripts. You achieve this by allowing customization of how abstractions compile into Kubernetes manifests, while simultaneously preserving core invariants—such as security requirements and resource limits—that must be applied to every workload. This means platform engineers can create new component types or traits to support emerging technologies, but the control plane ensures these new additions still adhere to the organization’s fundamental safety and compliance standards. It is a delicate balance, but by building these guardrails into the compilation logic itself, you provide developers with the freedom to innovate within a protected and governed environment.

How does a robust control plane transform the way developers interact with observability data, and why is this “connected story” so much more effective than traditional dashboards?

In most organizations, developers are forced to become “tooling detectives,” jumping between Kubernetes dashboards, Grafana, and Jaeger to piece together why a service is failing. Because a control plane understands the relationship between the high-level abstraction and the low-level runtime resources, it can aggregate all that telemetry and present it directly within the context of the component the developer is working on. When they open Backstage, they aren’t just seeing a list of pods; they are seeing logs, metrics, and traces that are automatically scoped to that specific component in that specific environment, with clear links to recent deployments and dependency health. This bi-directional flow of information—intent flowing down to become workloads, and state flowing up to become insights—eliminates context switching and allows teams to resolve issues in minutes rather than hours.

With the rise of artificial intelligence, there is a lot of talk about “AI-powered platforms.” Where do you see AI agents fitting into this multi-plane architecture, and how do they leverage the abstractions we’ve discussed?

AI fits into the platform in two distinct ways: as “first-class users” and as “embedded capabilities.” Because we have built a platform with well-defined abstractions and a centralized control plane, AI agents can interact with the system through interfaces like Model Context Protocol (MCP) servers or clear APIs to perform tasks like triggering builds or reasoning about complex dependency chains. We can also deploy specialized agents, like SRE agents that correlate logs and traces to find root causes, or FinOps agents that optimize resource costs across different environments. These agents are significantly more effective because they don’t have to navigate a “messy middle” of fragmented data; they have access to the same unified, connected story that the developers see, allowing them to act as force multipliers for the entire engineering organization.

OpenChoreo was recently accepted as a CNCF sandbox project. How does it serve as a reference implementation for the architecture we have been discussing?

OpenChoreo is a perfect example of this philosophy in action because it explicitly separates concerns across five distinct planes: the Experience, Control, Data, Observability, and Workflow planes. It demonstrates how high-level abstractions can be compiled into Kubernetes resources while maintaining a programmable control plane that reconciles state and enforces project-level isolation through network policies. Whether an organization adopts OpenChoreo directly or simply uses its Backstage plugins and control plane architecture as a guide, it provides a blueprint for moving away from brittle, script-based deployments. It also treats AI as a first-class participant, with built-in SRE agents that use Large Language Models (LLMs) to analyze telemetry and provide actionable insights, proving that a well-structured platform is the essential foundation for the future of AI-driven operations.

What is your forecast for the future of internal developer platforms over the next three to five years?

I believe we are entering an era where the distinction between “writing code” and “managing infrastructure” will almost entirely vanish for the average developer, as platforms become intelligent enough to handle the entire lifecycle of a service autonomously. We will move away from the “front door” portal model toward “invisible platforms” where the control plane doesn’t just respond to manual commands but proactively manages drift, optimizes costs, and repairs security vulnerabilities before a human even realizes there is an issue. AI agents will transition from being simple helpers to being the primary operators of our systems, but this will only be possible for organizations that have done the hard work of building a clean, abstracted architectural foundation. Those who continue to rely on the “messy middle” of point-to-point integrations will find themselves unable to keep pace with the speed and complexity of AI-driven development, while those with a robust multi-plane architecture will see their productivity and innovation reach levels that were previously unimaginable.

Explore more

Is the Mistic Backdoor Hiding in Your Security Tools?

Introduction The emergence of the Mistic backdoor represents a sophisticated advancement in the arsenal of modern cybercriminals, specifically those operating within the niche of Initial Access Brokering (IAB). This malicious software, also identified by some security researchers as MLTBackdoor, has been actively infiltrating corporate environments throughout the first half of 2026. Its primary strength lies in its ability to camouflage

Is the Redmi 17C the New King of Budget Smartphones?

Dominic Jainy is a seasoned IT professional with a deep understanding of how hardware evolution impacts the budget mobile market. Today, he breaks down Xiaomi’s latest strategic move with the Redmi 17C, a device that surprisingly leaps over a generation to deliver high-refresh-rate displays and massive battery life to the entry-level segment. We explore the balance between essential utility features,

How Can PowerTool Speed Up Business Central Data Migrations?

Modern enterprises frequently encounter significant friction during ERP transitions because traditional data migration methods often fail to accommodate the sheer volume and complexity of contemporary datasets. In 2026, the demand for agility within Microsoft Dynamics 365 Business Central has reached a point where standard configuration packages, while functional for small tasks, often act as a bottleneck for larger implementations. The

Will AI Token Costs Soon Surpass Developer Salaries?

Recent financial projections indicate that the cost of maintaining high-frequency artificial intelligence interactions is rapidly approaching the median annual compensation of experienced software engineers in the global market. As the software development industry undergoes a radical transformation, the traditional overhead associated with human labor is being challenged by the sheer volume of data processed through large language models. This shift

Linux Foundation Launches Agent Name Service for AI Identity

The silent acceleration of autonomous algorithms through corporate servers has finally hit a checkpoint as the industry introduces a universal digital passport for the non-human workforce. The sudden influx of AI agents into corporate ecosystems has outpaced the tools meant to manage them, leaving security teams to contend with silent, autonomous entities performing tasks across sensitive APIs. When an agent