How Is Google Using AI to Transform Code Migration Processes?

Code migration is a critical process when it comes to maintaining software applications, as it helps improve performance, enhance resilience, keep systems up to date, and eliminate stale or irrelevant code. However, the process can be exceedingly complex and time-consuming because the code is often distributed across a multitude of environments. While artificial intelligence (AI) has already begun assisting with various lower-level programming tasks, it has struggled to handle the convoluted task of code migration effectively.

However, Google has made strides in overcoming this challenge by employing a new step-by-step process and a common toolkit wherein large language models (LLMs) identify the files that need changes. According to Google, this innovative process has accelerated code migrations by 50%, setting a new standard in the industry. In a recent experience report, a team from Google Core and Google Ads described their approach, noting that it has the potential to revolutionize how code is maintained in large enterprises. Here, we explore the intricate steps Google has taken and highlight a few practical use cases.

1. Locate Code Spots Where Flags (Experiments) Are Mentioned

Google’s primary objective was to identify opportunities for LLMs to deliver added value and support scalability without relying on difficult-to-maintain abstract syntax trees (AST). Traditionally, ASTs have been used to represent the structure of a program or a code snippet, but they are deterministic, meaning that outcomes are predefined. Code migration scenarios often involve complex constructs that ASTs find challenging to represent. Google’s teams noted that success in LLM-based code migration isn’t straightforward. Utilizing LLMs alone through simple prompting isn’t sufficient for complex migrations. Instead, a combination of AST-based techniques, heuristics, and LLMs is essential to achieve success and ensure the changes are rolled out safely to avoid costly regressions.

Success for Google was measured by achieving at least a 50% reduction in the time required for end-to-end work, including code rewrites, identifying migration locations, conducting reviews, and performing the final rollout. In the end, engineers reported that this milestone was indeed achieved, with 80% of code modifications being fully AI-authored. Anecdotal evidence from developers indicated that even if changes weren’t perfect, significant value was found in having an initial version of the changelist already created. This initial effort often paved the way for further refinements and optimizations.

2. Remove Code Mentions of the Flag

One of the largest business units within Google, Google Ads, operates on a code base consisting of over 500 million lines of code. The system employs dozens of numerical unique ID types that refer to various resources, such as users, merchants, and campaigns. These IDs are usually defined as 32-bit integers in C++ and Java but had to be converted to 64-bit IDs to prevent ID value overflows. The report noted that moving from 32-bit to 64-bit was fraught with difficulties. Within Google’s ecosystem, IDs are sparsely defined and hard to locate, making them difficult to search and identify through static tools. Compounding the challenge is the fact that Google Ads features tens of thousands of code locations, rendering manual tracking overly complicated.

In this scenario, Google’s LLM-powered code migration process was a game-changer. Initially, an engineer identified the necessary IDs, file supersets, and locations for migration. The required changes were then generated within the LLM, fostering a feedback loop of testing and iteration. This iterative process allowed the engineer to review LLM-generated code as they would with any other codebase, making changes and corrections as necessary. Once this step was complete, the changes were split and sent for final review by the proprietors of each code segment, ensuring that migrations were carried out efficiently and accurately.

3. Streamline Any Conditional Statements That Rely on the Flag

Another pertinent example involves a significant set of test files still using the now-outdated JUnit3 library, a unit testing open-source framework for Java. Manually updating these files posed a considerable challenge and could have negatively impacted the codebase by introducing technical debt. Technical debt tends to replicate itself, as developers might inadvertently copy outdated code to produce new code. To tackle this issue, Google’s developers used LLMs to update a critical mass of JUnit3 tests to the new JUnit4 library. This automated update enabled the smooth migration of 5,359 files, modifying more than 149,000 lines of code over three months. This effort exemplifies Google’s efficient approach to transitioning to up-to-date technologies, crucial for maintaining the health and performance of their vast codebase.

4. Eliminate Any Redundant Code

In another use case, Google faced the challenge of cleaning up experimental code that had become stale. Obsolete experimental code can lead to inefficiencies and maintenance headaches. Using AI, Google performed several crucial steps to clean up such code. Initially, they located areas in the code where flags or experiments were mentioned, subsequently removing any code references to the flag. Next, they simplified any conditional expressions that depended on the flag and eliminated any redundant or dead code. This meticulous cleanup also involved updating existing tests while discarding any unnecessary or obsolete tests. This comprehensive approach ensured that the codebase remained clean, efficient, and scalable, significantly reducing the time and resources required for manual cleanup.

5. Revise Tests and Discard Unnecessary Tests

Code migration is essential for maintaining software applications, as it boosts performance, resilience, keeps systems current, and removes outdated code. However, it can be highly complex and time-consuming since code is often scattered across numerous environments. While artificial intelligence (AI) has begun to assist with various basic programming tasks, it has faced challenges in effectively managing the intricate process of code migration.

Google has made significant progress in addressing this issue by implementing a new step-by-step process and a standardized toolkit where large language models (LLMs) pinpoint the necessary file changes. Google reports that this innovative approach has sped up code migrations by 50%, setting a new benchmark in the industry. In a recent experience report, a team from Google Core and Google Ads detailed their method, suggesting it could transform code maintenance in large enterprises. Here, we delve into the detailed steps Google has undertaken and showcase a few practical applications of their pioneering method.

Explore more

D365 Supply Chain Tackles Key Operational Challenges

Imagine a mid-sized manufacturer struggling to keep up with fluctuating demand, facing constant stockouts, and losing customer trust due to delayed deliveries, a scenario all too common in today’s volatile supply chain environment. Rising costs, fragmented data, and unexpected disruptions threaten operational stability, making it essential for businesses, especially small and medium-sized enterprises (SMBs) and manufacturers, to find ways to

Cloud ERP vs. On-Premise ERP: A Comparative Analysis

Imagine a business at a critical juncture, where every decision about technology could make or break its ability to compete in a fast-paced market, and for many organizations, selecting the right Enterprise Resource Planning (ERP) system becomes that pivotal choice—a decision that impacts efficiency, scalability, and profitability. This comparison delves into two primary deployment models for ERP systems: Cloud ERP

Selecting the Best Shipping Solution for D365SCM Users

Imagine a bustling warehouse where every minute counts, and a single shipping delay ripples through the entire supply chain, frustrating customers and costing thousands in lost revenue. For businesses using Microsoft Dynamics 365 Supply Chain Management (D365SCM), this scenario is all too real when the wrong shipping solution disrupts operations. Choosing the right tool to integrate with this powerful platform

How Is AI Reshaping the Future of Content Marketing?

Dive into the future of content marketing with Aisha Amaira, a MarTech expert whose passion for blending technology with marketing has made her a go-to voice in the industry. With deep expertise in CRM marketing technology and customer data platforms, Aisha has a unique perspective on how businesses can harness innovation to uncover critical customer insights. In this interview, we

Why Are Older Job Seekers Facing Record Ageism Complaints?

In an era where workforce diversity is often championed as a cornerstone of innovation, a troubling trend has emerged that threatens to undermine these ideals, particularly for those over 50 seeking employment. Recent data reveals a staggering surge in complaints about ageism, painting a stark picture of systemic bias in hiring practices across the U.S. This issue not only affects