How Is Google Using AI to Transform Code Migration Processes?

Code migration is a critical process when it comes to maintaining software applications, as it helps improve performance, enhance resilience, keep systems up to date, and eliminate stale or irrelevant code. However, the process can be exceedingly complex and time-consuming because the code is often distributed across a multitude of environments. While artificial intelligence (AI) has already begun assisting with various lower-level programming tasks, it has struggled to handle the convoluted task of code migration effectively.

However, Google has made strides in overcoming this challenge by employing a new step-by-step process and a common toolkit wherein large language models (LLMs) identify the files that need changes. According to Google, this innovative process has accelerated code migrations by 50%, setting a new standard in the industry. In a recent experience report, a team from Google Core and Google Ads described their approach, noting that it has the potential to revolutionize how code is maintained in large enterprises. Here, we explore the intricate steps Google has taken and highlight a few practical use cases.

1. Locate Code Spots Where Flags (Experiments) Are Mentioned

Google’s primary objective was to identify opportunities for LLMs to deliver added value and support scalability without relying on difficult-to-maintain abstract syntax trees (AST). Traditionally, ASTs have been used to represent the structure of a program or a code snippet, but they are deterministic, meaning that outcomes are predefined. Code migration scenarios often involve complex constructs that ASTs find challenging to represent. Google’s teams noted that success in LLM-based code migration isn’t straightforward. Utilizing LLMs alone through simple prompting isn’t sufficient for complex migrations. Instead, a combination of AST-based techniques, heuristics, and LLMs is essential to achieve success and ensure the changes are rolled out safely to avoid costly regressions.

Success for Google was measured by achieving at least a 50% reduction in the time required for end-to-end work, including code rewrites, identifying migration locations, conducting reviews, and performing the final rollout. In the end, engineers reported that this milestone was indeed achieved, with 80% of code modifications being fully AI-authored. Anecdotal evidence from developers indicated that even if changes weren’t perfect, significant value was found in having an initial version of the changelist already created. This initial effort often paved the way for further refinements and optimizations.

2. Remove Code Mentions of the Flag

One of the largest business units within Google, Google Ads, operates on a code base consisting of over 500 million lines of code. The system employs dozens of numerical unique ID types that refer to various resources, such as users, merchants, and campaigns. These IDs are usually defined as 32-bit integers in C++ and Java but had to be converted to 64-bit IDs to prevent ID value overflows. The report noted that moving from 32-bit to 64-bit was fraught with difficulties. Within Google’s ecosystem, IDs are sparsely defined and hard to locate, making them difficult to search and identify through static tools. Compounding the challenge is the fact that Google Ads features tens of thousands of code locations, rendering manual tracking overly complicated.

In this scenario, Google’s LLM-powered code migration process was a game-changer. Initially, an engineer identified the necessary IDs, file supersets, and locations for migration. The required changes were then generated within the LLM, fostering a feedback loop of testing and iteration. This iterative process allowed the engineer to review LLM-generated code as they would with any other codebase, making changes and corrections as necessary. Once this step was complete, the changes were split and sent for final review by the proprietors of each code segment, ensuring that migrations were carried out efficiently and accurately.

3. Streamline Any Conditional Statements That Rely on the Flag

Another pertinent example involves a significant set of test files still using the now-outdated JUnit3 library, a unit testing open-source framework for Java. Manually updating these files posed a considerable challenge and could have negatively impacted the codebase by introducing technical debt. Technical debt tends to replicate itself, as developers might inadvertently copy outdated code to produce new code. To tackle this issue, Google’s developers used LLMs to update a critical mass of JUnit3 tests to the new JUnit4 library. This automated update enabled the smooth migration of 5,359 files, modifying more than 149,000 lines of code over three months. This effort exemplifies Google’s efficient approach to transitioning to up-to-date technologies, crucial for maintaining the health and performance of their vast codebase.

4. Eliminate Any Redundant Code

In another use case, Google faced the challenge of cleaning up experimental code that had become stale. Obsolete experimental code can lead to inefficiencies and maintenance headaches. Using AI, Google performed several crucial steps to clean up such code. Initially, they located areas in the code where flags or experiments were mentioned, subsequently removing any code references to the flag. Next, they simplified any conditional expressions that depended on the flag and eliminated any redundant or dead code. This meticulous cleanup also involved updating existing tests while discarding any unnecessary or obsolete tests. This comprehensive approach ensured that the codebase remained clean, efficient, and scalable, significantly reducing the time and resources required for manual cleanup.

5. Revise Tests and Discard Unnecessary Tests

Code migration is essential for maintaining software applications, as it boosts performance, resilience, keeps systems current, and removes outdated code. However, it can be highly complex and time-consuming since code is often scattered across numerous environments. While artificial intelligence (AI) has begun to assist with various basic programming tasks, it has faced challenges in effectively managing the intricate process of code migration.

Google has made significant progress in addressing this issue by implementing a new step-by-step process and a standardized toolkit where large language models (LLMs) pinpoint the necessary file changes. Google reports that this innovative approach has sped up code migrations by 50%, setting a new benchmark in the industry. In a recent experience report, a team from Google Core and Google Ads detailed their method, suggesting it could transform code maintenance in large enterprises. Here, we delve into the detailed steps Google has undertaken and showcase a few practical applications of their pioneering method.

Explore more

Agency Management Software – Review

Setting the Stage for Modern Agency Challenges Imagine a bustling marketing agency juggling dozens of client campaigns, each with tight deadlines, intricate multi-channel strategies, and high expectations for measurable results. In today’s fast-paced digital landscape, marketing teams face mounting pressure to deliver flawless execution while maintaining profitability and client satisfaction. A staggering number of agencies report inefficiencies due to fragmented

Edge AI Decentralization – Review

Imagine a world where sensitive data, such as a patient’s medical records, never leaves the hospital’s local systems, yet still benefits from cutting-edge artificial intelligence analysis, making privacy and efficiency a reality. This scenario is no longer a distant dream but a tangible reality thanks to Edge AI decentralization. As data privacy concerns mount and the demand for real-time processing

SparkyLinux 8.0: A Lightweight Alternative to Windows 11

This how-to guide aims to help users transition from Windows 10 to SparkyLinux 8.0, a lightweight and versatile operating system, as an alternative to upgrading to Windows 11. With Windows 10 reaching its end of support, many are left searching for secure and efficient solutions that don’t demand high-end hardware or force unwanted design changes. This guide provides step-by-step instructions

Mastering Vendor Relationships for Network Managers

Imagine a network manager facing a critical system outage at midnight, with an entire organization’s operations hanging in the balance, only to find that the vendor on call is unresponsive or unprepared. This scenario underscores the vital importance of strong vendor relationships in network management, where the right partnership can mean the difference between swift resolution and prolonged downtime. Vendors

Immigration Crackdowns Disrupt IT Talent Management

What happens when the engine of America’s tech dominance—its access to global IT talent—grinds to a halt under the weight of stringent immigration policies? Picture a Silicon Valley startup, on the brink of a groundbreaking AI launch, suddenly unable to hire the data scientist who holds the key to its success because of a visa denial. This scenario is no