How Is Google Using AI to Transform Code Migration Processes?

Code migration is a critical process when it comes to maintaining software applications, as it helps improve performance, enhance resilience, keep systems up to date, and eliminate stale or irrelevant code. However, the process can be exceedingly complex and time-consuming because the code is often distributed across a multitude of environments. While artificial intelligence (AI) has already begun assisting with various lower-level programming tasks, it has struggled to handle the convoluted task of code migration effectively.

However, Google has made strides in overcoming this challenge by employing a new step-by-step process and a common toolkit wherein large language models (LLMs) identify the files that need changes. According to Google, this innovative process has accelerated code migrations by 50%, setting a new standard in the industry. In a recent experience report, a team from Google Core and Google Ads described their approach, noting that it has the potential to revolutionize how code is maintained in large enterprises. Here, we explore the intricate steps Google has taken and highlight a few practical use cases.

1. Locate Code Spots Where Flags (Experiments) Are Mentioned

Google’s primary objective was to identify opportunities for LLMs to deliver added value and support scalability without relying on difficult-to-maintain abstract syntax trees (AST). Traditionally, ASTs have been used to represent the structure of a program or a code snippet, but they are deterministic, meaning that outcomes are predefined. Code migration scenarios often involve complex constructs that ASTs find challenging to represent. Google’s teams noted that success in LLM-based code migration isn’t straightforward. Utilizing LLMs alone through simple prompting isn’t sufficient for complex migrations. Instead, a combination of AST-based techniques, heuristics, and LLMs is essential to achieve success and ensure the changes are rolled out safely to avoid costly regressions.

Success for Google was measured by achieving at least a 50% reduction in the time required for end-to-end work, including code rewrites, identifying migration locations, conducting reviews, and performing the final rollout. In the end, engineers reported that this milestone was indeed achieved, with 80% of code modifications being fully AI-authored. Anecdotal evidence from developers indicated that even if changes weren’t perfect, significant value was found in having an initial version of the changelist already created. This initial effort often paved the way for further refinements and optimizations.

2. Remove Code Mentions of the Flag

One of the largest business units within Google, Google Ads, operates on a code base consisting of over 500 million lines of code. The system employs dozens of numerical unique ID types that refer to various resources, such as users, merchants, and campaigns. These IDs are usually defined as 32-bit integers in C++ and Java but had to be converted to 64-bit IDs to prevent ID value overflows. The report noted that moving from 32-bit to 64-bit was fraught with difficulties. Within Google’s ecosystem, IDs are sparsely defined and hard to locate, making them difficult to search and identify through static tools. Compounding the challenge is the fact that Google Ads features tens of thousands of code locations, rendering manual tracking overly complicated.

In this scenario, Google’s LLM-powered code migration process was a game-changer. Initially, an engineer identified the necessary IDs, file supersets, and locations for migration. The required changes were then generated within the LLM, fostering a feedback loop of testing and iteration. This iterative process allowed the engineer to review LLM-generated code as they would with any other codebase, making changes and corrections as necessary. Once this step was complete, the changes were split and sent for final review by the proprietors of each code segment, ensuring that migrations were carried out efficiently and accurately.

3. Streamline Any Conditional Statements That Rely on the Flag

Another pertinent example involves a significant set of test files still using the now-outdated JUnit3 library, a unit testing open-source framework for Java. Manually updating these files posed a considerable challenge and could have negatively impacted the codebase by introducing technical debt. Technical debt tends to replicate itself, as developers might inadvertently copy outdated code to produce new code. To tackle this issue, Google’s developers used LLMs to update a critical mass of JUnit3 tests to the new JUnit4 library. This automated update enabled the smooth migration of 5,359 files, modifying more than 149,000 lines of code over three months. This effort exemplifies Google’s efficient approach to transitioning to up-to-date technologies, crucial for maintaining the health and performance of their vast codebase.

4. Eliminate Any Redundant Code

In another use case, Google faced the challenge of cleaning up experimental code that had become stale. Obsolete experimental code can lead to inefficiencies and maintenance headaches. Using AI, Google performed several crucial steps to clean up such code. Initially, they located areas in the code where flags or experiments were mentioned, subsequently removing any code references to the flag. Next, they simplified any conditional expressions that depended on the flag and eliminated any redundant or dead code. This meticulous cleanup also involved updating existing tests while discarding any unnecessary or obsolete tests. This comprehensive approach ensured that the codebase remained clean, efficient, and scalable, significantly reducing the time and resources required for manual cleanup.

5. Revise Tests and Discard Unnecessary Tests

Code migration is essential for maintaining software applications, as it boosts performance, resilience, keeps systems current, and removes outdated code. However, it can be highly complex and time-consuming since code is often scattered across numerous environments. While artificial intelligence (AI) has begun to assist with various basic programming tasks, it has faced challenges in effectively managing the intricate process of code migration.

Google has made significant progress in addressing this issue by implementing a new step-by-step process and a standardized toolkit where large language models (LLMs) pinpoint the necessary file changes. Google reports that this innovative approach has sped up code migrations by 50%, setting a new benchmark in the industry. In a recent experience report, a team from Google Core and Google Ads detailed their method, suggesting it could transform code maintenance in large enterprises. Here, we delve into the detailed steps Google has undertaken and showcase a few practical applications of their pioneering method.

Explore more

Robotic Process Automation Software – Review

In an era of digital transformation, businesses are constantly striving to enhance operational efficiency. A staggering amount of time is spent on repetitive tasks that can often distract employees from more strategic work. Enter Robotic Process Automation (RPA), a technology that has revolutionized the way companies handle mundane activities. RPA software automates routine processes, freeing human workers to focus on

RPA Revolutionizes Banking With Efficiency and Cost Reductions

In today’s fast-paced financial world, how can banks maintain both precision and velocity without succumbing to human error? A striking statistic reveals manual errors cost the financial sector billions each year. Daily banking operations—from processing transactions to compliance checks—are riddled with risks of inaccuracies. It is within this context that banks are looking toward a solution that promises not just

Europe’s 5G Deployment: Regional Disparities and Policy Impacts

The landscape of 5G deployment in Europe is marked by notable regional disparities, with Northern and Southern parts of the continent surging ahead while Western and Eastern regions struggle to keep pace. Northern countries like Denmark and Sweden, along with Southern nations such as Greece, are at the forefront, boasting some of the highest 5G coverage percentages. In contrast, Western

Leadership Mindset for Sustainable DevOps Cost Optimization

Introducing Dominic Jainy, a notable expert in IT with a comprehensive background in artificial intelligence, machine learning, and blockchain technologies. Jainy is dedicated to optimizing the utilization of these groundbreaking technologies across various industries, focusing particularly on sustainable DevOps cost optimization and leadership in technology management. In this insightful discussion, Jainy delves into the pivotal leadership strategies and mindset shifts

AI in DevOps – Review

In the fast-paced world of technology, the convergence of artificial intelligence (AI) and DevOps marks a pivotal shift in how software development and IT operations are managed. As enterprises increasingly seek efficiency and agility, AI is emerging as a crucial component in DevOps practices, offering automation and predictive capabilities that drastically alter traditional workflows. This review delves into the transformative