Generative AI coding assistants, such as GitHub Copilot and Tabnine, are revolutionising the coding process. These tools utilise foundational models (e.g. OpenAI’s GPT, Anthopic’s Claude) to suggest and generate code snippets from natural language inputs, enhancing developer efficiency. GitHub Copilot, for instance, now contributes to 46% of coding tasks and accelerates coding speed by up to 55%.
McKinsey’s study shows that software development is one of the four biggest opportunities for companies to improve efficiency. But how can generative AI further enhance developers’ productivity beyond just code generation?
In this blog article, we’re going to delve into how generative AI can revolutionise the entire software development life cycle. We’ll shine a spotlight on the crucial role of performance optimisation, and explore how AI can turbocharge the code optimisation process.
Generative AI in the Software Development Life Cycle
Before delving into generative AI in software engineering, let’s quickly overview the Software Development Life Cycle (SDLC). SDLC, comprising planning, designing, coding, testing, deploying, and maintenance stages, is depicted in Figure 1 below.
- Planning involves determining the software’s purpose, its users, and their desired features.
- Design entails creating the software’s blueprint, including its appearance, functionality, and data requirements.
- Development involves coding, testing, and bug fixing to build the software.
- Testing ensures the software operates correctly under different data and user scenarios.
- Deployment involves making the software accessible to users by installing it on their computers or servers.
- Maintenance keeps the software updated, fixing bugs, adding features, and enhancing performance.
According to a recent Stack Overflow survey, 70% of developers use AI tools or plan to do so within the next few months. Right now, many Generative AI tools for software engineering, like GitHub Copilot, Replit’s Ghostwriter, Sourcegraph’s Cody, and TabNine, are largely focused on the development and testing stages, offering capabilities such as code generation and code completion. We have discussed these capabilities in our previous blog post, “Generative AI for Code: what you need to know in 2023“.
However, the programming landscape is vast, filled with opportunities and numerous ways to enhance developers’ workflows. Figure 2 below offers a clear overview of how Generative AI can assist at each stage of SDLC.
Figure 2. Generative AI Uses Cases by SDLC
Code Better, Not Just Faster
Among the various stages of the SDLC, one that often doesn’t get the spotlight but is crucial to the software’s efficiency and effectiveness is Performance Optimisation. This stage, typically embedded within the Continuous Integration and Continuous Deployment (CI/CD) processes, focuses on optimising the code to ensure it runs at peak performance on the target hardware. It’s not just about making the code work; it’s about making the code work efficiently, reducing latency, and improving the overall user experience.
And the bar for code performance is rising in an AI-dominated future. We will explain the key reasons below.
Cost of Compute and Profitability
Software is eating the world. Even the allure of modern vehicles often lies in digital features like parking assistance and IoT connectivity. This software explosion is fuelled by generative AI coding assistants that exponentially accelerate code creation. But with great opportunity comes a great cost.
A16Z reports cloud spending ranging from 75-80% of revenue was common for software companies in 2021. Fast forward to 2023, businesses are racing to embed predictive and generative AI, amplifying computational and efficiency demands. According to ScaleTorch, the demand for AI computing power is expected to skyrocket by a staggering 750x within the next five years.
Clearly, efficient code is not merely a technical goal but a financial necessity, as it can significantly cut cloud costs and boost profit margins for organisations.
Speed, Scale and Customer Experience
In mission-critical applications, optimising code for rapid execution is paramount. For instance, in high-frequency trading, milliseconds saved can yield millions in profits. In autonomous vehicles, high-performance code can ensure safety by enabling swift decision-making. Similarly, lag-free code in video streaming ensures seamless playback for a superior viewer experience.
However, the advent of Generative AI and LLMs brings a new dimension to the speed challenge. Despite their benefits, the extensive processing times associated with LLMs can pose a significant hurdle for real-time and edge applications, particularly as the number of users and applications continues to grow.
Energy Efficiency and ESG
Amidst the rapid expansion of generative AI, the emphasis on Environmental, Social, and Governance (ESG) factors is intensifying, making energy-efficient code an urgent priority.
To put this into perspective, the training of GPT-3 is estimated to have consumed 1,287 MWh of energy, resulting in emissions of over 550 tons of carbon dioxide equivalent. This is comparable to one person making 550 round trips between New York and San Francisco – and that’s before the model is even launched to consumers. The environmental impact doesn’t stop at the training phase. For instance, integrating LLMs into search engines could potentially lead to a fivefold increase in computing power, resulting in substantial carbon emissions.
As we highlight in our blog article, The Need for Green AI, efficient code is important in curbing emissions while enabling cutting-edge AI software applications.
The Challenges of Performance Optimisation
Performance optimisation, while crucial, is far from straightforward. It presents several challenges, including:
Scarcity of Expertise: Skilled performance engineers, capable of effectively optimising code, are a rare breed. In a city like London, the cost for such expertise can reach up to £500k per year, making it a resource that only tech giants with deep pockets can afford. This scarcity and high cost of expertise can turn performance optimisation into a significant hurdle for many organisations.
Time and Effort: The process of optimisation is iterative and often lengthy. It involves fine-tuning code, testing it, analysing the results, and repeating the process until optimal performance is achieved. Even the most experienced engineers can spend days figuring out the best ways to optimise code. This challenge is magnified in the context of large codebases, where engineers often lack the global view needed to confidently apply optimisations.
Resource Limitations: Large codebases require significant human resources for improvement. For instance, a codebase with a million lines of code could require up to 70 top developers, including those working on tests, frontend, backend, etc., to review and maintain. Moreover, a new team of developers could spend 2-5 times longer to understand, review, and optimise the code. This requirement for substantial human resources adds to the complexity and cost of performance optimisation.
In 2020 alone, the estimated Cost of Poor Software Quality (CPSQ) in the United States was a whopping $2.08 trillion. This staggering figure includes expenditures on rework, lost productivity, and customer dissatisfaction resulting from subpar code. Addressing this trillion-dollar problem demands a new approach to performance optimisation. But is AI code generation the right solution?
The Limitations of AI Code Generators in Performance Optimisation
AI Code Generators can help developers write code more quickly and efficiently. However, as discussed in our previous blog, AI-generated code requires optimisation to meet production quality. Recent studies indicate that Github Copilot may produce code with a significantly slower runtime performance.
Figure 3. AI-generated Code Requires Optimisation to Meet Production Quality.
We will explain AI code generators’ limitations in performance optimisation:
Lack of Specialisation: Most AI code generators have been trained on natural language text and source code from publicly available sources. While this makes it versatile, it does not function as a domain-specific performance optimiser that can address unique performance issues. For example, it cannot fine-tune 3D game applications for optimal framerate performance across different mobile chipsets.
Lack of Context: AI code generators don’t consider the specific constraints of your project. This includes hardware constraints, such as the specifications of the machines your code will run on, and user-specific constraints, like specific performance requirements or usage patterns. For example, the code it suggests may run slower or use more memory on ARM-based devices like smartphones compared to code optimised for x86 chips commonly used in computers. As a result, developers may need to further refine and optimise any Copilot suggestions to make sure the code runs efficiently on the intended hardware.
Lack of Flexibility: AI code generators typically rely on specific foundational models. GitHub Copilot, for instance, is built on OpenAI’s GPT-4, Sourcegraph’s Cody utilises Anthropic’s Claude 2, and Google’s Codey employs Google’s PaLM 2. In the rapidly evolving field of AI, diverse models, including the recently unveiled Llama 2 by Meta, are emerging, each potentially better suited to certain use cases. However, the inability to choose or switch between these foundational models limits users from customising the coding process to their unique needs, potentially hindering optimisation.
Lack of Performance Visibility: AI code generators don’t provide real-time feedback on how a certain change will affect the performance of the code. This means developers must manually test and analyse the performance impact of the code suggestions, adding extra steps to the optimisation process.
Artemis: Hunting for Inefficient Code Leveraging Generative AI and Predictive ML
So, what will be the answer to performance optimisation in the future of AI-powered coding?
Figure 4. Generative AI May Struggle with Complex Engineering Tasks.
Figure 5. Artemis Automates the Code Optimisation Process.
Designed to amplify code performance for production, Artemis automates the optimisation process while enhancing code efficiency.
Inefficiency Hunting: Artemis automatically hunts for code inefficiencies within the entire codebase, considering user-defined performance metrics (e.g. runtime, CPU usage). This allows for a comprehensive and context-aware analysis, ensuring no performance issue goes unnoticed. Artemis is able to conduct static analysis of code, where code is examined to identify and execute improvements, and dynamic analysis of code, where codebases are improved via scanning code while they run.
Flexible LLMs Selection: Developers can flexibly choose the most appropriate LLMs for generating code candidates, which are potential enhancements to improve code performance. This flexibility also mitigates the risks associated with the deprecation of older models.
Survival of the Fittest: Mirroring natural selection, Artemis automatically refines and tests code candidates against custom performance criteria. It suggests optimal code modifications for specific use cases, ensuring only the highest-performing code prevails.
Real-Time Performance Review: As changes are made, Artemis provides a real-time performance review of the code. Users can see the impact of different modifications and choose the optimal one, making the process of performance optimisation more transparent and user-friendly.
Adaptive Optimisation with Full Privacy: When deployed on-premise, Artemis trains a private optimisation model directly on your codebase, ensuring your code is never exposed. The more you interact with Artemis, the more it learns from your feedback, better adapting to your codebase and knowledge base.
Figure 6 illustrates a scenario where Artemis suggests the optimal code version (by merely changing two lines of code) from among 64 alternatives. The recommended changes led to a notable improvement over the original code, with a 36.49% reduction in runtime, an 87.86% decrease in CPU usage, and a 4.24% boost in memory utilisation.
Figure 6. Artemis Suggests the Optimal Code Version.
At its core, Artemis leverages AI to refine and expedite code optimisation, accelerating the journey of your code to production, reducing compute costs, and enhancing user experience. In the thrilling sphere of AI-driven coding, developers will harness diverse AI tools, from code generation to advanced performance optimisation, turbocharging productivity and igniting innovation.
About the Author
Wanying Fang | TurinTech Marketing