Enhancing Sustainability of Modern Software Systems through Self-adaptive Architectures

Dr. Karthik Vaidhyanathan explains the concept of software sustainability and how his group’s research on self-adaptation is contributing towards greener and sustainable software.

Impact of Uncertainties on Software Sustainability
Imagine receiving an alert that your system has gone down or is consuming an unsustainable amount of power (Yes, you heard that right!). The immediate response is to identify and resolve the issue – whether it’s a sudden spike in user traffic, unexpected resource constraints, or a misconfiguration. But what if the system could adapt itself dynamically, anticipating and mitigating these problems in real-time? Modern software systems including AI Systems are subjected to various types of uncertainties ranging from unpredictable user behaviors and fluctuating workloads to resource constraints in the operating environments, evolving security threats and real-world variability in data inputs. Studies show that 64% of system outages result from misconfigurations¹. And 91% of AI models degrade over time². Beyond impacting reliability and performance, these uncertainties also have a huge impact on the environment (increase in user loads may trigger the need to add more resources). Recent estimates from Green Software Foundation³ highlight that software emissions are equivalent to the emissions of air, rail and shipping combined. As researchers trying to work in the intersection of software architecture and ML, we are constantly trying to enhance the sustainability of modern software systems including AI-enabled systems. One way we have attempted to tackle this problem is by making systems self-adaptive where they adapt their structure or behavior with minimal human intervention.

Enhancing Sustainability through Self-adaptation
More often than not, people often associate sustainability with making things green. While it’s not incorrect, it’s not the only aspect of sustainability. It is a multi-dimensional quality attribute which encompasses the technical, environmental, social and economical aspects of a running software system. My research on using self-adaptation to dynamically adapt software systems at runtime to enhance sustainability started during my PhD where the focus was to continuously monitor the metrics of a running software system, identify any potential uncertainties using AI and further use AI to dynamically reconfigure the system to handle the uncertainty. We had applied this concept to IoT systems where we used AI to dynamically switch the processing between edge, fog and cloud to save battery power of IoT devices while guaranteeing system performance4. That’s when we realized that the idea could be applied to a broader class of software systems.

1 https://www.reuters.com/technology/major-tech-outages-recent-years-2024-07-19
2 https://www.nature.com/articles/s41598-022-15245-z
3 https://stateof.greensoftware.foundation/en/insights/software-emissions-are-equivalent-to-air-rail-shipping-combined/
4 Cámara, J., Muccini, H. and Vaidhyanathan, K., 2020, March. Quantitative verification-aided machine learning: A tandem approach for architecting self-adaptive IoT systems. In 2020 IEEE International Conference on Software Architecture (ICSA) (pp. 11-22). IEEE.

Most of the organizations today are adopting a microservice-based architectural style where each service is designed around domain boundaries and team compositions. At the same time organizations have realized/are starting to realize the importance of monitoring their carbon footprint and further reducing emissions. In this context, we developed an approach that uses self-adaptive mechanisms to dynamically decide which microservice instance to use to guarantee trade-off between latency and energy consumption. To illustrate how this works, consider a scenario where a user located in Hyderabad sends a request to login to an e-commerce application. Typically such a request will be served by an instance of the microservice that is located in a data center closer to the location of the user inorder to guarantee quicker response. However, that instance might be located in a data center which is powered by fossil fuel or the instance might be consuming higher power due to several incoming requests. Hence, it might be beneficial to serve the request from some other instance which is consuming less power and is located in a data center powered by renewable energy but at the compromise of the network latency that may come in due to the location. With this broad thought, we developed an approach that leverages the use of reinforcement learning to decide which microservice instance to use to serve a user request by considering the trade-off between response time and energy consumption. This work has been published in top tier international conferences⁵. Further, we also extended this concept to serverless functions in our recently published work where it was more about deciding the optimal instance to use to allocate a serverless function considering trade-off between cost and performance⁶.

Extending to AI-enabled Systems
The emergence of AI has enhanced various walks of our life albeit with a significant cost in terms of the environmental impact. Estimates suggest that data centers already consume 2% of the world’s electricity requiring gallons of fresh water with AI taking the bulk of the load and these numbers are only expected to increase⁷. To this end, we believe self-adaptive systems can certainly play a role in enhancing the sustainability of AI systems without compromising on the performance of the system or accuracy of the AI models. For any task today, there are various types of AI models that could be used. For example, if we want to chat, we have ChatGPT or Claude or Llama or Gemini to name a few. Similarly for tasks like object detection, we have different varieties of AI models such as the Yolo family of models or the detectron series, etc. Most of the times we don’t need complex AI models, all we need is simple AI models but for some scenarios we may need complex AI models that can provide high accuracy. Based on this thought process, as a starting point, we developed an approach, Eco-MLS⁸, that presents the architecture of an ML system that leverages the concept of self-adaptation, switching between different AI models at run-time to trade-off between accuracy and energy consumption. We took object detection as our domain and we developed a self-adaptive ML system that switches between different object detection models at runtime by deciding when to use what models without compromising on the accuracy while simultaneously reducing power consumption. We noticed that we could reduce energy consumption by close to 80% by using our approach as opposed to using a large complex model.

5 Karthik Vaidhyanathan, Mauro Caporuscio, Stefano Florio, and Henry Muccini. 2024. ML-enabled Service Discovery for Microservice Architecture: a QoS Approach. In Proceedings of the 39th ACM/SIGAPP Symposium on Applied Computing (SAC ’24)
6 Jain, P., Singhal, P., Pandey, D., Quatrocchi, G., Vaidhyanathan, K. (2025). POSEIDON: Efficient Function Placement at the Edge Using Deep Reinforcement Learning. In: Gaaloul, W., Sheng, M., Yu, Q., Yangui, S. (eds) Service-Oriented Computing. ICSOC 2024.
7 https://www.wired.com/story/true-cost-generative-ai-data-centers-energy
8 M. Tedla, S. Kulkarni and K. Vaidhyanathan, “EcoMLS: A Self-Adaptation Approach for Architecting Green ML-Enabled Systems,” in 2024 IEEE 21st International Conference on Software Architecture Companion (ICSA-C), https://arxiv.org/abs/2404.11411

Further, we extended this concept to the larger MLOps pipeline. To keep it simple, MLOps can be thought of as a set of practices that combines machine learning (ML), software engineering, and DevOps to streamline and automate the end-to-end lifecycle of ML development, deployment and management. MLOps pipeline typically enhances the maintainability of the ML system. However, periodically retraining ML models has also had an impact on the environment footprint.

Even though it may enhance accuracy of the ML models, the accuracy needs to be also looked at from the perspective of impact on the environment. Sometimes for 1% increase in accuracy, tons of CO2 might be emitted⁹. To this end, we developed an approach to architect self-adaptive MLOps pipelines that decide when to re-train models or switch between models or use a more greener cloud instance for retraining by considering the cost, performance, accuracy and current carbon footprint. This work was published in the prestigious international conference on software architecture (ICSA 2024)¹⁰ where the work won the best poster award.

Onward and Forward
Currently, one of the biggest challenges in software engineering is about enhancing the sustainability of modern software systems, in particular AI systems. There are already efforts going on at a global level towards making softwares more green such as that of Green Software Foundation where different organizations are involved in defining green software engineering practices. There are also a lot of tools that have been made available both by the research and practitioner community to measure power, carbon footprint,etc. However, more concerted effort between the academic and research community is required to create a wider impact. Many times there is also a lack of awareness that needs to be addressed. As far as our research is concerned, we are working on extending our approach of model switching to Generative AI-based applications and to edge cloud continuum. Further, we are also extending our self-adaptive MLOps pipeline to support different types of AI systems. We are also collaborating with different research groups such as the S2 group at VU Amsterdam, the Netherlands to perform studies that can aid practitioners or researchers in building greener software systems. In addition to this, another compelling research angle that is being actively explored in collaboration with Lloyds Technology Centre is the development of practices that can enable architects to come up with green software design and deployment practices.

9 https://arxiv.org/abs/1906.02243
10 H. Bhatt, S. Arun, A. Kakran and K. Vaidhyanathan, “Towards Architecting Sustainable MLOps: A Self-Adaptation Approach,” in 2024 IEEE 21st International Conference on Software Architecture Companion (ICSA-C) https://arxiv.org/pdf/2404.04572

This article was initially published in the December edition of TechForward Dispatch

Karthik Vaidhyanathan

Dr. Karthik Vaidhyanathan is an Assistant Professor at the Software Engineering Research Center, IIIT-Hyderabad, India where he is also associated with the leadership team of the Smart City Living Lab. His main research interests lie in the intersection of software architecture and machine learning with a specific focus on building sustainable software systems in the cloud and the edge. Karthik also possesses more than 5 years of industrial experience in building and deploying ML products/services. Karthik is also an editorial board member of IEEE Software. https://karthikvaidhyanathan.com

Alumni spotlight: Dr. Santhosh Kodipaka’s role in the setup of IIITH Alumni Foundation, US »

« Anika Roy on how an internship at MIT was a pivotal experience

9 https://arxiv.org/abs/1906.02243 10 H. Bhatt, S. Arun, A. Kakran and K. Vaidhyanathan, “Towards Architecting Sustainable MLOps: A Self-Adaptation Approach,” in 2024 IEEE 21st International Conference on Software Architecture Companion (ICSA-C) https://arxiv.org/pdf/2404.04572

9 https://arxiv.org/abs/1906.02243
10 H. Bhatt, S. Arun, A. Kakran and K. Vaidhyanathan, “Towards Architecting Sustainable MLOps: A Self-Adaptation Approach,” in 2024 IEEE 21st International Conference on Software Architecture Companion (ICSA-C) https://arxiv.org/pdf/2404.04572