The browser you are using is not supported by this website. All versions of Internet Explorer are no longer supported, either by us or Microsoft (read more here: https://www.microsoft.com/en-us/microsoft-365/windows/end-of-ie-support).

Please use a modern browser to fully experience our website, such as the newest versions of Edge, Chrome, Firefox or Safari etc.

Towards green software: tackling the energy cost of scientific software

A rack of ATLAS TDAQ components. Photo by Cern.
The computing farm for the real-time data selection system of the ATLAS experiment at CERN, where Lund particle physics researchers work, analyses up to 100000 collision events per second when the Large Hadron Collider is running. Image: ATLAS Experiment

Research in particle physics often relies on sizable, cutting-edge computing resources for analysing large datasets, producing simulation samples, or developing and running complex machine learning models. While particle physics has been a pioneer in dealing with many “big science” issues and raised the stakes in the Large Hadron Collider era,today it is by no means isolated.

More and more research fields are facing similar challenges. And one aspect is common to all of them: much of their software is written or used without considering energy efficiency as a metric. In fact, researchers write or provide software and procure computing hardware, but power consumption costs are footed by institutions or laboratories, often creating the impression that, somehow, computational power is an infinite resource.

This is however far from reality, as the use of scientific software and ML models can have a significant environmental cost [1]. Unsurprisingly, a growing number of scientists feel that it is their duty to break from unsustainable paradigms and instead estimate and tackle the environmental cost of their research software. Sustainable software development is an active field of research [2], and projects that complement the green computing trends in hardware manufacturing with a bottom-up approach from the researchers themselves are arising in several research institutions and fields.

A good example is “Tackling the energetic cost of scientific software,” a newly born collaboration project at the University of Lund, in Sweden, and the University of Manchester, in the UK. It encourages a mindset change for scientific software users and developers towards net zero goals. The project will pilot how to estimate and improve the energy efficiency of high-throughput physics, astronomy and life sciences software, and to begin training a new generation of researchers towards choosing and developing the most energetically sustainable software solutions suited to their scientific problem.

Caterina Doglioni, senior lecturer in Particle Physics in Lund with a joint appointment at the University of Manchester, tells us what this project is about: “We analyse a lot of data, and if we can connect small improvements that are also good practices in programming to energy efficiency, then its a big motivation for new students and collaborators to code better”. And she adds: “Our field needs to do all it can for the environment”. She collaborates on this project with Alexander Ekman, a PhD student also in particle physics, Sonja Ajts from Medicine, and Alexandros Sopasakis from Mathematics.

In practice, how is this going to work? The first challenge is to reviews and collect robust metrics to evaluate the energy efficiency and environmental impact of scientific data analysis software and machine learning algorithms on different computing architectures. This is a relatively new field, as the first set of principles of green software engineering were introduced in 2019 and updated in 2022 [2]. The starting point will be the basis provided by the Green Software Initiative, where the Software Carbon Intensity (SCI) [3] is used as a metric to inform software choices for users and developers. This metric includes emissions from the energy required by the hardware on which the software operates, as well as emissions associated with manufacturing the hardware.

In a second stage, pilot studies analysing selected software frameworks and analysis codes from different disciplines, including particle physics, astronomy and medicine will be conducted. The metrics include the energy consumed by a software system, which can be reduced by writing more efficient code. An important part of the work will be to identify inefficiencies and redundancies in the code via best practices.

But how large can the effect be? As the project is just starting, the answer is still qualitative but the intuition is that it can have a sizable impact in high throughput software. And there are several paths to explore. Taking for example ML models, one can define trade-offs between precision or discrimination and energy consumption, seek optimal architectures for a specific class of problems, or focus on real-time data processing aspects prior to feeding inputs to the ML algorithms, as a way to understand the smallest and most effective subset of information to achieve good performance.

The output will be a collection of domain-specific metrics and best practices for sustainable and energy-efficient analysis software and ML algorithms. Hopefully this will inform more comprehensive follow-up studies, and be extended to further disciplines at a later stage. This will begin to close the gap between the state-of-the-art studies on the energy efficiency of individual algorithms and domain-specific challenges and needs.

But the bottom-up change in mindset intended with this project involves another,  complementary goal: to increase the literacy of students and young researchers on energy the efficiency of software and best coding practices. This project has just started with the involvement of students and Early Career Researchers from India, Ukraine and UK, supported by the Google Summer of Code [4], IRIS-HEP [5] and Learning Through Research [6] programs. These students are working together with the Lund researchers to obtain first estimates on selected software frameworks in particle physics by the end of this summer.

Their most important finding and methods will also be included in teaching for undergraduate and graduate student — some of the people who will deploy and lead future software development projects and future facilities.

1. https://arxiv.org/abs/1906.02243

2. https://learn.greensoftware.foundation/introduction

3. https://github.com/Green-Software-Foundation/sci

4. https://summerofcode.withgoogle.com/programs/2023/projects/Nks9akq7

5. https://iris-hep.org/fellows.html

6. https://www.careers.manchester.ac.uk/findjobs/internships/2ndyear/sei/