
The energy behind AI: Why power efficiency matters
The energy behind AI: Why power efficiency matters
In our latest whitepaper, we explain how Nebius improves efficiency across the stack, from software engineering to hardware design and data center operations.
Training and running modern AI models require large-scale compute capacity, specialized hardware and data centers designed for high-density workloads. Each training run and every model output relies on power delivered through physical infrastructure. A lot of power.
As AI adoption accelerates, energy use increasingly sets the boundaries of how far these systems can scale. Power availability, efficiency and infrastructure design are becoming practical constraints. This shift is prompting cloud providers, enterprises and AI practitioners to think of concrete ways to manage the energy footprint of AI systems by optimizing energy use and creating measurable efficiency gains.
AI needs power. A lot of it
When we type a question into a chatbot, the response feels instant. But under the hood, the model executes a long chain of mathematical operations across GPUs and other specialized hardware running in data centers around the world. All of this consumes electricity. This applies to inference, that is, generating responses, and even more so to the training that comes beforehand, where thousands of servers can run continuously for weeks or months.
The link between digital technologies and electricity consumption is not new. Data centers have been significant energy consumers for more than a decade. In 2010, they already accounted for just over 1% of global electricity use¹, a figure that grew to around 1.5% by 2024². What has changed is the nature of the demand. AI introduces more power-dense hardware, more complex clusters and higher cooling requirements.
Electricity consumption from data centers has grown 12% per year over the last five years³, in large part due to the rise of AI. Some projections show that in the U.S. alone, electricity use could reach 325–580 TWh per year by 2028⁴, which is roughly comparable to the annual electricity consumption of the UK or Germany today.
The direction is clear: AI is becoming a major new driver of power demand, and how this demand is handled matters.
With great power comes great responsibility
This growing energy footprint creates responsibility across the entire AI ecosystem. Responsibility comes from understanding where one’s real influence sits and how it connects to the influence of others, either amplifying or limiting it.
From end users generating requests, to engineers designing hardware, to cloud providers operating large-scale data centers — everyone involved influences how much energy AI consumes, how much of that use can be avoided and how much is ultimately converted into useful output. AI operates as a production process spanning models, software, hardware, clusters and facilities, with decisions at each level shaping the final energy footprint.
Energy in, tokens out: Understanding the black box
From the outside, AI development can appear like a black box. Electricity goes in, and AI output comes out. Yet two providers drawing similar amounts of power may deliver very different levels of AI output. Understanding why requires looking inside the system and breaking it down into parts that can be measured, compared and optimized.
Inside the box sits a complex stack of interconnected systems. Hardware choices, cluster design, workload scheduling, resource utilization and data center architecture all influence how much AI value is produced from a given amount of power.
Four layers of energy-to-AI efficiency
In this whitepaper
Each layer has its own stakeholders, technical constraints and optimization levers. For example, users, including AI developers and adopters, define workload patterns and model choices that determine how much energy is required to deliver a given computational task. Hardware and software vendors, who supply the building blocks of AI infrastructure, set the capabilities and constraints of the systems that execute that task. AI infrastructure providers then design, operate and optimize the environments that bring workloads and underlying components together. Responsibility is shared, but influence is uneven.
Figure 1. Four layers of efficiency we factor into the journey from energy to tokens
This layered view helps clarify where efficiency gains are possible and where trade-offs exist. It also makes it possible to track sustainability impacts through measurable engineering outcomes tied to specific parts of the AI stack.
Scaling the sector responsibly
As a vertically integrated AI cloud provider, Nebius operates across all layers of efficiency in the energy-to-AI process.
We design and operate data centers, build and test server hardware in-house and develop the software stack that orchestrates workloads. This gives us multiple, direct points of influence over the energy footprint of AI workloads. The whitepaper
As demand for AI grows, considering the infrastructure powering it will only become more essential in the years ahead. Through innovation at every layer and an engineering-led approach, we can ensure that every watt is used with purpose.
References
- Jonathan G. Koomey, (2011). Growth in Data Center Electricity Use 2005 to 2010. Oakland, CA: Analytics Press. alejandrobarros.com/wp-content/uploads/old/4363/Growth_in_Data_Center_Electricity_use_2005_to_2010.pdf ↵
- International Energy Agency (2025). Energy and AI. Paris: IEA. iea.org/reports/energy-and-ai. ↵
- International Energy Agency (2025). Energy and AI: Energy Demand from AI. Paris: IEA. iea.org/reports/energy-and-ai/energy-demand-from-ai. ↵
- Berkeley Lab, Energy Analysis & Environment Impacts Division (2024). United States Data Center Energy Usage Report. Berkeley, CA: Lawrence Berkeley National Laboratory. escholarship.org/uc/item/32d6m0d1. ↵





