Darren Burgess, Castrol’s Data Center Cooling
Darren Burgess, Castrol’s Data Center Cooling

Castrol expands its thermal management empire with strategic investment in ECS

In STL, the rising heat of next-generation AI met its match at SC25 as Castrol announced a strategic investment in Electronic Cooling Solutions (ECS), a Santa Clara–based thermal engineering firm known for its deep bench of CFD modeling, reliability testing, and design-for-deployment expertise. The move signals Castrol’s shift from “fluid supplier” to full-stack thermal partner for data centers navigating the swelling power demands of artificial intelligence and high-performance computing.
 
Between keynotes, we sat down with Darren Burgess, Castrol’s Data Center Cooling specialist from Austin, Texas. In a conversation that bounced from Bitcoin mines to hyperscale design rooms, Burgess laid out why Castrol is betting big on immersion cooling, and why ECS is the linchpin.

Immersion Cooling’s Momentum and Why Single-Phase Leads Today

Burgess described immersion cooling as “the simplest path to big power savings,” emphasizing single-phase immersion as the star of today’s deployments. Bitcoin miners have already paved the way: predictable thermals, easy heat capture, fewer moving parts, and measurable reductions in energy overhead.
 
“The industry is learning what miners figured out early,” Burgess told us. “When the density goes up, air just taps out.”
 
Two-phase immersion may be the future, but Castrol is positioning it carefully. “It’s coming,” Burgess said. “But the industry needs predictable supply chains and stability first. That’s where Castrol’s global network becomes an advantage.”

The Glycol Problem No One Talks About

Castrol’s data center expansion isn’t just about immersion. Burgess highlighted a quieter but critical battleground: the chemistry inside traditional hydronic loops. Specifically, propylene glycol (PG 25), a staple in cooling systems, whose stability is often taken for granted.
 
“PG is like a living system,” Burgess said. “If you don’t monitor it, corrosion becomes an invisible tax. Fluid health isn’t optional anymore, it’s uptime insurance.”
 
Castrol is developing next-gen formulations, including detoxified ethylene glycol options with higher-temperature tolerance.

ECS + Castrol: A Full-Stack Thermal Alliance

The newly announced investment gives Castrol something it has never possessed at a global scale: deep thermal engineering capabilities that touch every layer of system design.
ECS brings:
  • Room-to-rack thermal modeling
  • System-level CFD
  • Failure-mode and reliability analysis
  • Immersion and liquid cooling design validation
  • Acclimation, condensation, and corrosion forensic services
Their portfolio includes AI module liquid-cooling designs up to 17 kW, corrosion root-cause tracing, and environmental acclimation studies for hyperscale data centers.
 
With Castrol’s investment, Bharat Vats, an industry veteran and former CEO of Atom Power, has been named President and CEO of ECS. His mandate: scale up ECS’s impact across hyperscalers, OEMs, cloud providers, and energy-intensive AI labs.
 
“Working with Castrol opens the door for ECS to reach the entire data center ecosystem,” Vats said. “Together, we can accelerate the shift to more efficient cooling architectures.”

Why This Investment Matters Now

A recent Castrol-commissioned survey found that 74% of data-center experts now believe liquid cooling is the only path forward for today’s AI power densities. Yet many operators hesitate due to integration complexity and a lack of trusted partners.
 
Castrol believes combining its supply-chain muscle with ECS’s engineering precision will remove those barriers.
 
Peter Huang, Castrol’s Global VP of Data Centre Thermal Management, put it plainly: “The industry needs partners that can guide them from whiteboard to deployment. Castrol wants to be that end-to-end partner.”

A Turning Point for AI-Era Data Centers

SC25 has made one thing obvious: thermal is no longer a back-of-house concern. It is the governing constraint of AI. The players who master heat will be the ones who shape the computing landscape of the next decade.
 
With Castrol expanding from automotive lubricants into immersion, hydronics, and now full-stack thermal design, and ECS bringing decades of analysis and validation expertise, the partnership lands at a pivotal moment.
 
Together, they’re sending a clear message to hyperscalers and AI labs everywhere: The future isn’t just faster. The future runs colder.

Abstraction, automation: Scientific computing enters a new era at SC25

At the SC25 show, the ACM and IEEE-CS Award Presentations provided more than recognition; they reflected on the past and future of scientific computing. The keynote, "Abstraction and Automation: From Workflows to Intelligent Systems and the Future of Scientific Discovery," was delivered by Ewa Deelman of the University of Southern California, a pioneer known for leading the development of the Pegasus Workflow Management System.

From Code to Workflows: Making Complexity Human-Scale

Deelman traced the layered evolution of scientific computing. Initially, researchers worked with machine code and manually scheduled tasks. Subsequently, scripts, batch systems, and workflow engines emerged, serving not as mere conveniences but as tools to preserve scientific intent while managing complexity.
 
Pegasus emerged from this philosophy. Rather than requiring scientists to think like system schedulers, Pegasus translated high-level scientific descriptions into reliable execution across diverse environments, ranging from high-performance supercomputers to distributed grids. The aim was not automation alone, but rather reproducibility, transparency, and trust.

Automation Arrives, And Changes the Scientific Lifecycle

Deelman shifted to the present, where automation has moved far beyond workflow execution. With artificial intelligence now embedded throughout the research pipeline, systems are:
  • assisting with hypothesis generation
  • optimizing and adapting workflows
  • monitoring results in real time
  • and supporting interpretation and publication
In her words, systems are no longer just running science; they are reasoning about it. For fields where data volumes exceed human capacity, cognitive automation has become essential rather than optional.

Transparency, Trust, and the Human Role

The rise of intelligent automation brings new responsibilities. Deelman raised questions that resonated across the SC25 audience:
  • How do we ensure transparency when systems make autonomous choices?
  • What does scientific accountability look like when recommendations come from models, not humans?
  • Where must human judgment remain non-negotiable?
Rather than replacing scientists, Deelman argued, automation amplifies the need for critical thinking and creativity. Scientific skepticism becomes more, not less, important when systems can produce convincing results without explanation.

Design Principles That Endure Through Change

Despite shifting technologies, Deelman highlighted the principles that have sustained Pegasus for decades:
  • abstraction that clarifies rather than conceals
  • automation that supports scientific intent, not overrides it
  • reproducibility as a foundation, not a feature
These values, she emphasized, must guide the next generation of intelligent systems.

Looking Forward: Machines as Partners in Discovery

Deelman closed with optimism grounded in realism. Intelligent systems will soon help explore parameter spaces unreachable by human reasoning alone, uncover patterns hidden in massive datasets, and accelerate breakthroughs that once took decades.
 
But progress requires discipline: transparent algorithms, accountable design, and a scientific culture that refuses to outsource curiosity.
 
The applause that followed made clear that the supercomputing community understood the moment. At SC25, the message was unmistakable: scientific computing is entering a new era. Not one defined by machines replacing thought, but by machines expanding what thought can reach.

A retrospective on science-driven system architecture, the grand challenges ahead

When John Shalf stepped onto the stage for his award presentation at SC25, the atmosphere was charged with the sense of a field at a pivotal moment. His talk, titled "A Retrospective on Science-Driven System Architecture and Grand Challenges for the New Century," traced his professional journey, which reflected the evolution and challenges of modern high-performance computing.

From Early Experiments to Scientific Breakthroughs

Shalf recounted his early days at Virginia Tech, where he was first introduced to reconfigurable computing and DNA sequence comparison, drawing him into the high-performance computing (HPC) ecosystem. This foundation led him to Oak Ridge, where he contributed to materials-prediction codes and experienced the renowned "frog memo." He later transitioned to the National Center for Supercomputing Applications (NCSA), where he worked on some of the most ambitious codes of that era.

He mentioned key projects such as Enzo for cosmology, Cactus for general relativity, and the SC95 initiative, which linked supercomputers across the continent to create a single distributed machine simulating colliding galaxies in real time.

These milestones were not mere footnotes; they represented significant advances leading to one of the greatest scientific achievements of the century: the detection of gravitational waves. Shalf emphasized that the 15-year gap between the prediction and confirmation of these waves was not a delay, but rather a testament to the kind of sustained, disciplined computation that truly defines scientific progress.

Designing for the Workload, Not the Hype

A key theme of Shalf's talk was clear and direct: future systems should be designed around real workloads rather than aspirational benchmarks. He pointed out the development of the Sustained System Performance (SSP) benchmark at Berkeley Lab, which aims to accurately represent system performance instead of flattering it.

This philosophy relies on workload analysis, the use of algorithm/application matrices, and collaboration with applied mathematicians. This transition from focusing on synthetic performance to emphasizing actionable intelligence signifies a significant evolution in high-performance computing (HPC) thinking.

Bandwidth, Wires, and the Growing Crisis Beneath the Surface

Shalf explored the fundamental technical challenge of the post-Dennard era: as transistors continue to shrink, the reliability of wires decreases. While bandwidth increases, so does congestion. Memory channel speeds may improve, but latency issues remain significant. The once-reliable engineering playbook is now under strain due to power constraints and interconnect limitations.

He reviewed earlier efforts involving heterogeneous architectures, Cell processors, multi-core AMD designs, and initial flexibly assignable switch topologies. This research culminated in the hybrid H-FAST network approach, which reduced packet switching by leveraging persistent communication patterns.

This wasn’t mere tinkering; it provided evidence that structural change is achievable.

The Green Flash Era and the Rebirth of Co-Design

Shalf revisited the Green Flash project, an early initiative aimed at implementing deep hardware/software co-design for climate modeling. Rather than forcing scientific codes to conform to standard architectures, Green Flash directly optimized kernels for the hardware, automatically tuning them across various architectures and prototyping custom accelerators well before the modern RISC-V renaissance.

The project's influence also reached into the industry. Shalf pointed out that Google's TPU lineage owes a conceptual debt to the early co-design culture that was established during this period.

A Historical Link: Berkeley Lab's Breakthrough in Data Transfer

At one point in his talk, Shalf paused to highlight a significant milestone in networking, which was documented in Steve Fisher's report from July 3, 2002, about Berkeley Lab's demonstration of 10-gigabit Ethernet. https://www.supercomputingonline.com/latest/924-berkeley-lab-proves-10-gigabit-ethernet-data-transfer-is-a-reality 

Long before terms like "AI clusters" and "hyperscale fabrics" became common, Berkeley Lab demonstrated that multi-gigabit, wide-area data movement was not merely a concept of science fiction but a crucial component of scientific infrastructure. Their demonstration achieved data transfer at unprecedented speeds, validating 10GbE as a practical backbone for research networks and paving the way for the distributed science workflows we often take for granted today.

This achievement marked the beginning of an era where the bottleneck shifted decisively from computation to communication, an insight that resonates with Shalf's warnings even today.

Energy: The Defining Constraint of Our Time

Shalf's central thesis made a significant impact: energy is the real barrier.

He noted the end of Dennard scaling, emphasizing that wire delays are now surpassing transistor improvements. Additionally, hyperscale AI is driving consumption at the grid level.

Shalf argued that the next wave of innovation will not come from brute-force scaling but rather from specialization, advanced packaging, and chiplets. Most importantly, he highlighted the need for an expanded definition of co-design that integrates materials, circuits, architecture, and algorithms into a unified approach.

Reversible Logic and the Frontier Beyond Thermodynamics

In one of the most progressive sections of the talk, Shalf introduced reversible computing and topological materials as promising avenues for achieving ultra-low-energy computation. By eliminating the need for bit erasure, reversible logic avoids thermodynamic limits altogether, challenging the conventional belief that computation must always incur an energy cost. This serves as a reminder that future breakthroughs may look very different from the incremental improvements of the past decade.

Closing: A Field Ready for Reinvention

Shalf concluded on an optimistic note, albeit with a sense of urgency. He argued that the future of high-performance computing will not be dominated by massive machines or sheer scaling alone. Instead, it will belong to systems designed with humility, shaped by genuine scientific needs, and developed through collaboration rather than stagnation.

The applause that followed was not simply in recognition of a successful career. It was a response to a vision, a call for computing to reinvent itself, as it has done throughout history whenever science has demanded it.