Computing Sciences Area

A CONVERSATION WITH JONATHAN CARTER

The Computing Sciences Area (CSA) is home to the Computational Research Division (CRD) and two Department of Energy User Facilities: ESnet and NERSC. The Computational Research Division conducts research in applied mathematics, computer science, data science, and computational science. ESnet (short for Energy Sciences Network) links scientists at national labs, universities, and research institutions around the world through high-bandwidth connections, while NERSC (short for National Energy Research Scientific Computing Center) is the primary scientific computing power for the DOE Office of Science and a user facility that supports the work of more than 7,000 scientists every year. Jonathan Carter, the Associate Lab Director for CSA, shares his thoughts on the Area’s mission and priorities.

How does the Computing Sciences Area support the Lab’s mission?


Computing is essential to the modern scientific enterprise. At the Lab, for example, scientific work requires processing data from experiments or running simulations to provide insights into what’s going on in the experiments. The Computing Sciences Area provides not only high performance computing and the support that enables us to use it effectively, but also a powerful network that connects researchers with their instruments and collaborators.


We also conduct fundamental research and development. Our applied mathematicians have developed techniques to simplify modeling across time and length scales, create faster and more accurate numerical methods, and to create models that can adapt as a simulation evolves. Our computer scientists are taking computing to the next level by studying and innovating in computing architectures, software, and algorithms. They are also working on techniques to deal with the increasing data volumes and different data modalities that science teams are now creating. And our computational scientists work closely with scientists in other fields inside and outside the Lab to explore new simulation methods and data analysis methods.


What are the Computing Sciences Area’s top 3 or 4 priorities today? Why are they important?


One of our priorities is to provide more computing power to meet growing demand. Many people have heard of Moore's Law. Gordon Moore was a scientist at Intel, who observed in the mid-‘60s that the number of transistors on a chip doubled about every 18 months, and that led to increasing compute power. Today it is no longer so simple. With current designs, computers are consuming a great deal of power, and the amount of power we’ll need to meet future needs will start to become unsustainable.


To get the performance we need for future computing demands at reasonable cost and power requirements, we need to look at different technologies. This means new kinds of microelectronics, new kinds of devices, and perhaps changes in the computing paradigm (for example quantum computing). These new designs will also allow us to solve different kinds of problems which current computers are not able to solve.


Secondly, we are prioritizing the use of artificial intelligence (AI) and machine learning (ML) in science. AI and ML are impacting society in general, but also impacting how we do science. There have been significant advances in the last ten years. AI and machine learning can now help us classify or find events in scientific data but often they can’t tell us how and why they got to those conclusions.


Richard Feynman, the famous U.S. physicist, came up with an analogy that I think can be applied to the current limitations of AI and ML. It’s like the Mayan astronomers, which had sophisticated methods to compute planet positions accurately over many years, but who appeared to have no general concept of planetary orbits. We hope that by bringing applied mathematics and statistics to the table, we can develop better AI and ML algorithms that not only provide robust answers but also reveal to us higher-level concepts that lie behind the answers.


Last but not least, we are looking at how to coordinate scientific resources in a way that makes it easy for scientists to create, share, and reuse data. The Molecular Foundry’s National Center for Electron Microscopy, the Advanced Light Source, the Joint Genome Institute, all of these Lab facilities create incredible amounts of data. But the utility of this data can be limited if the data can’t be moved to a suitable computer for analysis, is impossible to store, or hard for collaborators to access. So we are looking at scientific workflows and the use of instruments and computing platforms along those workflows, to make it easier for scientists to collect, process, reuse, and share data.


Who do you partner with at the Lab to be successful?


With two user facilities, we pretty much end up partnering with almost all of the scientists who work with simulations and data at the Lab. Some of those collaborations exist entirely with our facilities, but a number of them are in-depth collaborations involving applied math, computer science, networking, machine learning, and designing and building data repositories to name a few. Three larger collaborations, the Exascale Computing Project, the Scientific Discovery Through Advanced Computing (SciDAC) and the Center for Advanced Mathematics for Energy Research Applications (CAMERA) are multi-program office-funded partnerships with scientists in almost all the other divisions. I should also add that the area also collaborates closely with UC Berkeley, primarily in applied mathematics, computer science, and quantum information science.


On the operations and facilities front at the Lab, we work very closely with the Facilities, Projects and Infrastructure Modernisation, and EHS divisions to support our building, electrical and mechanical needs. NERSC needs a lot of engineering and construction expertise for the upkeep and upgrade of power and cooling systems, so that’s a critical partnership. Of course, we also partner with Information Technology on networking and plans for IT infrastructure.