This site features internal content. Click on this button for the public 90th anniversary site:
What is your area of work?
I am a High Performance Linux Systems Engineer and the team lead for the High Performance Computing (HPC) group. My day-to-day role includes the management of the computing resources within the HPC Datacenter, managing and maintaining division computing resources, and maintaining the batch scheduler system on the Lawrencium SuperComputer (a Linux cluster). My work provides the Berkeley Lab researchers with computing resources so they can bring science solutions to the world.
What big challenge(s) are you hoping to solve with your work in the next 20 years?
Researchers use high performance computing for three main things: collecting data, analyzing data, and solving a problem. My team helps researchers by providing the computing hardware that allows them to collect their data and store it on our global storage filesystem, analyze data using our graphics processing units (gpu) nodes as well as applications we provide, and support that allows them to run their code to produce results needed for their research. So for the next 20 years as problems get larger and data gets bigger we will need to be able to keep up with the needs of the researchers and be able to scale up our resources to support them. Data center space and management could be one of the biggest challenges we will have. That’s the big challenge we’re looking at today.
What steps are you taking today to accomplish this vision?
We’re looking into other alternatives or additional space for our data center. We’re currently upgrading power and creating separate server rooms for our compute, storage and infrastructure systems, with compute being the largest and then storage.
We are looking at ways to continue to work more closely with NERSC to help provide researchers with similar hardware at a midrange scale so they can migrate their programs to a larger scale HPC cluster like Cori or Perlmutter.
Who do you partner with at the Lab to bring this vision to life?
Our focus is partnering with the science divisions. We need to know where they're going with their research, what is their workflow, what are their needs? By engaging with them ahead of time, we can scope out what researchers need now and in the future and can be prepared to meet those needs.
We also work with other sites such as Argonne National Lab, the European Organization for Nuclear Research (CERN), and Michigan State University (MSU). We are collaborating with a slew of other sites to ensure researcher workflow can be accommodated. This includes networking with internal Lab teams like ESnet, the Joint Genome Institute (JGI), Molecular Foundry, and NERSC.
Who from the past, present, or future would you like to collaborate with? And on what?
From an IT perspective our collaboration needs to be with any user facility that our customers are collaborating with as well. We need to get out of the bounds of Berkeley Lab, to collaborate with mid-range computing teams at other labs. I want to collaborate more with other groups that our users collaborate with across the country, especially those smaller, almost hidden groups within a research lab. We do this with UC Berkeley, but not necessarily with other labs. We could learn so much together and be ready for the computing research needs of the future.