by Benjamin Recchie
There are times when you need an exact measurement—but there are also times when an approximation is good enough to work with, and a lot easier to obtain. Even so, knowing how to get the right approximation can still be tricky. Solving that problem is the mission of Robert Gramacy, associate professor of econometrics and statistics at Chicago Booth and fellow in the Computation Institute.
As an industrial statistician, Gramacy is a bit of an odd fit for Booth. He teaches regression to MBA students and Bayesian analysis to PhD candidates, and he has contributed his statistical expertise to projects of his other Booth colleagues on more traditional business topics like accounting and finance. But his bread and butter is statistical inferences for topics in the physical sciences, extrapolating results about a large number of potential simulations from running just a sample of them. In this way, he hopes to give researchers more powerful tools to cut down on the amount of computational time needed to get an answer.
“The general problem is to understand an input-output relationship for anything,” Gramacy says. ”There are a number of variables that describe the configuration of the system, and then as a function of those variables, some computer code is executed and it gives some output.” He creates what’s called a response surface model for the results, which involves running the experiment for a small number of input configurations, then using statistics to extrapolate results from there.
But even the abbreviated simulations can require thousands or tens of thousands of runs. The computational problem, Gramacy explains, is that to determine the response surface for n simulation runs requires calculations that scale as n cubed. To get around this, he developed what he calls a “divide and conquer” methodology: breaking each of the n-cubed calculations into smaller ones that could take advantage of parallel computing.
One such project, done with collaborators at the University of Michigan, calculates the hydrodynamic shock inside of an explosion. His collaborators performed the simulations on Department of Energy computers, and came to him for the statistical analysis. “Even with a big computer like Midway,” he says, with traditional methods “you can do at best 10,000 computer modeling runs.” His divide and conquer strategy allowed him to simulate 1 million data points on Midway in an afternoon.
Another project involved calculating the drag Earth orbiting satellites experience. Although space is usually thought of as a vacuum, there are tiny wisps of an atmosphere at orbital altitudes, enough to slow down a satellite and eventually cause it to crash. Satellites tend to be ungainly things, with antennas and solar panels sticking out; knowing how much drag a satellite experiences in different orientations is important to orbital station-keeping. “We’re using the statistical model to give what the simulator would give in a fraction of the time,” Gramacy says—not a simulation, but an emulation. Instead of running a detailed Monte Carlo analysis that could take an hour, “we can now get that drag in fractions of a second to within 1% accuracy using just a statistical model.”
Gramacy’s first attempt at statistically approximating satellite drag was for the Hubble Space Telescope, but he’s preparing to work on the GRACE satellite and the International Space Station next. It’s important for mission managers to be able to make snap decisions about whether to expend limited fuel to adjust a satellite’s orbit. After all, as fast as it is to run a simulation on a high-performance computer, not having to run it is even faster.