Avoiding The Bitter Lesson in HEC-RAS Modeling

Sometime after publishing RAS-Commander, I came across Rich Sutton's essay "The Bitter Lesson" I found some profound insights that may resonate deeply many of our common experiences in the field of hydraulic modeling, particularly in using HEC-RAS in the post-6.0 2D modeling era. Especially this quote:

"breakthrough progress eventually arrives by an opposing approach based on scaling computation by search and learning"

Given the massive breakthroughs seen in the field of AI, it would be useful to heed their lessons and try to emulate the strategies they have found to be most successful, and bring them back to our field of practice. So let's do just that.

GPT-4o's Visual Take on The Bitter Lesson Applied to Hydraulic Modeling

Rising Compuational Demands of 2D models and The Exhaustion of Moore's Law's

With the release of HEC-RAS 5.0.7 in 2015, the compute demands for hydraulic modeling exploded. Since release, a trove of useful features, robustness, stability and steady improvements have pushed a significant amount of modeling effort into 2D. The 5.0.7 model release in 2015 which introduced 2D modeling, also fatefully coincided with a trend in the semiconductor industry that saw a failure to keep up with Moore's Law. As transistor density gains became more marginal, more chip space became dedicated to additional cores, which provide little if any marginal benefit to HEC-RAS model runtimes beyond a single order of magnitude (and rapidly diminishing marginal returns beyond just 2 cores). This reality has plagued modelers, as the expectation has always been that the newest machines will always support every-increasing model cell counts and runtime expectations. Especially with the proliferation of relatively low-performing cloud architectures, modelers would be forgiven for feeling that the state of technology in their field of practice has largely stood still or has even worsened relative to clients' expectations that computers get significantly "better" over time. In comparison to the advances in scaling highly parallelizable AI architectures, well, there is no comparison. There are some reasons for that, which make the bitter lesson all the more pertinent.

Parallelism isn't built-in to HEC-RAS

One fundamental limitation of HEC-RAS is its lack of parallelization, meaning models are typically run serially (one plan must complete before the next is executed). To run multiple models in parallel, multiple instances of the program must be manually opened. To avoid errors, especially in automated runs, the entire project folder often needs to be copied. This practice prevents file conflicts that could arise from overlapping preprocessing and model execution stages, but also complicates parallelization efforts as results files must be manually compiled.

Previous generations of 1D models simply didn't have long enough runtimes to justify such features, as they could always be queued to be completed over a lunch break, or overnight, etc. But with the aforementioned explosion in model mesh counts and associated nonlinear increase in compute demand, these workflow assumtions simply don't hold, and have been effectively obsoleted for most large scale 2D efforts. It can often take overnight simply to generate one model run, in which case a parallelized effort proves to be very useful from a real productivity and utilization standpoint.

The Power of Scaling from 1 to 99 and the "Missing Middle" of RAS Parallelization

Existing solutions using Linux leverage containerization do exist for massively parallel runs. These systems allow anywhere from thousands to millions of runs to be containerized and executed. However, these tools are not accessible to most practicing engineers and are generally proprietary software operated exclusively by specialists. These systems have long leveraged parallel compute at scale to address uncertainty, but smaller scale uncertanties that can be reasonably mitigated in 5-99 runs were not feasible to use with these methods simply due the friction imposed by containerization. There are also quite a few implicit limitations of such architectures that I have covered in previous blog posts. Not to mention compute costs! But they solve different levels of scale, as the containerized solutions are a superior solution for run sets that far exceed 99 total plans. Between the default of 1 serialized run, and 99 parallelized runs, is a wide gulf that is ripe for innovation that I like to think of as the Missing Middle.

Before RAS-Commander there was a significant gap between solutions capable of running thousands of HEC-RAS runs and the default software, which can only handle 1 or 2 before becoming unmanageable. By allowing the native HEC-RAS software to scale up to 99 parallel plans without special software or platforms, the "missing middle" problem of compute scaling was demonstrated to be an effective workaround for the immediate constraints presented by current solver and CPU architectures. Furthermore, the approach isnt closed source or particularly novel. Prior to 2015, there was no great need for paralellization outside of coastal applications, and runtimes were tolerable.

There has simply been a failure across the industry to recognize and adapt to the computational reality of running larger models and the related intersection with larger technological trends that have changed the physical layout and "shape" of available CPU compute over the last decade. While not currently high on the development roadmap, parallelism will undoubtedly be implemented eventually in every major modeling package as the shape of compute diverges from the implicit assumptions and the bitter lesson slowly becomes more apparent.

Applying Insights from "The Bitter Lesson"

"The Bitter Lesson" emphasizes the limitations of heuristic methods and highlights the power of leveraging parallelization to scale computational resources. Moving from one serial operation to ten simultaneous runs on a workstation, and then expanding further to 99 runs, unlocks significant new data-driven approaches that reduce uncertainty and improve accuracy. This approach mirrors the broader trend in technology, where the most meaningful innovations have been driven by parallel processing since the early 2000s, as physical limits were reached in CPU design. Many approaches have been implicitly limited by Moore’s Law, whererever parallelism was not a focus of development or where it is not efficiently achievable.

In a recent keynote, NVIDIA's CEO Jensen Huang highlighted the paradigm of diminishing returns experienced by CPU-bound processes. As long as we are running x86 processors, this won’t change. GPUs, designed as Reduced Instruction Set Chips (RISC) with fast interconnects designed for highly parallelizable tasks, are pivotal. AI as we know it is built on massively parallelized systems that are not as easily constrained by Moore’s Law.

Similarly, in hydraulic modeling, leveraging more data and compute power allows for more informed decision-making. While the fundamental physics equations will remain tightly coupled and difficult to scale across any particular model domain, parallelism can still be leveraged. While RAS-Commander allows more effective model-level parallelism, it is limited to just 99 runs which does reasonably cover the most immediate frontier of modeling innovation which was previously only available with proprietary tools.

Optimizing Within Constraints of Available Technologies

The HEC-RAS system has long focused on improving resolution in data sets, land cover layers, and mesh cells. These improvements are all increasing the demands on a single tightly coupled solution set, which fundamentally constrained by the scaling limitations of CPUs with x86 instruction sets. On these platforms, we have shown that trying to use more than two CPUs becomes inefficient. Even moving to the most advanced server architectures provides marginal benefits (and many times, less performant than other options at similar cost).

The future of hydraulic modeling in the near term lies in overcoming these constraints. By leveraging parallelization on a single workstation, then across multiple workstations, we are able to effectively unlock another order of magnitude of effective compute and data, rendering many previous approaches obsolete that were heuristic optimizations of serialized operations. This approach aligns with the principles of "The Bitter Lesson," advocating for more data and compute resources to achieve better outcomes. In practice, the RAS Commander tool exemplifies this approach. It bridges the gap between large-scale, specialist-driven solutions and accessible tools for practicing engineers. RAS Commander enables efficient use of local machines for parallel execution, vastly increasing computational capacity for HEC-RAS projects.

Even when GPU-based solvers arrive, the calculations within any physically-linked model domain will remain tightly coupled and probably won’t scale like the matrix multiplication of an AI. Waiting for GPU solvers to solve the problem is a dead end. There will still be a need for parallelized approaches to reduce uncertainties at many crucial steps in our hydrologic and hydraulic modeling efforts. In fact, the large scale GPU farms will undoubtedly enable parallelization on a massive scale that will enable even more detailed probabilistic uncertainty analysis than the relatively small gains demonstrated by RAS-Commander (single digit orders of magnitude). So far, the RAS-Commander approach has been applied to calibration and validation workflows, but countless examples exist where having more than one result at a time would simplify approaches and reduce uncertainty, and with these tools being available and dissemenated as open source software, additional applications will present themselves.

What's the Lesson?

Restating the Bitter Lesson, waiting for someone to make a “smarter model” or to leapfrog Moore’s Law (which we haven’t kept up with since 2015) is not a reasonable expectation. Until someone figures out how to tokenize a watershed and feed it to a transformer, developing and leveraging parallelization tools should be a focus of hydrologic and hydraulic model software development. By enabling more advanced methods of uncertainty analysis, we gain more data and uncover more potential uncertainties that can be signficantly reduced with even the simplest bright force methods.

The breakthrough that has happened in AI will eventually make its way to H&H modeling, but the paradigm shift that enabled that breakthrough is not totally unrelated to the fundamental constraints of our hydraulic modeling work. Until that breakthrough happens in our field, learning from the Bitter Lesson to and adjusting one's mental models of how to scale the technoogy they use is pivotal to continued innovation.

By increasing the number of parallel runs from 1-2 up to 99 runs in parallel, RAS-Commander represents a natural progression of tailoring the available methods to best fit with the available hardware topology. With more cores available that can't always be effectively utilized on a single model, combined with long runtimes and increasingly detailed and complex datasets and models, the demands and expectations on the modelers and their hardware have long exceeded their ability to keep up. By leveraging multiple orders of magnitude of additional effective compute and data to solve problems that were previously only solved with heuristic methods, RAS Commander is an exemplar of "The Bitter Lesson" in practice. Given the technological landscape, this approach may be the only way to outpace the slow pace of growth represented by the occasional incremental advances offered by improved solvers, increased transistor density, and better SSE extensions.

How Do We Enable AI-Like Breakthroughs in H&H Modeling?

To answer this question, the Bitter Lesson should be heeded not just by those in AI who were disrupted, but also by others whose work is fundamentally compute constrained. The shape of compute has changed, and the shape of the software designed to leverage that compute must change with it, becoming more flat and parallel to better fit with changing capabilities. If we want to have any hope of innovating faster than the fundamental transistor density trends (which have been on a marginal decline for a decade), the focus must be on leveraging available compute though paralellization. Especially for CPU-bound operations like HEC-RAS, the strategy of leveraging multiple cores on a single run has far exceeded it's effective limit, scaling to barely 8 cores while many platforms now support 64 cores or more. The answer has been and will always be to enable more parallelism, especially in the final compute operations.

As end users, we can't implement parallelism within our software packages, nor do we steer the development roadmap. But we can implement program-level parallelism to leapfrog the median performance trendlines and unlock a small bit (from 1 to 99, barely 2 orders of magnitude) of the breakthrough capabilities that are seeing in other fields that are driving hardware innovation. The changing hardware landscape will demand that parallelism move from the user's script-level down to the program level, and eventually to the GPU solver level. Even large models today only use 6-8GB of memory, which in theory would enable massive parallelism potential on current GPU architectures at scale (currently offering many terabytes of memory) which would enable automated calibration and validation and parameter optimization routines with ease. This is the future, if we choose to pursue it.

Until then, there is RAS-Commander!

Now, I'm always guilty of not reading a link at the beginning of a blog post if. So if you didn't read it the first time, read The Bitter Lesson again!

I will leave you with a paraphrased quote, before anyone has the chance to utter it aloud:
"They said that brute force" search may have won this time, but it was not a general strategy, and anyway it was not how people ~~played chess~~ do H&H modeling. These researchers wanted methods based on human input to win and were disappointed when they did not."

William Katzenmeyer, P.E., C.F.M.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

9.Avoiding_The_Bitter_Lesson_In_RAS_Modeling.md

9.Avoiding_The_Bitter_Lesson_In_RAS_Modeling.md

Avoiding The Bitter Lesson in HEC-RAS Modeling

Rising Compuational Demands of 2D models and The Exhaustion of Moore's Law's

Parallelism isn't built-in to HEC-RAS

The Power of Scaling from 1 to 99 and the "Missing Middle" of RAS Parallelization

Applying Insights from "The Bitter Lesson"

Optimizing Within Constraints of Available Technologies

What's the Lesson?

How Do We Enable AI-Like Breakthroughs in H&H Modeling?

Files

9.Avoiding_The_Bitter_Lesson_In_RAS_Modeling.md

Latest commit

History

9.Avoiding_The_Bitter_Lesson_In_RAS_Modeling.md

File metadata and controls

Avoiding The Bitter Lesson in HEC-RAS Modeling

Rising Compuational Demands of 2D models and The Exhaustion of Moore's Law's

Parallelism isn't built-in to HEC-RAS

The Power of Scaling from 1 to 99 and the "Missing Middle" of RAS Parallelization

Applying Insights from "The Bitter Lesson"

Optimizing Within Constraints of Available Technologies

What's the Lesson?

How Do We Enable AI-Like Breakthroughs in H&H Modeling?