The most useful numerical lava flow models are those that are both accurate and efficient, and have the physical complexity necessary for applications ranging from simple flow path prediction to real-time flow forecasting to thermal interactions with the environment. Here we quantify both model validity and computational cost, and then discuss the appropriate applications of each code and the relevance of our benchmark experiments to natural lava flows.
Assessment of model accuracy
The benchmarking process allows direct comparison of the strengths and weaknesses of each code. Overall, all models perform well, but they vary in accuracy and especially computational cost. To test the goodness of fit between model and benchmark, we compare the scaling of the X and Y propagation of flows through time to analytical solutions where they are known (unconfined horizontal and sloping flow). We also measure the offset between the experiment and model extents and thicknesses to calculate the difference in the absolute values.
All the codes perform well in simulating the lateral flow propagation of isoviscous flows, as measured by comparison with the temporal evolution of X(t) and Y(t) predicted by theory (Fig. 7). Although the simulations scale appropriately with time, the flow extents diverge, as reflected in the NRMSE (Table 2). Acceptable values of NRMSE will vary by potential model application, but here we consider an NRMSE <10% to be a good fit.
OpenFOAM best matches the unconfined isoviscous flow advance in X, Y, and R (average NRMSE 3.69%; BM-A and C), but the increase in flow thickness with time is low and unsteady relative to the experiments (Fig. 2). VolcFlow and COMSOL capture the evolution in flow thickness for the Benchmark A most accurately (NRMSE 3.13 and 4.63%, respectively), however they underestimate the thickness where the flow encounters an obstacle (NRMSE 14.22 and 16.09%, respectively, for BM-B; Fig. 3). Overall, OpenFOAM best reproduces the final thickness of flows encountering obstacles (NRMSE 6.62%).
For the cooling benchmarks, FLOW-3D and OpenFOAM provide the best fits to the experimental data. All of the model results for the dome extrusion benchmark (BM-C) exhibit the tightest agreement between each other and the experimental observations (<16% spread between all models in radius and normalized temperature with respect to time). OpenFOAM, FLOW-3D, VolcFlow, and COMSOL also capture the advance of molten basalt (NRMSE <7%; BM-D). In this final benchmark, FLOW-3D performs best overall, largely capturing the downslope and thickness growth (average NRMSE 9.15%). OpenFOAM models flow width the best (NRMSE 7.27%), but the thickness values fall ~20% below the observations. VolcFlow and COMSOL capture the flow advance overall, but do not include cooling.
In assessing the performance of the MOLASSES model, we can only compare flow extent at the end of an experiment. The model produces variable results, with good fits to the length, width, and thickness of Benchmark A, the radius of Benchmark C, and the thickness of the molten basalt Benchmark D (NRMSE <10% for all). The modeled downslope and cross-slope dimensions of Benchmark D differ significantly from the experimental observations (NRMSE 21–38%).
The accuracy of model predictions can largely be explained by model assumptions. For example, the 2D CFD codes (VolcFlow and COMSOL) underestimate the thicknesses against obstacles relative to the experiments, but perform similarly to the other benchmarks when no obstacle is present. This is likely because the obstacle represents very steep topography (a vertical step), which can produce errors in the shallow water approximation solution (Kurganov and Petrova, 2007). Unsteady and low thicknesses seen in the 3D model results (OpenFOAM and FLOW-3D), may be related to the limited vertical resolution defined by the mesh. The MOLASSES model mostly matches the observed thicknesses because the input “critical thickness” parameter of the simulation is taken from the experimental thickness. However, the cellular automata approach does not allow for any dynamic effects, which may limit its accuracy where flows encounter obstacles and velocity of the flow can affect its thickness (Soule et al., 2004; Dietterich et al., 2015).
Different approaches to rheology can also explain some of the misfits between models and experiments. Poor MOLASSES results for length and width of the molten basalt flow may be related to the model assumptions of a Bingham-like fluid with a residual thickness that limits spreading (Kubanek et al., 2015). The chosen “critical thickness” parameter also may not be representative of the modal thickness of the entire flow. Modifying the critical thickness parameter would likely have a strong influence on the result. For example, a lower critical thickness in the molten basalt Benchmark D (Fig. 5) might increase the extent of the MOLASSES simulation to match the experimental observations, but it would produce a thinner flow. Importantly, the 3D cooling models (OpenFOAM, FLOW-3D) do not perform better overall for the flow propagation and thickness than the 2D isothermal models (VolcFlow, COMSOL) in the molten basalt Benchmark D. This demonstrates that the flow behavior was not cooling-limited over this short distance and duration, and that a longer molten basalt experiment, an analogue experiment with a growing surface crust, or a well-documented natural flow would be necessary to fully test models with cooling and solidification.
Assessment of computational efficiency
We assess model computation cost by comparing the total CPU hours required for the model runs for each benchmark (Fig. 8). Because the processing speed of all the processors used was relatively similar (2.5–3.6 GHz), CPU hours is a proxy for computational steps or the total floating point operations performed. The results demonstrate the significantly greater computation time for the CFD models relative to the cellular automata code, and the 3D models relative to the 2D ones. Overall, MOLASSES runs 2–3 orders of magnitude faster than VolcFlow and COMSOL, and 4–5 orders of magnitude faster than OpenFOAM and FLOW-3D. However, it does not produce the same detailed output from physical inputs, only the final extent of a flow for a given volume and minimum thickness parameter. Because differences in computation times are in order of magnitude, smaller differences between machine speeds and memory resources are not considered. For axisymmetric cases, such as the dome extrusion or modeling of lava lakes, OpenFOAM, FLOW-3D, and COMSOL can all run faster simulations as radial 2D slices.
The models presented here also vary in how difficult or time-consuming they are to set up. FLOW-3D and VolcFlow have user interfaces and do not require extensive coding, while OpenFOAM, being a library and not a pre-built software product, requires a more extensive preparation and setup. COMSOL provides a user interface as well as a scripting portal, so it is easy for initial setup and further customization and batch calculation. MOLASSES has no user interface, but only requires a DEM and a single text configuration file of 6–10 parameters that end-users modify.
Implications for model applications
The applications of lava flow modeling to assessing flow hazard, forecasting, impact, and mitigation require varying degrees of accuracy, speed, and complexity. Despite a range in computational cost, there are limited differences in the performances of each code. The largest variations are found in the benchmark comparisons of flow thicknesses against obstacles and molten basalt experiments. These differences in accuracy have implications for assessing the model complexity needed to simulate lava; when combined with the computation cost, they also inform code selection for different applications.
In this benchmarking exercise, the participating codes all perform well for isoviscous flow, both sloping and on a horizontal plane (BM-A, B, and C; Figs. 2, 3 and 4). In the molten basalt case (BM-D), the similarity in results for cooling, temperature dependent viscous flow (OpenFOAM and FLOW-3D) and isoviscous flow (VolcFlow and COMSOL; Fig. 5) suggests that the initial viscosity dominates the flow behavior over the short duration of the experiment, with cooling exerting limited influence. Importantly, this will not be true in natural basaltic lava flows, which cool and crystallize extensively during flow emplacement (e.g., Cashman et al., 1999; Soule et al., 2004; Riker et al., 2009). The main differences between the isoviscous and cooling models are in the flow width, which decreases, and flow thickness, which increases, with cooling. For Benchmark D, the incorporation of thermal effects in the models plays a less important role than the effusion rate, initial viscosity, and gravity. However, over the timescales and lengths of real lava flows, especially where cooling-limited (e.g., Guest et al., 1987; Harris and Rowland, 2009), the thermal effects are crucial to include, correctly, in simulations.
Our results can be used to choose the best code for a given application. For studying or forecasting short duration and volume-limited flows (e.g., Etna 1981 or Kīlauea December 1974; Guest et al., 1987; Soule et al., 2004), models with simple rheologies (VolcFlow and MOLASSES) are reasonably accurate and significantly faster than more complex models. However, if a flow is longer-lived and cooling-limited (e.g., Etna 1983 or Mauna Loa 1984; Guest et al., 1987; Lipman and Banks, 1987), fully thermorheological models (OpenFOAM and FLOW-3D) may better forecast its advance. Models that simulate the thermal interaction of the flow with its surroundings (e.g., OpenFOAM, FLOW-3D, and COMSOL) are necessary to assess thermal lava flow impacts on vegetation and the built environment.
Lava flow interventions for hazard mitigation commonly use diversion barriers, although the past success of this approach has been mixed (e.g., Barberi and Carapezza, 2004). Critically, we suggest that numerical modeling could inform future designs. The benchmarks with obstacles (BM-B) demonstrate that obstacle geometry and orientation influence the extent of thickening of the flow upslope of the obstacle that can lead to overtopping (Fig. 3; Dietterich et al., 2015). For simulations of flow into topographic obstacles, including artificial diversion structures (Scifoni et al., 2010; Fujita et al., 2009) or abrupt confining topography (Dietterich and Cashman, 2014), our results suggest that the 2D VolcFlow and COMSOL models consistently underestimate flow thickening at obstacles. It may therefore be more appropriate to use fully 3D models such as OpenFOAM or FLOW-3D to determine if a flow may overtop an obstacle. Even the best performer, OpenFOAM (Table 2), does not capture the observed evolution in flow thickness through time (Fig. 2b). Further experiments and model runs measuring thickening against gradually rising obstacles, such as cones and earth berms, as well as different planform obstacle geometries, could reveal the true extent of this problem for more realistic topographic obstacles. Furthermore, the design of lava flow interventions based on cooling with water (e.g., Williams and Moore, 1983), requires fully 3D models that can simulate crust formation (e.g., OpenFOAM and FLOW-3D).
To construct a hazard map, lava flow models are often applied probabilistically. All of the codes that we tested could be combined with Monte Carlo methods to estimate lava flow hazard from either a single vent or an entire volcanic edifice (e.g., Del Negro et al., 2013). This type of application would require running many thousands of simulations to both capture the range of possible behavior and achieve robust statistics (e.g., Tarquini and Favalli, 2015). With this in mind, our computational cost analysis shows that MOLASSES produces very quick and often accurate simulations and would thus be an appropriate choice for probabilistic hazard analysis.
For studying controls on flow emplacement, the slower but more accurate 3D CFD models provide more flexibility, as they can include parameters such as substrate properties and rheological complexity. Simulations to reconstruct the emplacement of past flows (i.e., to determine effusion rates) are similarly not time-sensitive and can take advantage of sophisticated models that are too slow for probabilistic applications and rapid disaster response. Alternatively, for robust flow forecasting and uncertainty analysis, an ensemble modeling approach could be developed that combines models with different strengths and weaknesses following practices from weather and storm forecasting (e.g., Bonadonna et al., 2012).
Applicability of experiment benchmarks to natural flows
We have tested codes for lava flow simulation on benchmarks derived from laboratory experiments. While the scale of these experiments is small compared with natural flows, they share fundamental physics, such as low Reynolds numbers, large Bond numbers, and high Peclet numbers (Garel et al., 2012; Lev et al., 2012; Dietterich et al., 2015), which indicate laminar flow, negligible surface tension, and a very thin thermal boundary. However, natural lava flows contain much more complexity than is captured in these benchmarks, including multiphase rheology (bubbles, melt and crystals), growth and strength of a surface crust, and changing effusion rates (e.g., Cashman and Mangan, 2014). Further validation is necessary to test models that include these features.
As lava flows travel away from the vent, their multiphase properties change dramatically, impacting their rheology and behavior (e.g., Cashman et al., 1999; Riker et al., 2009). Loss of bubbles and growth of crystals combine to alter the bulk rheology of the flow (e.g., Soule and Cashman, 2005; Castruccio et al., 2010) and the heat budget through latent heat of crystallization (e.g., Harris and Rowland, 2001). The surface crust also plays a role in resisting the flow, controlling heat loss, and the transition between pāhoehoe and ‘a‘ā lavas (Griffiths et al., 2003; Lyman et al., 2005; Cashman et al., 2006). Critically, the growth of the crust, and its strength and thickness, determines whether flows form tubes or open channels, which dramatically affects how far and fast they travel. It is therefore necessary to fully incorporate phase changes for models designed to span this range of behavior. Two-dimensional models like the COMSOL-based depth-averaged model presented here can approximate solidification on the flow margins, but cannot grow a crust, which is fundamental for governing pāhoehoe emplacement (Chen and Rempel, 2015). Three-dimensional models like OpenFOAM and FLOW-3D have the potential to form a crust and simulate an evolution in bulk rheology (e.g., Vakhrushev et al., 2014), but the current implementation of our lava flow OpenFOAM solver includes only a cooling-induced increase in viscosity, and we have not tested FLOW-3D’s solidification solver for a lava flow scenario. Employing analogue experiments with surface crust growth (e.g., Griffiths et al., 2003; Garel et al., 2014) as benchmarks could help guide development of appropriate models, although most existing analogue materials are not scaled appropriately to quantify crustal growth effects (e.g., Soule and Cashman, 2005).
Lava flows also change the local topography during emplacement. Construction of cones and depositing of tephra near the vent area, flow levees along channel margins, and elongated tumuli formed by flow inflation will all affect the routing of subsequent flows (e.g., Mattox et al., 1993; Wolfe, 1988; Dietterich et al., 2012; Elissondo et al., 2016). Models should therefore be time-dependent and include syn-eruptive alteration of the topography, or they will lose accuracy as the eruption continues (e.g., Hidaka et al., 2005).
Variations in effusion rate during eruption are also common but are not included in our benchmarking study. Surges in effusion rate often lead to breakouts or overflows that can spawn new flow branches that ultimately widen the flow field (e.g., Guest et al., 1987; Hon et al., 1994; Tarquini and de' Michieli Vitturi, 2014). New flow branches can rob downslope flow fronts of lava supply (Dietterich and Cashman, 2014) and thus exert a fundamental control on flow behavior. Experiments that alter effusion rate while incorporating crust growth and levee formation would be valuable to test model implementations of this behavior (e.g., Rader et al., 2015).
Finally, we note the paucity of detailed documentation of natural lava flows that could serve as the preferred datasets for benchmarks, since they inherently capture the complexities of real lava flows. Although experiments offer the advantage of well-constrained input parameters and complete documentation of the resulting flow, they are limited in both extent and kinetic energy, and thus cannot simulate the dynamical and rheological range of real lava flows. Technological advances (including lidar, InSAR, and structure-from-motion photogrammetry) now permit accurate measurements of pre-eruptive topography, as well as repeat measurements of flow geometry that could provide near-real-time data on effusion rate, flow extent, and thickness during emplacement (e.g., Favalli et al., 2010; Poland, 2014; Slatcher et al., 2015). Ideally these measurements would be supplemented by corresponding measurements of thermorheological properties using a combination of field and remotely sensed measurements of temperature (thermocouples and thermal imaging; Lipman and Banks, 1987; Patrick et al., 2017) and rheology (samples and video analysis; Cashman et al., 1999; Lev et al., 2012; Soldati et al., 2016). We urge the community to come together to collect such data on future lava flows.