 Research
 Open Access
 Published:
How many explosive eruptions are missing from the geologic record? Analysis of the quaternary record of large magnitude explosive eruptions in Japan
Journal of Applied Volcanologyvolume 4, Article number: 17 (2015)
Abstract
Large magnitude explosive eruptions in Japan were compiled for the Large Magnitude Explosive Volcanic Eruptions (LaMEVE) database. Here we use this dataset to investigate the underrecording of Japanese explosive eruptions. We identify underrecording of Volcanic Explosivity Index (VEI) 4–5 eruptions on two timescales. Model fitting and Akaike’s information criterion (AIC and AICc) model selection suggest that these trends can be represented with the double exponential decay model, reflecting geologic processes. The time series of the recording rate of larger eruptions (VEI 6 and 7) show a slowly decreasing trend in comparison to smaller eruptions. These time series can be represented with the single exponential decay model. The percentages of missing eruptions are estimated from the fitted models. Our results show an inverse correlation between VEI and degree of underreporting suggesting that even larger VEI eruptions are underrecorded in the Quaternary. For example, 89 % of VEI 4 events, 65–66 % of VEI 5 events, 46–49 % of VEI 6 events and 36–39 % of VEI 7 events are missing from the record at 100 ka, 200 ka, 300 ka, and 500 ka, respectively. Comparison of frequencies of Japanese and global eruptions suggests that underrecording of the global database is 7.9–8.7 times larger than in the Japanese dataset. Therefore, underrecording of events must be taken into account in estimating recurrence rates of explosive eruptions using the geologic record.
Background
Databases of large magnitude volcanic eruptions
Databases of large magnitude volcanic eruptions have been created to provide basic information about explosive volcanism (e.g. Machida and Arai, 2003; Committee for catalog of Quaternary volcanoes in Japan 2000; Siebert and Simkin, 2002; Siebert et al., 2010) and to facilitate assessment of hazards and risks from volcanic eruptions (e.g. Mason et al., 2004; Crosweller et al., 2012). Among these databases, the most accessible global databases of volcanic eruptions are the Smithsonian’s Global Volcanism Program database (Siebert and Simkin, 2002) and the Large Magnitude Explosive Volcanic Eruptions (LaMEVE) database (Crosweller et al., 2012). The Smithsonian’s Global Volcanism Program has compiled documentation of Holocene volcanism around the world for nearly five decades for better understanding of the full range of Earth’s eruptive activity, and to make these data available (online at http://www.volcano.si.edu/) to the everbroadening community interested in volcanism (Siebert and Simkin, 2002). The database contains information about vent location, start and end dates of eruptions, dating method and Volcanic Explosivity Index (VEI; Newhall and Self 1982) of any magnitude of eruption known to have occurred during the past 10,000 years (Siebert and Simkin, 2002). The LaMEVE Quaternary database includes much older events (back to ~2.6 Ma; Crosweller et al., 2012). To support assessments of environmental and societal impacts of volcanism on global, regional and local scales and to provide basic information on global explosive volcanism, the LaMEVE database was created as a component of the Volcanic Global Risk Identification and Analysis Project (VOGRIPA) database, which is being developed as part of the Global Volcano Model (GVM) (Crosweller et al., 2012). The LaMEVE database aims to include all known explosive eruptions for which events are dated, source volcanoes are known and eruption magnitude (M = log _{ 10 } [erupted mass(kg)]  7; Pyle 2000) and/or VEI are ≥ 4. The database contains information about the age of eruptions, deposit type, deposit volumes, VEI, eruption magnitude, intensity, basic geochemistry, source volcano location, data sources, errors and uncertainties with indices of data reliability (Crosweller et al., 2012). This database is publically available online (Crosweller et al., 2012; http://www.bgs.ac.uk/vogripa) and currently contains information on 3,107 Quaternary volcanoes and 1,887 explosive eruption records from the last 2.6 My.
Underrecording of events
Creation of databases of large magnitude explosive volcanic eruptions has clarified that underrecording of the frequency of eruptions and variable recording over time are problems (Deligne et al., 2010; Brown et al., 2014). Potential for underrecording directly impacts hazard and risk assessments (Bebbington 2013). A major challenge is to understand the degree of underrecording (Guttorp and Thompson, 1991; Deligne et al., 2010; Furlan, 2010; Brown et al., 2014).
Guttorp and Thompson (1991) studied patterns of volcanic activity during the last 500 years using the Smithsonian’s Global Volcanism Program database. Their results show that neither the global nor Japanese eruption records are significantly different from a Poisson process during this time interval. They estimated the reporting probability by smoothing the observed number of events with fitting a linear line to a locally chosen number of events, decided by crossvalidation. Deligne et al. (2010) compiled a global dataset of Holocene large explosive volcanic eruptions from the Smithsonian’s Global Volcanism Program database and additional literature. They analyzed the data with the extreme value method, first applied in volcanology by Coles and Sparks (2006). In this method, underrecording of events is taken into account as a function of eruption magnitude. Their results indicate that the level of underrecording is high and fairly constant from the start of the Holocene until about 1 A.D. and then decreases dramatically over the last ~2000 years (Deligne et al., 2010). Furlan (2010) analyzed a catalogue of large eruptions that occurred during the last two millennia. To represent the temporal evolution of the censoring effect in the recording process effectively, the extreme value method was used with a changepoint model. Like Coles and Sparks (2006), Furlan (2010) concluded that underrecording decreases dramatically in the most recent 400 years with increased recording of eruptions spreading throughout the South Pacific in this period. Underreporting is significantly more pronounced during Holocene and historical time when data from smallvolume eruptions (VEI 03) are also considered (Siebert et al., 2010). Brown et al. (2014) analyzed LaMEVE to show that underrecording is a function of magnitude and the proportion of historic and geological data in each magnitude subset. The time at which 50 % of the data are younger is a power law function of magnitude, which was attributed largely to preservation potential of deposits.
Although these analyses focused on the under_{−}recording in different global databases, underrecording varies between regions due to variations in the length of the historic record, preservation of deposits, and number of geological investigations. For example, in the LaMEVE database, Japanese events account for about 39 % (729 out of 1,887 eruptions) of the entire set of eruptive events, whereas Japanese volcanoes only account for about 3.4 % (106 out of 3,107) of the number of volcanoes in the database (Brown et al., 2014).
In this paper we investigate underrecording of a large magnitude explosive eruption database in Japan in order to quantify underrecording of eruptions in regions where a comparatively large amount of geological data are available. We use a threshold of VEI or magnitude 4 to analyze underreporting effects of eruptions with magnitudes large enough to have been documented both in historical and geologic records.
Methods
Data on Japanese explosive eruptions were compiled from the Smithsonian’s Global Volcanism Program database (Siebert and Simkin, 2002), additional Japanese databases and published articles. These data were used to populate LaMEVE. These additional Japanese databases are Machida and Arai (2003), Committee for Catalog of Quaternary volcanoes in Japan (2000), Geological Survey of Japan, AIST (2013) and Hayakawa (2010). Machida and Arai (2003) compiled tephra data into a catalog of tephra in and around Japan to provide basic information for identification and further investigations of regional tephra fallout deposits. As this database is focused on identification of regional tephra layers, it contains major mineral components, type and refractive index of volcanic glasses and type locality of tephra. Products of recent small eruptions are not described in detail. Committee for catalog of Quaternary volcanoes in Japan (2000) edited a Quaternary volcano catalog to understand temporal and spatial variations in magmatism in Japan. Like the database of Machida and Arai (2003), this database does not contain younger and smaller eruptions. In contrast to these databases, Geological Survey of Japan, AIST (2013) compiled published articles into an online 10,000year eruption database of Japan. Compared to the databases of Machida and Arai (2003) and Committee for catalog of Quaternary volcanoes in Japan (2000), this database provides more details of younger and smaller eruptions, although information related to some major volcanoes is still under compilation. Moreover, Hayakawa (2010) organized online databases of 2000year and one millionyear tephra in Japan. The database contains more instances of tephra fallout deposits than other Japanese databases, but these additional units are often not described in terms of eruption age and volume.
For our analysis, we identified 700 explosive volcanic eruptions at 100 volcanoes in Japan from the LaMEVE database (Crosweller et al., 2012, downloaded August 11th 2014). The LaMEVE database is developing continuously. It not only collects newly reported events and volcanoes but also removes events and volcanoes when they are determined to be inaccurate or duplicates. The analyzed version of LaMEVE originally contained 729 eruptions from 106 Japanese volcanoes. A total of 29 events and 6 volcanoes were removed because their exact eruption sources were determined to be uncertain based on their original data source (Hayakawa, 2010). Since the analyzed LaMEVE dataset contains 1887 explosive eruptions and 3107 Quaternary volcanoes throughout the world, these 700 Japanese explosive eruptions and 100 Japanese volcanoes account for about 38 % (=100 × (729–29)/(1887–29)) and 3.2 % (=100 × (106–6)/(3107–6)) of the global records, respectively. Of these eruptions, two are older than 1.8 Ma and the oldest event is 2.25 Ma. A total of 686 eruptions are larger than or equal to VEI 4 (Table 1). Although the other 14 events are larger than or equal to eruption magnitude 4 (as defined by Pyle, 2000), their VEI value is 3. Similarly, a total of 699 eruptions are larger than or equal to eruption magnitude 4 (Table 2); eruption magnitude of the remaining one VEI 4 eruption is 3.7. Smaller eruptions account for the majority of the compiled data: VEI 4 and 5 eruptions account for 82.5 % of the total 686 eruptions greater than or equal to VEI 4 (Table 1) and 74.7 % of the total 699 eruptions (M ≥ 4) are smaller than magnitude 5.5 (Table 2). Brown et al. (2014) stated that analysis of the LaMEVE database at a resolution higher than one order of magnitude bins is not justified due to the large uncertainties in tephra volume estimation and, therefore, analyzed the data in bins of order of magnitude (M44.9, M55.9, M66.9, M77.9 and M88.9). In our analysis, we analyze the data every half an order of magnitude, because the influence of the uncertainty is less significant as the data are binned cumulatively, so that each bin contains eruptions larger than or equal to its eruption magnitude value (Table 2). Plots of cumulative number of events against time (Fig. 1a and b) are strongly nonlinear with the event rate (the derivative of this relationship)  decreasing back in time. Event rate decreases more rapidly as VEI or magnitude increases (Fig. 1c and d). Therefore, it is clear that eruptions are not constantly recorded over long time periods, a feature also observed in global data (Brown et al., 2014).
Except the category of VEI 8 due to a very small number of events (9 events), the number of missing events given the apparent change in frequency with time is estimated in three steps: (1) calculation of time dependent recording rate of events for each VEI (4 to 7) and eruption magnitude (4.0 to 7.0) category, (2) fitting a function to the time series of the observed recording rate, (3) calculation of the percentage of missing events compared to the total number of events, with both estimated from the fitted function. The modeling assumes that there are no missing events at time zero (the present day) under the present global volcano monitoring systems and the present day recurrence rate (the eruption frequency; events/ka) is calculated as the intercept of the fitted function at time zero.
Recording rate calculation
For each VEI and eruption magnitude category (Tables 1 and 2), event recording rate, r _{ dm,} at the m ^{th} youngest event age, d _{ m,} is calculated using a moving average:
where d _{ m+n } and d _{ mn } represent the ages of the n ^{th} older and younger events than the m ^{th} youngest event, respectively, and 2n is the number of events representing a window size, MA, of the moving average calculation. Because the smoothness of the time series of the observed recording rate depends on the window size, multiple calculations with different window sizes are necessary. Although a smaller window size (e.g. MA = 2) is suitable to detect short term recording rate changes, it often gives a zero denominator in equation 1 because some consecutive events have the same age in the database. We therefore chose the window size (MA) of 4 (n = 2), 6 (n = 3) and 8 (n = 4) in this study. This is a simpler approach than the one adopted by Sanchez and Shcherbakov (2012), in which they used logarithmically increasing bin lengths to calculate normalized recording rate.
Function fitting to the observed recording rate
The time series of the calculated recording rate was modeled with two functions. The recording rate of the larger events appears to decrease exponentially as a function of eruption age (Fig. 2a, b and e). Therefore, an exponential decay function, R _{ e }(t), is used to model the trends.
where R _{ 0 } = rλ is the intercept value of the recording rate at time zero, λ is the decay constant of the exponential decay function, f _{ e }(t) is the probability density function of an exponential distribution and r is a scaling factor to fit f _{ e }(t) to the time series of the observed recording rate. Parameter λ is obtained from maximization of the logarithmic likelihood function of λ:
where N is the number of observed recording rate in the time series. From this function, the maximum likelihood estimate of λ is obtained explicitly:
where \( \widehat{d_m} \) is the mean age of the observed recording rate r _{ dm }. For the maximum likelihood estimate of λ,
represents the halflife period of recording rate R(t).
Time series of smaller events show a more rapid drop followed by a slower decreasing trend (Fig. 2c, d and hk). These two trends are difficult to be modeled with a single exponential function. In these cases, a double exponential decay function, R _{ d }(t), is fitted to the time series of recording rate to capture the two decreasing trends observed in the time series (Fig. 2c, d and hk):
The two terms of the double exponential decay function, R _{ d }(t), are attributed to rapid and slow decay of underrecording processes. R _{ 1 } is the initial value of the recording rate of units, which disappear relatively quickly with time, R _{ 2 } is the initial value of the recording rate of units which are preserved for a longer time. The parameters λ _{ 1 } and λ _{ 2 } are decay constants of the two exponential decay segments; f _{ e1 }(t) and f _{ e2 }(t) are the probability density functions of two exponential distributions; f _{ d }(t) is the probability density function of a double exponential distribution; p and r are weighting and scaling factors, respectively. Parameters λ _{ 1 }, λ _{ 2 } and p can be estimated numerically from maximization of the logarithmic likelihood function of λ _{ 1 }, λ _{ 2 } and p:
For the estimated λ _{ 1 } and λ _{ 2,}
represent the halflife periods of rapid and slow decreasing processes, respectively.
The functions R _{ e }(t) and R _{ d }(t) are obtained by estimating the scale factor r after deciding the probability density functions, f _{ e }(t) and f _{ d }(t) with the decay constants, λ, λ _{ 1 } and λ _{ 2 } and the weighting factor p. The scale factor r is estimated by minimizing the square error between the probability density function and the time series of observed recording rate in logarithmic scale (Fig. 2ak).
The critical assumption here is that the recurrence rate of explosive eruptions in Japan was constant during the last 2 Ma. This assumption is plausible because the tectonic setting of arc volcanism has not changed significantly in Japan during most of the Quaternary (e.g., Taira, 2001; Mahony et al., 2012).
Although other functions were also investigated to describe underrecording of events (Fig. 3), they are not suitable for our analysis. For example, Coles and Sparks (2006) and Deligne et al. (2010) introduced a function into the extreme value method to represent the probability of underrecording. Their underrecording probability function may represent the short term underrecording (~2,000 years, Coles and Sparks 2006; ~10,000 years, Deligne et al., 2010). However, we found that their underrecording probability function does not fit the long term dataset (Fig. 3). In addition, the Weibull distribution is often used to model the survivor rate (e.g. Pyke and Thompson, 1986); however, the divergence of the function at t = 0 does not provide an estimate of the current frequency of eruptions. The large recording rate calculated by the function at t ~ 0 (e.g. Fig. 3d, j and k) suggests the divergence of the function. Although the modified Omori’s law (Utsu, 1961) for aftershock decay in seismology shows better fitting than these two functions (Fig. 3), an advantage of our model is the clear physical meanings of parameters in an exponential decay model, specifically the initial value R _{ 0 } (present day recurrence rate) and the halflife T of the recording rate. Parameters of the double exponential decay model correspond to the initial values R _{ 1 }, R _{ 2 } and the halflife T _{ 1 }, T _{ 2 } of recording rates of the rapid and the more slowly decreasing processes. In the latter case, the sum of the initial values of the recording rates of the rapid and slower processes represents the present day recurrence rate R _{ 0 } (R _{ 1 } + R _{ 2 }) of events (Table 3). On the other hand, the window size 2n (MA) for the moving average calculation (equation 1) and function are not uniquely constrained by our method. Therefore, multiple calculations with different window sizes and evaluation of models with corrected Akaike’s information criterion are necessary to compare the models.
For each VEI (4 to 7) and eruption magnitude (4.0 to 7.0) category, one of the two probability density functions, f _{ e }(t) and f _{ d }(t), is selected as a better function based on Akaike’s information criterion (AIC; Akaike, 1974):
and corrected Akaike’s information criterion (AICc; Sugiura, 1978) for small sample size (here, sample size ≤ 50):
where k is the number of parameters in the model, N is the number of recording rate observations in the time series and lnL is the maximized value of the logarithmic likelihood function (equations 5 and 12). AICc was used for the model selection of eruption categories VEI 7 (29 events), M ≥ 6.5 (50 events) and M ≥ 7 (34 events). When the difference of the AIC or AICc values of the two models was larger than 2, the model with the smaller AIC or AICc was selected (Burnham and Anderson, 2002). For our analyses, this difference corresponds to a significance level of 4.6 % (Hudson, 1971; Zhuang et al., 2005). On the other hand, if the difference of the AIC or AICc values was less than 2, results of both models are shown in Table 3, as the models are not regarded as significantly different.
Calculation of the percentage of missing events
Based on the fitted function, R(t), the percentage of underrecording in T years is estimated from the number of recorded events, \( {\displaystyle {\int}_0^TR(t)}dt \), and the total number of events, TR(0), both estimated from the fitted function:
When the AIC or AICc difference of the two models is less than 2, the underrecording was calculated as the weighted average of underrecordings, which are estimated from both exponential and doubleexponential models and weighted with their relative likelihoods (Table 3), given by:
where A _{ s } and A _{ l } are the smaller and larger AIC or AICc values, respectively (Burnham and Anderson 2002). Note that the relative likelihood of the smaller AIC or AICc model is equal to 1 in this equation (Table 3).
Results
The results of model fitting, AIC and AICc model selection suggest that the time series of the recording rate of smaller and larger eruptions can be described by double exponential decay model and exponential decay model, respectively (Fig. 2, Table 3). Good fitting of the functions to the observed recording rates is indicated visually by the small (less than one log unit) residuals (Fig. 4). Eruptions are not constantly recorded over long time periods (Fig. 1a and c). For instance, the VEI 4 recording rate decreases very rapidly with time (Fig. 2d, halflife is about 5600 years; Table 3) and only 4 of 293 VEI 4 eruptions are older than or equal to 1 Ma (Fig. 1a). The recording rate of smaller events, therefore, shows a rapid decreasing trend in the comparatively recent part of the record, and a slight decreasing trend in the older part of the record, suitable to be modeled by the double exponential decay function (Fig. 2c, d and hk). For example, originally high initial recording rates (recurrence rate) of about 22–25 events/ka (VEI 4; MA = 4, 6, 8), 3.2–3.5 events/ka (VEI 5; MA = 4, 6, 8), 26 events/ka (M ≥ 4; MA = 8) and 3.4–3.5 events/ka (M ≥ 5; MA = 6, 8) decreases very rapidly with the halflives of the recording rate of 5.56–5.63 ka (VEI 4; MA = 4, 6, 8), 39.2–40.2 ka (VEI 5; MA = 4, 6, 8), 7.69 ka (M ≥ 4; MA = 8) and 53.0–53.3 ka (M ≥ 5; MA = 6, 8) (Table 3). At older times, the data are characterized by a much less rapid decrease in the recording rate, indicating a different, slower process. Here the halflives of the recording rate of 101–112 ka (VEI 4; MA = 4, 6, 8), 181–202 ka (VEI 5; MA = 4, 6, 8), 163 ka (M ≥ 4; MA = 8) and 265–274 ka (M ≥ 5; MA = 6, 8) are found (Table 3) for time periods prior to 26–27 ka (VEI 4; MA = 4, 6, 8), 114–130 ka (VEI 5; MA = 4, 6, 8), 28 ka (M ≥ 4; MA = 8) and 167–173 ka (M ≥ 5; MA = 6, 8), respectively. The recording rates of VEI 4 (MA = 4, 6, 8), VEI 5 (MA = 4, 6, 8), M ≥ 4 (MA = 8) and M ≥ 5 (MA = 6, 8) at this transition are about 1.7 events/ka, 0.6–0.7 events/ka, 3.9 events/ka and 0.6 events/ka, respectively.
The recording rate of larger eruptions can be reasonably modeled with a single exponential function (Fig. 2a, b and e). For instance, the initial recording rates (recurrence rate) of VEI 6 (about 0.4 events/ka; MA = 4, 6, 8), VEI 7 (about 0.06 events/ka; MA = 4, 6, 8) and M ≥ 7 (about 0.06–0.07 events/ka; MA = 4, 6, 8) eruption categories decrease with the halflives of 137–150 ka, 317–343 ka and 361–389 ka, respectively (Table 3). On the other hand, AIC or AICc difference between the fitted models is less than 2 in the cases of the M ≥ 6 dataset with MA = 8 and M ≥ 6.5 dataset with MA = 4 (Table 3) suggesting that these medium eruption magnitude categories might be represented by either the exponential or double exponential model (Burnham and Anderson, 2002). This transitional feature is not found in the analysis of VEI datasets but in the modeling of the eruption magnitude dataset, probably because the datasets of eruption magnitude are taken at smaller intervals (every half an order of magnitude) and also contain different magnitude eruptions, which are larger than each magnitude category. Furthermore, this transitional feature also indicates the importance of testing different window sizes in the moving average calculation in our analysis.
Figure 5 shows the increase of underrecording with time. Although the percentage is estimated with different window sizes (MA = 4, 6, 8) in the moving average calculation, it does not vary significantly, except in the case of the VEI 5 dataset. The events of smaller VEI (4 and 5) show an increase of underrecording with increasing time in the past. For example, 89 % of VEI 4 events (MA = 4, 6, 8) and 65–66 % of VEI 5 events (MA = 4, 6, 8) are missing from the record at 100 ka and 200 ka, respectively (Fig. 5a). Furthermore, even larger VEI events (VEI 6 and 7) show a significant amount of underrecording, although the increase of underrecording is reduced compared to the smaller VEI events. For instance, 46–49 % of VEI 6 events (MA = 4, 6, 8) and 36–39 % of VEI 7 events (MA = 4, 6, 8) are missing from the record at 300 ka and 500 ka, respectively (Fig. 5a). Similarly, eruption magnitude datasets including smaller eruption magnitude events, show relatively quick increase of underrecording. For example, 83 % of M ≥ 4 eruptions (MA = 8), 58 % of M ≥ 5 eruptions (MA = 6, 8), 46–54 % of M ≥ 6 eruptions (MA = 4, 6, 8), and 33–36 % of M ≥ 7 eruptions (MA = 4, 6, 8) are underrecorded in 100 ka, 200 ka, 300 ka, and 500 ka periods, respectively (Fig. 5b). The total number of eruptions is estimated from the fitted models. During the Quaternary (2.588 Ma; Gibbard et al., 2010), it is estimated that about 56,000–63,000 VEI 4 events (MA = 4, 6, 8), 8,200–9,000 VEI 5 events (MA = 4, 6, 8), 1,100–1,300 VEI 6 events (MA = 4, 6, 8), 150–180 VEI 7 events (MA = 4, 6, 8), 68,000 M ≥ 4 eruptions (MA = 8), 8,700–9,000 M ≥ 5 eruptions (MA = 6, 8), 1,300–1,900 M ≥ 6 eruptions (MA = 4, 6, 8) and 160–190 M ≥ 7 eruptions (MA = 4, 6, 8) occurred in Japan.
Discussion
Underrecording of the global data
Underrecording of the global database can be estimated from the ratio between the Japanese and global eruption frequencies and the actual percentage of the Japanese eruptions in the global database. An almost perfect empirical linear relationship is reported between logarithmic frequency and VEI values by De la CruzReina (1991):
where F is the eruption frequency (events/ka) and I is the VEI or eruption magnitude. When eruption magnitudes are taken instead of VEI values, the slope of the trend line, b, is equivalent to the “bvalue” of GutenbergRichter law in seismology. Between VEI or eruption magnitude categories of I _{ p } and I _{ q } (I _{ p } < I _{ q }), the mean ratio of the global eruption frequency to the Japanese eruption frequency, r _{ g }, is:
where the subscripts g and j denote the parameters obtained from the global and the Japanese data, respectively. In our analysis, the Japanese eruption frequency (events/ka) of each VEI and magnitude category is calculated from the fitted recording rate models with t = 0 on the assumption that there is little possibility of current underrecording given present global volcano monitoring systems. Because the mechanism of very large explosive eruptions may be different from that of smaller ones (Deligne et al., 2010) and larger eruption categories (VEI 7, M ≥ 6.5 and M ≥ 7) do not have many records (Tables 1 and 2), the empirical linear relationship (equation 19) is estimated for the range of VEI 4 to VEI 6 and M ≥ 4 to M ≥ 6 in this study (Fig. 6 and Table 4). In comparison with the Japanese dataset, temporal patterns of global volcanism in the past 500 years yields a _{ g } = 5.494 and b _{ g } = 0.789 (Fig. 6a; De la CruzReina, 1991). Although the time range of the analysis is short and underrecording of the global dataset is not considered, this result suggests that the eruption frequency of the global volcanism is 10.6–11.5 times more frequent than the Japanese eruptions between the VEI values of 4 to 7 (MA = 4, 6, 8; e.g. Fig. 6a), or 8.7–9.4 % of global eruptions occurred in Japan. Comparison between this percentage and the actual percentage of Japanese eruptions in the global database (about 38 %) suggests that the underrecording of the global database is 4.0–4.3 times larger than estimated for the Japanese data. Pyle (1995) compiled explosive eruptions that occurred globally during the last 2000 years and analyzed their frequencysize relationship, timeaveraged eruptive mass and thermal energy release. The frequencies of those eruptions, which are derived from longer time period for larger eruption categories, and the mean volume of those eruption categories (Pyle 1995) yield a _{ g } = 6.126 and b _{ g } = 0.86 for the linear relationship of logarithmic frequency and magnitude with an assumption of tephra density of 1000 kg/m^{3} (Fig. 6b). For the eruption magnitude categories of M ≥ 4 to M ≥ 6 (MA = 6, 8; e.g. Fig. 6b), global eruptions are about 21.1–23.2 times more frequent than the Japanese eruptions, and 4.3–4.7 % of eruptions in the world occurred in Japan. This percentage and the actual percentage of the Japanese eruptions in the global database (about 38 %) suggest that underrecording of the global database is 7.9–8.7 times larger than the Japanese dataset. Because the Japanese volcanoes accounts for about 3.2 % of the LaMEVE database, the results estimated with the Pyle (1995) approach are more concordant than the results obtained using the De la Cruz Reyna (1991) approach.
The reason for many missing events
The results of our analyses suggest a large amount of underrecording of events including not only smaller (e.g. VEI 4 and 5) but also larger (e.g. VEI 6 and 7) magnitude eruptions. The underrecording of events is attributed to the absence of historical records, disappearance of tephra units from the geological record and or overlooking of events  that is, not identifying events that are preserved in the rock record. The main mechanisms of disappearance of tephra units are erosion (e.g. Lavigne 2004; Pierson et al., 2013) and alteration of tephra deposits (e.g. Pollard et al., 2003), burial of tephra deposits by younger deposits (Imura and Kobayashi, 2001) and disappearance of the source volcano itself due to burial (e.g. Kamata, 1989) or erosion (e.g. Machida et al., 1997).
The absence of a historical record is a significant reason for a recording rate decrease in the past 1,000 years. For example, Furlan (2010) showed that recording of large eruptions changed considerably at AD 1900 and 1600. These changes can be explained in terms of factors such as colonization, improvements in newspaper reporting, the telegraph and more recently the internet, and development of modern scientific approaches to reporting natural events. Because of this more rapid decrease in historical data compared with geological data, Brown et al. (2014) excluded eruptions less than 1,000 years BP from their analysis to understand geological data. In our analysis, the dataset includes both historical and geological data and shows the rapid decrease process (Fig. 2c, d and hk). On the other hand, the time scale of this process is much longer (halflife >5,000 years; Table 3) than historical time scale. Therefore, the rapid decrease observed in our analysis must be mainly caused by geological processes. For instance, erosion of tephra deposits must be a significant mechanism responsible for the underrecording of smaller eruption volume events. Especially, recording rates of smaller events show the process of the rapid decrease (Fig. 2c, d and hk) and suggests that erosion is more dominant during this process. Conversely, older smaller events show slower decreasing process of recording rate (long tail of recording rate, Fig. 2c, d and hk) and suggest that some of the tephra deposits are preserved from the rapid erosion process as indicated by Brown et al. (2014). Alteration of tephra deposits, burial of tephra deposits by younger deposits and disappearance of eruption source due to burial or erosion must be the mechanism of this slower decreasing process. They must also affect the long term decrease of recording rate of those large eruptions (Fig. 2a, b and e). Furthermore, overlooking events may happen if geologists are biased toward reporting younger (e.g. active volcano < 10,000 years old) and larger events. Reporting bias must affect the identification of older smaller events more than the reporting of larger and younger events.
Volcanic hazard implications
In using an eruption database, underrecording of events must be taken into account to avoid underestimation of potential hazards and spurious inferences concerning how eruption rates have varied with time. Time averaged frequency of eruptions must be calculated by averaging the number of events of each VEI or eruption magnitude category over different time scales (e.g. Simkin, 1993; Pyle, 1995). The time averaged frequency of smaller eruptions will be underestimated when considering long time periods due to their relatively quick disappearance and the time averaged frequency of larger eruptions will be less reliable when considering short time periods due to the small number of eruptions. Our method estimates the halflife of eruption records, which will help to select the calculation time scale of time averaged frequency of each VEI or eruption magnitude category. In addition to the estimation of the regional underrecording of events, underrecording of events at individual volcanoes can be evaluated by other statistical approaches (e.g. Wang and Bebbington, 2012).
Conclusions
We compiled data on large magnitude explosive eruptions in Japan from the LaMEVE database (Crosweller et al., 2012) and investigated the underrecording of events. Brown et al. (2014) analyzed the spatial and temporal bias of the LaMEVE database and showed that underrecording is a function of magnitude and the proportion of historic and geological data in each magnitude subset. They showed that about 40 % of all recorded eruptions have occurred in Japan. Here, we studied the underrecording of the Japanese dataset and also estimated the underrecording of the global dataset on the basis of the estimated eruption frequency and the number of volcanoes. Although the recording rate of smaller events (VEI 4, 5 and eruption magnitude 4–5.5) are basically high, it drops very rapidly to a small value and then decreases slowly back in time. The model fitting, AIC and AICc model selection suggest that the two trends of smaller eruption categories (VEI 4 and VEI 5) can be represented with the double exponential decay model. The recording rates of larger eruptions (VEI 6, 7 and M ≥ 7) show a more slowly decreasing trend, which does not have a significant initial quick drop and which can be represented by a single exponential decay model. The total number of eruptions and the percentage of missing eruptions are estimated from the fitted models. Our results suggest that even larger VEI events have significant underrecording when considering time periods of hundreds of thousands of years. For example, 89 % of VEI 4 events, 65–66 % of VEI 5 events, 46–49 % of VEI 6 events and 36–39 % of VEI 7 events are missing from the record at 100 ka, 200 ka, 300 ka, and 500 ka, respectively. Comparison of frequencies of Japanese and global eruptions suggests that underrecording of the global database is 7.9–8.7 times larger than the Japanese dataset. Therefore, underrecording of events must be taken into account in longterm volcanic hazard assessments.
Abbreviations
 LaMEVE:

Large Magnitude Explosive Volcanic Eruptions
 VEI:

Volcanic Explosivity Index
 M:

Eruption magnitude
 MA:

Window size
 Dexp:

Double exponential decay function
 Exp:

Exponential decay function
 Lr:

Relative likelihood
References
Akaike H (1974) A new look at the statistical model identification. IEEE Transactions on Automatic Control AC19:716–723
Bebbington MS (2013) Models for Temporal Volcanic Hazard. Stat Volcanol 1:1–24. doi:10.5038/2163338X.1.1
Brown SK, Crosweller HS, Sparks RSJ, Cottrell E, Deligne NI, Guerrero NO, Hobbs L, Kiyosugi K, Loughlin SC, Siebert L, Takarada S (2014) Characterisation of the Quaternary eruption record: analysis of the Large Magnitude Explosive Volcanic Eruptions (LaMEVE) database. J Appl Volcanol 3:5
Burnham KP, Anderson DR (2002) Model Selection and Multimodel Inference: A Practical InformationTheoretic Approach, 2nd edn. Springer, New York
Coles S, Sparks RSJ (2006) Extreme value methods for modeling historical series of large volcanic magnitudes. In: Mader HM, Coles SG, Connor CB, Connor LJ (eds) Statistics in Volcanology. Geological Society of London, Special Publication of IAVCEI, 1, 47–56
Committee for catalog of Quaternary volcanoes in Japan (ed) (2000) Catalog of Quaternary volcanoes in Japan. http://www.geo.chs.nihonu.ac.jp/tchiba/volcano/index.htm. Accessed regularly (in Japanese)
Crosweller HS, Arora B, Brown SK, Cottrell E, Deligne NI, Guerrero NO, Hobbs L, Kiyosugi K, Loughlin SC, Lowndes J, Nayembil M, Siebert L, Sparks RSJ, Takarada S, Venzke E (2012) Global database on large magnitude explosive volcanic eruptions (LaMEVE). J Appl Volcanol 1:4. doi:10.1186/2191504014
De la CruzReyna S (1991) Poissondistributed patterns of explosive eruptive activity. Bull Volcanol 54:57–67
Deligne NI, Coles SG, Sparks RSJ (2010) Recurrence rates of large explosive volcanic eruptions. J Geophys Res 115:B06203
Efron B (1979) Bootstrap Methods: Another Look at the Jackknife. Annal Stat 7(1):1–26
Furlan C (2010) Extreme value methods for modelling historical series of large volcanic magnitudes. Stat Model 10(2):113–132
Geological Survey of Japan, AIST (ed) (2013) Catalog of eruptive events during the last 10,000 years in Japan, version 2.1. https://gbank.gsj.jp/volcano/eruption/index.html. Accessed regularly (in Japanese)
Gibbard PL, Head MJ, Walker MJC (2010) Formal ratification of the Quaternary System/Period and the Pleistocene Series/Epoch with a base at 2.58 Ma. J Quat Sci 25(2):96–102
Guttorp P, Thompson ML (1991) Estimating SecondOrder Parameters of Volcanicity from Historical Data. J Am Stat Assoc 86:578–583
Hayakawa Y (2010) Hayakawa’s 2000year eruption database and one millionyear tephra database. http://www.hayakawayukio.jp/database/. Accessed regularly
Hudson DJ (1971) Interval Estimation from the Likelihood Function. J Royal Stat Soc Ser B 33(2):256–262
Imura R, Kobayashi T (2001) Geological map of Kirishima volcano. Geological Survey of Japan, (in Japanese with English abstract)
Kamata H (1989) Shishimuta caldera, the buried source of the Yabakei pyroclastic flow in the Hohi volcanic zone, Japan. Bull Volcanol 51:41–50
Lavigne F (2004) Rate of sediment yield following smallscale volcanic eruptions: a quantitative assessment at the Merapi and Semeru stratovolcanoes, Java, Indonesia. Earth Surface Process Landforms 29(8):1045–1058
Machida H, Arai F (2003) Atlas of Tephra in and around Japan, revised edition. University of Tokyo Press, Japan (in Japanese)
Machida H, Yamazaki H, Arai F, Fujiwara O (1997) Omine Pyroclastic Flow Deposits: A marker tephra for the study of evolution of the Japan Northern Alps. J Geogr 106(3):432–439 (in Japanese)
Mahony SH, Wallace LM, Miyoshi M, Villamor P, Sparks RSJ, Hasenaka T (2012) Volcanotectonic interactions during rapid plate boundary evolution in the Kyushu region, SW Japan. Bull Geological Soc Am 123:2201–2223
Mason BG, Pyle DM, Oppenheimer C (2004) The size and frequency of the largest explosive eruptions on Earth. Bull Volcanol 66(8):735–748
Newhall CG, Self S (1982) The Volcanic Explosivity Index (VEI): an estimate of explosive magnitude for historical volcanism. J Geophys Res 87(C2):1231–1238
Pierson TC, Major JJ, Amigo Á, Moreno H (2013) Acute sedimentation response to rainfall following the explosive phase of the 2008–2009 eruption of Chaitén volcano, Chile. Bull Volcanol 75:723. doi:10.1007/s0044501307234
Pollard AM, Blockley SPE, Ward KR (2003) Chemical alteration of tephra in the depositional environment: theoretical stability modelling. J Quart Sci 18(5):385–394
Pyke DA, Thompson JN (1986) Statistical analysis of survival and removal rate experiments. Ecology 67(1):240–245
Pyle DM (1995) Mass and energy budgets of explosive volcanic eruptions. Geophys Res Lett 22(5):563–566
Pyle DM (2000) Sizes of volcanic eruptions. In: Sigurdsson H, Houghton BF, McNutt SR, Rymer H, Stix J (eds) Encyclopedia of Volcanoes. Academic, London
Sanchez L, Shcherbakov R (2012) Temporal scaling of volcanic eruptions. J Volcanol Geotherm Res 247–248:115–121
Siebert L, Simkin T (2002) Volcanoes of the World: an Illustrated Catalog of Holocene Volcanoes and their Eruptions. Smithsonian Institution, Global Volcanism Program Digital Information Series, GVP3, (Current URL: http://www.volcano.si.edu/). Accessed regularly
Siebert L, Simkin T, Kimberly P (2010) Volcanoes of the world, 3rd edn. University of California Press, Berkeley
Simkin T (1993) Terrestrial Volcanism in Space and Time. Ann Rev Earth Planet Sci 21:427–452
Sugiura N (1978) Further analysts of the data by akaike’ s information criterion and the finite corrections. Commun Stat Theory Method 7(1):13–26
Taira A (2001) Tectonic evolution of the Japanese island arc system. Annu Rev Earth Planet Sci 29:109–34
Utsu (1961) A statistical study on the occurrence of aftershocks. Geophys Mag 30:521–605
Wang T, Bebbington M (2012) Estimating the likelihood of an eruption from a volcano with missing onsets in its record. J Volcanol Geotherm Res 243–244:14–23
Zhuang J, VereJones D, Guan H, Ogata Y, Ma L (2005) Preliminary Analysis of Observations on the UltraLow Frequency Electric Field in the Beijing Region. Pure appl geophys 162:1367–1396, doi:10.1007/s0002400426743
Acknowledgments
This work was funded by a grant from the European Research Council (VOLDIES grant), the Natural Environment Research Council (Global Volcano Model grant), the British Geological Survey and also Munich Re in the initial stages. Support for KK was provided by the Nuclear Waste Organization of Japan (NUMO) and the Obayashi Corporation. Findings do not necessarily reflect their views. The authors benefited from conversations with R Kazahaya about the modeling of recording rate of volcanic events. We thank two anonymous reviewers and editor J. Barclay for making further improvements to the manuscript.
Author information
Additional information
Competing interests
The authors declare that they have no competing interests.
Authors’ contributions
KK drafted the manuscript, compiled the data and undertook data analysis. CC and RSJS conceived of the study and helped draft the manuscript. HSC, SKB and LS compiled the data and provided comments on the draft manuscript. TW provided advice on data analysis and comments on the draft manuscript. ST provided data and support to KK on Japanese eruptions. All authors read and approved the final manuscript.
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (https://creativecommons.org/licenses/by/4.0), which permits use, duplication, adaptation, distribution, and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
About this article
Received
Accepted
Published
DOI
Keywords
 Large Magnitude Explosive Volcanic Eruptions database
 Underrecording of volcanic events
 Recurrence rate
 Missing data
 Statistics in volcanology
 Natural hazard