Report on potential sampling biases in the LaMEVE database of global volcanism

We investigate whether the disproportionate contribution of individual volcanoes in the Large Magnitude Explosive Volcanic Eruption database (LaMEVE) potentially compromises the treatment of LaMEVE as a globally representative database of volcanic activity. We find that 41% of volcanoes which contribute at least one eruption to LaMEVE only contribute one eruption (10% of all eruptions), and the six most prolific volcanoes contribute 11% of eruptions. However, there is no systematic bias with respect to the eruption magnitude or date for volcanoes contributing one eruption. Also, no bias can be discerned for when the smallest or largest eruption at a volcano occurs in its eruptive record. Half of the volcanoes contributing one or more eruptions to the LaMEVE database had their first eruption prior to 36.4 ka. We find LaMEVE is representative – while there are well-known issues of eruption under-reporting, LaMEVE is not overly biased by the activity of a few volcanoes.


Introduction
The Large Magnitude Explosive Volcanic Eruption database (LaMEVE) compiles all known explosive Quaternary eruptions with a magnitude (M; Pyle, 2000) or Volcanic Explosivity Index (VEI; Newhall and Self, 1982) of 4 or greater (Crosweller et al., 2012). The database also includes all known Quaternary volcanoes, even if they do not have a qualifying eruption in their current record. Using the terminology introduced in Brown et al. (2014), we refer to volcanoes which contribute at least one eruption to the LaMEVE database as Quaternary Explosive Activity Recorded (QEAR) volcanoes. Under-recording of volcanic eruptions in the LaMEVE and companion databases such as the Smithsonian Institution's Global Volcanism Program (Global Volcanism Program, 2013) is a well-known problem (e.g., Newhall and Self, 1982;Simkin, 1993;Siebert et al., 2010;Brown et al., 2014, Rougier et al., 2016: the further back one goes from the present, the fewer eruptions have been reported, with the recording rate of smaller eruptions decaying more rapidly than for larger eruptions. Various strategies have been taken to both characterise and correct for the incompleteness of the record (e.g., Coles and Sparks, 2006;Marzocchi and Zaccarelli, 2006;Deligne et al., 2010;Furlan, 2010;Jenkins et al., 2012;Mead and Magill, 2014;Kiyosugi et al., 2015;Rougier et al., 2016) to characterise magnitude-frequency relationships and other properties. However, a potential issue is whether LaMEVE disproportionately "samples" a few volcanoes, introducing biases and potentially compromising its analysis. This report investigates this issue, which has not received previous study.
LaMEVE and companion databases have been used to draw inferences on global or regional activity. An often unstated assumption is that while there is underrecording of eruptions, the existing data are representative of the behaviour of global volcanism. However, Brown et al. (2014)  In this report we explore the possibility that individual volcanoes, not just regions, may disproportionately contribute to the global record. If this were the case, then an analysis of global data may in actuality be an analysis of the behaviour of a few volcanoes and hence could be unrepresentative. We examine version 3 of the LaMEVE database, released in September 2015.

Number of eruptions per volcano
There are 2627 volcanoes in LaMEVE, 480 of which have at least one recorded M and/or VEI ≥ 4 eruption.
Here we will only focus on the subset of volcanoes which contribute at least one eruption to LaMEVE, i.e., QEAR volcanoes.
The uneven contribution of individual volcanoes questions the assumption that LaMEVE is indeed globally representative. Here we examine whether LaMEVE is disproportionately influenced by the behaviour of a few volcanoes.

Volcanoes contributing one eruption
Here we compare those volcanoes that only record one eruption in LaMEVE with the greater LaMEVE database. We reiterate that eruptions from volcanoes contributing a single eruption account for 10% of LaMEVE eruptions (Fig. 1), so are unlikely to bias the greater distribution. The magnitude distribution (Fig. 3) and age of eruptions at those volcanoes with only one eruption recorded in LaMEVE is similar to the overall distribution. Only 38 eruptions from volcanoes with only one eruption (19%) are from 1600 AD onwards, and 93 (47%) are from the last 10,000 years. In comparison for all eruptions in LaMEVE, 200 (10%) and 798 (41%) are from 1600 AD and 10,000 yBP onwards, respectively. These numbers are slightly different, but not enough to suggest a systematic bias towards historic or recent eruptions reported for volcanoes which contributed only a single eruption to LaMEVE. These data show no evidence for major differences about the size or timing of eruptions from volcanoes which contribute a single eruption to LaMEVE.

Eruption sequence
We investigated whether there might be systematic bias if large eruptions destroy the record of smaller earlier eruptions from the same volcano. There is no discussion of this in the literature, although it would be difficult to establish the occurrence of this in the field. If this does happen, it would result in a systematic bias where smaller eruptions (remembering all eruptions in LaMEVE are large ≥ M4) are more likely to be recorded if they post-date larger eruptions from the same source, further confounding the already challenging underreporting issue.
At volcanoes contributing two eruptions to LaMEVE, about a quarter are of equal magnitude, and of the remainder it is approximately evenly split between volcanoes where the smaller eruption is older versus where the larger eruption is older (Table 1). For volcanoes contributing three and four eruptions, we examined the Fig. 1 Cumulative percentage of total number of QEAR volcanoes (triangles) and total number of eruptions (circles) according to the number of eruptions a volcano contributes. For example, this plot shows that~40% of volcanoes contribute one eruption, and~10% of eruptions come from volcanoes contributing one eruption sequence in the individual volcano's record of both the smallest and the largest eruption (Table 1). For cases where there are multiple eruptions of the same magnitude that counts as either the smallest or the largest, the earliest instance of that magnitude is utilized. For example, Calabozos volcano (Chile) had a magnitude 5.2 eruption at 810 ka, and magnitude 7.4 eruptions at 800, 300, and 150 ka. Thus, in Table 1, in the column of volcanoes with 4 contributing eruptions, Calabozos is one of the 12 volcanoes with the smallest eruption being the oldest, but is not one of the volcanoes where the largest eruption is the youngest. Similarly, Atacazo volcano (Ecuador) had magnitude 4.0 eruptions at 9903 and 6147 yBP, a magnitude 5.2 eruption at 5040 yBP, and a magnitude 5.3 eruption at 2232 yBP. In this case, Atacazo is counted as a volcano which has both its smallest eruption as the oldest, and its largest eruption as the youngest.
At volcanoes contributing three eruptions, there is a greater proportion of largest eruptions which occurred first compared to the smallest eruptions, but it is not a large difference (the limited data preclude robust statistical analysis). At volcanoes contributing four eruptions, the largest and smallest eruptions have similar distributions.

By when is LaMEVE globally representative?
Issues of under-reporting aside, it is useful to determine from what point in time the LaMEVE database is representative of the eruptive history of contributing volcanoes. There is no standard methodology to follow as there might be in the case of under-reporting. However, we note that half of QEAR volcanoes had their earliest contributing eruption before 36.4 ka (Fig. 4). The record of half of QEAR volcanoes largely predates the radiocarbon era (only 46 of the 809 eruptions older than 36.4 ka are dated with radiocarbon methods). Similarly, half of QEAR volcanoes had their most recent contributing eruption before 6400 yBP. The presence of a record of one or more (large) eruptions at a volcano does not mean that individual volcano's record is complete: even volcanoes contributing many eruptions to LaMEVE may have missing eruptions. However, we suggest the future workers use 36.4 ka as a rough guide as to when individual volcanoes do not bias the LaMEVE eruption record.

Conclusions
The LaMEVE database has substantial under-reporting (incompleteness) issues. At the individual volcano level, 41% of QEAR volcanoes only contribute one eruption (10% of all eruptions); the most eruptions contributed by a single volcano is 52 eruptions. However, there are no  (11) Largest eruption youngest -22.0% (9) 21.9% (7) a c b d Fig. 4 Age of the oldest eruption recorded at every volcano contributing at least one eruption to LaMEVE. A) Volcanoes with oldest LaMEVE eruption (M and/or VEI ≥ 4) in the last 10,000 years, B) volcanoes with oldest eruption between 10 and 36.4 ka, C) volcanoes with oldest eruption between 36.4 and 100 ka, and D) volcanoes with oldest eruption prior to 100 ka differences in magnitude distribution or eruption date between those volcanoes which contribute only one eruption and those that contribute more than one. We also find that at volcanoes with two, three, or four eruptions there is no bias in eruption size order at individual volcanoes. Finally, half of QEAR volcanoes had their first eruption prior to 36.4 ka; this may be an appropriate time to consider the database geographically representative. Overall, we find that LaMEVE does not have systematic biases caused by disproportion contributions by individual volcanoes.