Tephra deposit inversion by coupling TEPHRA2 with the Metropolis-Hastings algorithm and its implications

In this work we couple the Metropolis-Hastings algorithm with the volcanic ash transport model TEPHRA2 , and present the coupled algorithm as a new method to estimate the Eruption Source Parameters of volcanic eruptions based on mass per unit area or thickness measurements 15 of tephra fall deposits. Basic elements in the algorithm and how to implement it are introduced. Experiments are done with synthetic datasets. These experiments are designed to demonstrate that the algorithm works, and to show how inputs aﬀect its performance. Results are presented as sample posterior distribution estimates for variables of interest. Advantages of the algorithm are that it has the ability to i) incorporate prior knowledge; ii) quantify the uncertainty; and iii) 20 capture correlations between variables of interest in the estimated Eruption Source Parameters. A limitation is that some of the inputs need to be specifed subjectively. How and why such inputs aﬀect the performance of the algorithm and how to specify them properly are explained and listed. Correlation between variables of interest are well-explained by the physics of tephra transport. We point out that in tephra deposit inversion, caution is needed in attempting to 25 estimate Eruption Source Parameters, and wind direction and speed at each elevation level, as this increases the number of variables to be estimated. The algorithm is applied to a mass per unit area dataset of the tephra deposit from the 2011 Kirishima-Shinmoedake eruption. Simulation results from TEPHRA2 using posterior means from the algorithm are consistent with ﬁeld observations, suggesting that this approach reliably reconstructs Eruption Source Parameters and 30 wind conditions from deposits.

(m i,j (x, y)) of the deposit is proportional to a 2D Gaussian function, and can be written as: where (x, y) is the spatial coordinates, and f i,j (x, y) is the 2D Gaussian function with its mean and variance depending on the grain size j, released elevation H i , wind speed and direction, and parameters 115 that characterize turbulent diffusion (e.g., turbulent diffusion coefficient). The total mass per unit area at (x, y) is the sum of Eq. 1 for all particle sizes released from the eruptive column. As the eruptive column, a line source, is discretized into many point sources in TEPHRA2, the total mass per unit area can be written as: To run TEPHRA2, ESPs, wind conditions, and locations of interest need to be specified. TEPHRA2 Simply put, Bayes' rule states that our prior knowledge about certain quantities of interest can be updated based on new observations. Assuming that the quantities of interest (i.e., ESPs in this study) 130 is a vector x, the prior knowledge on its value can be denoted as a prior probability distribution P (x).
With a series of observations θ (i.e., mass per unit area of tephra deposit on the ground in this study), our understanding on x could be updated, which is denoted by the posterior probability distribution (P (x|θ)). Bayes' rule is written as: where P (θ|x) is the likelihood function. It denotes the probability of observing θ given x. P (θ) is is 135 the evidence P (θ) = P (θ|x)P (x)dx, and is the total probability of the observations. Here we refer to probability density as probability for convenience.
We can "insert" our knowledge on the process being analyzed into Bayes' rule with the help of the likelihood function. In our case, this knowledge is the model TEPHRA2. We use the simplest case with only one variable of interest unknown, say column height, and one mass per unit area observation 140 (θ * ) to explain the likelihood function. We could apply one value of column height (h * ) to TEPHRA2, and collect the corresponding output d(x = h * ) with d(·) denoting the TEPHRA2 output (assume one location of interest in this example). Neglecting model uncertainty, knowing x = h * is equivalent of knowing d(x = h * ). Further assuming that the likelihood function follows a Gaussian distribution, its mean value could be d(x = h * ), and its variance needs to be determined based on our understanding to accept or reject x * 1 . If it is rejected, x 1 = x 0 . Otherwise, x 1 = x * 1 . The two procedures, namely drawing a new point and rejecting or accepting it, iterate, and after sufficient iterations, a chain of vectors x 0 , x 1 , ..., x n is obtained. By (1) excluding points with relatively small index (e.g., x 0 ,.., x 499 ), and (2) taking points with a fixed interval along the chain (e.g., only taking x 500 , x 600 , ..,x 9900 195 , x 10000 ), the target posterior distribution is obtained through sampling. The first measure is to make sure that the results are not affected by the value of the starting point, and the second is to avoid auto-correlation in the chain (see Chib and Greenberg, 1995;Andrieu et al., 2003;Kaipio and Somersalo, 2006 for more details).
Whether to accept or reject a proposed point follows the rules below:

200
• If P (θ|x * 1 )P (x * 1 ) > P (θ|x 0 )P (x 0 ), then the posterior probability is greater at x * 1 , and the proposed point will be accepted. That is, if the proposal has a higher posterior probability it is automatically accepted. Following this rule ensures that there will be more samples with greater posterior probability in the chain.

205
This allows the algorithm to occassionally sample points with low or relatively low posterior probability, in order to explore the entire possible domain of x. Following this rule implies that: -If P (θ|x * 1 )P (x * 1 ) is a lot smaller than P (θ|x 0 )P (x 0 ), the posterior probability at x * 1 is small, and should be accepted with low probability. This ensures that there will be fewer points with low posterior probability in the chain.

Form of the likelihood function
In the present version of the algorithm, it is assumed that the likelihood function for each observation 220 follows a Gaussian distribution with variable log( observation model output ). The mean of the distribution centers at 0 such that the likelihood function peaks when the observation is identical to the model output.
This setup is consistent with previous works (e.g., Connor and Connor, 2006;Kawabata et al., 2013;White et al., 2017). It states that measurement uncertainty scales with magnitude of the observation.
The variance or scale of the likelihood function, which reflects measurement uncertainty, needs to 225 be specified by users. Its effect on the results of the algorithm will be examined in the following experiments. It should be noted that the form of the likelihood function could be changed to different forms easily.

Two ways to specify the wind profile
The algorithm allows for two ways to specify and estimate the wind profile. In cases with limited 230 observations, it is impossible to estimate the wind speed and direction at each elevation. In such cases, a simplified form of the wind profile can be adopted. It assumes that (1) the wind direction is constant, and does not change with elevation, and (2) wind speed increases from zero to a certain value with elevation, and then decreases to zero with elevation linearly. Four variables define such a simplified wind profile. They are the wind direction, maximum wind speed, elevation that corresponds 235 to the maximum wind speed, and elevation that the wind speed reaches zero. In the second way, all values are possible for the wind direction and speed at each elevation (non-simplified wind profile), and users could estimate the wind direction and speed at all or certain specified elevations.

Running the algorithm
To run the algorithm, users first need to prepare input files. Examples of input files can be found in 240 the Appendix. They have five columns including the "variable name" column. The "initial value" column specifies the starting value of each variable. If the value of one certain variable is known (not to be estimated), its initial value will be fixed during the implementation of the algorithm, and the "prior" column needs to be marked as "Fixed" correspondingly. For such variables, users could leave the last three columns blank. To estimate a non-simplified wind profile, users must specify the wind 245 direction and speed at several elevation levels, depending on the form of the assumed profile.
For variables to be estimated, their initial values do not affect the results as long as the specified number of draws is sufficiently large, and users only need to specify values that are physically reasonable and would not lead to error when running TEPHRA2. Forms of prior distributions for these variables need to be defined in the "prior" column. The current version of the algorithm sup-250 ports "Gaussian" and "Uniform" distributions. If the former is specified, columns "parameter a" and "parameter b" should be filled with the mean and standard deviation of the prior distribution, respectively. Otherwise, the two columns correspond to the minimum and maximum of the uniform prior distribution. The last column "draw scale" specifies the standard deviation of the proposal functions for each variable to be estimated.

255
After the input files are prepared, users need to run all python scripts in order to execute the algorithm. Other than setting up proper file paths to read input files and observed data, and store the results, users only need to specify two values, namely the scale of the likelihood function and the number of draws (length of the chain), to run the algorithm. How to determine the number of draws is universally challenging when working with MCMC methods. It is preferred to have a large number 260 of draws so that measures can be done to reduce autocorrelation in the sampled chain (e.g., leave out samples at the begining of the chain and taking samples along the chain with a fixed interval as the final sample distribution; see Chib and Greenberg, 1995;Andrieu et al., 2003;Kaipio and Somersalo, 2006 for more details), and the results are less likely to be divergent. Here we recommend the number of draws to be ranging from five thousand to one million. This is based on our experiments with the 265 algorithm. Users could check for convergence by running two or more separate runs with identical inputs except for the starting values (the initial value column). If sample distributions from the runs are similar to each other, results from them converge. Otherwise, users need to increase the number of draws, and run the algorithm and check again following the same procedure. See Andrieu et al. (2003) and references within for more information on how to implement the M-H algorithm properly.

270
The primary output from the algorithm is the sampled chain. If 10000 draws are specified, the sampled chain will be a 10000-by-23 (19 variables for the eruptive column plus 4 variables for the wind profile) matrix if a simplified wind profile is adopted. Otherwise, with the non-simplified wind profile scenario, the result will be presented as three separate matrices for the eruptive column (10000-by-19 matrix) and wind speed and direction at each elevation (two 10000-by-40 matrices if the specified to debug and adjust parameters to run the algorithm. The running time of the algorithm depends on the specified number of draws and the running time of TEPHRA2. On an iMac with a 3 GHz Intel Core i5 processor, implementing the algorithm with ten thousand draws takes 394 seconds (∼6.5 minutes).

285
In this section, we generate simulation data using TEPHRA2. The simulated data are treated as field measurements for testing and validation of the algorithm. The specified ESPs and wind conditions to generate these "field measurements" are listed in ESPs and wind conditions used to run TEPHRA2 are listed in Table 1, and are fixed in all following experiments. The simplified wind profile is adopted throughout the 12 experiments. Focusing on just two variables makes it easier for examining and interpreting the results. The 12 experiments are also designed to highlight how the algorithm is affected by its inputs. The first experiment, Experiment # 0, is used as reference for comparison with results from the rest. For the rest experiments, we just 295 modify one input of the algorithm or the observation dataset, and keep the others the same as they are in Experiment # 0. In this way, the impact of each factor on the performance of the algorithm can be isolated and highlighted. Changes made in each experiment are listed in Table 2.  Table 3.
In addition to the 12 experiments, we also test the performance of the algorithm in cases with nonsimplified wind profile. An example of such an experiment is given. In all of our experiments, we find that the resultant sample posterior distributions are symmetric, and resemble Gaussian distributions.
This allows us to use posterior means and standard deviations to characterize the results, and the 310 posterior means also correspond to values with the maximum posterior probability.

Reference experiment
In Experiment # 0, ten observations are sampled at localities far from the source vent downwind (∼ 14 − 30 km from the vent ; sample sites shown in Fig. 1a). Their values range from 50-383 kg/m 2 .
We set the priors as Gaussian distributions for the column height and log-transformed total eruption log-transformed kg (1.88 × 10 11 kg) for the eruption mass. It is noted that 16000 m is slightly greater than the specified column height used to generate the dataset, and the prior mean for the log-scaled eruption mass is identical to the true value ( Table 2).
The results are shown and summarized in Table 3  suggests that the algorithm works. With ten observations and the presented algorithm, the reduced 325 standard deviations and consistency between the posterior means and the true values suggest that our knowledge about the column height and total eruption mass has correctly improved.

Scale of the likelihood function
In Experiment # 1, we increase the scale of the likelihood function from 0.05 to 0.2 (Table 2). A In other words, to the algorithm, new data are collected, but cannot be trusted due to the greater measurement uncertainty. This is also manifested in the acceptance rate of Experiment # 1, which is 84.0%. What the M-H algorithm does here is basically accepting or rejecting points based on the prior distribution, and the role of the likelihood function is largely ignored due to the greater scale of 340 the likelihood function (i.e., greater measurement uncertainty). The posterior distribution of the logtransformed total eruption mass still centers close to the true value with lowered standard deviation compared to its prior, suggesting that estimating total eruption mass is less sensitive to measurement uncertainty.

345
In Experiment # 2, scales of the proposal function are increased from 500 m and 0.05 log-transformed kg to 2000 m and 0.2 log-transformed kg for column height and log-transformed total eruption mass, respectively (Table 3). Results from Experiments # 0 and 2 resemble each other (Table 3)

Prior distributions
Experiments # 3 and 4 are designed to test how prior distributions affect the posterior distribution. the prior needs to be determined with caution.

Number of input observations
Experiments # 5-9 share specifications with Experiments # 0-4, respectively, except that 20 more samples are included in the input. The 30 measurements (see Fig. 1 for sample localities) range from 32-383 kg/m 2 . The results are compared pairwise, and summarized in this section.  (Table 3). Even though, the posterior means are still not consistent with the specified values. It can be seen that more observations "drag" the means of 390 the posterior distributions towards the true values. Results from experiments # 3, 4, 8, and 9 reflect the "wrestling" between prior knowledge and observations.

Sample site locations
In Experiments # 10 and 11, ten measurements at proximal (and closely-spaced) and medial localities are used as input observations, respectively (see sites in Fig. 1a to each other, and as a result, they provide limited useful information to the algorithm to update the priors. In Experiment # 11, the sample sites are spatially medial to the source vent, i.e., neither especially close nor far relative to the entire span of distances (Fig. 1a). The sites are distributed in a more scattered pattern than are those in Experiment # 10. These observations provide more useful 410 information than do the proximal observations, which can be used to update the priors. As shown in Table 3, the posterior means of the column height and log-scaled eruption mass are consistent with the true values, and the posterior distributions are also characterized by smaller standard deviations compared to those in Experiment # 10 and the specified priors.

Experiment with non-simplified wind profile 415
Results from one experiment with non-simplified wind profile are reported in this section. The ESPs and wind conditions to generate the synthetic data are shown in Table 1. Change in wind speed and direction with elevation is taken into account in this experiment. The wind speed is specified to increase from 0 on the ground to a maximum of 70 m/s at 16 km, and then decrease to 10 m/s at 24 km. The wind direction is to the north on the ground, and gradually changes to eastwards with 420 elevation.
We set our ESPs of interest to be column height and log-transformed total eruption mass. For the wind profile, we choose to estimate wind directions and speeds at elevation levels 5, 14, and 18 km a.s.l. We do not choose to estimate wind direction and speed at each elevation (i.e., estimate the complete wind profile) because that would significantly increase the number of variables to be 425 estimated. In such circumstances, the problem tends to be ill-posed, and whether it is solvable or not might become uncertain. Therefore, an experiment that estimates the wind direction and speed at each elevation cannot be used to justify the method works. Neither can it be used to justify that the method does not work. Here, the current experiment is designed to show that the method is able to estimate wind speed and direction at different elevations when the problem is known to be solvable 430 (relatively fewer variables to be estimtated). The problem and difficulty in estimating a non-simplified wind profile is discussed in the following section.
Elevation levels 5, 14, and 18 km are chosen because wind directions at these elevations are 25 • , 65 • and 85 • , respectively. We "collect" tephra mass per unit area at 495 randomly selected sample sites (Fig. 1b). The dataset size is greater than commonly-seen thickness or mass per unit area 435 datasets for tephra fall deposits. Estimating wind speed and direction at a few elevations and having 495 observations are not realistic in studies on tephra fall deposits. However, these are adopted such that we know that the problem is solvable, and can be used to validate the algorithm. In more realistic scenarios, whether the algorithm is able to correctly estimate the ESPs and wind conditions depends on model uncertainty, measurement uncertainty, number of observations, and potentially the  Table   4. They are highly consistent with specified values used to generate the dataset, and the posterior standard deviations are a lot smaller than those of the priors. The results suggest that the algorithm 450 functions when a non-simplified wind profile is adopted. Greater uncertainty is obtained for wind direction and speed at higher elevations (i.e., 14 and 18 km). This is due to the fact that the wind speed at higher elevations is generally greater, and the estimated uncertainty scales with it.
Not applicable 455

Discussion
In this work, we introduce an algorithm coupling the ash dispersal model TEPHRA2 with the Metropolis-Hastings implementation of MCMC. We validated it with synthetic data generated by TEPHRA2. By varying the inputs to the algorithm and observation datasets one at a time, we examine and explain how they affect the performance and efficiency of the algorithm. Thirteen experiments 460 are done. The first twelve of them focus on simple scenarios (i.e., with simplified wind profile) with two variables of interest (i.e., total eruption mass and column height) given ten to thirty observations.
In these experiments, the algorithm is shown to work well, and has the ability to quantify the uncertainty in the estimate. In the thirteenth experiment, we set out to estimate two ESPs, and wind directions and speeds at three elevation levels, with sufficient observations. In this experiment, the 465 wind direction is set to vary with elevation. The results again suggest that the algorithm could work well under such more complex scenarios. As the goal of this work is to present this algorithm and introduce and explain how it works, we avoid using real thickness or mass per unit area datasets of tephra fall deposits in testing the algorithm. This would avoid additional sources of uncertainty from affecting the results, and is considered as a stricter measure for testing and validating a method.

470
Our discussion here focuses on advantages and limitations of the algorithm, interpreting the posterior correlation between column height and total eruption mass in our experiments, and whether we should attempt to estimate wind direction and speed at each elevation in tephra inversion when working with TEPHRA2. The algorithm is then applied to the mass per unit area dataset of the 2011 Kirishima-Shinmoedake tephra deposit to infer the corresponding ESPs. Practically speaking, prior knowledge is being used consistently throughout the implementation of 485 the algorithm. That is, the prior probability helps determine whether to accept or reject a proposed point in each draw of the algorithm. However, for a non-probabilistic inversion method, such as gradient methods, prior knowledge might be used only once in the inversion process-it helps determine the starting point.
In tephra deposit inversion, uncertainty in the estimated ESPs comes from the interplay of multiple 490 sources, which include the uncertainty in the prior, measurement uncertainty, and potential model uncertainty. How they interact with each other and affect the final uncertainty has never been studied systematically for tephra inversion. The algorithm might be able to help address this concern in future work.

495
Whether the priors, number of draws, and standard deviations of the proposal functions are specified properly or not affects the performance of the algorithm. This problem arises as long as the M-H algorithm is adopted, and there is no definitively correct way to determine some of such inputs to the algorithm. What we attempt to do, in this work, is to explain and highlight why and how all the inputs affect the performance of the algorithm through experiments, and present commonly adopted 500 measures and references (Chib and Greenberg, 1995;Andrieu et al., 2003;Kaipio and Somersalo, 2006) on how to use the algorithm properly and how to check if the results converge. We hope that introducing the algorithm in this way helps users in understanding and implementing.

Correlation in the posterior distribution
From examining the sampled posterior distributions, we find that correlation exists between column 505 height and total eruption mass. Here we focus on the correlations in Experiments # 0-11, as their setups are simpler. Correlations in these experiments are shown in Table 3, and the bivariate posterior distributions from selected experiments are shown in Fig. 3.
We find that both negative and positive correlations exist between column height and total eruption mass in the sampled posterior distributions, and whether the correlation is positive or not depends 510 on sample site localities. In Experiments # 10 and 11, the correlation is positive (0.986 and 0.831), and for the rest, the correlation (-0.474--0.856) is negative. This can be explained by the physics of tephra transport. Considering all other ESPs fixed except for column height and total eruption mass, if observations are made at distal localities (Experiments # 0-9), the combination of a greater column height, which allows tephra to be dispersed farther downwind and leads to more tephra deposition at 515 distal sites, and a smaller total eruption mass (eruption mass is proportional to tephra thicknss/mass per unit area everywhere) leads to results similar to those for a lower column height and a greater total eruption mass (within the area where footprints of tephra deposits overlap). Therefore, the correlation is negative in Experiments # 0-9. However, a lower column height leads to a thicker tephra deposit at proximal sites because of less interaction with wind, and less time for turbulent diffusion to disperse 520 tephra to distal localities. Total eruption mass is always proportional to tephra thickness and mass per unit area, i.e., scenarios with (1) greater column height and greater total eruption mass and (2) lower column height and lower total eruption mass lead to similar observed thickness or mass per unit area if the observations are all made at proximal sites. These relationships are well-known (e.g., Suzuki et al., 1983;, but how their interaction with sample site locations leads 525 to and affects the correlation between the estimated column height and total eruption mass in tephra inversion has not been reported even in the simplest scenarios, i.e., our experiments with synthetic data and all other ESPs known. It is worth noting that the correlation in Experiment # 10 is surprisingly high (0.986). This is due to the fact that the sample sites are too close to each other, and the "observed" tephra mass per unit 530 area values at these sites also have similar values. To the algorithm, the ten observations in Experiment # 10 are almost the same (both their locations and their observed values). The high correlation in fact reveals that non-unique solutions exist in Experiment # 10. The presence of correlation in the posterior distribution suggests that the interaction of variables plays a role in tephra inversion. This is consistent with results from sensitivity analysis on TEPHRA2 in the work of Scollo et al. (2008).

535
This finding suggests that the algorithm has the potential to be used to discover intrinsic relationships (interactions) between variables of interest and wind conditions in TEPHRA2 and other dispersion models, and could thus improve our understanding on different sources of uncertainty in tephra inversion.

Whether to estimate the wind direction and speed at each elevation 540
The algorithm can be used with either simplified or non-simplified wind profiles. It is always desired and preferred to have a more exact and detailed understanding on wind conditions in tephra inversion.
At the same time, we also should not ignore its corresponding practicality. It is common to estimate six or more ESPs (e.g., column height, total eruption mass, column mass distribution, mean and standard deviation of grain size distribution, and diffusion coefficient) in tephra inversion. If wind direction 545 and speed are to be estimated at ten elevations, this adds up to at least 26 = 6 + 10 × 2 variables of interest. Considering just two values for each variable, this means 2 26 (which is greater than 60 million) possible combinations for the 26 variables. The number of field measurements for tephra fall deposits rarely exceeds 300. In such circumstances, together with other sources of uncertainties, estimating the ESPs and the complete non-simplified wind profile would be challenging from a technical and non-550 volcanological perspective regardless of which inversion method is adopted. This concern motivates us to propose the two options to specify and estimate the wind profile. Estimating a simplified wind profile conforms to the Ocham's razor.
However, as shown in previous works, estimating wind direction and speed at each elevation together with the ESPs can be done (e.g., White et al., 2017). Prior knowledge on local weather 555 conditions and expertise on the studied tephra deposits (e.g., constructed isomass and isopach maps) provide additional constraints on the ESPs and wind conditions, which help make the problem solvable.
A question naturally arises, that is, under what circumstances (e.g., spatial distribution and number of sample sites, observed values of tephra thickness, and prior knowledge on the wind condition), it is possible to estimate the ESPs and the non-simplified wind profile. This question will be crucial 560 for the efficiency of tephra inversion and interpretability of its results. It involves factors such as measurement uncertainty (i.e., whether a few outlier observations are due to the complex wind profile or measurement error) and how much we trust the prior knowledge. This question cannot be answered without a probabilistic framework, and we think that the presented algorithm might be key to this question. The question of whether it would be practical and meaningful to estimate the ESPs and 565 a non-simplified wind profile is comparable to the problem of detecting multiple lobes in tephra thickness distribution (Kawabata et al., 2013(Kawabata et al., , 2015, which is solved by coupling a semi-empirical model of tephra thickness distribution with statistical methods (Kawabata et al., 2013(Kawabata et al., , 2015. In our case, however, the difference is that the forword model TEPHRA2 is characterized by a lot more variables of interest.

Application to the Kirishima-Shinmoedake dataset
In this section, we apply the M-H algorithm to a dataset containing mass of tephra per unit area for the 2011 Kirishima eruption. The Kirishima-Shinmoedake event took place from 26 to 29 January, 2011, with an eruption of the Shinmoedake volcano. The column height ranged from 6.2-8.6 km above the crater, based on different models and Doppler radar measurements 575 Maeno et al., 2014). The total eruption mass was estimated to be 1.8 − 3.1 × 10 10 kg by Nakada et al. (2013). The mass of tephra erupted from the afternoon of 26 January to the early morning of 27 January, which corresponds to the current dataset, is about 1.4-2.5 × 10 10 kg .
The wind was blowing to the southeast, and the wind profile is reported in Hashimoto et al. (2012).
The tephra deposit data we are using are reported in Miyabuchi et al. (2013), and downloaded 580 from White et al. (2017). The dataset was collected from the main tephra fallout deposit (emplaced from the evening of 26 January to the morning of 27 January). Tephra thickness and grain size distributions were measured at 55 locations downwind from the vent. In addition, tephra thickness was measured at another 63 locations. The thickness measurements were converted to mass per unit area. See Miyabuchi et al. (2013) for sample processing procedures. 585 We set the ESPs to be estimated to be column height, eruption mass, α/β ratio (which characterizes the mass distribution of tephra within the eruptive column), median and standard deviation of the grain size distribution, diffusion coefficient, fall time threshold, and densities of lithic (stony) and pumice fragments. The eddy constant is fixed as 0.04. For the wind condition, a simplified wind profile is adopted, to avoid overcomplication of the problem, as discussed before. We assume that the 590 wind speed increases linearly from 0 to 11 km a.s.l., and then decreases with elevation to 24 km a.s.l.
This setup is based on the wind speed profile reported in (Hashimoto et al., 2012).  Table 5.
We draw fifty thousand points; this process is repeated three times. Sample distributions from the three runs are almost identical to each other, suggesting that the results converge. The first 5000 samples are discarded to exclude the impact from the initial starting point. For the rest of the chain, we collect samples based on a 15-point interval to avoid autocorrelation.

600
The results are summarized in Table 5. Posterior means of column height and total eruption mass are in general consistent with previous studies, and are updated from the priors with posterior standard deviations being much smaller. The simulated mass per unit area data from TEPHRA2 with posterior means as ESPs and wind conditions are plotted against field observations in Fig. 4, which suggests that TEPHRA2 could generally reproduce field observations based on estimated results from 605 the algorithm. Posterior means of column height and total eruption mass are 7.3 km and 9.14 * 10 9 kg, respectively. The former is within the range of the observed column height, and the latter is slighly smaller than estimates from previous work. Posterior distributions of the other ESPs generally lie within commonly-seen ranges (Table 5), and are also altered from the corresponding priors. We note that the posterior mean of the median of the grain size distribution is finer than data reported 610 by Miyabuchi et al. (2013). We think that this could be explained by the fact that data reported by Miyabuchi et al. (2013) represent the grain size distribution at certain sample sites, whereas our estimate focuses on the total grain size distribution.
This dataset has been used in White et al. (2017) for testing another tephra inversion method (i.e., Levenburg-Marquardt algorithm with Tikhonov and subspace regularization). We do not intend 615 to compare our results with theirs for several reasons: (1) the goal of their method is to efficiently implement tephra inversion with the ability to quantify the uncertainty in the estimate, and additional simplifications are assumed in the method; (2) different priors are assumed, and their work estimates 10 ESPs and wind directions and speeds at 11 elevation levels. We think that the two algorithms have their own advantages, and which algorithm to use depends on users' need. We introduce the model TEPHRA2 and basic elements of Bayes' rule, and present intuitive interpretations on the M-H algorithm. How to implement the coupled algorithm and formats of the input files are also introduced.

630
The coupled algorithm is validated with 13 experiments. For the first twelve of them, we focus on two variables of interest. In these experiments, we vary values of the input and size of the synthetic observation dataset one at a time to show that the algorithm works, and also to show how inputs affect the performance of the algorithm. In the 13th experiment, we set eight variables of interest to be estimated, including not only two ESPs, but also wind directions and speeds at three elevation 635 levels. This experiment is set up to show that the algorithm works with a complex wind profile. We do not use real datasets for validating the algorithm, because inversion results based on real datasets could be affected by additional sources of uncertainty (e.g., measurement uncertainty).
The advantages of the algorithm are that it has the ability to incorporate prior knowledge into the estimate in a statistically formal way, and to quantify the uncertainty in the estimate, and it captures 640 correlation between variables of interest in the estimate. Because of these advantages, we think that the algorithm has the potential to improve our understanding on how different sources of uncertainty interact and affect the results in tephra inversion. The main limitation of the algorithm is that there are subjective choices in implementation, which affect its performance. How and why they affect the algorithm are introduced, and commonly adopted measures and references (Chib and Greenberg, 645 1995;Andrieu et al., 2003;Kaipio and Somersalo, 2006) on how to specify the inputs properly are given.
In the 12 experiments with simplified wind profiles, it is found that correlation exists between the two variables of interest, namely column height and total eruption mass, in the sample posterior distributions. Whether the correlation is positive or not depends on sample site locations. These 650 could be well explained by the physics of tephra transport: a greater column height has a positive and negative relationship with tephra mass per unit area at distal and proximal sample sites, respectively, and total eruption mass is always proportional to tephra mass per unit area regardless of sample site location.
The algorithm supports specifying and estimating the wind profile in two ways. The first way takes 655 advantage of a simplified wind profile based on four variables of interest, and assumes that the wind direction does not change with elevation. The second one allows users to estimate wind speed and direction at each elevation. We argue that users need to be cautious in choosing how to specify and estimate the wind profile, because the second way could introduce more variables to be estimated, and potentially make the problem extremely ill-posed. How to choose the appropriate way to specify and 660 estimate the wind profile relies on factors such as prior knowledge of weather conditions and sample site distributions. We think that by experimenting on appropriate synthetic data, this question can be addressed. We apply the algorithm to the 2011 Kirishima-Shinmoedake tephra dataset, and the results are in general consistent with observations and estimates from previous work. We hope that the present work benefits future studies that attempt to implement tephra inversion and quantify the 665 associated uncertainty.

Declarations
Availability of data and materials The data used in this work are mostly generated from sythensized experiments. How to generate these data is specified in the text. The dataset of the tephra deposit from the 2011 Kirishima-Shinmoedake eruption used in this work is from previous publications. Where to find the dataset and corresponding 675 references are given in the text.

Competing interests
There is no competing interest involved in this work.     total eruption mass, and median and standard deviation of grain size distribution are referenced and inferred from Nakada et al., 2013;Miyabuchi et al., 2013;Maeno et al., 2014;White et al., 2017). Priors of α/β ratio, diffusion coefficient, fall time threshold, and pumice and lithic densities are specified as commonly adopted ranges or maximum ranges possible.