Those of us who have spent any time in microseismic would be familiar with questions like “… but what is the accuracy of your locations?” or ” .. what are the errors on that?” and indeed it is good that we are. I believe if you are going to provide a result you should also provide an estimate of its error or accuracy. However, methods for estimating error vary, so I am going to give my two cents on the matter. In the process, I’ll give an overview of a more general, and I think more useful, approach for describing location errors as contours or iso-surfaces for which there is a confidence that the location is enclosed. Whilst the example I discuss here is directed towards seismic event location the process is equally applicable to things like diffraction and drill bit location and more generally the procedure can be adapted to almost any kind of inverse problem.
What follows is a high-level overview of some of the issues with the traditional methods for computing and reporting errors and how confidence surfaces work. In the interest of a) keeping things shorter and b) not getting tied up in statistical jargon, I will gloss over a lot of the mathematical details on the way. So hopefully even if you are not an expert on seismic source location methods there is still something for you here too.
Motivation
Those familiar with seismic event locations and in particular microseismic analysis know that location errors are often reported as:
- Single values for entire sets of events (e.g. X meters vertically and Y meters horizontally)
- Error bars for each event
- An error ellipsoid via orientation and magnitude of the principal axis
Whilst, some of these approaches are definitely better than others (one-size fits all approaches to errors are particularly bad in my opinion), they all have the problem that they make some kind of assumption about the shape of the error surface, which is rarely if ever valid. What’s more, the problem is exacerbated by different measurements being reported for different datasets, for example in one dataset the reported error bars might be the two-sigma limits and another might be reporting one-sigma. Assuming the event error is normally distributed this is the difference between being 95% certain and 68% certain that your location lies within a particular interval.
So is there a better way?…. Well, yes there is.
Location is an inherently 3D problem. So the way we describe location errors should also be 3D. What we want to be able to do is draw a surface (or in map view a contour) around an area and be able to say I am X% certain the location is inside here. This surface could be any shape and might not even be spatially contiguous (hint- it will rarely if ever be an ellipse). Also, it must also be applicable to any type of location scenario (downhole, surface, .. teleseismic) and any kind of location metric (travel time fitting, stacking etc.), so that errors can be compared between datasets.
Confidence Volumes
The Misfit/Cost Function
Now it turns out location procedures provide us the ideal starting point since they all are either the minimization of a misfit/cost function or the maximization of a coherency or stack function. In this example, I am going use a misfit function based location algorithm although to work with a stack function the sense of the test is simply reversed.
A misfit function is just a measure of how well your model fits your data and for source location could be something as simple as the sum of the squared differences between modeled and predicted arrival times (i.e. the sum of the squared residuals). Here, I am using my preferred option which is the sum of squares using pairs of residuals between different sensors since this negates the effects of unknown origin time in the result. The following figure shows a map view at source depth through such a misfit function.
The white star marks the true source position and the black line shows the trend of a well path. This example is from a synthetic experiment, and for our purposes here the details like acquisition geometry, depths are not particularly important (although they do influence the result). What is important is that the colour background and contours show the levels of the misfit function. Points which have low values of misfit function are “good” solutions (as in they fit the data) for the source location and by extension points which have equal value of misfit function are equally good potential locations for the source. As you can see from the misfit function there are several areas which provide good source locations over a broad region trending perpendicular to the well, this is a function of the experiment geometry and the nature of the location experiment.
The Misfit Function PDF
Now, the misfit function already tells us quite a bit about location accuracy, however, it is not very quantitative. Nonetheless, we can use it as a basis to produce a confidence function using a PDF of the misfit statistic. The following plot shows the PDF distribution estimated for the misfit statistic above.
The misfit function PDF will vary from case to case and should describe the likelihood of obtaining a particular value of misfit function from a given data set taking into account things like:
- variability (or for synthetics expected variability) of the data
- correlated errors and variability between sensors
- nonlinear properties of the misfit statistic
- quality of the underlying data model
- prior knowledge of the system
Whilst that is all seems very vague, there are techniques for estimating the misfit function PDF the simplest of which is through Monte-Carlo simulation. Alternatively, for simple misfit statistics, it may be possible to derive the misfit function PDF directly (as chi-squared etc.).
The Confidence Function
We now have a misfit function and a PDF that describes the likelihood of obtaining values for the misfit function. So we can put these together to convert the misfit function values into confidence function values which describe the likelihood of obtaining a misfit value lower than the value at each position. In other words, if we have a confidence value of X% then there is an X% chance another point could have a lower value of misfit function and hence provide a better source location. So a contour (or isoSurface) on the X% value of the confidence function encloses the area where there is an X% likelihood of the event residing.
For our example, the computed confidence function looks like this
As before the black line and white star mark the true location and a well path respectively, but the color background has been replaced with the confidence function value. The grey contour lies at the 0.95 confidence value, which in other words mean that there is a 95% chance our source is located within this area.
Comparing the confidence function map with the misfit function map, the former shows more precisely defined region for the source (i.e. it has steeper sides). This is because, in this case, the misfit function PDF spans a relatively narrow range of values relative to the misfit function in the region of interest. In other words, although the misfit varies a lot over the spatial region if interest we know it’s value quite well and this is reflected in the steep sides to the location area. So, in this case, the event location is quite well resolved parallel to the well direction. Perpendicular to the well direction the location resolution is much worse (i.e. the confidence surface is elongated in this direction), but that is ok because I deliberately chose this example because it demonstrates a large non-regular error surface.
The Location PDF
Once we have a confidence volume it is then straight forward to compute the corresponding PDF for the event location, giving the likelihood the event is located at each particular voxel.
As with the confidence function, the PDF is characterized by a broad area of equal values elongated in the direction perpendicular to the well. Indicating the poor spatial resolution of the location, which, as mentioned above this is a function of the geometry of the experiment.
Summary
So that’s it for this post, to sum up:
- Many current methods for reporting location error fall short of the mark in that:
- they rely on an assumption about the error surface (ellipsoid etc.)
- they are open to misinterpretation
- they fail to capture that location is a 3D problem
- Confidence surfaces provide a solution to these issues:
- they can capture an error surface which is any shape, it does not even need to be spatially contiguous
- they can be applied to any location scenario (downhole, surface, ISM, teleseismic .. )
- they can be applied to any misfit or maximization metric
In short, confidence surfaces and by extension location PDFs provide a very general and hence very powerful description of our knowledge about a particular location. As such, they provide a valuable tool for evaluating location accuracy and array design. Beyond location analysis, there is plenty more you can do with confidence surfaces, and this is something I will hopefully come back to in the future.
Stay tuned for further updates…