Application of confidential intervals for verification of reservoir model at interpretation of well test data

S. V. Denisov1 , V. E. Lyalin2 , I. M. Grigorev3 , R. O. Sultanov4

1Ufa State Petroleum Technical University, Ufa, Russia

2, 3, 4Kalashnikov Izhevsk State Technical University, Izhevsk, Russia

1Corresponding author

Vibroengineering PROCEDIA, Vol. 15, 2017, p. 150-156. https://doi.org/10.21595/vp.2017.19427
Received 8 November 2017; accepted 17 November 2017; published 1 December 2017

Copyright © 2017 JVE International Ltd. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Creative Commons License
Table of Contents Download PDF References
Cite this article
Views 25
Reads 8
Downloads 709
CrossRef Citations 0
Abstract.

The information on arguments of an oil reservoir to a well test from the point of view of the Bayesian inference are express through even allocation of odds in room of arguments. In article application of confidential spacing for a quantitative appraisal of the information receive from the analysis of results of well test which one are us for upgrading of allocations of odds are offered. Use of confidential spacing for an appraisal of a correctness of a choice of a laboratory formation are show.

Keywords: well test, fiducially intervals, reservoir model, estimation of reservoir parameters.

1. Introduction

If information about the parameters of the formation before well testing, then information about these parameters from the point of view of Bayesian inference is expressed through a uniform probability distribution in the parameter space. The well test data contains the necessary information about the parameters, and the goal of analyzing the results of well test is to extract this information for use in updating the probability distributions in the parameter space. Confidence intervals can give a quantitative estimate of the information obtained [1].

Direct application of confidence intervals to the results of well test requires two conditions. First, the errors that represent the difference between the actual pressure value and its true value must be independent and normally distributed with respect to the true pressure change. The second condition is that for a domain in a parameter space sufficiently close to their estimates, the objective function can be approximated by a linear form by expanding it in a first-order Taylor series.

When these conditions are met, the updated probability distribution of unknown parameters generates a multidimensional normal distribution in the parameter space. A feature of the multidimensional normal distribution is that it is completely characterized by only two parameters: the mean value vector and the covariance matrix. For nonlinear regression analysis, the mean value vectors are parameter estimates, and the covariance matrix is calculated using the inverse Hesse matrix of the objective function on the basis of the final values of the estimates [2, 3].

2. The model of the system

The condition that the function describing the model can be approximated by expanding it in a Taylor series of the first order leads to the following expression:

(1)
F θ = F θ ^ + j = 1 m F θ j θ = θ ^ θ j - θ ^ j .

It is believed that the observed pressure readings are normally distributed with respect to the true value of Fθ,xi with the known variance:

(2)
P r o b y i F θ , x i = P r o b y i θ = 1 2 π σ e x p - 1 2 σ 2 y i - F θ , x i 2 .

As a result of observations of pressure values, the likelihood function for the parameters has the form:

(3)
L θ y 1 , , y n = P r o b y 1 , , y n θ = i = 1 n 1 2 π σ e x p - 1 2 σ 2 y i - F θ , x i 2
        = 1 2 π σ n e x p - 1 2 σ 2 R - J θ - θ ^ T R - J θ - θ ^ ,

where:

(4)
R = y 1 - F θ ^ , x 1 y n - F θ ^ , x n ,         J = F θ 1 x 1 F θ m x 1 F θ 1 x n F θ m x n θ = θ ^ .

OLS is equivalent to the maximum of the likelihood function, which takes place if and only if:

(5)
R T J = 0 .

As a result:

(6)
L θ y 1 , , y n = 1 2 π σ n e x p - 1 2 σ 2 R T R + θ - θ ^ T J T J θ - θ ^ .

The Hessian matrix in the Gaussian method, divided by 2, is defined as [2]:

(7)
H = i = 1 n F θ 1 F θ 1 i = 1 n F θ 1 F θ m i = 1 n F θ m F θ 1 i = 1 n F θ m F θ m θ = θ ^ .

Then:

(8)
J T J = H .

If for parameters the locally uniform a priori probability distribution is used (non-informative a priori probability distribution), then by the Bayes theorem the posterior probability distribution of the parameters after the n observations [1, 4]:

(9)
P r o b θ y 1 , , y n = L θ y 1 , , y n P r o b θ m L θ y 1 , , y n P r o b θ d θ = L θ y 1 , , y n m L θ y 1 , , y n d θ ,

where Probθ is a locally uniform a priori probability distribution.

By the definition of the multidimensional normal distribution:

(10)
m H 1 / 2 2 π σ 2 m / 2 e x p - 1 2 σ 2 θ - θ ^ T H θ - θ ^ d θ = 1 .

Therefore, the equation takes the form:

(11)
P r o b θ y 1 , , y n = H 1 / 2 2 π σ 2 m / 2 e x p - 1 2 σ 2 θ - θ ^ T H θ - θ ^ .

That is, the parameters θ form a multidimensional normal distribution with respect to θ^ with the covariance matrix σ2H-1. Eq. (11) quantifies the uncertainty associated with the parameter estimates.

When variance σ2 is unknown, the above reasoning requires a little refinement. σ2 can be obtained from the mean square error s2, which is calculated as:

(12)
s 2 = S S e r r n - m ,

where:

(13)
S S e r r = i = 1 n y i - F θ , x i 2 .

In this case s2 is an unbiased estimate of σ2, and σ2 has an inverse gamma distribution with respect s2 with n-m degrees of freedom:

(14)
P r o b σ 2 s 2 = 2 Γ v / 2 v s 2 2 1 σ v + 1 e x p - v s 2 2 σ 2 ,

where v=n-m.

Since θ and σ2 are independent random variables, θ^ does not change even when σ2 is replaced by s2.

As a result, a posteriori probability distribution for θ can be obtained by excluding σ2 when integrating the total a posteriori probability distribution density for θ and σ2:

(15)
P r o b θ y 1 , , y n = 0 P r o b θ , σ 2 y 1 , , y n d σ 2
            = 0 P r o b θ y 1 , , y n , σ 2 P r o b σ 2 s 2 d σ 2 .

After substituting Eqs. (11) and (14) into Eq. (15), we obtain:

(16)
P r o b θ y 1 , , y n = Γ n / 2 H 1 / 2 s - m Γ 1 / 2 m Γ v / 2 v m 1 + θ - θ ^ T H θ - θ ^ v s 2 - v + m 2 .

Therefore, when σ2 is unknown, the parameters θ form the multidimensional t – distribution of the Student relative to θ^ with the covariance matrix s2H-1 and n-m degrees of freedom.

The marginal probability distribution of the parameter θj is determined by excluding θi (ij, i= 1,…, m) when integrating over the space θ:

(17)
P r o b θ j y 1 , , y n = 1 2 π σ θ j 2 e x p - 1 2 σ θ j 2 θ j - θ ^ j 2 ,

where σθj is the Standard deviation, defined as:

(18)
σ θ j 2 = σ 2 h j j - 1 ,

where hjj-1 is the jth diagonal element of the inverse Hessian matrix computed at the point.

The more information is received about the parameters based on well test, the narrower the probability distribution with the shorter tails becomes. Accordingly, marginal probability distributions narrow down. Confidence intervals are used to quantify the range of marginal probability distributions.

By definition, a 95 % confidence interval covers 95 % of the area under the probability density curve, i.e. It is a range, the confidence probability of getting parameter values inside which is 95 %. Since the probabilities are distributed according to the normal law, the corresponding marginal distributions of each parameter are symmetric with respect to the estimates of these parameters. This means that the 95 % confidence interval is also symmetric with respect to the parameter estimate.

Usually, there are two types of confidence intervals: the range of absolute values and the range of relative values. Relative values are obtained by dividing the absolute values by the value of the parameter estimate.

In cases where the variance σ2 is unknown, the (1-α)·100 % th confidence interval for each parameter is determined from the following inequality [1]:

(19)
θ ^ j - σ θ j t 1 - α / 2 θ j θ ^ j + σ θ j t 1 - α / 2 ,

where t1-α/2 is the table value of the quantile of the order 1 – α/2 for the t – Student’s distribution with n-m degrees of freedom.

In cases where n-m> 30, the value t1-α/2 can be replaced by the corresponding value for the normal distribution. So, for α= 0,05 its value will be equal. Then Eqs. (19) takes the form “y”:

(20)
θ ^ j - 1,96 σ θ j θ j θ ^ j + 1,96 σ θ j .

The 1-α100 % th confidence interval for the relative values of each parameter is determined from the following inequality:

(21)
1 - σ θ j t 1 - α / 2 θ ^ j θ j θ ^ j 1 + σ θ j t 1 - α / 2 θ ^ j .

Similarly, when n-m> 30, the confidence interval for the relative values of each parameter can be represented as:

(22)
1 - 1,96 σ θ j θ ^ j θ j θ ^ j 1 + 1,96 σ θ j θ ^ j .

The correlation coefficient between any two parameters is calculated on the basis of elements located outside the main diagonal of the inverse Hessian matrix, at the point θ=θ^:

(23)
ρ i j = h i j - 1 h i i - 1 h j j - 1 .

As long as there are mathematical correlations between the parameters, none of them can be uniquely determined.

The joint application of confidence intervals and correlation coefficients requires the construction of confidence areas. (1 – α)·100 % th confidence area of parameters is defined as follows:

(24)
θ - θ ^ T H θ - θ ^ m s 2 F 1 - α m , n - m ,

where F1-αm,n-m is a tabulated quantile value of the order 1-α for F – distribution with m and n-m degrees of freedom.

For the convenience of the use of confidence intervals in the verification of the model, the variance of the probability distribution is taken into account, and not the correlation between the parameters. In practice, the values of the confidence intervals: ±10 % for permeability k, coefficient of accumulation C, distance to the border (re), crack length (xf); ±20 % for coefficient of elastic capacity ω, transmittance λ; ±1 for skin factor S and ±0.005 MPa for initial pressure (Pi). They were obtained heuristically on the basis of real experiments on the interpretation of field and simulated well test data.

The key idea is that if the model is chosen correctly and there is enough data, then all parameters should be within these acceptable limits. In this case, it is assumed that the model is selected correctly. Otherwise, the model is considered unacceptable, since confidence intervals exceed statistically allowable limits.

The variance of the probability distribution of each parameter is the product of the product of the mean square error and the corresponding diagonal element of the inverse Hessian matrix.

The mean square error is used to represent the variance of errors, which has a finite value, provided that a suitable model is selected. If this condition is met, then the mean square of the errors does not depend on the number of data and the time interval of well test. However, in the case of an incorrect model, the mean square of the errors becomes larger than the real value of the error variance.

The inverse Hessian matrix is a function of the number of parameters of the formation, which is equivalent to the dependence on the choice of the model, the correlation between the parameters, the amount of data and the time interval of well test. The property of the diagonal elements of the inverse Hessian matrix is that their values decrease monotonically with increasing amounts of data.

3. Computational experiments

Let’s demonstrate how confidence intervals can be used to assess the correctness of the model. For this purpose, the data of well test was modeled by the method of lowering the level. The purpose of the demonstration is to show how the confidence intervals solve the problem when it is known in advance whether the reservoir model corresponds to the data or not.

In the first case, the model was chosen correctly. The pressure values for well test by the method of level reduction were calculated using the flow model in an infinite formation, to which random errors were then added. Information on the reservoir and the fluid that saturates it: borehole radius rw= 0.1 m, reservoir thickness h= 5 m, volume factor Bo= 1 m3reservoir/m3norm, viscosity μ= 10-3 Pa·sec, porosity ϕ= 0.2, initial pressure Pi= 20 MPa, total compressibility ct= 10-4 MPa-1, operating rate q= 100 m3/day.

The true values of the parameters are k= 0.05 µm2, S= 10 and C= 0.2 m3/MPa. The random number generator generated a set of random errors distributed according to the normal law with zero mathematical expectation and variance 2.5·10-5 MPa2. Depending on the number of data points, the following four cases were considered: a) 51 data point, b) 61 data point, c) 71 data point and d) 81 data point. A flow model was used in an infinite reservoir with three parameters (k, S and C). The correspondence of the model to the data is illustrated in Fig. 1.

For simplicity, the results are given only for one parameter permeability. Marginal probabilities are shown in Fig. 2. The corresponding 95 % confidence intervals for permeability are summarized in Table 1.

Table 1. 95 % confidence intervals for permeability in case of correct model

Number of data points
51
61
71
81
Parameter estimation
0.0484
0.0498
0.0498
0.0499
s 2
2.74⋅10-5
2.90⋅10-5
2.67⋅10-5
2.69⋅10-5
h - 1
1.55⋅10-2
2.09⋅10-3
6.40⋅10-4
2.82⋅10-4
σ θ 2 = s 2 h - 1
4.24⋅10-7
6.07⋅10-8
1.71⋅10-8
7.57⋅10-9
σ θ
6.51⋅10-4
2.46⋅10-4
1.31⋅10-4
8.70⋅10-5
Confidence interval
2.71 %
0.99 %
0.52 %
0.35 %
Decision
Acceptable
Acceptable
Acceptable
Acceptable

Fig. 1. Simulated well test data and its correspondence to the correctly chosen reservoir model

 Simulated well test data and its correspondence to the correctly chosen reservoir model

a) 51 points

 Simulated well test data and its correspondence to the correctly chosen reservoir model

b) 71 points

 Simulated well test data and its correspondence to the correctly chosen reservoir model

c) 61 points

 Simulated well test data and its correspondence to the correctly chosen reservoir model

d) 81 points

In fact, only cases b), c) and d) contain useful information on permeability. As follows from Table 3, the permeability estimates are fairly close to the true value of 0.05 µm2. Therefore, in Fig. 2 all probability distributions are grouped around this value. As the number of data increases, more permeability information appears, and the corresponding deviation (σθ) decreases. The spread of the distributions narrows, and the normal distribution tends to take the form of the Dirac delta function. From the standpoint of confidence intervals, all cases are acceptable, i.e. The model is chosen correctly.

Fig. 2. Marginal densities of the probability distribution in the case of a correctly chosen reservoir model

 Marginal densities of the probability distribution in the case of  a correctly chosen reservoir model

4. Conclusions

In principle, confidence intervals can be used to accept or reject the selected model. Regardless of whether the model is chosen correctly or not, confidence intervals ultimately yield consistent results. But it must be taken into account that in practice, when verifying a model, confidence intervals should be determined for all parameters. In addition, confidence intervals are easy to calculate, since all the necessary information is contained in the results of nonlinear regression, and it is not difficult to use for model verification, as was demonstrated above.

However, comparative analysis based on confidence intervals has two drawbacks (practical and theoretical) from the standpoint of discriminant analysis of models.

First, the confidence intervals are directly proportional to the variance of the probability distribution of the parameter, which in turn is a combination of the mean squared error (estimated variance) s2 and the diagonal element of the Hesse inverse matrix hjj-1. That is, confidence intervals can be in acceptable redistributions, even if an incorrect model is used.

Secondly, confidence intervals are convenient for verification of models, but are not suitable for their discriminant analysis. In other words, based on confidence intervals, you can determine whether the model is suitable or not, but nothing can be said about which of the models is better. This is due to the fact that the correlation between parameters is not taken into account when calculating confidence intervals. However, in general, reservoir parameters are nonlinearly related to each other, which must be taken into account when verifying. Moreover, Eq. (11) indicates that the dimension of the probability distribution of the parameters coincides with their number. That is, different models with different number of parameters have different dimensions of probability distributions. Therefore, a direct comparison of the corresponding confidence intervals is clearly not enough.

References

  1. Gmurman V. E. Theory of Probability and Mathematical Statistics: 12th Ed., Higher Education, Moscow, 2006, p. 479, (in Russian). [Search CrossRef]
  2. Magnus J. R., Neidekker H. Matrix Differential Calculus with Applications in Statistics and Econometrics. John Wiley and Sons, Chichester, England, 2007, p. 468. [Search CrossRef]
  3. Demidenko E. Z. Linear and Nonlinear Regression. Finance and Statistics, Moscow, 1981, p. 304, (in Russian). [Search CrossRef]
  4. Ash R. Basic Probability Theory. Dover Publications, New York, 2008, p. 350. [Search CrossRef]
  5. Anraku T., Horne R. N. Discrimination between reservoir models in well test analysis. SPE Formation Evaluation, Stanford University, Vol. 10, Issue 2, 1995, p. 114-121. [Search CrossRef]