EFFECT OF UNCERTAINTY OF RELIABILITY DATA OF INSTRUMENTATION ON RISK ASSESSMENT AND PREDICTION IN PROCESS PLANT APPLICATIONS

Some aspects are discussed concerning the uncertainty causes in risk assessment techniques of industrial interest. Particular attention has been paid to the evaluation of the effect of uncertainty of reliability data of devices and instruments to be used in process plants on the evaluation of the occurrence frequency of the top event. The influence of measuring equipment, whose contribution to the whole uncertainty appears in some cases very important, has been analysed with reference to both operative and environmental aspects. NOMENCLATURE n: number of failures recorded in the test period T m: mean life [h] : maximum likelihood estimate of the mean life [h] f: degrees of freedom T: test period [h] (1 ): confidence level L: lower limit of the confidence interval of ratio U: upper limit of the confidence interval of ratio : complement to unity of confidence level : maximum likelihood estimate of failure rate [h] : confidence interval of failure rate [h] %: 100 ∙  /


INTRODUCTION
Risk evaluation in industrial applications is a topic, which appears to be more and more important for many reasons, involving different aspects.In fact, achieving a "tolerable" risk level involves technical, economic, socio-political and even moral considerations [1] and consequences of a reliable and quantitative risk evaluation influence industrial operation in different contexts, which are all of very remarkable economic relevance.
If just some aspects are considered, it is to be noticed that:  quantitative risk analysis allows the identification of events, which are the most likely to occur among the accidents, which are very important, if the effects on the human health and safety and the damages to the environment are taken into account.Based on this information, the design of Safety Instrumented Systems (SIS) could be strongly improved and requirements of safety management Standards and Rules [2] better fulfilled;  law requirements are more and more stringent, which require to define the occurrence probability for hazardous events in order to set safety and emergency plans and procedures [3].They also define social and town-planning limitations, depending on quantitative evaluation of occurrence probability of industrial accident and on the relevance of the event, whose effects are considered;  the capability is improved of setting programs of plant predictive maintenance, effective from both a production and cost point of view [4].It has to be pointed out that many consolidated approaches have been set for risk analysis of complexes systems.
Anyway, even though these topics have been deeply studied, many aspects have to be investigated, if reliable and accurate data should be obtained, with reference to all steps of the evaluation procedure.Among the most noteworthy aspects: the definition and the identification of possible faults, the suitable modelling of the process plant and of the component behaviour for accident prevention, the reliability data availability and the data processing [15].
Furthermore, whichever method is used, some common problems arise, summarized as in the following:  difficulty of completely modelling the production line behaviour from a safety point of view, so that simplifying hypotheses are then to be assumed;  incident common causes quantification;  evaluation of real working conditions of components and apparatuses;  incomplete availability of data with reference to specific production lines;  data accuracy, with reference to the failure rates of single components to be considered in risk analysis.
All these aspects are important, so that a remarkable and prudential attention should be devoted to the use of prediction activity, but a particular care should be paid, among all these topics, in fixing the failure rate data of different components, because they constitute the basis of evaluation, whichever method of evaluation is chosen.
Furthermore even though failure databases are used, which have a very extended data number, as in case of data bases used for petrol chemical applications, in most cases evaluation could be based on a very few data, so that the prediction confidence level is strongly affected.Further problems arise, if the effective working conditions in particular applications have to be considered, because databases information is normally used for risk evaluation also in plants of different types.Therefore, correct "data mining" is an important aspect of risk analysis.
These problems are emphasized, if measuring equipment data are considered, due to the generally reduced number of data, which are available, even though measuring instruments of pressure, of temperature, of flow rate and other thermomechanical quantities are an important percentage of the components to be considered.In fact, generally, a reduced number of failures is detected, so that reliability evaluation is difficult to be carried out.
Furthermore, if the contribution to the global risk of measuring instruments is considered, particular attention should be paid to the effect of environmental conditions on the correct behaviour of instrumentation; in fact data are often available for apparatuses, which work in quite different measuring conditions.Therefore, in many cases, the reliability of data concerning failure rate of different sensors used for production line control and safety is unsatisfactory.
The definition of fault conditions for measuring equipment is strictly connected to the above aspect, which is a further contribution to uncertainty to be estimated for measuring apparatuses.It is important to notice that this problem should be studied taking into account requirements for correct working of instrumentation, especially in contexts based on quality, environmental, safety and/or integrated management systems [16].
Taking in mind the above considerations, the aim of this paper is to analyse quantitatively some aspects, on which the accuracy of computation of top event frequency depends.
Different steps of the procedure will be studied:  the evaluation method of the uncertainty of data, which are generally available, referring to both literature information and/or high level databases for maintenance and risk analysis;  the aspects related to measurement equipment, in order to get information on the confidence to be attributed to failure rate predictions, in comparison with data concerning different components, like pumps, actuators, motors and so on.The methodology, which has been used to evaluate the uncertainty of failure rate evaluation of different sensors and measuring system will be described and the confidence level of reliability predictions discussed, from an operating point of view.Different computation approaches, which are most diffuse in practical applications will be used; traditional techniques will be discussed, being the attention focused at this step on accuracy evaluation depending on other aspects.
Results with reference to different type elements will be analysed, in order to obtain interesting information to improve maintenance and risk assessment activity and prediction.

UNCERTAINTY CAUSES IN RISK ANALYSIS METHODOLOGY
The aim of this study is to evaluate the whole uncertainty of the estimated occurrence frequency of top events in industrial plants for safety applications.
The final uncertainty of data concerning the occurrence frequency of the top events obviously depends on different causes.
The main contributions are as in the following:  the risk modelling method and the assumed simplifying hypotheses,  the uncertainty of data which are generally available,  the actual working conditions,  the type of components to be taken into account.This section will stress some aspects concerning the risk evaluation procedure.DIAGNOSTYKA, Vol. 20, No. 1 (2019) D'Emilia G, Natale E.: Effect of uncertainty of reliability data of instrumentation on risk assessment and … 75

Risk modelling methodologies
The most widely used method for qualitative and quantitative hazard assessment is the fault tree [6][7][8][9][10][11]; in some countries, law itself suggests it as hazard evaluation technique.
A fault tree is a graphical representation of the logical relations between a particular system accident or other undesired event, the top event, and primary cause events.
To apply the fault tree method, the following conditions have to be verified:  The faults are of binary nature, namely a component works or does not work. The transition from the working state to the failure one is instantaneous. The failure rate is constant (components without memory). The repair rate is constant too. The failure rate does not change after every repair. The fault tree and repair tree are equivalent, namely, if a failure produces an effect on some events of upper level, the corresponding repair restores initial conditions.A fault tree is essentially a simplified representation of a process system, which is generally very complex.Much work on fault tree methodology is concerned with correcting the oversimplifications [9].
Among these the Markov technique is becoming more and more widespread for the calculation of the unavailability of repairable systems.This technique is very effective, to analyze systems where the sequence of failure is important.Markov modeling can also be applied to the analysis of common cause failures, standby redundancies, and state-dependent failure rates.For these applications, the Markov process is taken as a discrete state, continuous-time model.For the Markov technique, as well as for the fault tree analysis, assumptions are made concerning the transition rate from a given state to another, that is the same at all times in the past and future.The main drawback of this technique is that for complex trees it can become unmanageable, due to the number of differential equations to be solved; in these cases, even though simplifying methods are used, a proper description of states and transitions becomes a problem.
Anyway, in most cases a combination of fault tree and Markov model appears as an effective approach and will be used in this paper.

Confidence interval
Due to a finite data number of components under observation, of failures registered and of limited surveillance periods, the failure rate estimates are affected by uncertainty, which can be expressed in terms of intervals, corresponding to a set confidence level.
These intervals are estimated according to literature indications [5,17,18], using information collected about failures of components.
Computation is carried on, by assuming constant failure rate, , and an exponential distribution of the reliability function, which provides the probability of surviving till a time of interest; this is a typical assumption in the fault tree approach.
Let is life time estimation: (1) and: ( (3) the freedom degrees.For a time-terminated test, the lower, L, and upper, U, limits of interval range for the ratio, when a (1 -) confidence level is set, can be computed by means of the following relationships: where distribution values are obtained from standard tables.

Failure rate data
Whatever method is used, it needs experimental observations, referring to the plant components behavior and therefore the problem of an effective failure data retrieval arises in any case.
Typical data sources for risk analysis are literature data or databases, which are typically used in industrial studies, also for plant predictive maintenance planning.
Tables 1 and 2 show a summary of fault data concerning different type of transducers (temperature, pressure, level and flow transducers) as taken from a high level very updated database, OREDA (Offshore Reliability Data), often mentioned also in literature references [6,7] and literature data in the chemical industry [5,8].
Remarkable differences arise by comparing data of different sources; therefore, choosing a suitable source appears a very important task.Some differences can be explained, if it is taken into account that some literature data [5,8] refers to a previous period, with a poorer instrumentation, due This comparison suggests some more causes of uncertainty of reliability data due to the use of information often referring to different plants, different and unknown environmental conditions; this aspect strongly influence accuracy of failure rate computation and therefore top event occurrence frequency evaluation, in particular with reference to measuring devices, as it will be shown in the following.
Further, if measuring instruments are referred to, some more aspects should be considered, with reference to fault definition and identification.
Typically, both "performance fault" and "condition fault" are taken into account, meaning that requested performance is not achieved.In the former case, this default is due to a measuring error or to a measurement uncertainty not adequate to the process control and safety requirements, in the latter, a fault condition occurs due to electrical or mechanical problems of the measuring apparatus.If a management system for quality [19] environmental [20] and for safety [2] is set, this scenario obviously allows to reduce fault number, as an effect of procedures which have to be realized; the prediction capability is also improved due to larger and better information concerning plant components and instrumentation, in particular.
Anyway, sensors and transducers fault definition, with reference to a certificated quality context, is a topic, which needs a deeper study.

RESULTS
The above-described methodology for confidence intervals calculation has been applied to some data from the OREDA database.
The computation refers to both transducers and mechanical components, whose incorrect behaviour is very often basis event in fault trees for industrial hazard analysis, carried on for plants having a risk of relevant accidents.
Transducers to be examined are in general mechanical and thermal measuring equipment for process control, like temperature, pressure, level and flow rate transducers and transmitters; data referring to motors and pumps have also been considered and analysed for comparison purposes.
The extent of confidence intervals depends on failure number collected, which in turn is bound up with the components under observation and the surveillance interval.
The problem of a reduced number of data usually is a problem more important for transducers than for mechanical and electrical equipment like motors, pump, actuators and so on.In fact, measurement equipment, which are very important for process productivity, reliability and human and environment safety, are very reliable themselves, so that a very few data regarding their fault is in general available.
The results of a preliminary analysis are summarized in Table 3, confirming the above consideration.with the corresponding failure rate,  and % parameters at a confidence limit of 90%.
A few considerations can be carried out:  The percentage confidence interval, %, only depends on available fault number;  A reduced range of variation of % can be found for pressure measuring instruments in comparison with other transducers, due to a better global information.
 For pumps and motors % is much lower than for sensors, therefore allowing more reliable calculation in fault tree analysis.
A preliminary data analysis shows, also, that if the number of faults, n, is more than 300, the corresponding percentage confidence interval,%, is less than 20%; if n is in the order of 50, % is increased up to 50% and more.
It is to be noticed that this computation has been carried out for the data collected corresponding to the whole categories of components of transducers, pumps and motors, without any distinction, due to the specific working principles.
Further problems arise if only data referring to specified sensor failure typologies are selected or particular environmental conditions, by using more selective filtering procedure, due to the reduced number of detected faults.
Filtering with different selection criteria strongly influences the numerousness of data to be considered for examination; therefore, the effect of filtering procedure on reliability of computations in risk analysis should be taken into account.
In fact, setting a more selective filtering mask, modifies the inventory characteristics (typology, environmental conditions, operating situation, etc..), which are considered.
It should be taken into account which amount this increased number of requested characteristics influences failure rate of components under consideration and its accuracy.

A practical example
Some aspects related to reliability data accuracy of measuring apparatuses will be discussed with reference to a practical situation, concerning a chemical process plant.
In particular, a stage of a distillation column is considered, whose temperature should be carefully controlled, in order to avoid unwanted temperature increase and a consequent column explosion, which is the "top event", whose occurrence frequency has to be evaluated.
The fault tree that describes the accident dynamics is depicted in Fig. 1.
The basic events of fault tree that lead to temperature increase and to column explosion, are all control systems failures, in particular TIC (Temperature Indicator and Control) and LIC (Level Indicator and Control), that prevents level decrease in the column and concentration increase of the organic impurities.
A further temperature control, TIC3, is placed as protection for the failures of the other temperature controls; therefore, we are interested to know his unavailability, which is the probability that a Missed Functioning (M.F.) occurs.
According to literature indication [5] and common professional practice, failure rates of control loops are evaluated, by simply adding fault rates, which refer to all component of control loop, transducer, control and monitoring block, actuator, valve.
Using, in particular, data from OREDA, regarding a globe valve for the process control with a pneumatic actuator, we obtain: (6) To calculate TIC3 unavailability, according to previous considerations, we will use the Markov method.
If, as in this case, only one component is considered, system states are depicted from diagram of Fig. 2, where λ and μ represent the failure and maintenance rates respectively.Markov equations that describe the system behavior are:  Assuming P a (0) = 1 e P b (0) = 0, the solutions of differential equations Eq. (7-8) are: (9) (10) where P b (t) represents the system unavailability Q(t): (11) which is a function of time.Usually the steady-state unavailability Q(∞) is of interest, being in most cases λ<<μ and becoming the transitory term of Q(t) negligible very quickly in comparison with the Mean Time Between Failures (MTBF).In this situation Q(∞) is evaluated according to Eq. ( 12): A functioning time equal to 8400 h/year and a repair time of TIC3 equal to 8 h are assumed.Faults leading to both incorrect opening and closing of control valve have to be also considered.
In order to evaluate the effect of reliability data accuracy, computation will be carried on with reference to temperature and level sensor contribution.
For temperature measuring instrument, in particular a resistance transducer, the occurrence frequency of top event is evaluated using the reference failure rate value λ T = 1.7 · 10 -6 h -1 and then extreme values of the 90% confidence interval (λ minT = 0 ; λ maxT = 3.9 · 10 -6 h -1 ) .
The most relevant results of computation are summarized in Table 4.The obtained results suggest some interesting consideration:  There are branches of fault tree, which influence the top event occurrence frequency much more than other, depending on the process and control configurations. In the studied case, level control is very delicate, due also to the failure rate of LIC.In particular, its failure rate is nearly one order of magnitude larger than failure rate of temperature sensor. The uncertainty of failure rate of level transmitter is much higher than temperature one and strongly affects uncertainty of occurrence frequencies of top event; neglecting these analysis results could compromise the risk evaluation;  In most cases the uncertainty of prediction of occurrence rates is due to the sensors.

CONCLUSIONS
In this paper some aspects have been analysed, which are important in industrial applications for the accuracy evaluation of risk quantitative computations.
Different steps of the procedure have been studied, with reference to the evaluation method, to the uncertainty of data of interest, referring to both literature information and high level databases for maintenance and risk analysis.
Different computation approaches, which are diffusely used in practical applications, were analysed.
A model for the evaluation of confidence interval for failure rate of components to be analysed for risk evaluation has been set and it has been used for a practical application.
In this paper significant differences of results arise between measuring instruments and other components like pumps, actuators, motors and so on, being uncertainty in failure rate evaluation of transducers much higher.
In particular, the uncertainty about reliability of transducers is particularly important with respect to the uncertainty of top event occurrence frequency.Therefore, special care in the literature and databases data selection, in the evaluation of process architecture, in the analysis of environmental conditions effect, in the setting of These considerations seem to be interesting from a practical point of view, based on different aspects: the number of sensors in process control is more and more increasing and the widespread use of new technologies of transducers and transmitters, asking for detailed and reliable information about failure rates of new sensors themselves.
. 20, No. 1 (2019) D'Emilia G, Natale E.: Effect of uncertainty of reliability data of instrumentation on risk assessment and … 79 the most suitable theoretical approach and modelling should be taken in mind and carefully considered.
Vol. 20, No. 1 (2019) D'Emilia G, Natale E.: Effect of uncertainty of reliability data of instrumentation on risk assessment and … 76 to the natural technological progress.Further, industrial plants are not the same: data refer to normal instruments of commercial level in process industries; in other applications, where higher instrument costs are acceptable, failure rate decreases.

Table 1 .
Failure rate data from OREDA.

Table 3 .
 and% parameters data for measuring instruments, pumps and motors.

Table 4 .
Effect of uncertainty of failure rates of TIC and LIC