SIMPLE METHOD OF FAILURE DETECTION OF ROTARY MACHINES

In the paper a simple unsupervised monitoring method of rotary machines is proposed. The method consists of three stages – multi-reference preliminary analysis of the vibration signals, auto-reference preliminary analysis and probabilistic analysis of the signals. The method was tested by using signals from eight simulated machines. The efficiency of the method has been positively verified.


INTRODUCTION AND PROBLEM FORMULATION
Rotary machines monitoring is a crucial part of their maintenance [6,8,10,14,15]. The automatic monitoring is based on various complex systems such as SCADA [5] or systems of artificial intelligence [2,3,4,7]. Systems of this type, apart from the obvious advantages, have some drawbacks, however. First of all, the complexity of the monitoring system can cause problems with ensuring real-time work. Furthermore, most of artificial intelligence systems requires a tedious learning process. Moreover, after restarting the machine after fault, the learning process usually has to be repeated because of change of characteristic of the working parameters that, even if are tiny, can have influence at monitoring process. The fact that the mentioned monitoring is connected with a problem of on-line processing and analysis of the big data stream is an additional difficultysee [1,9,10] and the review paper [11]. Therefore, there is a need for extremely simple but effective rotary machinery monitoring systems that can be used as complementary to systems based on more complex algorithms. In this paper such system is proposed.
The problem is formulated in the following way. There are a group of identical rotary machines that start working at the same time. All of them are monitored with the same frequency. The monitoring is conducted on the basis of vibrodiagnostics. A certain initial working time of the machines is trouble-free operation. The detection of the moment of a pre-failure, a failure or a fault of each machine as early as possible is the aim of the monitoring. Furthermore, the detecting algorithm should be as simple as possible. The approach was tested by using a simulated modelsee Section 3.

DESCRIPTION OF THE METHOD AND ITS APPLICATION
The proposed method consists of three stagesmulti-reference preliminary analysis, auto reference preliminary analysis and probabilistic analysis of the signals. Both preliminary analyzes include, among others, method calibration. The multireference analysis can be applied if at least two identical machines are monitored; on the other case, this stage has to be omitted. The multi-reference preliminary analysis consists in comparison of the signals from two machines. We compare the average signals from each waveform according to the formula (1) where the index i numbers the measurement in a given run, k numbers the run, whereas j and m numbers machines. The superscript bracket denotes that the number is an index, not a power. The number N denotes the total number of the measurements in the run. Thus, denotes the value of the ith signal in the kth run in the jth machine. The number denotes the kth run mean-square error of the jth and mth machine. The multi-reference preliminary analysis allows to determine the threshold value above which the signals should be considered as different. In addition, the method enables the extension of the reference range and allows for the initial indication of the first defect in some machines.
In the auto-reference preliminary analysis the sequential signals are compared with the first s runs of the same machine that are treated as the reference ones. The obtained value, calculated from the formula (2) below, is the average difference between the given k run and the reference runs, minimized after the reference runs: (2) where n (1 ≤ n ≤ s) is the number of reference run , k is the number of the measured run (1≤k≤250), i is the number of the signal in the run (1 ≤ i ≤ 250000). In order to calibrate the method, the pairwise differences between the reference runs were calculated using the formula (2) according to the formula: The auto-reference preliminary analysis enables to supplement the conclusions obtained from the previous method, allowing for a more accurate determination of the threshold, the moment of the first failure. However, the previous method allows faults to be distinguished between machinessee the next section.
The probabilistic stage consists in using the chisquare test for checking whether two different runs have the same distribution. The results of the measurements in each run were treated as sample of distribution of the feature. Chi-square test for homogeneity was used for the comparison. This test is convenient for usage because the assumptions are minimal. It is only assumed that the statistical samples are large and the number of elements in each bin is not less than five. Furthermore, bin widths have to be equal. However, the variable can be both continuous and discrete. Furthermore, the distribution function needs not to be continuous. The statistics is given by the formula (4) where n 1 i n 2 are the numbers of elements in the first and in the second sample, respectively. Variables n 1i , n 2i denote the numbers of elements in the i-th bin of the first and the second statistical sample. The results of measurements in each run are a single sample. The probabilistic stage allows precisely determine the moment of the first failure.

RESULTS
Simulated model consists of two elements -a synthetic model of vibration signal and synthetic model of development of particular fault of rotary machinery. Signal is defined as a sampled time waveform of acceleration of vibrations. Depending on the simulated fault mode, consecutive vibration signals are modified differently. Each vibration signal is constructed as a phenomenologicalbehavioral model with GAD [13] shaft components (AM-FM harmonics) and gearbox components (AM-FM harmonics with multiple double sidebands) as well as and GATD [12] rollingelement bearings components (AM-FM cyclononstationary components with additional phaselocked amplitude modulation). Fault development is modeled as a combination of linear, 2nd order polynomial or exponential growth of amplitudes of individual signal components with relatively low or relatively high variance. The model is based on [9], chapter 8.
Thus, a group of eight rotary identical engines were simulated. For each machine 250 runs were monitored. Each run lasted 10 seconds and was monitored with frequency 25 000 Hz which means that one run consisted of 250 000 measurements. The first 50 seconds of each machine operation is trouble-free and, therefore, were used as referential runs.
Taking into account the fact that the first five runs are reference on each machine, they were used to calibrate the method. The greatest difference measured by the formula (1) for the reference waveforms is between machine 1 and machine 8 and is 0.03. Thus, it was assumed that the value of 0.05 is the limit value above which the signals should be considered as significantly different. show no significant difference in signals during the first 230 runs, then their signals start to diverge and this is an upward trend, but the differences remain below 0.1 -see Fig.1. Likewise, machine 4 shows no appreciable difference with both machine 1 and machine 2 for the first 200 runs; then their signals start to diverge more and more. Finally, the difference is 0.125. This means that machines 1 and 2 either continue to run normally, or one of them has a minor fault after 230 runs. Machine 4 malfunctions after 200 runs. In addition, the first 200 runs of machines 1, 2 and 4 can be considered as referential in relation to the rest of the machines. A comparison of machines 1, 2 and 4 with machine 3 (see Fig. 1, Fig. 2, Fig. 3) shows the failure of machine 3 at approximately 180 runs. Likewise, the comparison of machines 1, 2 and 4 with machine 5 (see Fig. 1, Fig. 2, Fig. 4) also shows a failure of machine 5 at roughly 180 runs. Moreover, a comparison of machine 3 with machine 5 (see Fig.  3) shows that they perform the same. The comparison of machines 1, 2 and 4 with machine 6 (see Fig. 1, Fig. 2, Fig.  4) shows the failure of machine 6 also around 180 runs. However, the behavior of machine 6 differs from that of machines 3 and 5 -see Fig. 3 and Fig. 5. A comparison of machines 1, 2 and 4 with machines 7 and 8 (see Fig. 1, Fig.2 and Fig.4) shows a failure of approximately 70 run. Machines 7 and 8 behave the same -see Fig.7. To sum up, on the basis of multi-reference preliminary analysis the following conclusions were obtained.  The naive method is only suitable for detecting the first fault, assuming that at least two identical machines are tracked in parallel. If we only have two machines, we cannot decide which of them the fault occurred in, nevertheless, we are able to conclude that a failure has occurred in one of them.

Results obtained by auto-reference preliminary analysis
According to the parameters of simulation specified at the beginning of Section 3, the values of the indices in formulae (2) and (3) are following: l=5, s=5. The ranges of values of the remaining parameters are the same as for the multi-reference preliminary analysis.
For each of the eight machines, the obtained value of was approximately 0.2. Therefore 0.25 was taken as the threshold above which the signal should be treated as different from the reference. The results -see Fig. 8 -confirmed the predictions of the multi-reference preliminary analysis, but they refined them -both machine 1 and 2 worked flawlessly all the time. The failure of machines 3, 5 and 6 should be moved to approximately 200 run, while machines 7 and 8 to approximately 150 run.
The multi-reference preliminary analysis and the auto-reference preliminary analysis should be treated as complementary. The auto-reference analysis gives a more precise moment of failure, because it refers to its own pattern. However, only the multi-reference analysis allowed to distinguish between the failures of machines 3 and 5 and machine 6. Fig. 8. Compare the operation of each machine with its own reference runs, which were the first 5 runs. On the horizontal axis, the value of the k parameter, on the vertical axis, the difference values, calculated by using formula (2). The red line denotes the limit value equal to 0.25.  occurredthe value of statistics was partly above, partly below critical value. Starting from the 131st run the value of statistics was permanently above critical value. Machine 4 -up to 205 run below critical value, 206-229 -oscillations, from the 230th runpermanently above critical value. Machine 5 operates exactly in the same way as machine 3. Machine 6 -up to the 86th run below the critical value, oscillations 87-206, from the 207th runpermanently above the critical value. Machine 7 -up to the 84th mileage below the critical value, 85-139 oscillations, from the 140th run -permanently above the critical value. Machine 8 -single above critical value for runs 6, 17 and 27, 84-139 oscillations, from the 140th runconstantly above critical value. Fig.11. Graphs of the statistics of successive runs for all machinessee formula (4). On the vertical axis the value of the statisticssee formula (4). Up to the first vertical red line values of the statistics are less than critical value, the area of oscillations lies between the vertical red lines, whereas to the right of the right red vertical line the values of the statistics are greater than critical value.
It can be observed, that both preliminary analyzes give very similar results, but are slightly complementary. Probabilistic stage enables to distinguish the period of transitional operation of a machineoscillationsafter which always occurs permanent operation above critical value of the statistics. This means that the proposed probabilistic approach enables to detect the failure at the very early stage. As a consequence, the onset of the fault should be taken at the beginning of the oscillation.

CONCLUDING REMARKS
In the paper the simple method of early detection of failures was proposed. The method enables to detect the failure at the very early stage of its occurrence. According to its simplicity, the method is extremely easy for implementation and very fast. The last is important in the real-time systems, to which systems of intelligent monitoring belong.