HxGN EAM Crow AMSAA Reliability Evaluation Using Monte Carlo Simulation with Python Studio

November 19, 2024

Julio Roque

Assessing asset reliability and availability are very important steps for any industry that wants to improve its financial performance. Different facilities have different assets with various requirements of reliability. HxGN EAM provides Crow-AMSAA as a technique to evaluate historic failure data and predict the possible future failure event occurrences. HxGN EAM takes an asset’s failure history to plot the Crow-AMSAA reliability growth trend and uses a quantitative strategy to predict future failure times. Crow-AMSAA growth graph plot illustrates the historical reliability or degradation of assets or equipment and builds a high-level timeline correlating failure events to predict future failure events. The methodology provides a practical high-level approach to maintenance and future financial planning.

Introduction of Crow-AMSAA:

Crow-AMSAA cumulative failure formula can be written as follows:
N(t) = λtβ

After applying natural logarithms on both sides of the equation, a linear regression representation is described below:
lnN(t) = lnλ + βlnt

This equation implies a straight-line expression on a log-log scale plot. Here, β- slope parameter indicates the failure rate as:
•   β > 1: reliability deterioration, failure comes faster
•   β = 1: reliability is steady, with no improvement in failure rate
•   β < 1: reliability is improved. Failure comes slowly
λ: the linear intercept parameter. Here, we mainly discuss Crow-AMSAA regression method. Crow-AMSAA IEC will also be simulated.

Solving for λ and β parameters with historic failure events. The next failure time can be calculated as:
t = (N(t)/λ)1/β

The subsequent failure time equation implies a quantitative value. How to evaluate this value in relation to the next failure event of an asset, or what is the certainty that the asset will fail at approaching that time to implement a cost-effective maintenance plan?

Monte Carlo Simulation:

From one asset failure data, for example, here we have an asset with 50 failures data as below:

failure_data = [ 2.23, 25.38, 4.5, 0.97, 16.74, 26.94, 2.93, 28.1,25.74, 28.81,19.16, 20.36, 10.82, 12.33, 6.88, 20.67, 8.8,6.18,24.14, 10.11, 21.56, 27.85, 13.23, 6.21, 17.62, 5.07, 23.3,0.17,8.34, 11.63, 2.97, 9.73, 6.6, 25.06, 2.83, 22.81,16.84, 25.17, 11.56, 0.03, 29.47, 16.63, 19.47, 7.46,9.15,12.53, 28.21, 16.16, 29.7, 15.85]
Note: Here, assuming the failure time unit per day.

From these failure data, the cumulative failures over cumulative time plot is as below:

Figure 1
The next cumulative failure time from Crow-AMSAA regression is approximately 902; Crow-AMSAA IEC predicts the next cumulative failure time is approximately 761. Comparing these two different values, which one is more certainty? The Monte Carlo simulation method answers this question.

From one asset failure data, using Monte Carlo simulation, the algorithm can simulate multiple similar assets, for example, 1000 same or similar assets assuming similar running environment conditions. What will the failure data be? All these 1000 assets will most likely have different failure data. How much difference will failure data be?
Through Monte Carlo simulation, where LOWER_SCALE_FACTOR and UPPER_SCALE_FACTOR establish the lower/upper bounds of a uniform distribution, random failure rations can be generated on the sample asset failure data.

Scale factor controls the size or magnitude of random samples drawn. Defining a low scale factor means the random samples drawn are clustered closely or tightly around the distribution mean. This essentially narrows the range of possible outcomes, making the simulation less random which can provide more precision and certainty in results. However, it may not encompass the entire spectrum of potential outcomes, as it could potentially underestimate outliers and fail to capture genuine variability. This could result in overly optimistic or unrealistic results. The systems being modeled should be considered, along with any limitations and biases, when choosing scale factors.

To prove the approach, we implemented a Monte Carlo simulation of the Crow-AMSAA method using HxGN EAM Flex Python, using the Python numpy library random.uniform method as below:

sample_ratios = numpy.random.uniform(LOWER_SCALE_FACTOR, UPPER_SCALE_FACTOR, 1000)
sample_assets_failure_data = sample_ratios*failure_data

Note: here LOWER_SCALE_FACTOR and UPPER_SCALE_FACTOR needs field expert engineering knowledge to define more meaningful and practical values.

If LOWER_SCALE_FACTOR = 0.5 and UPPER_SCALE_FACTOR = 1.5, one sample ratio could be like this:
[1.02288314, 1.45634595, 0.6981283 , 0.7960544 , 0.59421378,
1.19131256, 1.48215976, 1.46831805, 0.50246922, 1.31796817,
1.04895486, 0.64635312, 0.91661579, 1.30231391, 1.345419 ,
1.18388141, 0.9298947 , 1.4283526 , 1.36883547, 1.4547359 ,
1.09069835, 0.89235755, 0.66190575, 1.19232892, 0.79417586,
1.03352172, 1.36648513, 0.60758453, 0.72135672, 1.29239591,
1.4728285 , 1.47041789, 1.17215485, 0.59997316, 1.00921672,
0.91063229, 1.04947185, 0.64975156, 1.38684941, 1.12959921,
1.24096365, 0.75032886, 0.74375125, 0.79658476, 0.80167098,
0.95490523, 1.34015255, 0.66155976, 0.76567869, 0.93693251]

The corresponding sample failure data will be as below:

[2.28102940e+00, 3.69620602e+01, 3.14157735e+00, 7.72172765e-01,
9.94713863e+00, 3.20939604e+01, 4.34272809e+00, 4.12597371e+01,
1.29335576e+01, 3.79706630e+01, 2.00979752e+01, 1.31597495e+01,
9.91778282e+00, 1.60575305e+01, 9.25648275e+00, 2.44708287e+01,
8.18307337e+00, 8.82721909e+00, 3.30436882e+01, 1.47073800e+01,
2.35154564e+01, 2.48521577e+01, 8.75701308e+00, 7.40436258e+00,
1.39933787e+01, 5.23995514e+00, 3.18391034e+01, 1.03289369e-01,
6.01611504e+00, 1.50305645e+01, 4.37430065e+00, 1.43071661e+01,
7.73622200e+00, 1.50353273e+01, 2.85608332e+00, 2.07715224e+01,
1.76731059e+01, 1.63542467e+01, 1.60319792e+01, 3.38879764e-02,
3.65711988e+01, 1.24779690e+01, 1.44808369e+01, 5.94252229e+00,
7.33528944e+00, 1.19649626e+01, 3.78057034e+01, 1.06908057e+01,
2.27406570e+01, 1.48503804e+01]

After generating 1000 samples data, and calculating the means of the slopes and intercepts, Crow-AMSAA regression reliability growth graph is as below:

Figure 2
Comparing Figure 2 with Figure 1, it is easy to observe that the Monte Carlo simulation method has better results compared with the direct Crow-AMSAA data on the plots, especially for Crow-AMSAA IEC model.

Monte Carlo Simulation Data Distribution:

For 1000 samples simulation, one of the simulated future next failure data distributions from Crow-AMSAA IEC is as below:

Figure 3
From Figure3, about 154 assets will fail at the next cumulative time, around 755 days, and about 110 assets will fail at the next cumulative time, 770 days. Crow-AMSAA IEC predicted the next cumulative failure time is about 761 days. The next failure cumulative percentage is as below:

Figure 4

Confidence Interval (CI) of Monte Carlo Simulated Future Failures:
With 95% confidence interval, for simulated 1000 samples:

Crow-AMSAA regression, the confidence interval (CI) is about: (909.6034543747164, 917.6936196759582), which means that with 95% of assets will fail next time between this range. The predicted value 901 is not within this range.

Similarly, for Crow-AMSAA IEC, the CI range is: (758.9502687272709, 763.6172261296729). The predicted value 761 is within the range.

Looking at the normal cumulative distribution function (CDF) of Crow-AMSAA predictions with simulation data, the normal CDF of regression value 902 is about 42.76%; the normal CDF of IEC value 761 is about 49.48%.

Mean Time Between Failures (MTBF):

The observed 50 failure data MTBF is 14.9 days; Crow-AMSAA regression simulation MTBF is 14.926041886590866 days; Crow-AMSAA IEC simulation MTBF is 14.908653567807114 days.

Different Metrics Comparison as below (some of the simulation result):

For detailed script and calculations, please see with HxGN EAM Python Studio template.

About the authors:

Peng Tang, Software Lead

Peng has over 15 years of data mining software design, analysis and development experience. He is also skilled at heterogeneous network analysis. He is currently working as a software lead at HxGN EAM, mainly on machine learning algorithm development.
He received a Bachelor’s Degree and Master’s Degree in computer science, a Master’s Degree in Mathematics and a Doctor of Philosophy in Information Technology with research on intelligent robotics and automation algorithm.

Marcel van Wingerden, EAM Product Owner

Marcel has over 35 years of experience in the asset management market in various roles, including system analysis and design, project management, system implementation, training, consultancy and support.
Since 1986, he has worked with the HxGN EAM product and helped customers with successful deployment.
Marcel has a Bachelor’s Degree in Business Administration in Management Information Systems.

Julio P. Roque, Vice President, Software Development

Julio is a technology leader with 20+ years of enterprise software experience, and he is equally comfortable building a new product or leading a major digital transformation. Julio joined Hexagon's Asset Lifecycle Intelligence division by the Infor EAM acquisition and has held roles at top enterprise software leaders, including North America and Asia-Pacific Services General Manager at IONA, Senior Director at SAP, and NOLA-APAC General Manager at Oracle.

At Hexagon's Asset Lifecycle Intelligence division, Julio is one of the leaders of the HxGN EAM team. He leads the software development for HxGN EAM Flex Python, Constraint Optimizer, and Asset Investment Planning. He also performs other functions within the HxGN EAM development team.

Julio attended the University of Puerto Rico and the University of Maryland. He is named in over seven patents and is a proud disabled United States Army Veteran.