International Journal of Mathematical, Engineering and Management Sciences

ISSN: 2455-7749

Reliability Importance of Components in a Real-Time Computing System with Standby Redundancy Schemes

Junjun Zheng
Department of Information Engineering, Hiroshima University, Higashi-Hiroshima, Japan.

Hiroyuki Okamura
Department of Information Engineering, Hiroshima University, Higashi-Hiroshima, Japan.

Tadashi Dohi
Department of Information Engineering, Hiroshima University, Higashi-Hiroshima, Japan.

DOI https://dx.doi.org/10.33889/IJMEMS.2018.3.2-007

Received on March 30, 2017
  ;
Accepted on September 27, 2017

Abstract

Component importance analysis is to measure the effect on system reliability of component reliabilities, and is used to the system design from the reliability point of view. On the other hand, to guarantee high reliability of real-time computing systems, redundancy has been widely applied, which plays an important role in enhancing system reliability. One of commonly used type of redundancy is the standby redundancy. However, redundancy increases not only the complexity of a system but also the complexity of associated problems such as common-mode error. In this paper, we consider the component importance analysis of a real-time computing system with warm standby redundancy in the presence of Common-Cause Failures (CCFs). Although the CCFs are known as a risk factor of degradation of system reliability, it is difficult to evaluate the component importance measures in the presence of CCFs analytically. This paper introduces a Continuous-Time Markov Chain (CTMC) model for real-time computing system, and applies the CTMC-based component-wise sensitivity analysis which can evaluate the component importance measures without any structure function of system. In numerical experiments, we evaluate the effect of CCFs by the comparison of system performance measure and component importance in the case of system without CCF with those in the case of system with CCFs. Also, we compare the effect of CCFs on the system in warm and hot standby configurations.

Keywords- Component importance measures, Standby redundancy, Real-time computing system, Common-cause failure, Markov chains.

Citation

Zheng, J., Okamura, H., & Dohi, T. (2018). Reliability Importance of Components in a Real-Time Computing System with Standby Redundancy Schemes. International Journal of Mathematical, Engineering and Management Sciences, 3(2), 64-89. https://dx.doi.org/10.33889/IJMEMS.2018.3.2-007.

Conflict of Interest

Acknowledgements

The first author would like to thank the China Scholarship Council (CSC) for the financial support.

References

Ayers, M. L. (2012). Telecommunications system reliability engineering, theory, and practice. John Wiley & Sons Incorporated.

Birnbaum, Z. W. (1968). On the importance of different components in a multicomponent system (No. TR-54). Washington University Seattle Lab of Statistical Research.

Fleming, K. N. (1975). A redundant model for common model failures in redundant safety systems. In Proceedings of the Sixth Pittsburgh Annual Modeling and Simulation Conference, pp. 579-581, Pittsburgh, PA, USA.

Frank, P. M. (1978). Introduction to system sensitivity theory. Academic Press, New York, NY, USA.

Fricks, R. M., & Trivedi, K. S. (2003). Importance analysis with Markov chains. In Reliability and Maintainability Symposium, 2003. Annual (pp. 89-95). IEEE.

Fricks, R., & Trivedi K. S. (1997). Modeling failure dependencies in reliability analysis using stochastic Petri nets. In Proceedings of the 11th European Simulation Multi-conference (ESM '97), Istanbul: Turkey ACM Press.

Henley, E. J., & Kumamoto, H. (1981). Reliability engineering and risk assessment. Prentice-Hall, Englewood Cliffs, NJ, USA.

Hughes, R. P. (1987). A new approach to common cause failure. Reliability Engineering, 17(3), 211-236.

Jensen, P. A., & Bard, J. F. (2003). Operations research models and methods. John Wiley & Sons Incorporated.

Johnson, B. W. (1988). Design & analysis of fault tolerant digital systems. Addison-Wesley Longman Publishing Company Incorporated.

Kuo, W., & Zhu, X. Y. (2012). Importance measures in reliability, risk, and optimization: principles and applications. John Wiley & Sons Incorporated.

Laplante, P. A. (1997). Real-time systems design and analysis-an engineer’s handbook. 2nd Ed., IEEE Press.

Mosleh, A., Parry, G. W., & Zikria, A. F. (1994). An approach to the analysis of common cause failure data for plant-specific application. Nuclear Engineering and Design, 150(1), 25-47.

Pan, Z., & Nonaka, Y. (1995). Importance analysis for the systems with common cause failures. Reliability Engineering & System Safety, 50(3), 297-300.

Plateau, B., & Stewart, W. J. (2000). Stochastic automata networks. In Computational Probability, edited by Grassmann, W. K., university of Saskatchewan, Boston; London: Kluwer Academic, (pp. 113-151). Springer US.

Rausand, M., & Høyland, A. (2004). System reliability theory: models, statistical method, and applications, 396. John Wiley & Sons Incorporated.

Trivedi, K. S. (2001). Probability and statistics with reliability, queueing, and computer sciences applications. 2nd Ed., John Wiley & Sons, New York.

Zheng, J., Okamura H., & Dohi T. (2015). Availability importance measures for virtualized system with live migration. Applied Mathematics, 6(2), 359-372.

Zheng, J., Okamura H., & Dohi T. (2015). Component importance measures for real-time computing systems in the presence of common-cause failures. In Proceedings of the 21st IEEE Pacific Rim International Symposium on Dependable Computing (PRDC 2015), pp. 301-310, IEEE CPS.

Zheng, J., Okamura, H., & Dohi, T. (2013). A note on sensitivity of transient solutions of continuous-time Markov chains. IEICE Technical Report- The Institute of Electronics, Information and Communication Engineers, 113(44), 25-29.

Privacy Policy| Terms & Conditions