Friday, December 18, 2009

Summary




I l@ve RuBoard










Summary


In this chapter we discuss the definition and measurements of system availability. System availability is perhaps one of the most important quality attributes in the modern era of Internet and network computing. We reference a couple of industry studies to show the status of system availability. We explore the relationships among reliability, availability, and the traditional defect level measurement in software development. The concept and measurement of availability is broader than reliability and defect level. It encompasses intrinsic product quality (reliability or defect level), customer impact, and recovery and maintenance strategies. System availability is a customer-oriented concept and measure.


It is clear that the current quality of typical software is far from adequate in meeting the requirements of high availability by businesses and the society.


There are several ways to collect customer outage data for quality improvement: direct customer input, data from the service process, and special customer surveys. Root cause analyses of customers' outages and a process similar to the defect prevention process (discussed in Chapter 2), are highly recommended as key elements of an outage reduction plan. Quality improvement from this process should include both corrective actions for the current problems and preventive actions for long-term outage reduction.


Finally, to complete the closed-loop process in our discussions, we cite several in-process metrics that are pertinent to outage and availability. We highly recommend that the tracking of system crashes and hangs during the final test phase be adopted by all projects. This is a simple and critical metric and can be implemented in different ways, ranging from complicated automated tracking to paper and pencil, by large as well as small teams.







    I l@ve RuBoard



    No comments: