Skip to:Content
|
Bottom
Cover image for Reliability and availability of cloud computing
Title:
Reliability and availability of cloud computing
Personal Author:
Publication Information:
NJ, : Wiley-IEEE Press, 2012.
Physical Description:
xviii, 323 p. : ill. ; 24 cm.
ISBN:
9781118177013
Added Author:

Available:*

Library
Item Barcode
Call Number
Material Type
Item Category 1
Status
Searching...
30000010327990 QA76.585 B394 2012 Open Access Book Book
Searching...

On Order

Summary

Summary

A holistic approach to service reliability and availability of cloud computing

Reliability and Availability of Cloud Computing provides IS/IT system and solution architects, developers, and engineers with the knowledge needed to assess the impact of virtualization and cloud computing on service reliability and availability. It reveals how to select the most appropriate design for reliability diligence to assure that user expectations are met.

Organized in three parts (basics, risk analysis, and recommendations), this resource is accessible to readers of diverse backgrounds and experience levels. Numerous examples and more than 100 figures throughout the book help readers visualize problems to better understand the topic--and the authors present risks and options in bulleted lists that can be applied directly to specific applications/problems.

Special features of this book include:

Rigorous analysis of the reliability and availability risks that are inherent in cloud computing Simple formulas that explain the quantitative aspects of reliability and availability Enlightening discussions of the ways in which virtualized applications and cloud deployments differ from traditional system implementations and deployments Specific recommendations for developing reliable virtualized applications and cloud-based solutions

Reliability and Availability of Cloud Computing is the guide for IS/IT staff in business, government, academia, and non-governmental organizations who are moving their applications to the cloud. It is also an important reference for professionals in technical sales, product management, and quality management, as well as software and quality engineers looking to broaden their expertise.


Author Notes

ERIC BAUER is a reliability engineering manager in the Software, Solutions and Services Group of Alcatel-Lucent. The holder of more than a dozen U.S. patents, he is the author of Design for Reliability: Information and Computer-Based Systems, Beyond Redundancy: How Geographic Redundancy Can Improve Service Availability and Reliability of Computer-Based Systems, and Practical System Reliability, also available from Wiley-IEEE Press.

RANDEE ADAMS is a consulting member of technical staff in the Software, Solutions and Services Group of Alcatel-Lucent and the coauthor of Beyond Redundancy: How Geographic Redundancy Can Improve Service Availability and Reliability of Computer-Based Systems .


Table of Contents

Figuresp. xvii
Tablesp. xxi
Equationsp. xxiii
Introductionp. xxv
I Basicsp. 1
1 Cloud Computingp. 3
1.1 Essential Cloud Characteristicsp. 4
1.2 Common Cloud Characteristicsp. 6
1.3 But What, Exactly, Is Cloud Computing?p. 7
1.4 Service Modelsp. 9
1.5 Cloud Deployment Modelsp. 11
1.6 Roles in Cloud Computingp. 12
1.7 Benefi ts of Cloud Computingp. 14
1.8 Risks of Cloud Computingp. 15
2 Virtualizationp. 16
2.1 Backgroundp. 16
2.2 What Is Virtualization?p. 17
2.3 Server Virtualizationp. 19
2.4 VM Lifecyclep. 23
2.5 Reliability and Availability Risks of Virtualizationp. 28
3 Service Reliability and Service Availabilityp. 29
3.1 Errors and Failuresp. 30
3.2 Eight-Ingredient Frameworkp. 31
3.3 Service Availabilityp. 34
3.4 Service Reliabilityp. 43
3.5 Service Latencyp. 46
3.6 Redundancy and High Availabilityp. 50
3.7 High Availability and Disaster Recoveryp. 56
3.8 Streaming Servicesp. 58
3.9 Reliability and Availability Risks of Cloud Computingp. 62
II Analysisp. 63
4 Analyzing Cloud Reliability and Availabilityp. 65
4.1 Expectations for Service Reliability and Availabilityp. 65
4.2 Risks of Essential Cloud Characteristicsp. 66
4.3 Impacts of Common Cloud Characteristicsp. 70
4.4 Risks of Service Modelsp. 72
4.5 IT Service Management and Availability Risksp. 74
4.6 Outage Risks by Process Areap. 80
4.7 Failure Detection Considerationsp. 83
4.8 Risks of Deployment Modelsp. 87
4.9 Expectations of IaaS Data Centersp. 87
5 Reliability Analysis of Virtualizationp. 90
5.1 Reliability Analysis Techniquesp. 90
5.2 Reliability Analysis of Virtualization Techniquesp. 95
5.3 Software Failure Rate Analysisp. 100
5.4 Recovery Modelsp. 101
5.5 Application Architecture Strategiesp. 108
5.6 Availability Modeling of Virtualized Recovery Optionsp. 110
6 Hardware Reliability, Virtualization, and Service Availabilityp. 116
6.1 Hardware Downtime Expectationsp. 116
6.2 Hardware Failuresp. 117
6.3 Hardware Failure Ratep. 119
6.4 Hardware Failure Detectionp. 121
6.5 Hardware Failure Containmentp. 122
6.6 Hardware Failure Mitigationp. 122
6.7 Mitigating Hardware Failures via Virtualizationp. 124
6.8 Virtualized Networksp. 127
6.9 MTTR of Virtualized Hardwarep. 129
6.10 Discussionp. 131
7 Capacity and Elasticityp. 132
7.1 System Load Basicsp. 132
7.2 Overload, Service Reliability, and Service Availabilityp. 135
7.3 Traditional Capacity Planningp. 136
7.4 Cloud and Capacityp. 137
7.5 Managing Online Capacityp. 144
7.6 Capacity-Related Service Risksp. 147
7.7 Capacity Management Risksp. 153
7.8 Security and Service Availabilityp. 157
7.9 Architecting for Elastic Growth and Degrowthp. 162
8 Service Orchestration Analysisp. 164
8.1 Service Orchestration Definitionp. 164
8.2 Policy-Based Managementp. 166
8.3 Cloud Managementp. 168
8.4 Service OrchestrationÆs Role in Risk Mitigationp. 169
9 Geographic Distribution, Georedundancy, and Disaster Recoveryp. 174
9.1 Geographic Distribution versus Georedundancyp. 175
9.2 Traditional Disaster Recoveryp. 175
9.3 Virtualization and Disaster Recoveryp. 177
9.4 Cloud Computing and Disaster Recoveryp. 178
9.5 Georedundancy Recovery Modelsp. 180
9.6 Cloud and Traditional Collateral Benefits of Georedundancyp. 180
9.7 Discussionp. 182
III Recommendationsp. 183
10 Applications, Solutions, and Accountabilityp. 185
10.1 Application Configuration Scenariosp. 185
10.2 Application Deployment Scenariop. 187
10.3 System Downtime Budgetsp. 188
10.4 End-to-End Solutions Considerationsp. 197
10.5 Attributability for Service Impairmentsp. 201
10.6 Solution Service Measurementp. 204
10.7 Managing Reliability and Service of Cloud Computingp. 207
11 Recommendations for Architecting a Reliable Systemp. 209
11.1 Architecting for Virtualization and Cloudp. 209
11.2 Disaster Recoveryp. 216
11.3 IT Service Management Considerationsp. 217
11.4 Many Distributed Clouds versus Fewer Huge Cloudsp. 224
11.5 Minimizing Hardware-Attributed Downtimep. 225
11.6 Architectural Optimizationsp. 231
12 Design for Reliability of Virtualized Applicationsp. 244
12.1 Design for Reliabilityp. 244
12.2 Tailoring DfR for Virtualized Applicationsp. 246
12.3 Reliability Requirementsp. 248
12.4 Qualitative Reliability Analysisp. 256
12.5 Quantitative Reliability Budgeting and Modelingp. 259
12.6 Robustness Testingp. 260
12.7 Stability Testingp. 267
12.8 Field Performance Analysisp. 268
12.9 Reliability Roadmapp. 269
12.10 Hardware Reliabilityp. 270
13 Design for Reliability of Cloud Solutionsp. 271
13.1 Solution Design for Reliabilityp. 271
13.2 Solution Scope and Expectationsp. 273
13.3 Reliability Requirementsp. 275
13.4 Solution Modeling and Analysisp. 279
13.5 Element Reliability Diligencep. 285
13.6 Solution Testing and Validationp. 285
13.7 Track and Analyze Field Performancep. 288
13.8 Other Solution Reliability Diligence Topicsp. 292
14 Summaryp. 296
14.1 Service Reliability and Service Availabilityp. 297
14.2 Failure Accountability and Cloud Computingp. 299
14.3 Factoring Service Downtimep. 301
14.4 Service Availability Measurement Pointsp. 303
14.5 Cloud Capacity and Elasticity Considerationsp. 306
14.6 Maximizing Service Availabilityp. 306
14.7 Reliability Diligencep. 309
14.8 Concluding Remarksp. 310
Abbreviationsp. 311
Referencesp. 314
About the authorsp. 318
Indexp. 319
Go to:Top of Page