Dependable computing systems : paradigms, performance issues, and applications

Title:

Series:

Wiley series on parallel and distributed computing

Publication Information:

Hoboken, NJ : John-Wiley and Sons, 2005

ISBN:

9780471674221

Subject Term:

Fault-tolerant computing

Fault-tolerant computing -- Case studies

Computers -- Reliability

Added Author:

Diab, Hassan B.

Zomaya, Albert Y.

Available:*

Library	Item Barcode	Call Number	Material Type	Item Category 1	Status
Searching... PSZ JB	30000010099379	QA76.9.F38 D46 2005	Open Access Book	Book	Searching... Unknown

A team of recognized experts leads the way to dependable computing systems

With computers and networks pervading every aspect of daily life, there is an ever-growing demand for dependability. In this unique resource, researchers and organizations will find the tools needed to identify and engage state-of-the-art approaches used for the specification, design, and assessment of dependable computer systems.

The first part of the book addresses models and paradigms of dependable computing, and the second part deals with enabling technologies and applications. Tough issues in creating dependable computing systems are also tackled, including:
* Verification techniques
* Model-based evaluation
* Adjudication and data fusion
* Robust communications primitives
* Fault tolerance
* Middleware
* Grid security
* Dependability in IBM mainframes
* Embedded software
* Real-time systems

Each chapter of this contributed work has been authored by a recognized expert. This is an excellent textbook for graduate and advanced undergraduate students in electrical engineering, computer engineering, and computer science, as well as a must-have reference that will help engineers, programmers, and technologists develop systems that are secure and reliable.

Author Notes

HASSAN B. DIAB, PhD, is Professor of Electrical and Computer Engineering, Faculty of Engineering and Architecture, American University of Beirut (AUB). He is currently Dean of the School of Engineering at AUB and Acting President of Dhofar University, Sultanate of Oman. He is the Associate Editor of Simulation: Transactions of the Society for Modeling and Simulation International and a founding member of the Arab Computer Society.

ALBERT Y. ZOMAYA, PhD, is the CISCO Systems Chair Professor of Internetworking, School of Information Technologies, The University of Sydney, and Deputy Director for Information Technology of the Sydney University Biological Informatics and Technology Centre. Dr. Zomaya has been the chair of the IEEE Technical Committee on Parallel Processing and has been awarded the IEEE Computer Society's Meritorious Service Award.

Reviews 1

Choice Review

Computer science defines dependability as the ability to deliver a trusted service and to avoid more frequent and severe failures than are acceptable. Attributes of dependability are availability, reliability, safety, confidentiality, integrity, and maintainability. Threats include faults, errors, and failures. Means to attain dependability include fault prevention, fault tolerance, fault removal, and fault forecasting. The growing reliance on computing systems makes research in and applications of dependability increasingly important. Editors Zomaya (Univ. of Sydney; editor in chief, "Wiley Book Series on Parallel and Distributed Computing," of which the book under review is part) and Diab (American Univ. in Beirut) have produced a solid collection of research papers on the specification, design, and assessment of dependable computer systems. A balance is struck between theoretical papers on fundamentals and practical papers on solutions using case studies, discussions of pros and cons, and lessons learned. Topics encompass verification techniques, tolerating arbitrary failures, robust wireless sensor networks, safety critical systems, dependability evaluation, voting, telemedicine, tracking errors, network resilience, safeguarding critical infrastructures, adaptive metaheuristics for routing, and reconfigurable computing. A foundation work in the field is Dependability: Basic Concepts and Terminology in English, French, German, Italian, and Japanese, ed. by J. C. Laprie (1992). ^BSumming Up: Recommended. Graduate students through professionals. M. Mounts Dartmouth College

Masahiro Fujita and Satoshi Komatsu and Hiroshi SaitoAssia Doudou and Benoit Garbinato and Rachid GuerraouiAndrea Bondavalli and Silvano Chiaradonna and Felicita di GiandomenicoBehrooz ParhamiAmol Bakshi and Viktor K. PrasannaArun K. SomaniFelix C. Gartner and Stefan PleischLorenzo StriginiAli E. Abdallah and Jonathan P. Bowen and Nimal NissankeDenis Gracanin and Mohamed Eltoweissy and Stephan Olariu and Ashraf WadaaKishor S. Trivedi and Archana Sathaye and Srinivasan RamaniJoao Gabriel Silva and Henrique MadeiraStephan Olariu and Kurt Maly and Edwin C. Foudriat and Sameh M. Yamany and Thomas LuckenbachLisa SpainhowerMartin Hiller and Arshad Jhumka and Neeraj SuriMohamed YounisBjarne E. Helvik and Otto WittnerDavid Gamez and Simin Nadjm-Tehrani and John Bigham and Claudio Balducelli and Kalle Burbeck and Tobias ChysslerGeyong Min and Mohamed Ould-Khaoua and Demetres D. Kouvatsos and Irfan U. AwanAlbert Y. Zomaya and Tysun Chan and Miro KraetzlHassan B. DiabMohamed Younis and I-Hong Yeh and Nicholas Kyriakopoulos and Nikitas Alexandridis and Tarek El-Ghazawi

Preface	p. xxiii
Contributors	p. xxxv
Acknowledgments	p. xxxix
Part I Models and Paradigms	p. 1
1 Formal Verification Techniques for Digital Systems	p. 3
1.1 Introduction	p. 3
1.2 Basic Techniques for Formal Verification	p. 4
1.3 Verification Techniques for Combinational Circuit Equivalence	p. 7
1.4 Verification Techniques for Sequential Circuits	p. 14
1.5 Summary	p. 24
References	p. 24
2 Tolerating Arbitrary Failures With State Machine Replication	p. 27
2.1 Introduction	p. 27
2.2 System Model	p. 31
2.3 Total Order Broadcast	p. 32
2.4 Weak Interactive Consistency	p. 36
2.5 Muteness Failure Detector	p. 44
2.6 Concluding Remarks	p. 52
References	p. 55
3 Model-Based Evaluation as a Support to the Design of Dependable Systems	p. 57
3.1 Introduction	p. 57
3.2 The Role of Model-Based Evaluation in the Development of Dependable Systems	p. 58
3.3 Dependability Modeling Methodologies and Tools	p. 61
3.4 Analytical Modeling to Support Design Decisions	p. 68
3.5 Analytical Modeling to Support Fault Removal During Operational Life	p. 76
3.6 Summary	p. 82
References	p. 82
4 Voting: A Paradigm for Adjudication and Data Fusion in Dependable Systems	p. 87
4.1 Introduction	p. 87
4.2 Voting in Dependable Systems	p. 88
4.3 Voting Schemes and Problems	p. 94
4.4 Voting for Data Fusion	p. 98
4.5 Implementation Issues	p. 102
4.6 Unifying Concepts	p. 107
4.7 Conclusion	p. 110
References	p. 111
5 Robust Communication Primitives for Wireless Sensor Networks	p. 115
5.1 Introduction	p. 115
5.2 Defining Realistic Models	p. 117
5.3 Our System Model	p. 119
5.4 Permutation Routing in a Single-hop Topology: State-of-the-Art	p. 121
5.5 An Energy-Efficient Protocol Using a Low-Power Control Channel	p. 125
5.6 Our Routing Protocol for a Faulty Network	p. 132
5.7 Our Generalized Protocol for a Multichannel Network	p. 135
5.8 Concluding Remarks	p. 140
References	p. 140
6 System-Level Diagnosis and Implications in Current Context	p. 143
6.1 Issues in Large and Complex Computing Systems	p. 143
6.2 System-Level Diagnosis	p. 145
6.3 Classification of Diagnosable Systems	p. 148
6.4 Diagnosability Algorithms	p. 157
6.5 Diagnosis Algorithms	p. 160
6.6 Application of System-Level Diagnosis Algorithm	p. 165
6.7 Summary and Conclusions	p. 166
References	p. 167
7 Predicate Detection in Asynchronous Systems With Crash Failures	p. 171
7.1 Introduction	p. 171
7.2 Predicate Detection in Fault-Free Environments	p. 173
7.3 Failures and Failure Detection	p. 177
7.4 Predicate Detection in Faulty Environments	p. 183
7.5 Solving Predicate Detection in Faulty Environments	p. 194
7.6 Conclusion	p. 209
References	p. 211
8 Fault Tolerance Against Design Faults	p. 213
8.1 Introduction	p. 213
8.2 Examples and Principles	p. 215
8.3 Potential and Actual Benefits	p. 225
8.4 Design Solutions	p. 230
8.5 Summary	p. 236
References	p. 238
9 Formal Methods for Safety Critical Systems	p. 243
9.1 Introduction	p. 243
9.2 Specification of Safety	p. 245
9.3 Historical Background	p. 247
9.4 Safety	p. 248
9.5 Application Areas	p. 253
9.6 Specification Framework	p. 256
9.7 System State and Behavior	p. 262
9.8 Discussion	p. 265
9.9 Conclusion	p. 268
References	p. 269
Part II Enabling Technologies and Applications	p. 273
10 Dependability Support in Wireless Sensor Networks	p. 275
10.1 Motivation and Background	p. 276
10.2 Service Centric Model	p. 279
10.3 Conclusion	p. 283
References	p. 283
11 Availability Modeling in Practice	p. 285
11.1 Introduction	p. 285
11.2 Modeling Approaches	p. 286
11.3 Composite Availability and Performance Model	p. 292
11.4 Digital Equipment Corporation Case Study	p. 297
11.5 Conclusion	p. 315
References	p. 315
12 Experimental Dependability Evaluation	p. 319
12.1 Field Measurement	p. 321
12.2 Fault Injection	p. 323
12.3 Robustness Testing	p. 337
12.4 Recent Developments: Dependability Benchmarking	p. 340
12.5 Conclusion	p. 342
References	p. 343
13 A Dependable Architecture for Telemedicine in Support of Disaster Relief	p. 349
13.1 Introduction	p. 349
13.2 Telemedicine-State of the Art	p. 350
13.3 The WIRM System Architecture	p. 352
13.4 A Novel 3D Data Compression Technique	p. 356
13.5 Interactive Remote Visualization	p. 358
13.6 An Overview of H3M-Our Wireless Architecture	p. 359
13.7 Concluding Remarks	p. 366
References	p. 366
14 An Overview of IBM Mainframe Dependable Computing: From System/360 to Series	p. 369
14.1 Introduction	p. 369
14.2 Error Detection and Fault Isolation	p. 375
14.3 Instruction Level Retry	p. 380
14.4 Online Repair	p. 386
14.5 Summary	p. 391
References	p. 392
15 Tracking the Propagation of Data Errors in Software	p. 395
15.1 Introduction	p. 395
15.2 Target System Model	p. 396
15.3 Overview of the Tool Suite	p. 397
15.4 Setup: Experiment Design and Target Instrumentation	p. 401
15.5 Injection: Running Experiments	p. 407
15.6 Analysis: Obtaining Error Propagation Characteristics	p. 408
15.7 Example Results Generated by Propane	p. 409
15.8 Propane's Attributes and Main Characteristics	p. 414
15.9 Summary	p. 415
References	p. 416
16 Integrated Reliable Real-Time Systems	p. 419
16.1 Background	p. 421
16.2 Integration Issues	p. 425
16.3 Few Forward Steps	p. 429
16.4 An Example Aerospace Application	p. 432
16.5 Conclusion	p. 442
References	p. 443
17 Network Resilience by Emergent Behavior from Simple Autonomous Agents	p. 449
17.1 Introduction	p. 449
17.2 Network Resilience	p. 450
17.3 Handling Routing and Resources in Networks by Emergence	p. 457
17.4 Cross-Entropy Based Path Finding	p. 460
17.5 Finding "Best-Effort" Primary/Backup Paths	p. 468
17.6 Discussion	p. 473
17.7 Concluding Remarks	p. 475
References	p. 475
18 Safeguarding Critical Infrastructures	p. 479
18.1 Introduction	p. 479
18.2 Attacks, Failures, and Accidents	p. 480
18.3 Solutions	p. 483
18.4 The Safeguard Architecture	p. 486
18.5 Future Work	p. 497
18.6 Conclusion	p. 497
References	p. 498
19 Impact of Traffic Self-Similarity on the Performance of Routing Algorithms in Multicomputer Systems	p. 501
19.1 Introduction	p. 502
19.2 The k-ary n-Cube and Dimension-Ordered Routing	p. 504
19.3 Modeling of Traffic Self-Similarity	p. 506
19.4 The Analytical Model	p. 507
19.5 Impact of Self-Similar Traffic on Routing Performance	p. 518
19.6 Conclusions	p. 519
References	p. 520
Appendix 19.1 Notation	p. 523
20 Some Observations on Adaptive Meta-Heuristics for Routing in Datagram Networks	p. 525
20.1 Introduction	p. 525
20.2 The Routing Problem	p. 526
20.3 Genetic Algorithms and Routing	p. 532
20.4 Genetic Routing Protocol Design	p. 536
20.5 Genetic Routing Protocol Implementation	p. 547
20.6 Results and Analysis	p. 552
20.7 Conclusions	p. 560
References	p. 561
21 Reconfigurable Computing for Cryptography	p. 563
21.1 Introduction	p. 564
21.2 Reconfigurable Computing	p. 565
21.3 AES Cryptography	p. 576
21.4 Case Study: The Twofish Cipher on a Dynamic RC System	p. 579
21.5 Future of RC	p. 589
21.6 Conclusion	p. 590
References	p. 591
22 Dependability of Reconfigurable Computing	p. 597
22.1 FPGA Preliminaries	p. 598
22.2 FPGA Fault Taxonomy	p. 603
22.3 Handling FPGA Failures	p. 608
22.4 Conclusion and Open Issues	p. 621
References	p. 622
Index	p. 627

Available:*

On Order

Summary

Summary

Author Notes

Reviews 1

Choice Review

Table of Contents