Available:*
Library | Item Barcode | Call Number | Material Type | Item Category 1 | Status |
---|---|---|---|---|---|
Searching... | 30000010114267 | QA76.88 H535 2006 | Open Access Book | Book | Searching... |
On Order
Summary
Summary
The state of the art of high-performance computing
Prominent researchers from around the world have gathered to present the state-of-the-art techniques and innovations in high-performance computing (HPC), including:
* Programming models for parallel computing: graph-oriented programming (GOP), OpenMP, the stages and transformation (SAT) approach, the bulk-synchronous parallel (BSP) model, Message Passing Interface (MPI), and Cilk
* Architectural and system support, featuring the code tiling compiler technique, the MigThread application-level migration and checkpointing package, the new prefetching scheme of atomicity, a new "receiver makes right" data conversion method, and lessons learned from applying reconfigurable computing to HPC
* Scheduling and resource management issues with heterogeneous systems, bus saturation effects on SMPs, genetic algorithms for distributed computing, and novel task-scheduling algorithms
* Clusters and grid computing: design requirements, grid middleware, distributed virtual machines, data grid services and performance-boosting techniques, security issues, and open issues
* Peer-to-peer computing (P2P) including the proposed search mechanism of hybrid periodical flooding (HPF) and routing protocols for improved routing performance
* Wireless and mobile computing, featuring discussions of implementing the Gateway Location Register (GLR) concept in 3G cellular networks, maximizing network longevity, and comparisons of QoS-aware scatternet scheduling algorithms
* High-performance applications including partitioners, running Bag-of-Tasks applications on grids, using low-cost clusters to meet high-demand applications, and advanced convergent architectures and protocols
High-Performance Computing: Paradigm and Infrastructure is an invaluable compendium for engineers, IT professionals, and researchers and students of computer science and applied mathematics.
Author Notes
LAURENCE T. YANG is a Professor of Computer Science, St. Francis Xavier University, Canada. Dr. Yang served as the vice chair of IEEE Technical Committee of Supercomputing Applications (TCSA) until 2004 and as an executive committee member of the IEEE Technical Committee of Scalable Computing (TCSC) since 2004. Dr. Yang has also received many awards, including the Distinguished Contribution Award, 2004; Technical Achievement Award, 2004; Outstanding Achievement Award, 2002, University Research/Publication/Teaching Award, 2000-2001/2002-2003/2003-2004, and Canada Foundation for Innovation (CFI) Award, 2003.
MINYI GUO received his PhD from the University of Tsukuba, Japan. He is currently an Associate Professor in the Department of Computer Software at the University of Aizu, Japan. In addition, Dr. Guo is Editor in Chief of the International Journal of Embedded Systems, and has written and edited books in the area of parallel and distributed computing, as well as embedded and ubiquitous computing.
Table of Contents
Preface |
Contributors |
Part 1 Programming Model |
1 ClusterGOP: A High-Level Programming Environment for ClustersFan Chan and Jiannong Cao and Minyi Guo |
1.1 Introduction |
1.2 GOP Model and ClusterGOP Architecture |
1.3 VisualGOP |
1.4 The ClusterGOP Library |
1.5 MPMD Programming Support |
1.6 Programming Using ClusterGOP |
1.7 Summary |
2 The Challenge of Providing A High-Level Programming Model for High-Performance ComputingBarbara Chapman |
2.1 Introduction |
2.2 HPC Architectures |
2.3 HPC Programming Models: The First Generation |
2.4 The Second generation of HPC Programming Models |
2.5 OpenMP for DMPs |
2.6 Experiments with OpenMP on DMPs |
2.7 Conclusions |
3 SAT: Toward Structured Parallelism Using SkeletonsSergei Gorlatch |
3.1 Introduction |
3.2 SAT: A Methodology Outline |
3.3 Skeletons and Collective Operations |
3.4 Case Study: Maximum Segment SUM (MSS) |
3.5 Performance Aspect in SAT |
3.6 Conclusions and Related Work |
4 Bulk-Synchronous Parallelism: An Emerging Paradigm of High-Performance ComputingAlexander Tiskin |
4.1 The BSP Model |
4.2 BSP Programming |
4.3 Conclusions |
5 Cilk Versus MPI: Comparing Two Parallel Programming Styles on Heterogenous SystemsJohn Morris and KyuHo Lee and JunSeong Kim |
5.1 Introduction |
5.2 Experiments |
5.3 Results |
5.4 Conclusion |
6 Nested Parallelism and Pipelining in OpenMPMarc Gonzalez and E. Ayguade and X. Martorell and J. Labarta |
6.1 Introduction |
6.2 OpenMP Extensions for Nested Parallelism |
6.3 OpenMP Extensions for Thread Synchronization |
6.4 Summary |
7 OpenMP for Chip MultiprocessorsFeng Liu and Vipin Chaudhary |
7.1 Introduction |
7.2 3SoC Architecture Overview |
7.3 The OpenMP Conpiler/Translator |
7.4 Extensions to OpenMP for DSEs |
7.5 Optimization for OpenMP |
7.6 Implementation |
7.7 Performance Evaluation |
7.8 Conclusions |
Part 2 Architectural And System Support |
8 Compiler and Run-Time Parallelization Techniques for Scientific Computations on Distributed-Memory Parallel ComputersPeiZong Lee and Cheien-Min Wang and Jan-Jan Wu |
8.1 Introduction |
8.2 Background Material |
8.3 Compiling Regular Programs on DMPCs |
8.4 Compiler and Run-Time Support for Irregular Programs |
8.5 Library Support for Irregular Applications |
8.6 Related Works |
8.7 Concluding Remarks |
9 Enabling Partial-Cache Line Prefetching Through Data CompressionYoutao Zhang and Rajiv Gupta |
9.1 Introduction |
9.2 Motivation of Partial Cache-Line Perfetching |
9.3 Cache Design Details |
9.4 Experimental Results |
9.5 Related Work |
9.6 Conclusion |
10 MPI Atomicity and Concurrent Overlapping I/OWei-Keng Liao and Alok Choudhary and Kenin Coloma and Lee Ward and Eric Russell and Neil Pundit |
10.1 Introduction |
10.2 Concurrent Overlapping I/O |
10.3 Implementation Strategies |
10.4 Experiment Results |
10.5 Summary |
11 Code Tiling: One Size Fits AllJingling Xue and Qingguang Huang |
11.1 Introduction |
11.2 Cache Model |
11.3 Code Tiling |
11.4 Data Tiling |
11.5 Finding Optimal Tile Sizes |
11.6 Experimental Results |
11.7 Related Work |
11.8 Conclusion |
12 Data Conversion for Heterogeneous Migration/CheckpointingHai Jiang and Vipin Chaudhary and John Paul Walters |
12.1 Introduction |
12.2 Migration and Checkpointing |
12.3 Data Conversion |
12.4 Coarse-Grain Tagged RMR in MigThread |
12.5 Microbenchmarks and Experiments |
12.6 Related Work |
12.7 Conclusions and Future Work |
13 Receiving-Message Prediction and Its Speculative ExecutionTakanobu Baba and Takashi Yokota and Kamemitsu Ootsu and Fumihitto Furukawa and Yoshiyuki Iwamoto |
13.1 Background |
13.2 Receiving-Message Prediction Meth |