Available:*
Library | Item Barcode | Call Number | Material Type | Item Category 1 | Status |
---|---|---|---|---|---|
Searching... | 30000010061311 | QA76.58 V73 2002 | Open Access Book | Book | Searching... |
On Order
Summary
Summary
Cluster computers provide a low-cost alternative to multiprocessor systems for many applications. Building a cluster computer is within the reach of any computer user with solid C programming skills and a knowledge of operating systems, hardware, and networking. This book leads you through the design and assembly of such a system, and shows you how to mearsure and tune its overall performance.
A cluster computer is a multicomputer, a network of node computers running distributed software that makes them work together as a team. Distributed software turns a collection of networked computers into a distributed system. It presents the user with a single-system image and gives the system its personality. Software can turn a network of computers into a transaction processor, a supercomputer, or even a novel design of your own.
Some of the techniques used in this book's distributed algorithms might be new to many readers, so several of the chapters are dedicated to such topics. You will learn about the hardware needed to network several PCs, the operating system files that need to be changed to support that network, and the multitasking and the interprocess communications skills needed to put the network to good use.
Finally, there is a simple distributed transaction processing application in the book. Readers can experiment with it, customize it, or use it as a basis for something completely different.
Author Notes
Alex Vrenios is the founder and principal analyst at the Distributed Systems Research Lab
Table of Contents
Introduction | p. 1 |
1 Linux Cluster Computer Fundamentals | p. 3 |
Why a Cluster? | p. 3 |
Architectural Design Features | p. 5 |
Scalability | p. 6 |
High Availability | p. 6 |
Fault Tolerance | p. 6 |
Feature Interdependence | p. 7 |
Cluster Applications | p. 7 |
Supercomputers | p. 7 |
Transaction Processors | p. 8 |
The Sample System: The Master-Slave Interface | p. 8 |
Reader Skills and OS Information | p. 9 |
The Remainder of This Book | p. 9 |
Summary | p. 11 |
Further Reading | p. 12 |
2 Multiprocessor Architecture | p. 13 |
Alternative Computer Architectures | p. 14 |
Overlap with Multiple CPUs | p. 15 |
A Taxonomy for Multiprocessors | p. 17 |
Tightly Versus Loosely Coupled Multiprocessors | p. 19 |
Distributed Shared Memory Systems | p. 21 |
Cluster Architectures | p. 21 |
Hardware Options | p. 22 |
Node Computers | p. 22 |
Interconnection Networks | p. 23 |
Software Options | p. 26 |
Performance Issues | p. 26 |
Our Cluster System's Architecture | p. 27 |
Summary | p. 28 |
Further Reading | p. 29 |
3 Inter-Process Communication | p. 31 |
Subtasking with fork and execl | p. 31 |
Sending Signals and Handling Received Signals | p. 35 |
Using Shared Memory Areas | p. 36 |
Using Semaphores with Shared Data | p. 39 |
Messaging: UDP Versus TCP | p. 45 |
IPC with UDP | p. 45 |
IPC with TCP/IP | p. 45 |
Internetworking Protocol Addresses | p. 51 |
Messaging Across the Network | p. 52 |
Automatically Starting Remote Servers | p. 60 |
Summary | p. 64 |
Further Reading | p. 64 |
4 Assembling the Hardware for Your Cluster | p. 65 |
Node Processors and Accessories | p. 65 |
Hardware Accessories | p. 66 |
Network Media and Interfaces | p. 68 |
Switches or Hubs? | p. 68 |
Network Cabling | p. 69 |
Implementing an OS | p. 70 |
Adding Network Support to the OS Installation | p. 71 |
Our Cluster System's Network Topology | p. 71 |
Summary | p. 72 |
Further Reading | p. 73 |
5 Configuring the Relevant Operating System Files | p. 75 |
A Brief Review of the Cluster Configuration | p. 75 |
The Linux root User | p. 76 |
Logging in as root at the System Prompt | p. 76 |
Altering Linux System Files as root | p. 77 |
Changes to the /etc/hosts File | p. 77 |
Changes to the /etc/fstab and /etc/exports Files | p. 78 |
Using NFS with the /etc/fstab and /etc/exports Files | p. 79 |
Remote Access Security | p. 80 |
Optional Addition of a /home/chief/.rhosts File | p. 80 |
Optional Changes to the /etc/passwd File | p. 81 |
Remote Reference Commands | p. 82 |
Summary | p. 87 |
Further Reading | p. 88 |
6 Configuring a User Environment for Software Development | p. 89 |
An Overview of the Linux File System | p. 89 |
Your /home/chief Home Directory | p. 91 |
Using the C Compiler | p. 91 |
Using the make Utility | p. 96 |
Backup and Recovery | p. 98 |
Summary | p. 99 |
Further Reading | p. 100 |
7 The Master-Slave Interface Software Architecture | p. 101 |
The Client Process | p. 101 |
The Serial Server Process | p. 105 |
The Concurrent Server Process | p. 108 |
The Distributed Server Process | p. 114 |
How the Master-Slave Interface Works | p. 119 |
System Limitations | p. 121 |
Summary | p. 121 |
Further Reading | p. 121 |
8 External Performance Measurement and Analysis | p. 123 |
Query Generation | p. 124 |
Inter-Arrival Time Distributions | p. 126 |
Checking for an Accurate Response | p. 130 |
Estimating and Displaying Network Utilization | p. 131 |
Displaying Response Time Statistics | p. 140 |
The External Performance of Your MSI Server | p. 147 |
Summary | p. 150 |
Further Reading | p. 151 |
9 Internal Performance Measurement and Timing | p. 153 |
Profiling Software Execution | p. 154 |
Distributed System Execution Profiling | p. 155 |
Event Timing Techniques | p. 157 |
Execution Phase Timing Plots | p. 159 |
System Performance Improvements | p. 162 |
Final MSI Performance Results | p. 165 |
Summary | p. 168 |
Further Reading | p. 169 |
10 Robust Software | p. 171 |
Alarm Exits | p. 172 |
Timeouts | p. 173 |
Subtask Restarts | p. 175 |
Main Task Restarts | p. 177 |
Reattaching Shared Memory | p. 178 |
Reliable UDP Communication | p. 178 |
Summary | p. 179 |
Further Reading | p. 179 |
11 Further Explorations | p. 181 |
Beowulf-like Supercomputers | p. 182 |
Supercomputer Applications | p. 183 |
Ad Hoc Peer-to-Peer Networks | p. 186 |
Future Applications | p. 187 |
Summary | p. 190 |
Further Reading | p. 190 |
12 Conclusions | p. 193 |
Multiprocessor Architectures | p. 194 |
Cluster Configuration | p. 194 |
Distributed Applications | p. 195 |
Final Comments | p. 195 |
Appendix | |
A The Source Code | p. 197 |
Query-Generating Client | p. 198 |
Master-Slave Interface | p. 205 |
Near Real-Time Performance | p. 225 |
Makefile | p. 233 |
Index | p. 235 |