Available:*
Library | Item Barcode | Call Number | Material Type | Item Category 1 | Status |
---|---|---|---|---|---|
Searching... | 30000010058228 | QA76.8.S65 B83 1996 | Open Access Book | Book | Searching... |
On Order
Summary
Summary
This book describes the Splash 2 computing system as designed and built at the Supercomputing Research Center. This is a novel attached processor using Xilinx 4010 FPGAs as its processing elements and whose application programming language is VHDL. This is the first publication that details the complete Splash 2 project -- the hardware and software systems, the architecture and their implementations, and the design process by which the architecture evolved from an earlier version machine. This text allows you to understand why the machine has been engineered in the way it has. In addition to the description of the machine, several applications are described in detail, permitting the reader to gain an understanding of the capabilities and the limitations of this kind of computing device.
The Splash 2 program is significant for two reasons. First, Splash 2 is part of a complete computer system that achieves supercomputer like performance on a number of different applications. The second significant aspect is that this large system is capable of performing real computations on real problems. In order to understand what happens when the application programmer is permitted to design the processor architecture of the machine that execute his programs, it is necessary to see the system as a whole. This book looks in-depth at one of the handful of data points in the design space of this new kind of machine.
Author Notes
Duncan A. Buell NCR Professor of Computer Science and Engineering Dept. of Computer Science and E University of South Carolina.
Table of Contents
Preface | p. xi |
1 Custom Computing Machines: an Introduction | p. 1 |
1.1 Introduction | p. 1 |
1.2 The Context for Splash 2 | p. 4 |
1.2.1 FPGAs | p. 4 |
1.2.2 Architecture | p. 5 |
1.2.3 Programming | p. 6 |
2 The Architecture of Splash 2 | p. 10 |
2.1 Introduction | p. 10 |
2.2 The Building Blocks | p. 11 |
2.3 The System Architecture | p. 12 |
2.4 Data Paths | p. 13 |
2.5 The Splash 2 Array Board | p. 16 |
2.5.1 The Linear Array | p. 16 |
2.5.2 The Splash 2 Crossbar | p. 16 |
2.5.3 Xilinx Chip X0 and Broadcast Mode | p. 17 |
2.6 The Interface Board and Control Features | p. 17 |
3 Hardware Implementation | p. 19 |
3.1 Introduction | p. 19 |
3.2 Development Board Design | p. 21 |
3.3 Interface Board Design | p. 21 |
3.3.1 DMA Channel | p. 23 |
3.3.2 XL and XR | p. 23 |
3.3.3 Interrupts | p. 24 |
3.3.4 Clock | p. 24 |
3.3.5 Programming and Readback | p. 24 |
3.3.6 Miscellaneous Registers | p. 25 |
3.4 Array Board Design | p. 25 |
3.4.1 Processing Element | p. 26 |
3.4.2 Control Element | p. 28 |
3.4.3 External Memory Access | p. 28 |
3.4.4 Crossbar | p. 28 |
3.4.5 Programming and Readback | p. 29 |
3.4.6 Miscellaneous Registers | p. 29 |
4 Splash 2: the Evolution of a New Architecture | p. 31 |
4.1 Splash 1 | p. 31 |
4.2 Splash 2: Thoughts on a Redesign | p. 34 |
4.3 Programming Language | p. 36 |
4.4 Choice of FPGAs | p. 37 |
4.5 Choice of Host and Bus | p. 38 |
4.6 Chip-to-Chip Interconnections | p. 39 |
4.7 Multitasking | p. 42 |
4.8 Chip X0 and Broadcast | p. 43 |
4.9 Other Design Decisions | p. 43 |
5 Software Architecture | p. 46 |
5.1 Introduction | p. 46 |
5.2 Background | p. 47 |
5.3 VHDL as a Programming Language | p. 49 |
5.3.1 History and Purpose of VHDL | p. 50 |
5.3.2 VHDL Language Features | p. 50 |
5.3.3 Problems with VHDL | p. 51 |
5.4 Software Environment | p. 51 |
5.5 Programmer's View of Splash 2 | p. 55 |
5.5.1 Programming Process | p. 55 |
5.5.2 Processing Element View | p. 56 |
5.5.3 Interface Board View | p. 57 |
5.5.4 Host View | p. 57 |
6 Software Implementation | p. 60 |
6.1 Introduction | p. 60 |
6.2 VHDL Environment | p. 60 |
6.2.1 Splash 2 VHDL Library | p. 61 |
6.2.2 Standard Entity Declarations | p. 61 |
6.2.3 Programming Style | p. 64 |
6.3 Splash 2 Simulator | p. 66 |
6.3.1 Structure | p. 66 |
6.3.2 Configuring the Simulator | p. 67 |
6.3.3 Input and Output | p. 68 |
6.3.4 Crossbar and Memory Models | p. 68 |
6.3.5 Hardware Constraints | p. 70 |
6.4 Compilation | p. 70 |
6.4.1 Logic Synthesis | p. 70 |
6.4.2 Physical Mapping | p. 71 |
6.4.3 Debugging Support | p. 71 |
6.5 Runtime System | p. 72 |
6.5.1 T2: A Symbolic Debugger | p. 72 |
6.5.2 Runtime Library | p. 73 |
6.5.3 Device Driver | p. 74 |
6.6 Diagnostics | p. 75 |
7 A Data Parallel Programming Model | p. 77 |
7.1 Introduction | p. 78 |
7.2 Data-parallel Bit C | p. 80 |
7.2.1 dbC Overview | p. 80 |
7.2.2 dbC Example | p. 81 |
7.3 Compiling from dbC to Splash 2 | p. 82 |
7.3.1 Creating a Specialized SIMD Engine | p. 83 |
7.3.2 Generic SIMD Code | p. 84 |
7.3.3 Generating VHDL | p. 84 |
7.4 Global Operations | p. 88 |
7.4.1 Nearest-Neighbor Communication | p. 88 |
7.4.2 Reduction Operations | p. 89 |
7.4.3 Host/Processor Communication | p. 91 |
7.5 Optimization: Macro Instructions | p. 92 |
7.5.1 Creating a Macro Instruction | p. 93 |
7.5.2 Discussion | p. 94 |
7.6 Evaluation: Genetic Database Search | p. 94 |
7.7 Conclusions and Future Work | p. 95 |
8 Searching Genetic Databases on Splash 2 | p. 97 |
8.1 Introduction | p. 97 |
8.1.1 Edit Distance | p. 98 |
8.1.2 Dynamic Programming Algorithm | p. 98 |
8.2 Systolic Sequence Comparison | p. 100 |
8.2.1 Bidirectional Array | p. 100 |
8.2.2 Unidirectional Array | p. 103 |
8.3 Implementation | p. 104 |
8.3.1 Modular Encoding | p. 105 |
8.3.2 Configurable Parameters | p. 106 |
8.3.3 Bidirectional Array | p. 107 |
8.3.4 Unidirectional Array | p. 107 |
8.4 Benchmarks | p. 107 |
8.5 Discussion | p. 108 |
8.6 Conclusions | p. 108 |
9 Text Searching on Splash 2 | p. 110 |
9.1 Introduction | p. 110 |
9.2 The Text Searching Algorithm | p. 111 |
9.3 Description of the Single-Byte Splash Program | p. 113 |
9.4 Timings, Discussion | p. 114 |
9.5 Outline of the 16-bit Approach | p. 115 |
9.6 Conclusions | p. 116 |
10 Fingerprint Matching on Splash 2 | p. 117 |
10.1 Introduction | p. 117 |
10.2 Background | p. 120 |
10.2.1 Pattern Recognition Systems | p. 121 |
10.2.2 Terminology | p. 122 |
10.2.3 Stages in AFIS | p. 123 |
10.3 Splash 2 Architecture and Programming Models | p. 125 |
10.4 Fingerprint Matching Algorithm | p. 125 |
10.4.1 Minutia Matching | p. 126 |
10.4.2 Matching Algorithm | p. 127 |
10.5 Parallel Matching Algorithm | p. 128 |
10.5.1 Preprocessing on the Host | p. 131 |
10.5.2 Computations on Splash | p. 132 |
10.5.3 VHDL Specification for X0 | p. 133 |
10.6 Simulation and Synthesis Results | p. 134 |
10.7 Execution on Splash 2 | p. 137 |
10.7.1 User Interface | p. 137 |
10.7.2 Performance Analysis | p. 137 |
10.8 Conclusions | p. 139 |
11 High-Speed Image Processing With Splash 2 | p. 141 |
11.1 Introduction | p. 141 |
11.2 The VTSplash System | p. 142 |
11.3 Image Processing Terminology and Architectural Issues | p. 143 |
11.4 Case Study: Median Filtering | p. 150 |
11.5 Case Study: Image Pyramid Generation | p. 153 |
11.5.1 Gaussian Pyramid | p. 154 |
11.5.2 Two Implementations for Gaussian Pyramid on Splash 2 | p. 155 |
11.5.3 The Hybrid Pipeline Gaussian Pyramid Structure | p. 157 |
11.5.4 The Laplacian Pyramid | p. 157 |
11.5.5 Implementation of the Laplacian Pyramid on Splash 2 | p. 159 |
11.6 Performance | p. 159 |
11.7 Summary | p. 163 |
12 The Promise and the Problems | p. 166 |
12.1 Some Bottom-Line Conclusions | p. 166 |
12.1.1 High Bandwidth I/O Is a Must | p. 166 |
12.1.2 Memory Is a Must | p. 167 |
12.1.3 Programming Is Possible, and Becoming More So | p. 168 |
12.1.4 The Programming Environment Is Crucial | p. 168 |
12.2 To Where from Here? | p. 169 |
12.3 If Not Splash 3, Then What? | p. 171 |
12.3.1 Architectures | p. 172 |
12.3.2 Custom Processors | p. 173 |
12.3.3 Languages | p. 174 |
12.4 The "Killer" Applications | p. 177 |
12.5 Final Words | p. 178 |
A Splash 2 Development--The Project Manager's Summary | p. 179 |
B An Example Application | p. 186 |
References | p. 190 |