Title:
Practical parallel rendering
Publication Information:
Natick, MA : AK Peters, 2002.
Physical Description:
xiii, 370 p. : ill. (some col.) ; 24 cm.
ISBN:
9781568811796
Available:*
Library | Item Barcode | Call Number | Material Type | Item Category 1 | Status |
---|---|---|---|---|---|
Searching... | 30000010193567 | QA76.58 P72 2002 | Open Access Book | Book | Searching... |
On Order
Summary
Summary
Meeting the growing demands for speed and quality in rendering computer graphics images requires new techniques. Practical parallel rendering provides one of the most practical solutions. This book addresses the basic issues of rendering within a parallel or distributed computing environment, and considers the strengths and weaknesses of multiprocessor machines and networked render farms for graphics rendering. Case studies of working applications demonstrate, in detail, practical ways of dealing with complex issues involved in parallel processing.
Author Notes
Alan Charmers
Table of Contents
Preface | p. xi |
I Parallel Rendering | p. 1 |
1 Introduction to Parallel Processing | p. 3 |
1.1 Concepts | p. 4 |
1.1.1 Dependencies | p. 5 |
1.1.2 Scalability | p. 6 |
1.1.3 Control | p. 7 |
1.2 Classification of Parallel Systems | p. 7 |
1.2.1 Parallel versus Distributed Systems | p. 13 |
1.3 The Relationship of Tasks and Data | p. 14 |
1.3.1 Inherent Difficulties | p. 15 |
1.3.2 Tasks | p. 16 |
1.3.3 Data | p. 16 |
1.4 Evaluating Parallel Implementations | p. 17 |
1.4.1 Realization Penalties | p. 18 |
1.4.2 Performance Metrics | p. 20 |
1.4.3 Efficiency | p. 25 |
2 Task Scheduling and Data Management | p. 31 |
2.1 Problem Decomposition | p. 31 |
2.1.1 Algorithmic Decomposition | p. 32 |
2.1.2 Domain Decomposition | p. 32 |
2.1.3 Abstract Definition of a Task | p. 34 |
2.1.4 System Architecture | p. 34 |
2.2 Computational Models | p. 36 |
2.2.1 Data Driven Model | p. 36 |
2.2.2 Demand Driven Model | p. 41 |
2.2.3 Hybrid Computational Model | p. 46 |
2.3 Task Management | p. 46 |
2.3.1 Task Definition and Granularity | p. 46 |
2.3.2 Task Distribution and Control | p. 48 |
2.3.3 Algorithmic Dependencies | p. 49 |
2.4 Task Scheduling Strategies | p. 53 |
2.4.1 Data Driven Task Management Strategies | p. 53 |
2.4.2 Demand Driven Task Management Strategies | p. 54 |
2.4.3 Task Manager Process | p. 59 |
2.4.4 Distributed Task Management | p. 62 |
2.4.5 Preferred Bias Task Allocation | p. 64 |
2.5 Data Management | p. 65 |
2.5.1 World Model of the Data: No Data Management Required | p. 66 |
2.5.2 Virtual Shared Memory | p. 67 |
2.5.3 The Data Manager | p. 69 |
2.5.4 Consistency | p. 76 |
2.5.5 Minimizing the Impact of Remote Data Requests | p. 80 |
2.5.6 Data Management for Multistage Problems | p. 85 |
3 Parallel Global Illumination Algorithms | p. 89 |
3.1 Rendering | p. 90 |
3.2 Parallel Processing | p. 92 |
3.3 Ray Tracing | p. 94 |
3.4 Spatial Subdivisions | p. 96 |
3.4.1 Parallel Ray Tracing | p. 99 |
3.4.2 Demand Driven Ray Tracing | p. 99 |
3.4.3 Data Parallel Ray Tracing | p. 104 |
3.4.4 Hybrid Scheduling | p. 107 |
3.5 Radiosity | p. 109 |
3.5.1 Form Factors | p. 110 |
3.5.2 Parallel Radiosity | p. 111 |
3.6 Full Matrix Radiosity | p. 112 |
3.6.1 Setting Up the Matrix of Form Factors | p. 113 |
3.6.2 Solving the Matrix of Form Factors | p. 115 |
3.6.3 Group Iterative Methods | p. 115 |
3.7 Progressive Refinement | p. 116 |
3.7.1 Parallel Shooting | p. 118 |
3.8 Hierarchical Radiosity | p. 120 |
3.8.1 Parallel Hierarchical Radiosity | p. 121 |
3.9 Particle Tracing | p. 122 |
3.9.1 Parallel Particle Tracing | p. 123 |
3.9.2 Density Estimation | p. 124 |
3.10 Data Distribution and Data Locality | p. 125 |
3.10.1 Data Distribution | p. 126 |
3.10.2 Visibility Preprocessing | p. 127 |
3.10.3 Environment Mapping | p. 128 |
3.10.4 Geometric Simplification | p. 128 |
3.10.5 Directional Caching | p. 130 |
3.10.6 Reordering Computations | p. 130 |
3.11 Discussion | p. 131 |
4 Overview of Parallel Graphics Hardware | p. 133 |
4.1 Pipelining | p. 133 |
4.2 Parallelism in Graphics Cards | p. 136 |
4.2.1 3DLABS Products | p. 136 |
4.2.2 Hewlett-Packard Products | p. 139 |
4.2.3 SGI Products (Silicon Graphics, Inc.) | p. 142 |
4.2.4 UNC Products | p. 146 |
4.2.5 Pomegranate Graphics Chip | p. 149 |
4.3 Conclusion | p. 151 |
5 Coherence in Ray Tracing | p. 153 |
5.1 Scene Analysis | p. 154 |
5.1.1 Distribution of Data Accesses | p. 155 |
5.1.2 Temporal Characteristics | p. 158 |
5.1.3 Temporal Behaviour per Ray Type | p. 162 |
5.1.4 Conclusions | p. 164 |
5.2 Animation Analysis | p. 165 |
5.2.1 Background | p. 166 |
5.2.2 Related Work | p. 167 |
5.2.3 Frame Coherence Algorithm | p. 169 |
5.2.4 Parallel Frame Coherence Algorithm | p. 174 |
5.2.5 Results | p. 176 |
5.2.6 Summary | p. 184 |
II Case Studies | p. 185 |
6 Interactive Ray Tracing on a Supercomputer | p. 187 |
6.1 System Architecture | p. 188 |
6.1.1 Conventional Operation | p. 189 |
6.1.2 Frameless Rendering | p. 192 |
6.1.3 Performance | p. 193 |
6.2 Ray Tracing for Volume Visualization | p. 194 |
6.2.1 Background | p. 195 |
6.2.2 Traversal Optimizations | p. 197 |
6.2.3 Algorithms | p. 201 |
6.2.4 Results | p. 205 |
6.2.5 Discussion | p. 211 |
6.3 Ray Tracing for Terrain Visualization | p. 214 |
6.4 Conclusions | p. 215 |
7 Interactive Ray Tracing on PCs | p. 217 |
7.1 Introduction | p. 217 |
7.1.1 Previous Work | p. 220 |
7.2 An Optimized Ray Tracing Implementation | p. 221 |
7.2.1 Code Complexity | p. 221 |
7.2.2 Caching | p. 222 |
7.2.3 Coherence through Packets of Rays | p. 223 |
7.2.4 Parallelism through SIMD Extensions | p. 223 |
7.3 Ray Triangle Intersection Computation | p. 223 |
7.3.1 Optimized Barycentric Coordinate Test | p. 223 |
7.3.2 Evaluating Instruction Level Parallelism | p. 224 |
7.3.3 SIMD Barycentric Coordinate Test | p. 224 |
7.4 BSP Traversal | p. 226 |
7.4.1 Traversal Algorithm | p. 226 |
7.4.2 Memory Layout for Better Caching | p. 228 |
7.4.3 Traversal Overhead | p. 229 |
7.5 SIMD Phong Shading | p. 229 |
7.6 Performance of the Ray Tracing Engine | p. 231 |
7.6.1 Comparison to Other Ray Tracers | p. 231 |
7.6.2 Reflection and Shadow Rays | p. 233 |
7.6.3 Comparison with Rasterization Hardware | p. 234 |
7.7 Interactive Ray Tracing on PC Clusters | p. 236 |
7.7.1 Overview | p. 238 |
7.8 Distributed Data Management | p. 239 |
7.8.1 Explicit Data Management | p. 239 |
7.8.2 Preprocessing | p. 241 |
7.9 Load Balancing | p. 241 |
7.10 Implementation | p. 242 |
7.11 Results | p. 243 |
7.12 Conclusions | p. 246 |
8 The "Kilauea" Massively Parallel Ray Tracer | p. 249 |
8.1 What Is the Kilauea Project? | p. 249 |
8.2 Basic Idea | p. 250 |
8.3 System Design | p. 251 |
8.3.1 Hardware Environment | p. 251 |
8.3.2 Pthreads | p. 252 |
8.3.3 Message Passing | p. 252 |
8.3.4 Front-End Process | p. 253 |
8.3.5 Launching Kilauea | p. 253 |
8.3.6 Single Executable Binary | p. 254 |
8.3.7 Multiframe Rendering | p. 254 |
8.3.8 Global Illumination Renderer | p. 255 |
8.4 The ShotData File Format | p. 256 |
8.5 Parallel Ray Tracing | p. 261 |
8.6 Implementation | p. 268 |
8.6.1 Low-Level Data Structure | p. 268 |
8.6.2 MPI (Message Passing Interface) Layer | p. 277 |
8.6.3 Tcl Command Interface | p. 279 |
8.6.4 Rank and Task | p. 281 |
8.6.5 Details of Ray Tracing | p. 288 |
8.6.6 Shading Computation | p. 291 |
8.6.7 Photon Map Method | p. 306 |
8.6.8 Things to Note in Shading Computation | p. 310 |
8.6.9 Development in General | p. 314 |
8.7 Rendering Results | p. 316 |
8.7.1 Sample 1: Quatro | p. 316 |
8.7.2 Sample 2: Jeep | p. 319 |
8.7.3 Sample 3: Jeep 8 | p. 321 |
8.7.4 Consideration of Rendering Results | p. 322 |
8.8 Conclusion | p. 323 |
8.9 Future Plans and Tasks | p. 325 |
9 Parallel Ray Tracing on a Chip | p. 329 |
9.1 The Smart Memories Chip | p. 329 |
9.2 The SHARP Ray Tracer | p. 331 |
9.3 Simulation Results | p. 333 |
9.3.1 Caching | p. 334 |
9.3.2 Estimated Performance | p. 335 |
9.4 Conclusions | p. 336 |
Bibliography | p. 337 |
Index | p. 363 |
Author Biographies | p. 369 |