Available:*
Library | Item Barcode | Call Number | Material Type | Item Category 1 | Status |
---|---|---|---|---|---|
Searching... | 30000003155888 | QA76.87 G34 1994 | Open Access Book | Book | Searching... |
On Order
Summary
Summary
Neural Network Learning and Expert Systems is the first book to present a unified and in-depth development of neural network learning algorithms and neural network expert systems. Especially suitable for students and researchers in computer science, engineering, and psychology, this text and reference provides a systematic development of neural network learning algorithms from a computational perspective, coupled with an extensive exploration of neural network expert systems which shows how the power of neural network learning can be harnessed to generate expert systems automatically.
Features include a comprehensive treatment of the standard learning algorithms (with many proofs), along with much original research on algorithms and expert systems. Additional chapters explore constructive algorithms, introduce computational learning theory, and focus on expert system applications to noisy and redundant problems.
For students there is a large collection of exercises, as well as a series of programming projects that lead to an extensive neural network software package. All of the neural network models examined can be implemented using standard programming languages on a microcomputer.
Reviews 1
Choice Review
Gallant's book deserves to go to the head of the list of the many excellent introductions to neural networks. Well organized, the nonspecialist sections provide a coherent and comprehensive introduction and the starred specialist sections take the reader to the frontiers of research. A list of research problems is also supplied. The first major section clearly and logically discusses fundamental principles; the second covers learning in single-layer networks. The third section proceeds to multilayer networks and includes a thorough discussion of back-propagation. The fourth section, a special feature, covers pioneering work by Gallant on neural network expert systems. Because of the lucid and engaging style of writing and the clarity of exposition, the serious general reader would be able to read at least the nonspecialist sections with pleasure and profit. A worthy acquisition for public libraries, and definitely a must for all college and university libraries. All levels. R. Bharath; Northern Michigan University
Table of Contents
Foreword |
I Basics |
1 Introduction and Important Definitions |
1.1 Why Connectionist Models? |
1.1.1 The Grand Goals of Al and Its Current Impasse |
1.1.2 The Computational Appeal of Neural Networks |
1.2 The Structure of Connectionist Models |
1.2.1 Network Properties |
1.2.2 Cell Properties |
1.2.3 Dynamic Properties |
1.2.4 Learning Properties |
1.3 Two Fundamental Models: Multilayer Perceptrons (MLP's) and Backpropagation Networks (BPN's) |
1.3.1 Multilayer Perceptrons (MLP's) |
1.3.2 Backpropagation Networks (BPN's) |
1.4 Gradient Descent |
1.4.1 The Algorithm |
1.4.2 Practical Problems |
1.4.3 Comments |
1.5 Historic and Bibliographic Notes |
1.5.1 Early Work |
1.5.2 The Decline of the Perceptron |
1.5.3 The Rise of Connectionist Research |
1.5.4 Other Bibliographic Notes |
1.6 Exercises |
1.7 Programming Project |
2 Representation Issues |
2.1 Representing Boolean Functions |
2.1.1 Equivalence of {{+1, -1,0}} and {{1,0}} Forms |
2.1.2 Single-Cell Models |
2.1.3 Nonseparable Functions |
2.1.4 Representing Arbitrary Boolean Functions |
2.1.5 Representing Boolean Functions Using Continuous Connectionist Models |
2.2 Distributed Representations |
2.2.1 Definition |
2.2.2 Storage Efficiency and Resistance to Error |
2.2.3 Superposition |
2.2.4 Learning |
2.3 Feature Spaces and ISA Relations |
2.3.1 Feature Spaces |
2.3.2 Concept-Function Unification |
2.3.3 ISA Relations |
2.3.4 Binding |
2.4 Representing Real-Valued Functions |
2.4.1 Approximating Real Numbers by Collections of Discrete Cells |
2.4.2 Precision |
2.4.3 Approximating Real Numbers by Collections of Continuous Cells |
2.5 Example: Taxtime! |
2.6 Exercises |
2.7 Programming Projects |
II Learning In Single-Layer Models |
3 Perceptron Learning and the Pocket Algorithm |
3.1 Perceptron Learning for Separable Sets of Training Examples |
3.1.1 Statement of the Problem |
3.1.2 Computing the Bias |
3.1.3 The Perceptron Learning Algorithm |
3.1.4 Perceptron Convergence Theorem |
3.1.5 The Perceptron Cycling Theorem |
3.2 The Pocket Algorithm for Nonseparable Sets of Training Examples |
3.2.1 Problem Statement |
3.2.2 Perceptron Learning Is Poorly Behaved |
3.2.3 The Pocket Algorithm |
3.2.4 Ratchets |
3.2.5 Examples |
3.2.6 Noisy and Contradictory Sets of Training Examples |
3.2.7 Rules |
3.2.8 Implementation Considerations |
3.2.9 Proof of the Pocket Convergence Theorem |
3.3 Khachiyan's Linear Programming Algorithm |
3.4 Exercises |
3.5 Programming Projects |
4 Winner-Take-All Groups or Linear Machines |
4.1 Generalizes Single-Cell Models |
4.2 Perceptron Learning for Winner-Take-All Groups |
4.3 The Pocket Algorithm for Winner-Take-All Groups |
4.4 Kessler's Construction, Perceptron Cycling, and the Pocket Algorithm Proof |
4.5 Independent Training |
4.6 Exercises |
4.7 Programming Projects |
5 Autoassociators and One-Shot Learning |
5.1 Linear Autoassociators and the Outer-Product Training Rule |
5.2 Anderson's BSB Model |
5.3 Hopfieid's Model |
5.3.1 Energy |
5.4 The Traveling Salesman Problem |
5.5 The Cohen-Grossberg Theorem |
5.6 Kanerva's Model |
5.7 Autoassociative Filtering for Feedforward Networks |
5.8 Concluding Remarks |
5.9 Exercises |
5.10 Programming Projects |
6 Mean Squared Error (MSE) Algorithms |
6.1 Motivation |
6.2 MSE Approximations |
6.3 The Widrow-Hoff Rule or LMS Algorithm |
6.3.1 Number of Training Examples Required |
6.4 Adaline |
6.5 Adaptive Noise Cancellation |
6.6 Decision-Directed Learning |
6.7 Exercises |
6.8 Programming Projects |
7 Unsupervised Learning |
7.1 Introduction |
7.1.1 No Teacher |
7.1.2 Clustering Algorithms |
7.2 k-Means Clustering |
7.2.1 The Algorithm |
7.2.2 Comments |
7.3 Topology-Preserving Maps |
7.3.1 Introduction |
7.3.2 The Algorithm |
7.3.3 Example |
7.3.4 Demonstrations |
7.3.5 Dimensionality, Neighborhood Size, and Final Comments |
7.4 Art1 |
7.4.1 Important Aspects of the Algorithm |
7.4.2 The Algorithm |
7.5 Art2 |
7.6 Using Clustering Algorithms for Supervised Learning |
7.6.1 Labeling Clusters |
7.6.2 ARTMAP or Supervised ART |
7.7 Exercises |
7.8 Programming Projects |
III Learning In Multilayer Models |
8 The Distributed Method and Radial Basis Functions |
8.1 Rosenblatt's Approach |
8.2 The Distributed Method |
8.2.1 Cover's Formula |
8.2.2 Robustness-Preserving Functions |
8.3 Examples |
8.3.1 Hepatobiliary Data |
8.3.2 Artificial Data |
8.4 How Many Cells? |
8.4.1 Pruning Data |
8.4.2 Leave-One-Out |
8.5 Radial Basis Functions |
8.6 A Variant: The Anchor Algorithm |
8.7 Scaling, Multiple Outputs, and Parallelism |
8.7.1 Scaling Properties |
8.7.2 Multiple Outputs and Parallelism |
8.7.3 A Computational Speedup for Learning |
8.7.4 Concluding Remarks |
8.8 Exercises |
8.9 Programming Projects |
9 Computational Learning Theory and the BRD Algorithm |
9.1 Introduction to Computational Learning Theory |
9.1.1 PAC-Learning |
9.1.2 Bounded Distributed Connectionist Networks |
9.1.3 Probabilistic Bounded Distributed Concepts |
9.2 A Learning Algorithm for Probabilistic Bounded Distributed Concepts |
9.3 The BRD Theorem |
9.3.1 Polynomial Learning |
9.4 Noisy Data and Fallback Estimates |
9.4.1 Vapnik-Chervonenkis Bounds |
9.4.2 Hoeffding and Chernoff Bounds |
9.4.3 Pocket Algorithm |
9.4.4 Additional Training Examples |
9.5 Bounds for Single-Layer Algorithms |
9.6 Fitting Data by Limiting the Number of Iterations |
9.7 Discussion |
9.8 Exercise |
9.9 Programming Project |
10 Constructive Algorithms |
10.1 The Tower and Pyramid Algorithms |
10.1.1 The Tower Algorithm |
10.1.2 Example |
10.1.3 Proof of Convergence |
10.1.4 A Computational Speedup |
10.1.5 The Pyramid Algorithm |
10.2 The Cascade-Correlation Algorithm |
10.3 The Tiling Algorithm |
10.4 The Upstart Algorithm |
10.5 Other Constructive Algorithms and Pruning |
10.6 Easy Learning Problems |
10.6.1 Decomposition |
10.6.2 Expandable Network Problems |
10.6.3 Limits of Easy Learning |
10.7 Exercises |
10.8 Programming Projects |
11 Backpropagation |
11.1 The Backpropagation Algorithm |
11.1.1 Statement of the Algorithm |
11.1.2 A Numerical Example |
11.2 Derivation |
11.3 Practical Considerations |
11.3.1 Determination of Correct Outputs |
11.3.2 Initial Weights |
11.3.3 Choice of r |
11.3.4 Momentum |
11.3.5 Network Topology |
11.3.6 Local Minima |
11.3.7 Activations in [0,1] versus [-1, 1] |
11.3.8 Update after Every Training Example |
11.3.9 Other Squashing Functions |
11.4 NP-Completeness |
11.5 Comments |
11.5.1 Overuse |
11.5.2 Interesting Intermediate Cells |
11.5.3 Continuous Outputs |
11.5.4 Probability Outputs |
11.5.5 Using Backpropagation to Train Multilayer Perceptrons |
11.6 Exercises |
11.7 Programming Projects |
12 Backpropagation: Variations and Applications |
12.1 NETtalk |
12.1.1 Input and Output Representations |
12.1.2 Experiments |
12.1.3 Comments |
12.2 Backpropagation through Time |
12.3 Handwritten Character Recognition |
12.3.1 Neocognitron Architecture |
12.3.2 The Network |
12.3.3 Experiments |
12.3.4 Comments |
12.4 Robot Manipulator with Excess Degrees of Freedom |
12.4.1 The Problem |
12.4.2 Training the Inverse Network |
12.4.3 Plan Units |
12.4.4 Comments |
12.5 Exercises |
12.6 Programming Projects |
13 Simulated Annealing and Boltzmann Machines |
13.1 Simulated Annealing |
13.2 Boltzmann Machines |
13.2.1 The Boltzmann Model |
13.2.2 Boltzmann Learning |
13.2.3 The Boltzmann Algorithm and Noise Clamping |
13.2.4 Example: The 4-2-4 Encoder Problem |
13.3 Remarks |
13.4 Exercises |
13.5 Programming Project |
IV Neural Network Expert Systems |
14 Expert Systems and Neural Networks |
14.1 Expert Systems |
14.1.1 What Is an Expert System? |
14.1.2 Why Expert Systems? |
14.1.3 Historically Important Expert Systems |
14.1.4 Critique of Conventional Expert Systems |
14.2 Neural Network Decision Systems |
14.2.1 Example: Diagnosis of Acute Coronary Occlusion |
14.2.2 Example: Autonomous Navigation |
14.2.3 Other Examples |
14.2.4 Decision Systems versus Expert Systems |
14.3 MACIE, and an Example Problem |
14.3.1 Diagnosis and Treatment of Acute Sarcophagal Disease |
14.3.2 Network Generation |
14.3.3 Sample Run of Macie |
14.3.4 Real-Valued Variables and Winner-Take-All Groups |
14.3.5 Not-Yet-Known versus Unavailable Variables |
14.4 Applicability of Neural Network Expert Systems |
14.5 Exercise |
14.6 Programming Projects |
15 Details of the MACIE System |
15.1 Inferencing and Forward Chaining |
15.1.1 Discrete Multilayer Perceptron Models |
15.1.2 Continuous Variables |
15.1.3 Winner-Take-All Groups |
15.1.4 Using Prior Probabilities for More Aggressive Inferencing |
15.2 Confidence Estimation |
15.2.1 A Confidence Heuristic Prior to Inference |
15.2.2 Confidence in Inferences |
15.3 Information Acquisition and Backward Chaining |
15.4 Concluding Comment |
15.5 Exercises |
15.6 Programming Projects |
16 Noise, Redundancy, Fault Detection, and Bayesian Decision Theory |
16.1 The High Tech Lemonade Corporation's Problem |
16.2 The Deep Model and the Noise Model |
16.3 Generating the Expert System |
16.4 Probabilistic Analysis |
16.5 Noisy Single-Pattern Boolean Fault Detection Problems |
16.6 Convergence Theorem |
16.7 Comments |
16.8 Exercises |
16.9 Programming Projects |
17 Extracting Rules from networks |
17.1 Why Rules? |
17.2 What Kind of Rules? |
17.2.1 Criteria |
17.2.2 Inference Justifications versus Rule Sets |
17.2.3 Which Variables in Conditions |
17.3 Inference Justifications |
17.3.1 MACIE's Algorithm |
17.3.2 The Removal Algorithm |
17.3.3 Key Factor Justifications |
17.3.4 Justifications for Continuous Models |
17.4 Rule Sets |
17.4.1 Limiting the Number of Conditions |
17.4.2 Approximating Rules |
17.5 Conventional + Neural Network Expert Systems |
17.5.1 Debugging an Expert System Knowledge Base |
17.5.2 The Short-Rule Debugging Cycle |
17.6 Concluding Remarks |
17.7 Exercises |
17.8 Programming Projects |
Appendix Representation Comparisons |
A.1 DNF Expressions and Polynomial Representability |
A.1.1 DNF Expressions |
A.1.2 Polynomial Representability |
A.1.3 Space Comparison of MLP and DNF Representations |
A.1.4 Speed Comparison of MLP and DNF Representations |
A.1.5 MLP versus DNF Representations |
A.2 Decision Trees |
A.2.1 Representing Decision Trees by MLP's |
A.2.2 Speed Comparison |
A.2.3 Decision Trees versus MLP's |
A.3 p-lDiagrams |
A.4 Symmetric Functions and Depth Complexity |
A.5 Concluding Remarks |
A.6 Exercises |
Bibliography |
Index |