Cover image for Neural network learning and expert systems
Title:
Neural network learning and expert systems
Personal Author:
Publication Information:
Cambridge, Mass. : MIT Pr., 1994
ISBN:
9780262071451

Available:*

Library
Item Barcode
Call Number
Material Type
Item Category 1
Status
Searching...
30000003155888 QA76.87 G34 1994 Open Access Book Book
Searching...

On Order

Summary

Summary

Neural Network Learning and Expert Systems is the first book to present a unified and in-depth development of neural network learning algorithms and neural network expert systems. Especially suitable for students and researchers in computer science, engineering, and psychology, this text and reference provides a systematic development of neural network learning algorithms from a computational perspective, coupled with an extensive exploration of neural network expert systems which shows how the power of neural network learning can be harnessed to generate expert systems automatically.

Features include a comprehensive treatment of the standard learning algorithms (with many proofs), along with much original research on algorithms and expert systems. Additional chapters explore constructive algorithms, introduce computational learning theory, and focus on expert system applications to noisy and redundant problems.

For students there is a large collection of exercises, as well as a series of programming projects that lead to an extensive neural network software package. All of the neural network models examined can be implemented using standard programming languages on a microcomputer.


Reviews 1

Choice Review

Gallant's book deserves to go to the head of the list of the many excellent introductions to neural networks. Well organized, the nonspecialist sections provide a coherent and comprehensive introduction and the starred specialist sections take the reader to the frontiers of research. A list of research problems is also supplied. The first major section clearly and logically discusses fundamental principles; the second covers learning in single-layer networks. The third section proceeds to multilayer networks and includes a thorough discussion of back-propagation. The fourth section, a special feature, covers pioneering work by Gallant on neural network expert systems. Because of the lucid and engaging style of writing and the clarity of exposition, the serious general reader would be able to read at least the nonspecialist sections with pleasure and profit. A worthy acquisition for public libraries, and definitely a must for all college and university libraries. All levels. R. Bharath; Northern Michigan University


Table of Contents

Foreword
I Basics
1 Introduction and Important Definitions
1.1 Why Connectionist Models?
1.1.1 The Grand Goals of Al and Its Current Impasse
1.1.2 The Computational Appeal of Neural Networks
1.2 The Structure of Connectionist Models
1.2.1 Network Properties
1.2.2 Cell Properties
1.2.3 Dynamic Properties
1.2.4 Learning Properties
1.3 Two Fundamental Models: Multilayer Perceptrons (MLP's) and Backpropagation Networks (BPN's)
1.3.1 Multilayer Perceptrons (MLP's)
1.3.2 Backpropagation Networks (BPN's)
1.4 Gradient Descent
1.4.1 The Algorithm
1.4.2 Practical Problems
1.4.3 Comments
1.5 Historic and Bibliographic Notes
1.5.1 Early Work
1.5.2 The Decline of the Perceptron
1.5.3 The Rise of Connectionist Research
1.5.4 Other Bibliographic Notes
1.6 Exercises
1.7 Programming Project
2 Representation Issues
2.1 Representing Boolean Functions
2.1.1 Equivalence of {{+1, -1,0}} and {{1,0}} Forms
2.1.2 Single-Cell Models
2.1.3 Nonseparable Functions
2.1.4 Representing Arbitrary Boolean Functions
2.1.5 Representing Boolean Functions Using Continuous Connectionist Models
2.2 Distributed Representations
2.2.1 Definition
2.2.2 Storage Efficiency and Resistance to Error
2.2.3 Superposition
2.2.4 Learning
2.3 Feature Spaces and ISA Relations
2.3.1 Feature Spaces
2.3.2 Concept-Function Unification
2.3.3 ISA Relations
2.3.4 Binding
2.4 Representing Real-Valued Functions
2.4.1 Approximating Real Numbers by Collections of Discrete Cells
2.4.2 Precision
2.4.3 Approximating Real Numbers by Collections of Continuous Cells
2.5 Example: Taxtime!
2.6 Exercises
2.7 Programming Projects
II Learning In Single-Layer Models
3 Perceptron Learning and the Pocket Algorithm
3.1 Perceptron Learning for Separable Sets of Training Examples
3.1.1 Statement of the Problem
3.1.2 Computing the Bias
3.1.3 The Perceptron Learning Algorithm
3.1.4 Perceptron Convergence Theorem
3.1.5 The Perceptron Cycling Theorem
3.2 The Pocket Algorithm for Nonseparable Sets of Training Examples
3.2.1 Problem Statement
3.2.2 Perceptron Learning Is Poorly Behaved
3.2.3 The Pocket Algorithm
3.2.4 Ratchets
3.2.5 Examples
3.2.6 Noisy and Contradictory Sets of Training Examples
3.2.7 Rules
3.2.8 Implementation Considerations
3.2.9 Proof of the Pocket Convergence Theorem
3.3 Khachiyan's Linear Programming Algorithm
3.4 Exercises
3.5 Programming Projects
4 Winner-Take-All Groups or Linear Machines
4.1 Generalizes Single-Cell Models
4.2 Perceptron Learning for Winner-Take-All Groups
4.3 The Pocket Algorithm for Winner-Take-All Groups
4.4 Kessler's Construction, Perceptron Cycling, and the Pocket Algorithm Proof
4.5 Independent Training
4.6 Exercises
4.7 Programming Projects
5 Autoassociators and One-Shot Learning
5.1 Linear Autoassociators and the Outer-Product Training Rule
5.2 Anderson's BSB Model
5.3 Hopfieid's Model
5.3.1 Energy
5.4 The Traveling Salesman Problem
5.5 The Cohen-Grossberg Theorem
5.6 Kanerva's Model
5.7 Autoassociative Filtering for Feedforward Networks
5.8 Concluding Remarks
5.9 Exercises
5.10 Programming Projects
6 Mean Squared Error (MSE) Algorithms
6.1 Motivation
6.2 MSE Approximations
6.3 The Widrow-Hoff Rule or LMS Algorithm
6.3.1 Number of Training Examples Required
6.4 Adaline
6.5 Adaptive Noise Cancellation
6.6 Decision-Directed Learning
6.7 Exercises
6.8 Programming Projects
7 Unsupervised Learning
7.1 Introduction
7.1.1 No Teacher
7.1.2 Clustering Algorithms
7.2 k-Means Clustering
7.2.1 The Algorithm
7.2.2 Comments
7.3 Topology-Preserving Maps
7.3.1 Introduction
7.3.2 The Algorithm
7.3.3 Example
7.3.4 Demonstrations
7.3.5 Dimensionality, Neighborhood Size, and Final Comments
7.4 Art1
7.4.1 Important Aspects of the Algorithm
7.4.2 The Algorithm
7.5 Art2
7.6 Using Clustering Algorithms for Supervised Learning
7.6.1 Labeling Clusters
7.6.2 ARTMAP or Supervised ART
7.7 Exercises
7.8 Programming Projects
III Learning In Multilayer Models
8 The Distributed Method and Radial Basis Functions
8.1 Rosenblatt's Approach
8.2 The Distributed Method
8.2.1 Cover's Formula
8.2.2 Robustness-Preserving Functions
8.3 Examples
8.3.1 Hepatobiliary Data
8.3.2 Artificial Data
8.4 How Many Cells?
8.4.1 Pruning Data
8.4.2 Leave-One-Out
8.5 Radial Basis Functions
8.6 A Variant: The Anchor Algorithm
8.7 Scaling, Multiple Outputs, and Parallelism
8.7.1 Scaling Properties
8.7.2 Multiple Outputs and Parallelism
8.7.3 A Computational Speedup for Learning
8.7.4 Concluding Remarks
8.8 Exercises
8.9 Programming Projects
9 Computational Learning Theory and the BRD Algorithm
9.1 Introduction to Computational Learning Theory
9.1.1 PAC-Learning
9.1.2 Bounded Distributed Connectionist Networks
9.1.3 Probabilistic Bounded Distributed Concepts
9.2 A Learning Algorithm for Probabilistic Bounded Distributed Concepts
9.3 The BRD Theorem
9.3.1 Polynomial Learning
9.4 Noisy Data and Fallback Estimates
9.4.1 Vapnik-Chervonenkis Bounds
9.4.2 Hoeffding and Chernoff Bounds
9.4.3 Pocket Algorithm
9.4.4 Additional Training Examples
9.5 Bounds for Single-Layer Algorithms
9.6 Fitting Data by Limiting the Number of Iterations
9.7 Discussion
9.8 Exercise
9.9 Programming Project
10 Constructive Algorithms
10.1 The Tower and Pyramid Algorithms
10.1.1 The Tower Algorithm
10.1.2 Example
10.1.3 Proof of Convergence
10.1.4 A Computational Speedup
10.1.5 The Pyramid Algorithm
10.2 The Cascade-Correlation Algorithm
10.3 The Tiling Algorithm
10.4 The Upstart Algorithm
10.5 Other Constructive Algorithms and Pruning
10.6 Easy Learning Problems
10.6.1 Decomposition
10.6.2 Expandable Network Problems
10.6.3 Limits of Easy Learning
10.7 Exercises
10.8 Programming Projects
11 Backpropagation
11.1 The Backpropagation Algorithm
11.1.1 Statement of the Algorithm
11.1.2 A Numerical Example
11.2 Derivation
11.3 Practical Considerations
11.3.1 Determination of Correct Outputs
11.3.2 Initial Weights
11.3.3 Choice of r
11.3.4 Momentum
11.3.5 Network Topology
11.3.6 Local Minima
11.3.7 Activations in [0,1] versus [-1, 1]
11.3.8 Update after Every Training Example
11.3.9 Other Squashing Functions
11.4 NP-Completeness
11.5 Comments
11.5.1 Overuse
11.5.2 Interesting Intermediate Cells
11.5.3 Continuous Outputs
11.5.4 Probability Outputs
11.5.5 Using Backpropagation to Train Multilayer Perceptrons
11.6 Exercises
11.7 Programming Projects
12 Backpropagation: Variations and Applications
12.1 NETtalk
12.1.1 Input and Output Representations
12.1.2 Experiments
12.1.3 Comments
12.2 Backpropagation through Time
12.3 Handwritten Character Recognition
12.3.1 Neocognitron Architecture
12.3.2 The Network
12.3.3 Experiments
12.3.4 Comments
12.4 Robot Manipulator with Excess Degrees of Freedom
12.4.1 The Problem
12.4.2 Training the Inverse Network
12.4.3 Plan Units
12.4.4 Comments
12.5 Exercises
12.6 Programming Projects
13 Simulated Annealing and Boltzmann Machines
13.1 Simulated Annealing
13.2 Boltzmann Machines
13.2.1 The Boltzmann Model
13.2.2 Boltzmann Learning
13.2.3 The Boltzmann Algorithm and Noise Clamping
13.2.4 Example: The 4-2-4 Encoder Problem
13.3 Remarks
13.4 Exercises
13.5 Programming Project
IV Neural Network Expert Systems
14 Expert Systems and Neural Networks
14.1 Expert Systems
14.1.1 What Is an Expert System?
14.1.2 Why Expert Systems?
14.1.3 Historically Important Expert Systems
14.1.4 Critique of Conventional Expert Systems
14.2 Neural Network Decision Systems
14.2.1 Example: Diagnosis of Acute Coronary Occlusion
14.2.2 Example: Autonomous Navigation
14.2.3 Other Examples
14.2.4 Decision Systems versus Expert Systems
14.3 MACIE, and an Example Problem
14.3.1 Diagnosis and Treatment of Acute Sarcophagal Disease
14.3.2 Network Generation
14.3.3 Sample Run of Macie
14.3.4 Real-Valued Variables and Winner-Take-All Groups
14.3.5 Not-Yet-Known versus Unavailable Variables
14.4 Applicability of Neural Network Expert Systems
14.5 Exercise
14.6 Programming Projects
15 Details of the MACIE System
15.1 Inferencing and Forward Chaining
15.1.1 Discrete Multilayer Perceptron Models
15.1.2 Continuous Variables
15.1.3 Winner-Take-All Groups
15.1.4 Using Prior Probabilities for More Aggressive Inferencing
15.2 Confidence Estimation
15.2.1 A Confidence Heuristic Prior to Inference
15.2.2 Confidence in Inferences
15.3 Information Acquisition and Backward Chaining
15.4 Concluding Comment
15.5 Exercises
15.6 Programming Projects
16 Noise, Redundancy, Fault Detection, and Bayesian Decision Theory
16.1 The High Tech Lemonade Corporation's Problem
16.2 The Deep Model and the Noise Model
16.3 Generating the Expert System
16.4 Probabilistic Analysis
16.5 Noisy Single-Pattern Boolean Fault Detection Problems
16.6 Convergence Theorem
16.7 Comments
16.8 Exercises
16.9 Programming Projects
17 Extracting Rules from networks
17.1 Why Rules?
17.2 What Kind of Rules?
17.2.1 Criteria
17.2.2 Inference Justifications versus Rule Sets
17.2.3 Which Variables in Conditions
17.3 Inference Justifications
17.3.1 MACIE's Algorithm
17.3.2 The Removal Algorithm
17.3.3 Key Factor Justifications
17.3.4 Justifications for Continuous Models
17.4 Rule Sets
17.4.1 Limiting the Number of Conditions
17.4.2 Approximating Rules
17.5 Conventional + Neural Network Expert Systems
17.5.1 Debugging an Expert System Knowledge Base
17.5.2 The Short-Rule Debugging Cycle
17.6 Concluding Remarks
17.7 Exercises
17.8 Programming Projects
Appendix Representation Comparisons
A.1 DNF Expressions and Polynomial Representability
A.1.1 DNF Expressions
A.1.2 Polynomial Representability
A.1.3 Space Comparison of MLP and DNF Representations
A.1.4 Speed Comparison of MLP and DNF Representations
A.1.5 MLP versus DNF Representations
A.2 Decision Trees
A.2.1 Representing Decision Trees by MLP's
A.2.2 Speed Comparison
A.2.3 Decision Trees versus MLP's
A.3 p-lDiagrams
A.4 Symmetric Functions and Depth Complexity
A.5 Concluding Remarks
A.6 Exercises
Bibliography
Index