Neural network learning and expert systems

Select an Action

Place Hold(s)
Add to My Lists
Email
Print

Title:

Personal Author:

Gallant, Stephen I.

Publication Information:

Cambridge, Mass. : MIT Pr., 1994

ISBN:

9780262071451

Subject Term:

Artificial neural networks

Expert systems (Computer science)

Available:*

Library	Item Barcode	Call Number	Material Type	Item Category 1	Status
Searching... PSZ JB	30000003155888	QA76.87 G34 1994	Open Access Book	Book	Searching... Unknown

Neural Network Learning and Expert Systems is the first book to present a unified and in-depth development of neural network learning algorithms and neural network expert systems. Especially suitable for students and researchers in computer science, engineering, and psychology, this text and reference provides a systematic development of neural network learning algorithms from a computational perspective, coupled with an extensive exploration of neural network expert systems which shows how the power of neural network learning can be harnessed to generate expert systems automatically.

Features include a comprehensive treatment of the standard learning algorithms (with many proofs), along with much original research on algorithms and expert systems. Additional chapters explore constructive algorithms, introduce computational learning theory, and focus on expert system applications to noisy and redundant problems.

For students there is a large collection of exercises, as well as a series of programming projects that lead to an extensive neural network software package. All of the neural network models examined can be implemented using standard programming languages on a microcomputer.

Reviews 1

Choice Review

Gallant's book deserves to go to the head of the list of the many excellent introductions to neural networks. Well organized, the nonspecialist sections provide a coherent and comprehensive introduction and the starred specialist sections take the reader to the frontiers of research. A list of research problems is also supplied. The first major section clearly and logically discusses fundamental principles; the second covers learning in single-layer networks. The third section proceeds to multilayer networks and includes a thorough discussion of back-propagation. The fourth section, a special feature, covers pioneering work by Gallant on neural network expert systems. Because of the lucid and engaging style of writing and the clarity of exposition, the serious general reader would be able to read at least the nonspecialist sections with pleasure and profit. A worthy acquisition for public libraries, and definitely a must for all college and university libraries. All levels. R. Bharath; Northern Michigan University

Foreword

I Basics

1 Introduction and Important Definitions

1.1 Why Connectionist Models?

1.1.1 The Grand Goals of Al and Its Current Impasse

1.1.2 The Computational Appeal of Neural Networks

1.2 The Structure of Connectionist Models

1.2.1 Network Properties

1.2.2 Cell Properties

1.2.3 Dynamic Properties

1.2.4 Learning Properties

1.3 Two Fundamental Models: Multilayer Perceptrons (MLP's) and Backpropagation Networks (BPN's)

1.3.1 Multilayer Perceptrons (MLP's)

1.3.2 Backpropagation Networks (BPN's)

1.4 Gradient Descent

1.4.1 The Algorithm

1.4.2 Practical Problems

1.4.3 Comments

1.5 Historic and Bibliographic Notes

1.5.1 Early Work

1.5.2 The Decline of the Perceptron

1.5.3 The Rise of Connectionist Research

1.5.4 Other Bibliographic Notes

1.6 Exercises

1.7 Programming Project

2 Representation Issues

2.1 Representing Boolean Functions

2.1.1 Equivalence of {{+1, -1,0}} and {{1,0}} Forms

2.1.2 Single-Cell Models

2.1.3 Nonseparable Functions

2.1.4 Representing Arbitrary Boolean Functions

2.1.5 Representing Boolean Functions Using Continuous Connectionist Models

2.2 Distributed Representations

2.2.1 Definition

2.2.2 Storage Efficiency and Resistance to Error

2.2.3 Superposition

2.2.4 Learning

2.3 Feature Spaces and ISA Relations

2.3.1 Feature Spaces

2.3.2 Concept-Function Unification

2.3.3 ISA Relations

2.3.4 Binding

2.4 Representing Real-Valued Functions

2.4.1 Approximating Real Numbers by Collections of Discrete Cells

2.4.2 Precision

2.4.3 Approximating Real Numbers by Collections of Continuous Cells

2.5 Example: Taxtime!

2.6 Exercises

2.7 Programming Projects

II Learning In Single-Layer Models

3 Perceptron Learning and the Pocket Algorithm

3.1 Perceptron Learning for Separable Sets of Training Examples

3.1.1 Statement of the Problem

3.1.2 Computing the Bias

3.1.3 The Perceptron Learning Algorithm

3.1.4 Perceptron Convergence Theorem

3.1.5 The Perceptron Cycling Theorem

3.2 The Pocket Algorithm for Nonseparable Sets of Training Examples

3.2.1 Problem Statement

3.2.2 Perceptron Learning Is Poorly Behaved

3.2.3 The Pocket Algorithm

3.2.4 Ratchets

3.2.5 Examples

3.2.6 Noisy and Contradictory Sets of Training Examples

3.2.7 Rules

3.2.8 Implementation Considerations

3.2.9 Proof of the Pocket Convergence Theorem

3.3 Khachiyan's Linear Programming Algorithm

3.4 Exercises

3.5 Programming Projects

4 Winner-Take-All Groups or Linear Machines

4.1 Generalizes Single-Cell Models

4.2 Perceptron Learning for Winner-Take-All Groups

4.3 The Pocket Algorithm for Winner-Take-All Groups

4.4 Kessler's Construction, Perceptron Cycling, and the Pocket Algorithm Proof

4.5 Independent Training

4.6 Exercises

4.7 Programming Projects

5 Autoassociators and One-Shot Learning

5.1 Linear Autoassociators and the Outer-Product Training Rule

5.2 Anderson's BSB Model

5.3 Hopfieid's Model

5.3.1 Energy

5.4 The Traveling Salesman Problem

5.5 The Cohen-Grossberg Theorem

5.6 Kanerva's Model

5.7 Autoassociative Filtering for Feedforward Networks

5.8 Concluding Remarks

5.9 Exercises

5.10 Programming Projects

6 Mean Squared Error (MSE) Algorithms

6.1 Motivation

6.2 MSE Approximations

6.3 The Widrow-Hoff Rule or LMS Algorithm

6.3.1 Number of Training Examples Required

6.4 Adaline

6.5 Adaptive Noise Cancellation

6.6 Decision-Directed Learning

6.7 Exercises

6.8 Programming Projects

7 Unsupervised Learning

7.1 Introduction

7.1.1 No Teacher

7.1.2 Clustering Algorithms

7.2 k-Means Clustering

7.2.1 The Algorithm

7.2.2 Comments

7.3 Topology-Preserving Maps

7.3.1 Introduction

7.3.2 The Algorithm

7.3.3 Example

7.3.4 Demonstrations

7.3.5 Dimensionality, Neighborhood Size, and Final Comments

7.4 Art1

7.4.1 Important Aspects of the Algorithm

7.4.2 The Algorithm

7.5 Art2

7.6 Using Clustering Algorithms for Supervised Learning

7.6.1 Labeling Clusters

7.6.2 ARTMAP or Supervised ART

7.7 Exercises

7.8 Programming Projects

III Learning In Multilayer Models

8 The Distributed Method and Radial Basis Functions

8.1 Rosenblatt's Approach

8.2 The Distributed Method

8.2.1 Cover's Formula

8.2.2 Robustness-Preserving Functions

8.3 Examples

8.3.1 Hepatobiliary Data

8.3.2 Artificial Data

8.4 How Many Cells?

8.4.1 Pruning Data

8.4.2 Leave-One-Out

8.5 Radial Basis Functions

8.6 A Variant: The Anchor Algorithm

8.7 Scaling, Multiple Outputs, and Parallelism

8.7.1 Scaling Properties

8.7.2 Multiple Outputs and Parallelism

8.7.3 A Computational Speedup for Learning

8.7.4 Concluding Remarks

8.8 Exercises

8.9 Programming Projects

9 Computational Learning Theory and the BRD Algorithm

9.1 Introduction to Computational Learning Theory

9.1.1 PAC-Learning

9.1.2 Bounded Distributed Connectionist Networks

9.1.3 Probabilistic Bounded Distributed Concepts

9.2 A Learning Algorithm for Probabilistic Bounded Distributed Concepts

9.3 The BRD Theorem

9.3.1 Polynomial Learning

9.4 Noisy Data and Fallback Estimates

9.4.1 Vapnik-Chervonenkis Bounds

9.4.2 Hoeffding and Chernoff Bounds

9.4.3 Pocket Algorithm

9.4.4 Additional Training Examples

9.5 Bounds for Single-Layer Algorithms

9.6 Fitting Data by Limiting the Number of Iterations

9.7 Discussion

9.8 Exercise

9.9 Programming Project

10 Constructive Algorithms

10.1 The Tower and Pyramid Algorithms

10.1.1 The Tower Algorithm

10.1.2 Example

10.1.3 Proof of Convergence

10.1.4 A Computational Speedup

10.1.5 The Pyramid Algorithm

10.2 The Cascade-Correlation Algorithm

10.3 The Tiling Algorithm

10.4 The Upstart Algorithm

10.5 Other Constructive Algorithms and Pruning

10.6 Easy Learning Problems

10.6.1 Decomposition

10.6.2 Expandable Network Problems

10.6.3 Limits of Easy Learning

10.7 Exercises

10.8 Programming Projects

11 Backpropagation

11.1 The Backpropagation Algorithm

11.1.1 Statement of the Algorithm

11.1.2 A Numerical Example

11.2 Derivation

11.3 Practical Considerations

11.3.1 Determination of Correct Outputs

11.3.2 Initial Weights

11.3.3 Choice of r

11.3.4 Momentum

11.3.5 Network Topology

11.3.6 Local Minima

11.3.7 Activations in [0,1] versus [-1, 1]

11.3.8 Update after Every Training Example

11.3.9 Other Squashing Functions

11.4 NP-Completeness

11.5 Comments

11.5.1 Overuse

11.5.2 Interesting Intermediate Cells

11.5.3 Continuous Outputs

11.5.4 Probability Outputs

11.5.5 Using Backpropagation to Train Multilayer Perceptrons

11.6 Exercises

11.7 Programming Projects

12 Backpropagation: Variations and Applications

12.1 NETtalk

12.1.1 Input and Output Representations

12.1.2 Experiments

12.1.3 Comments

12.2 Backpropagation through Time

12.3 Handwritten Character Recognition

12.3.1 Neocognitron Architecture

12.3.2 The Network

12.3.3 Experiments

12.3.4 Comments

12.4 Robot Manipulator with Excess Degrees of Freedom

12.4.1 The Problem

12.4.2 Training the Inverse Network

12.4.3 Plan Units

12.4.4 Comments

12.5 Exercises

12.6 Programming Projects

13 Simulated Annealing and Boltzmann Machines

13.1 Simulated Annealing

13.2 Boltzmann Machines

13.2.1 The Boltzmann Model

13.2.2 Boltzmann Learning

13.2.3 The Boltzmann Algorithm and Noise Clamping

13.2.4 Example: The 4-2-4 Encoder Problem

13.3 Remarks

13.4 Exercises

13.5 Programming Project

IV Neural Network Expert Systems

14 Expert Systems and Neural Networks

14.1 Expert Systems

14.1.1 What Is an Expert System?

14.1.2 Why Expert Systems?

14.1.3 Historically Important Expert Systems

14.1.4 Critique of Conventional Expert Systems

14.2 Neural Network Decision Systems

14.2.1 Example: Diagnosis of Acute Coronary Occlusion

14.2.2 Example: Autonomous Navigation

14.2.3 Other Examples

14.2.4 Decision Systems versus Expert Systems

14.3 MACIE, and an Example Problem

14.3.1 Diagnosis and Treatment of Acute Sarcophagal Disease

14.3.2 Network Generation

14.3.3 Sample Run of Macie

14.3.4 Real-Valued Variables and Winner-Take-All Groups

14.3.5 Not-Yet-Known versus Unavailable Variables

14.4 Applicability of Neural Network Expert Systems

14.5 Exercise

14.6 Programming Projects

15 Details of the MACIE System

15.1 Inferencing and Forward Chaining

15.1.1 Discrete Multilayer Perceptron Models

15.1.2 Continuous Variables

15.1.3 Winner-Take-All Groups

15.1.4 Using Prior Probabilities for More Aggressive Inferencing

15.2 Confidence Estimation

15.2.1 A Confidence Heuristic Prior to Inference

15.2.2 Confidence in Inferences

15.3 Information Acquisition and Backward Chaining

15.4 Concluding Comment

15.5 Exercises

15.6 Programming Projects

16 Noise, Redundancy, Fault Detection, and Bayesian Decision Theory

16.1 The High Tech Lemonade Corporation's Problem

16.2 The Deep Model and the Noise Model

16.3 Generating the Expert System

16.4 Probabilistic Analysis

16.5 Noisy Single-Pattern Boolean Fault Detection Problems

16.6 Convergence Theorem

16.7 Comments

16.8 Exercises

16.9 Programming Projects

17 Extracting Rules from networks

17.1 Why Rules?

17.2 What Kind of Rules?

17.2.1 Criteria

17.2.2 Inference Justifications versus Rule Sets

17.2.3 Which Variables in Conditions

17.3 Inference Justifications

17.3.1 MACIE's Algorithm

17.3.2 The Removal Algorithm

17.3.3 Key Factor Justifications

17.3.4 Justifications for Continuous Models

17.4 Rule Sets

17.4.1 Limiting the Number of Conditions

17.4.2 Approximating Rules

17.5 Conventional + Neural Network Expert Systems

17.5.1 Debugging an Expert System Knowledge Base

17.5.2 The Short-Rule Debugging Cycle

17.6 Concluding Remarks

17.7 Exercises

17.8 Programming Projects

Appendix Representation Comparisons

A.1 DNF Expressions and Polynomial Representability

A.1.1 DNF Expressions

A.1.2 Polynomial Representability

A.1.3 Space Comparison of MLP and DNF Representations

A.1.4 Speed Comparison of MLP and DNF Representations

A.1.5 MLP versus DNF Representations

A.2 Decision Trees

A.2.1 Representing Decision Trees by MLP's

A.2.2 Speed Comparison

A.2.3 Decision Trees versus MLP's

A.3 p-lDiagrams

A.4 Symmetric Functions and Depth Complexity

A.5 Concluding Remarks

A.6 Exercises

Bibliography

Index

Available:*

On Order

Summary

Summary

Reviews 1

Choice Review

Table of Contents