Cover image for Intelligent document retrieval : exploiting markup structure
Title:
Intelligent document retrieval : exploiting markup structure
Personal Author:
Series:
Springer international series on information retrieval ; 17
Publication Information:
Dordrecht, Netherlands : Springer, 2005
ISBN:
9781402037672

9781402037689

Available:*

Library
Item Barcode
Call Number
Material Type
Item Category 1
Status
Searching...
30000010134672 TK5105.884 K78 2005 Open Access Book Book
Searching...

On Order

Summary

Summary

Collections of digital documents can nowadays be found everywhere in institutions, universities or companies. Examples are Web sites or intranets. But searching them for information can still be painful. Searches often return either large numbers of matches or no suitable matches at all.

Such document collections can vary a lot in size and how much structure they carry. What they have in common is that they typically do have some structure and that they cover a limited range of topics. The second point is significantly different from the Web in general.

The type of search system that we propose in this book can suggest ways of refining or relaxing the query to assist a user in the search process. In order to suggest sensible query modifications we would need to know what the documents are about. Explicit knowledge about the document collection encoded in some electronic form is what we need. However, typically such knowledge is not available. So we construct it automatically.


Table of Contents

Foreword
Preface
List of Figures
List of Tables
1 Introduction
1.1 Introductory Examples
1.2 Using Markup to Extract Knowledge
1.3 Applying the Extracted knowledge
1.4 Structure of the Book
Part I The Model
2 Related Work
2.1 Information Retrieval
2.2 Information Extraction
2.3 Clustering
2.4 Classification
2.5 Web Search Techniques
2.6 Ontologies
2.7 Layout Analysis
2.8 Web Search Studies
2.9 Navigating Concept Hierarchies
2.10 Dialogue Systems
2.11 Usability Issues
2.12 Concluding Remarks on Related Work
3 Data Analysis and Domain Model Construction
3.1 Documents
3.2 Concepts
3.3 A Domain Model Based on Concepts
3.4 Model Structure
3.5 Model Construction
3.6 Using the Model for Query Modification
3.7 Implementational Issues
4 Incorporating Additional Knowledge
4.1 Internal Knowledge
4.2 External Knowledge
5 A Dialogue System for Partially Structured Data
5.1 Dialogue as Movement in Space
5.2 Dialogue Example
5.3 Static vs. Dynamic Clusters
5.4 Real User Queries
5.5 Properties
5.5.1 Document Properties
5.5.2 System Properties
5.5.3 Goal Description
5.6 Dialogue
5.6.1 High Level Dialogue States
5.6.2 Low Level Dialogue States
5.6.3 Constructing Potential Choices
5.6.4 Dialogue Strategies
5.6.5 Customization
Part II Practical Applications
6 UKSearch - Intelligent Web Search
6.1 Indexing Web Pages
6.2 The UKSearch System
6.2.1 Indexing and Model Construction
6.2.2 Dialogue Strategy
6.3 Sample Domain 1: Essex University
6.3.1 Index Tables
6.3.2 Domain Model
6.3.3 Concepts it vs. Real User Queries
6.4 Sample Domain 2: BBC News
6.4.1 Index Tables
6.4.2 Domain Model
6.4.3 Adjusted Dialogue Strategy
6.5 Implementational Issues
7 UKSearch - Evaluation and Discussion
7.1 Log Analysis
7.1.1 System Setup
7.1.2 Results
7.1.3 Discussion
7.2 Investigating Domain Model Relations
7.2.1 Task and Setup
7.2.2 Results
7.2.3 Discussion
7.3 Task-Based Evaluation: Essex University
7.3.1 Search Tasks
7.3.2 Experimental Setup
7.3.3 Procedure
7.3.4 Results
7.3.5 Discussion
7.4 Task-Based Evaluation: BBC News
7.4.1 Search Tasks
7.4.2 Experimental Setup and Procedure
7.4.3 Results
7.4.4 Discussion
8 YPA - Searching Classified Directories
8.1 System Overview
8.2 Indexing Classified Advertisements
8.2.1 Structure of the Backend
8.2.2 Domain Model Construction
8.3 Dialogue Strategy in the YPA
8.3.1 Properties
8.3.2 Dialogue Setup
8.3.3 Dialogue Function
8.3.4 Calculation of Potential Choices
8.4 Implementational Issues
9 Future Directions and Conclusions
9.1 Towards Evolving Domain Models
9.2 Dialogue Management
9.3 An Outlook on Future Evaluations
9.4 Conclusions
References
Index