Cover image for Textual information access : statistical models
Title:
Textual information access : statistical models
Publication Information:
London : ISTE ; Hoboken, N.J. : Wiley, 2012
Physical Description:
xvi, 429 p. : ill. ; 25 cm.
ISBN:
9781848213227

Available:*

Library
Item Barcode
Call Number
Material Type
Item Category 1
Status
Searching...
30000010324844 QA76.9.T48 T49 2012 Open Access Book Book
Searching...

On Order

Summary

Summary

This book presents statistical models that have recently been developed within several research communities to access information contained in text collections. The problems considered are linked to applications aiming at facilitating information access:
- information extraction and retrieval;
- text classification and clustering;
- opinion mining;
- comprehension aids (automatic summarization, machine translation, visualization).
In order to give the reader as complete a description as possible, the focus is placed on the probability models used in the applications concerned, by highlighting the relationship between models and applications and by illustrating the behavior of each model on real collections.
Textual Information Access is organized around four themes: informational retrieval and ranking models, classification and clustering (regression logistics, kernel methods, Markov fields, etc.), multilingualism and machine translation, and emerging applications such as information exploration.

Contents

Part 1: Information Retrieval
1. Probabilistic Models for Information Retrieval, Stéphane Clinchant and Eric Gaussier.
2. Learnable Ranking Models for Automatic Text Summarization and Information Retrieval, Massih-Réza Amini, David Buffoni, Patrick Gallinari,
 Tuong Vinh Truong and Nicolas Usunier.
Part 2: Classification and Clustering
3. Logistic Regression and Text Classification, Sujeevan Aseervatham, Eric Gaussier, Anestis Antoniadis,
 Michel Burlet and Yves Denneulin.
4. Kernel Methods for Textual Information Access, Jean-Michel Renders.
5. Topic-Based Generative Models for Text 
Information Access, Jean-Cédric Chappelier.
6. Conditional Random Fields for Information Extraction, Isabelle Tellier and Marc Tommasi.
Part 3: Multilingualism
7. Statistical Methods for Machine Translation, Alexandre Allauzen and François Yvon.
Part 4: Emerging Applications
8. Information Mining: Methods and Interfaces for Accessing Complex Information, Josiane Mothe, Kurt Englmeier and Fionn Murtagh.
9. Opinion Detection as a Topic Classification Problem, Juan-Manuel Torres-Moreno, Marc El-Bèze, Patrice Bellot and
 Fréderic Béchet.


Author Notes

Eric Gaussier has been Professor of Computer Science at Joseph Fourier University in France since September 2006. He is currently leading the AMA team, the research of which fits within the general framework of machine learning and information modeling. Since 2010, he has also been deputy director of the Grenoble Informatics Laboratory, one of the largest Computer Science laboratories in France.
Franois Yvon is Professor of Computer Science at the University of Paris Sud in Orsay, France and a member of the Spoken Language Processing group of LIMSI/CNRS. His main research interests include analogy-based and statistical language learning, speech recognition and synthesis, and machine translation. He is currently leading LIMSI's research activities on statistical machine translation.


Table of Contents

Eric Gaussier and François YvonStéphane Clinchant and Eric GaussierMassih-Réza Amini and David Buffoni and Patrick Gallinari and Tuong Vinh Truong and Nicolas UsunierSujeevan Aseervatham and Eric Gaussier and Anestis Antoniadis and Michel Burlet and Yves DenneulinJean-Michel RendersJean-Cédric ChappelierIsabelle Tellier and Marc TommasiAlexandre Allauzen and François YvonJosiane Mothe and Kurt Englmeier and Fionn MurtaghJuan-Manuel Torres-Moreno and Marc El-Bèze and Patrice Bellot and Fréderic BéchetFrançois Yvon
Introductionp. xiii
Part 1 Information Retrievalp. 1
Chapter 1 Probabilistic Models for Information Retrievalp. 3
1.1 Introductionp. 3
1.1.1 Heuristic retrieval constraintsp. 6
1.2 2-Poisson modelsp. 8
1.3 Probability ranking principle (PRP)p. 10
1.3.1 Reformulationp. 12
1.3.2 BM25p. 13
1.4 Language modelsp. 15
1.4.1 Smoothing methodsp. 16
1.4.2 The Kullback-Leibler modelp. 19
1.4.3 Noisy channel modelp. 20
1.4.4 Some remarksp. 20
1.5 Informational approachesp. 21
1.5.1 DFR modelsp. 22
1.5.2 Information-based modelsp. 25
1.6 Experimental comparisonp. 27
1.7 Tools for information retrievalp. 28
1.8 Conclusionp. 28
1.9 Bibliographyp. 29
Chapter 2 Learnable Ranking Models for Automatic Text Summarization and Information Retrievalp. 33
2.1 Introductionp. 33
2.1.1 Ranking of instancesp. 34
2.1.2 Ranking of alternativesp. 42
2.1.3 Relation to existing frameworksp. 44
2.2 Application to automatic text summarizationp. 45
2.2.1 Presentation of the applicationp. 45
2.2.2 Automatic summary and learningp. 48
2.3 Application to information retrievalp. 49
2.3.1 Application presentationp. 49
2.3.2 Search engines and learningp. 50
2.3.3 Experimental resultsp. 53
2.4 Conclusionp. 54
2.5 Bibliographyp. 54
Part 2 Classification and Clusteringp. 59
Chapter 3 Logistic Regression and Text Classificationp. 61
3.1 Introductionp. 61
3.2 Generalized linear modelp. 62
3.3 Parameter estimationp. 65
3.4 Logistic regressionp. 68
3.4.1 Multinomial logistic regressionp. 69
3.5 Model selectionp. 70
3.5.1 Ridge regularizationp. 71
3.5.2 LASSO regularizationp. 71
3.5.3 Selected Ridge regularizationp. 72
3.6 Logistic regression applied to text classificationp. 74
3.6.1 Problem statementp. 74
3.6.2 Data pre-processingp. 75
3.6.3 Experimental resultsp. 76
3.7 Conclusionp. 81
3.8 Bibliographyp. 82
Chapter 4 Kernel Methods for Textual Information Accessp. 85
4.1 Kernel methods: context and intuitionsp. 85
4.2 General principles of kernel methodsp. 88
4.3 General problems with kernel choices (kernel engineering)p. 95
4.4 Kernel versions of standard algorithms: examples of solversp. 97
4.4.1 Kernal logistic regressionp. 98
4.4.2 Support vector machinesp. 99
4.4.3 Principal component analysisp. 101
4.4.4 Other methodsp. 102
4.5 Kernels for text entitiesp. 103
4.5.1 "Bag-of-words" kernelsp. 104
4.5.2 Semantic kernelsp. 105
4.5.3 Diffusion kernelsp. 107
4.5.4 Sequence kernelsp. 109
4.5.5 Tree kernelsp. 112
4.5.6 Graph kernelsp. 116
4.5.7 Kernels derived from generative modelsp. 119
4.6 Summaryp. 123
4.7 Bibliographyp. 124
Chapter 5 Topic-Based Generative Models for Text Information Accessp. 129
5.1 Introductionp. 129
5.1.1 Generative versus discriminative modelsp. 129
5.1.2 Text modelsp. 131
5.1.3 Estimation, prediction and smoothingp. 133
5.1.4 Terminology and notationsp. 134
5.2 Topic-based modelsp. 135
5.2.1 Fundamental principlesp. 135
5.2.2 Illustrationp. 136
5.2.3 General frameworkp. 138
5.2.4 Geometric interpretationp. 139
5.2.5 Application to text categorizationp. 141
5.3 Topic modelsp. 142
5.3.1 Probabilistic Latent Semantic Indexingp. 143
5.3.2 Latent Dirichlet Allocationp. 146
5.3.3 Conclusionp. 160
5.4 Term modelsp. 161
5.4.1 Limitations of the multinomialp. 161
5.4.2 Dirichlet compound multinomialp. 162
5.4.3 DCM-LDAp. 163
5.5 Similarity measures between documentsp. 164
5.5.1 Language modelsp. 165
5.5.2 Similarity between topic distributionsp. 165
5.5.3 Fisher kernelsp. 166
5.6 Conclusionp. 168
5.7 topic model softwarep. 169
5.8 Bibliographyp. 170
Chapter 6 Conditional Random Fields for Information Extractionp. 179
6.1 Introductionp. 179
6.2 Information extractionp. 180
6.2.1 The taskp. 180
6.2.2 Variantsp. 182
6.2.3 Evaluationsp. 182
6.2.4 Approaches not based on machine learningp. 183
6.3 Machine learning for information extractionp. 184
6.3.1 Usage and limitationsp. 184
6.3.2 Some applicable machine learning methodsp. 185
6.3.3 Annotating to extractp. 186
6.4 Introduction to conditional random fieldsp. 187
6.4.1 Formalization of a labelling problemp. 187
6.4.2 Maximum entropy model approachp. 188
6.4.3 Hidden Markov model approachp. 190
6.4.4 Graphical modelsp. 191
6.5 Conditional random fieldsp. 193
6.5.1 Definitionp. 193
6.5.2 Factorization and graphical modelsp. 195
6.5.3 Junction treep. 196
6.5.4 Inference in CRFsp. 198
6.5.5 Inference algorithmsp. 200
6.5.6 Training CRFsp. 201
6.6 Conditional random fields and their applicationsp. 203
6.6.1 Linear conditional random fieldsp. 204
6.6.2 Links between linear CRFs and hidden Markov modelsp. 205
6.6.3 Interests and applications of CRFsp. 208
6.6.4 Beyond linear CRFsp. 210
6.6.5 Existing librariesp. 211
6.7 Conclusionp. 214
6.8 Bibliographyp. 215
Part 3 Multilingualismp. 221
Chapter 7 Statistical Methods for Machine Translationp. 223
7.1 Introductionp. 223
7.1.1 Machine translation in the age of the Internetp. 223
7.1.2 Organization of the chapterp. 226
7.1.3 Terminological remarksp. 227
7.2 Probabilistic machine translation: an overviewp. 227
7.2.1 Statistical machine translation: the standard modelp. 228
7.2.2 Word-based models and their limitationsp. 230
7.2.3 Phrase-based modelsp. 234
7.3 Phrase-based modelsp. 235
7.3.1 Building word alignmentsp. 237
7.3.2 Word alignment models: a summaryp. 245
7.3.3 Extracting bisegmentsp. 246
7.4 Modeling reorderingsp. 250
7.4.1 The space of possible reorderingsp. 250
7.4.2 Evaluating permutationsp. 255
7.5 Translation: a search problemp. 259
7.5.1 Combining modelsp. 259
7.5.2 The decoding problemp. 261
7.5.3 Exact search algorithmsp. 262
7.5.4 Heuristic search algorithmsp. 267
7.5.5 Decoding: a solved problem?p. 272
7.6 Evaluating machine translationp. 272
7.6.1 Subjective evaluationsp. 273
7.6.2 The BLEU metricp. 275
7.6.3 Alternatives to BLEUp. 277
7.6.4 Evaluating machine translation: an open problemp. 279
7.7 State-of-the-art and recent developmentsp. 279
7.7.1 Using source contextp. 279
7.7.2 Hierarchical modelsp. 281
7.7.3 Translating with linguistic resourcesp. 283
7.8 Useful resourcesp. 287
7.8.1 Bibliographic data and online resourcesp. 288
7.8.2 Parallel corporap. 288
7.8.3 Tools for statistical machine translationp. 288
7.9 Conclusionp. 289
7.10 Acknowledgmentsp. 291
7.11 Bibliographyp. 291
Part 4 Emerging Applicationsp. 305
Chapter 8 Information Mining: Methods and Interfaces for Accessing Complex Informationp. 307
8.1 Introductionp. 307
8.2 The multidimensional visualization of informationp. 309
8.2.1 Accessing information based on the knowledge of the structured domainp. 309
8.2.2 Visualization of a set of documents via their contentp. 313
8.2.3 OLAP principles applied to document setsp. 317
8.3 Domain mapping via social networksp. 320
8.4 Analyzing the variability of searches and data mergingp. 323
8.4.1 Analysis of IR engine resultsp. 323
8.4.2 Use of data unificationp. 325
8.5 The seven types of evaluation measures used in IRp. 327
8.6 Conclusionp. 331
8.7 Acknowledgmentsp. 332
8.8 Bibliographyp. 332
Chapter 9 Opinion Detection as a Topic Classification Problemp. 337
9.1 Introductionp. 337
9.2 The TREC and TAC evaluation campaignsp. 339
9.2.1 Opinion detection by question-answeringp. 340
9.2.2 Automatic summarization of opinionsp. 342
9.2.3 The text mining challenge of opinion classification (DEFT (DÉfi Fouille de Textes))p. 343
9.3 Cosine weights - a second glancep. 347
9.4 Which components for a opinion vectors?p. 348
9.4.1 How to pass from words to terms?p. 349
9.5 Experimentsp. 352
9.5.1 Performance, analysis, and visualization of the results on the IMDB corpusp. 354
9.6 Extracting opinions from speech: automatic analysis of phone pollsp. 357
9.6.1 France Télécom opinion investigation corpusp. 358
9.6.2 Automatic recognition of spontaneous speech in opinion corporap. 360
9.6.3 Evaluationp. 363
9.7 Conclusionp. 365
9.8 Bibliographyp. 366
Appendix A Probabilistic Models: An Introductionp. 369
A.1 Introductionp. 369
A.2 Supervised categorizationp. 370
A.2.1 Filtering documentsp. 370
A.2.2 The Bernoulli modelp. 372
A.2.3 The multinomial modelp. 376
A.2.4 Evaluating categorization systemsp. 379
A.2.5 Extensionsp. 380
A.2.6 A first summaryp. 383
A.3 Unsupervised learning: the multinomial mixture modelp. 384
A.3.1 Mixture modelsp. 384
A.3.2 Parameter estimationp. 386
A.3.3 Applicationsp. 390
A.4 Markov models: statistical models for sequencesp. 391
A.4.1 Modeling sequencesp. 391
A.4.2 Estimating a Markov modelp. 394
A.4.3 Language modelsp. 395
A.5 Hidden Markov modelsp. 397
A.5.1 The modelp. 398
A.5.2 Algorithms for hidden Markov modelsp. 399
A.6 Conclusionp. 410
A.7 A primer of probability theoryp. 411
A.7.1 Probability space, eventp. 411
A.7.2 Conditional independence and probabilityp. 412
A.7.3 Random variables, momentsp. 413
A.7.4 Some useful distributionsp. 418
A.8 Bibliographyp. 420
List of Authorsp. 423
Indexp. 425