Cover image for Building and using comparable corpora
Title:
Building and using comparable corpora
Publication Information:
Heidelberg : Springer, 2013
Physical Description:
xiii, 335 p. : ill. (some col.) ; 25 cm.
ISBN:
9783642201271

Available:*

Library
Item Barcode
Call Number
Material Type
Item Category 1
Status
Searching...
35000000003260 QA76.9.N38 B85 2013 Open Access Book Book
Searching...
Searching...
30000010333716 QA76.9.N38 B85 2013 Open Access Book Book
Searching...

On Order

Summary

Summary

The 1990s saw a paradigm change in the use of corpus-driven methods in NLP. In the field of multilingual NLP (such as machine translation and terminology mining) this implied the use of parallel corpora. However, parallel resources are relatively scarce: many more texts are produced daily by native speakers of any given language than translated. This situation resulted in a natural drive towards the use of comparable corpora, i.e. non-parallel texts in the same domain or genre. Nevertheless, this research direction has not produced a single authoritative source suitable for researchers and students coming to the field.

The proposed volume provides a reference source, identifying the state of the art in the field as well as future trends. The book is intended for specialists and students in natural language processing, machine translation and computer-assisted translation.


Table of Contents

Preface - Building and Using Comparable CorporaS. Sharoff and R. Rapp and P. Zweigenbaum
Overviewing Important Aspects of the Last 20 Years of Research in Comparable CorporaS. Sharoff and R. Rapp and P. Zweigenbaum
Part I Compiling and Measuring Comparable Corpora
Multilingual Corpus CollectionS. Shi and P. Fung
Automatic Comparable Web Corpora Collection and Bilingual Terminology Extraction for Specialized Dictionary MakingA. Gurrutxaga and I. Leturia and I. San Vicente and X. Saralegi
Statistical Comparability: Methodological CaveatsR. Khler
Methods for Collection and Evaluation of Comparable DocumentsM. Lestari Paramita and D. Guthrie and E. Kanoulas and R. Gaizauskas and P. Clough and M. Sanderson
Measuring the Distance between Comparable Corpora between LanguagesS. Sharoff
Exploiting Comparable Corpora for Lexicon Extraction: Measuring and Improving Corpus QualityB. Li and E. Gaussier
Statistical Corpus and Language Comparison on Comparable CorporaT. Eckart and U. Quasthoff
Comparable Multilingual Patents as Large-scale Parallel CorporaB. Lu and B. Tsou
Part II Using Comparable Corpora
Extracting Parallel Phrases from Comparable DataS. Hewavitharana and S. Vogel
Exploiting Comparable CorporaD.S. Munteanu and D. Marcu
Paraphrase Detection in Comparable Monolingual CorporaL. Deleger and B. Cartoni and P. Zweigenbaum
Information Network Construction and Alignment from Automatically Acquired Comparable CorporaH. Ji and W.-P. Lin
Bilingual Terminology Mining from Comparable CorporaB. Daille and E. Morin
The Place of Comparable Corpora in Providing Terminological Reference Information to Online Translators: A Strategic FrameworkK. Kageura and T. Abekawa
Old Needs, New Solutions: Comparable Corpora for Language ProfessionalsS. Bernardini and A. Ferraresi
Exploiting the Incomparability of Comparable Corpora for Contrastive Linguistics and Translation StudiesS. Neumann and S. Hansen-Schirra