Cover image for LMF lexical markup framework
Title:
LMF lexical markup framework
Series:
Computer engineering and IT series
Publication Information:
London : ISTE ; Hoboken, NJ : Wiley, 2013
Physical Description:
xiv, 266 p. : ill. ; 24 cm.
ISBN:
9781848214309
Added Author:

Available:*

Library
Item Barcode
Call Number
Material Type
Item Category 1
Status
Searching...
30000010323136 QA76.9.N38 L54 2013 Open Access Book Book
Searching...

On Order

Summary

Summary

The community responsible for developing lexicons for Natural Language Processing (NLP) and Machine Readable Dictionaries (MRDs) started their ISO standardization activities in 2003. These activities resulted in the ISO standard - Lexical Markup Framework (LMF).
After selecting and defining a common terminology, the LMF team had to identify the common notions shared by all lexicons in order to specify a common skeleton (called the core model) and understand the various requirements coming from different groups of users.
The goals of LMF are to provide a common model for the creation and use of lexical resources, to manage the exchange of data between and among these resources, and to enable the merging of a large number of individual electronic resources to form extensive global electronic resources.
The various types of individual instantiations of LMF can include monolingual, bilingual or multilingual lexical resources. The same specifications can be used for small and large lexicons, both simple and complex, as well as for both written and spoken lexical representations. The descriptions range from morphology, syntax and computational semantics to computer-assisted translation. The languages covered are not restricted to European languages, but apply to all natural languages.
The LMF specification is now a success and numerous lexicon managers currently use LMF in different languages and contexts.
This book starts with the historical context of LMF, before providing an overview of the LMF model and the Data Category Registry, which provides a flexible means for applying constants like /grammatical gender/ in a variety of different settings. It then presents concrete applications and experiments on real data, which are important for developers who want to learn about the use of LMF.

Contents

1. LMF - Historical Context and Perspectives, Nicoletta Calzolari, Monica Monachini and Claudia Soria.
2. Model Description, Gil Francopoulo and Monte George.
3. LMF and the Data Category Registry: Principles and Application, Menzo Windhouwer and Sue Ellen Wright.
4. Wordnet-LMF: A Standard Representation for Multilingual Wordnets, Piek Vossen, Claudia Soria and Monica Monachini.
5. Prolmf: A Multilingual Dictionary of Proper Names and their Relations, Denis Maurel, Béatrice Bouchou-Markhoff.
6. LMF for Arabic, Aida Khemakhem, Bilel Gargouri, Kais Haddar and Abdelmajid Ben Hamadou.
7. LMF for a Selection of African Languages, Chantal Enguehard and Mathieu Mangeot.
8. LMF and its Implementation in Some Asian Languages, Takenobu Tokunaga, Sophia Y.M. Lee, Virach Sornlertlamvanich, Kiyoaki Shirai, Shu-Kai Hsieh and Chu-Ren Huang.
9. DUELME: Dutch Electronic Lexicon of Multiword Expressions, Jan Odijk.
10. UBY-LMF - Exploring the Boundaries of Language-Independent Lexicon Models, Judith Eckle-Kohler, Iryna Gurevych, Silvana Hartmann, Michael Matuschek and Christian M. Meyer.
11. Conversion of Lexicon-Grammar Tables to LMF: Application to French, Éric Laporte, Elsa Tolone and Matthieu Constant.
12. Collaborative Tools: From Wiktionary to LMF, for Synchronic and Diachronic Language Data, Thierry Declerck, Pirsoka Lendvai and Karlheinz Mörth.
13. LMF Experiments on Format Conversions for Resource Merging: Converters and Problems, Marta Villegas, Muntsa Padró and Núria Bel.
14. LMF as a Foundation for Servicized Lexical Resources, Yoshihiko Hayashi, Monica Monachini, Bora Savas, Claudia Soria and Nicoletta Calzolari.
15. Creating a Serialization of LMF: The Experience of the RELISH Project, Menzo Windhouwer, Justin Petro, Irina Nevskaya, Sebastian Drude, Helen Aristar-Dry and Jost Gippert.
16. Global Atlas: Proper Nouns, From Wikipedia to LMF, Gil Francopoulo, Frédéric Marcoul, David Causse and Grégory Piparo.
17. LMF in U.S. Government Language Resource Management, Monte George.

About the Authors

Gil Francopoulo works for Tagmatica (www.tagmatica.com), a company specializing in software development in the field of linguistics and documentation in the semantic web, in Paris, France, as well as for Spotter (www.spotter.com), a company specializing in media and social media analytics.


Author Notes

Gil Francopoulo works for Tagmatica (www.tagmatica.com), a company specializing in software development in the field of linguistics and documentation in the semantic web, in Paris, France, as well as for Spotter, (www.spotter.com), a company specializing in eReputation computation and text mining.


Table of Contents

Nicoletta Calzolari and Monica Monaciani and Claudia SoriaGil Francopoulo and Monte GeorgeMenzo Windhouwer and Sue Ellen WrightPiek Vossen and Claudia Soria and Monica MonachiniDenis Maurel and Béatrice Bouchou-MarkhoffAida Khemakhem and Bilel Gargouri and Kais Haddar and Abdelmajid Ben HamadouChantal Enguehard and Mathieu MangeotTakenobu Tokunaga and Sophia Y.M. Lee and Virach Sornlertlamvanich and Kiyoaki Shirai and Shu-Kai Hsieh and Chu-Ren HuangJan OdukJudith Eckle-Kohler and Iryna Gurevych and Silvana Hartmann and Michael Matuschek and Christian M. MeyerÉric Laporte and Elsa Tolone and Matthieu ConstantThierry Declerck and Pirsoka Lendvai and Karlheinz MorthMaria Villegas and Muntsa Padró and Nuria BelYoshihiko Hayashi and Monica Monachini and Bora Savas and Claudia Soria and Nicoletta CalzolariMenzo Windhouwer and Justin Petro and Irina Nevskaya and Sebastian Drude and Helen Aristar-Dry and Jost GippertGil Francopoulo and Frédéric Marcoul and David Causse and Grégory PiparoMonte George
Prefacep. xiii
Chapter 1 LMF - Historical Context and Perspectivesp. 1
1.1 Introductionp. 1
1.2 The contextp. 2
1.3 The foundations: the Grosseto Workshop and the "X-Lex" projectsp. 4
1.4 EAGLES and ISLEp. 5
1.5 Setting up methodologies and principles for standardsp. 6
1.5.1 The MILE methodology: toward LMFp. 8
1.6 EAGLES/ISLE legacyp. 10
1.6.1 Lessons learned for standard designp. 12
1.6.2 Moving closer to LMFp. 13
1.7 Interoperability: the keystone of the fieldp. 14
1.8 Bibliographyp. 15
Chapter 2 Model Descriptionp. 19
2.1 Objectivesp. 19
2.2 The ISO specificationp. 19
2.3 Means of descriptionp. 20
2.4 Core modelp. 21
2.5 Core model and extension packagesp. 22
2.6 Morphology extensionp. 23
2.7 Machine-Readable Dictionary extensionp. 26
2.8 NLP syntax extensionp. 27
2.9 NLP semantic extensionp. 29
2.10 Multilingual notation extensionp. 31
2.11 NLP morphological pattern extensionp. 33
2.12 NLP multiword expression pattern extensionp. 36
2.13 Constraint expression extensionp. 38
2.14 Conclusionp. 39
2.15 Bibliographyp. 40
Chapter 3 LMF and the Data Category Registry: Principles and Applicationp. 41
3.1 Introductionp. 41
3.2 Data category specificationsp. 42
3.2.1 Data modelp. 42
3.2.2 Persistent identifiersp. 43
3.2.3 Standardizationp. 43
3.3 The ISOcat Data Category Registryp. 44
3.3.1 A web user interfacep. 44
3.3.2 Communitiesp. 45
3.4 LMF and data categoriesp. 45
3.4.1 Data category selectionsp. 45
3.4.2 Referring to data categoriesp. 45
3.4.3 Standardizing data categoriesp. 48
3.5 Conclusions and future workp. 49
3.6 Bibliographyp. 49
Chapter 4 Wordnet-LMF: A Standard Representation for Multilingual Wordnetsp. 51
4.1 Introductionp. 51
4.2 The KYOTO projectp. 52
4.3 LMF and Wordnet representationp. 54
4.4 Wordnet-LMFp. 56
4.4.1 Designing Wordnet-LMFp. 57
4.4.2 LMF componentsp. 58
4.4.3 Additional and custom componentsp. 59
4.4.4 Comparing LMF and Wordnet-LMFp. 60
4.5 Conclusionsp. 62
4.6 Bibliographyp. 65
Chapter 5 Prolmf: A Multilingual Dictionary of Proper Names and their Relationsp. 67
5.1 Motivationp. 67
5.2 Prolmf basisp. 69
5.3 More on lexica and relations in Prolmfp. 73
5.4 Conclusionp. 77
5.5 Bibliographyp. 79
5.6 Appendixp. 80
Chapter 6 LMF for Arabicp. 83
6.1 Introductionp. 83
6.2 Modeling of the basic propertiesp. 85
6.3 Modeling of the morphologic extensionp. 86
6.4 Modeling of the morphologic pattern extensionp. 88
6.5 Modeling of the syntactic extensionp. 90
6.6 Modeling of the semantic extensionp. 92
6.7 Arabic LMF applicationsp. 94
6.8 Implementationp. 95
6.9 Conclusionp. 96
6.10 Bibliographyp. 96
Chapter 7 LMF for a Selection of African Languagesp. 99
7.1 Introductionp. 99
7.2 Less-resourced languagesp. 99
7.2.1 Definitionp. 99
7.2.2 Socio-economic contextp. 100
7.2.3 Linguistic resourcesp. 101
7.2.4 Building electronic lexical resourcesp. 101
7.3 From published dictionaries to LMFp. 102
7.3.1 Objectivesp. 102
7.3.2 Methodologyp. 102
7.4 Illustrationsp. 104
7.4.1 Definition of the copy formatp. 104
7.4.2 From original format to copy formatp. 107
7.4.3 From copy format to pivot formatp. 109
7.4.4 From pivot format to target formatp. 110
7.5 Difficulties and proposalsp. 113
7.5.1 Data categoryp. 113
7.5.2 LMF structurep. 113
7.5.3 Adding annotationsp. 116
7.6 Conclusionp. 117
7.7 Acknowledgmentsp. 117
7.8 Bibliographyp. 117
Chapter 8 LMF and its Implementation in Some Asian Languagesp. 119
8.1 Introductionp. 119
8.2 Lexical specification and data categoriesp. 120
8.2.1 Lexical specificationp. 120
8.2.2 Data categoriesp. 121
8.3 Upper-layer ontologyp. 125
8.4 Evaluation platformp. 126
8.5 Discussionp. 128
8.6 Conclusionp. 129
8.7 Acknowledgmentsp. 130
8.8 Bibliographyp. 131
Chapter 9 DUELME: Dutch Electronic Lexicon of Multiword Expressionsp. 133
9.1 Introductionp. 133
9.2 DUELMEp. 134
9.3 LMFp. 135
9.4 The DUELME class modelp. 135
9.5 Comparison with the LMF Core Packagep. 137
9.6 Comparison with the LMF NLP multiword expression patterns extensionp. 139
9.7 Conclusionsp. 142
9.8 Acknowledgmentsp. 143
9.9 Bibliographyp. 143
Chapter 10 UBY-LMF - Exploring the Boundaries of Language-Independent Lexicon Modelsp. 145
10.1 Introductionp. 145
10.2 Architecture of UBY-LMFp. 147
10.3 Language independence of UBY-LMFp. 148
10.3.1 Language-specific lexical-syntactic informationp. 148
10.3.2 Translation informationp. 149
10.3.3 Language-independent lexical-semantic informationp. 150
10.3.4 Language-independent semantic information at the interface to syntaxp. 150
10.4 FrameNet in UBY-LMFp. 151
10.5 Conclusionp. 153
10.6 Acknowledgmentsp. 154
10.7 Bibliographyp. 154
Chapter 11 Conversion of Lexicon-Grammar Tables to LMF: Application to Frenchp. 157
11.1 Motivationp. 157
11.2 The Lexicon-Grammarp. 157
11.2.1 Lexicon-Grammar tablesp. 157
11.2.2 The LGLex dictionaryp. 159
11.2.3 The LGLex-Lefff dictionaryp. 160
11.3 Lexical entriesp. 160
11.4 Subcategorization framesp. 163
11.4.1 Subcategorization frame setsp. 163
11.4.2 Grammatical functionsp. 164
11.4.3 Representation of syntactic argumentsp. 165
11.4.4 Levels of generality of syntactic constructionsp. 168
11.4.5 Constituentsp. 169
11.5 Resultsp. 170
11.6 Conclusionp. 171
11.7 Bibliographyp. 172
Chapter 12 Collaborative Tools: From Wiktionary to LMF, for Synchronic and Diachronic Language Datap. 175
12.1 Introductionp. 175
12.2 Wiktionaryp. 175
12.3 Related workp. 177
12.4 Additional challenges: how to encode the diversity of Wiktionary lexicon in LMF?p. 179
12.4.1 Diachronic language data in Wiktionaryp. 179
12.4.2 A possible solution for interlinking dictionaries converted into LMFp. 181
12.5 Conclusionp. 183
12.6 Bibliographyp. 184
Chapter 13 LMF Experiments on Format Conversions for Resource Merging: Converters and Problemsp. 187
13.1 Introductionp. 187
13.2 Automatic merging of resourcesp. 188
13.3 Moving from PAROLE Genelex to LMFp. 191
13.3.1 Lexical entryp. 192
13.3.2 Subcategorizationp. 193
13.3.3 Properties (attributes vs. complex data categories)p. 194
13.4 Conclusionp. 197
13.5 Availability of resourcesp. 198
13.6 Bibliographyp. 198
Chapter 14 LMF as a Foundation for Servicized Lexical Resourcesp. 201
14.1 Introductionp. 201
14.2 Lexical resources as lexical Web servicesp. 201
14.3 LMF-aware Web services in the RESTful stylep. 202
14.4 Implementation showcasesp. 203
14.4.1 Servicizing WordNet-type computational semantic lexiconsp. 204
14.4.2 Bilingual machine-readable dictionariesp. 207
14.4.3 Status of the developed servicesp. 211
14.5 Summaryp. 212
14.6 Bibliographyp. 212
Chapter 15 Creating a Serialization of LMF: The Experience of the RELISH Projectp. 215
15.1 introductionp. 215
15.2 Overview of the RELISH interchange formatp. 216
15.3 Mapping of equivalent elementsp. 217
15.3.1 Entry and headwordp. 218
15.3.2 Sense and its contained elementsp. 218
15.4 Complex mappingsp. 219
15.4.1 Relationsp. 219
15.4.2 Notes and feature structuresp. 219
15.4.3 Grammatical informationp. 221
15.4.4 Examples and extending LMFp. 222
15.5 Harmonization of linguistic conceptsp. 223
15.6 Conclusions and future workp. 224
15.7 Bibliographyp. 225
Chapter 16 Global Atlas: Proper Nouns, From Wikipedia to LMFp. 227
16.1 Motivationp. 227
16.2 Preparing recognitionp. 227
16.3 Context of usagep. 230
16.4 Ontology of typesp. 231
16.5 Main source: Wikipediap. 232
16.6 Extractionp. 233
16.7 Auxiliary machine learningp. 234
16.8 LMF structuresp. 234
16.9 Examplep. 235
16.10 Resultsp. 237
16.11 Current limitations and planned improvementsp. 237
16.12 LMF limitationsp. 238
16.13 Related workp. 238
16.14 Conclusionp. 239
16.15 Bibliographyp. 239
Chapter 17 LMF in U.S. Government Language Resource Managementp. 243
17.1 Introductionp. 243
17.2 Wordscape overviewp. 244
17.3 The goalp. 245
17.4 The importance of data standardsp. 245
17.5 Language base exchangep. 246
17.6 Managing multilingual representationsp. 249
17.7 Managing grammatical informationp. 251
17.8 Grammatical information, an MRD examplep. 255
17.9 Managing LBX schema and document instancesp. 258
17.10 Data exchange using LBXp. 259
17.11 Summaryp. 260
List of Authorsp. 263
Indexp. 267