Workshop on Open Data in Linguistics

June 2, 2011 in Uncategorized

The workshop will be held on June 30th, 17:30 in Workshop II during the OKCon 2011.

To read up on the current status of the Open Linguistic Group see this blog post.

At the beginning there will be 6 presentations:

  • Dennis SpohrLinking lexical resources and ontologies on the Semantic Web with lemonSlides – There are a large number of ontologies currently available on the Semantic Web. However, in order to exploit them within natural language processing applications, more linguistic information than can be represented in current Semantic Web standards is required. Further, there are a large number of lexical resources available representing a wealth of linguistic information, but this data exists in various formats and is difficult to link to ontologies and other resources. We present a model we call lemon (Lexicon Model for Ontologies) that supports the sharing of terminological and lexicon resources on the Semantic Web as well as their linking to the existing semantic representations provided by ontologies. We demonstrate that lemon can succinctly represent existing lexical resources and in combination with standard NLP tools we can easily generate new lexica for domain ontologies according to the lemon model. We demonstrate that by combining generated and existing lexica we can collaboratively develop rich lexical descriptions of ontology entities. We also show that the adoption of Semantic Web standards can provide added value for lexicon models by supporting a rich axiomatization of linguistic categories that can be used to constrain the usage of the model and to perform consistency checks.
  • Ernesto William De Luca Multilingual Lexical Linked Data – A lot of information that is already available on the Web, or retrieved from local information systems and social networks is structured in data silos that are not semantically related. Semantic technologies make it emerge that the use of typed links that directly express their relations are an advantage for every application that can reuse the incorporated knowledge about the data. In this presentation, we present our work of providing Lexical Linked Data (LLD) through a meta-model that contains all the resources and gives the possibility to retrieve and navigate them from different perspectives. We show some use cases where we link lexical data, and show how to reuse and inference semantic data derived from lexical data.
  • Christian ChiarcosModelling linguistic corpora and their annotations with OWL/DLSlides
  • Sebastian NordhoffThe Glottolog/Langdoc Project: Publishing a bibliographical database of 200,000 references for 7,000 languages as Linked DataSlides
  • Sebastian HellmannNIF: NLP Interchange FormatSlides
  • Richard LittauerTowards Open Methods: Using Scientific Workflows in LinguisticsSlides

The second part will consist of a mixture of an Open Panel and a Q&A session. At any time people can come to the front and make a statement and then we will discuss it, topics include and are not limited to:

  • Incentives for publishing data: Requirement analysis for a Scientific Journal as a forum for publishing data
  • Best practices for publishing Open Linguistic Data

If you have any questions, or if you’re interested in keeping in touch, please write to the [open-linguistics mailing list](!

