Between Corpora and Dictionaries /
COST ENeL WG3 meeting
Budapest, Hungary, 24-25 February 2017


24 February 2017 (14.00 – 18.00)
14.00-14.10Introduction: Simon Krek
14.10-14.30A corpus-based lexical database for Sign Language of the Netherlands
(Onno Crasborn, Inge Zwitserlood, Els van der Kooij and Richard Bank)
14.30-14.50A corpus-based dictionary of spoken German – a short insight into a new project
(Christine Möhrs)
14.50-15.10Which vocabulary thematic fields should be illustrated? Dictionary tradition versus pictorial corpora
(Monika Biesaga)
15.10-15.30 Corpora-based lexicography in service of overcoming errors in the production of prepositions in L2
(Noam Ordan, Omaima Abboud, Rani Abboud, Yifat Ben-Moshe, Ilan Kernerman)
15.30-16.00 coffee break and poster session
16.00-16.20 Integrating Trandix and corpus in a single e-resource
(Isabel Durán-Muñoz)
16.20-16.40 Corpus of Learner Translations and its role in communication, translation and research
(Julia Ostanina-Olszewska, Janusz Parfieniuk)
16.40-17.00 Treq: A Translation Equivalents Database (demo)
(Michal Škrabal, Martin Vavřín)
17.00-17.20Using Domain Corpora for Semi-automatic Building of a Multilingual Terminology Thesaurus
(Aleš Horák, Adam Rambousek)
17.20-17.40Visualisation as an afterthought: lessons learned
(Arvi Tavast, Maria Tuulik, Jelena Kallas)
(moderator: Simon Krek)
25 February 2017 (9.00 – 13.00)
9.00-9.20 A ColWordNet API
(Luis Espinosa-Anke, Jose Camacho-Collados, Sara Rodríguez-Fernández, Horacio Saggion, Leo Wanner)
9.20-9.40Extracting Lexical Data from Classification Schemes
(Thierry Declerck and Kseniya Egorova)
9.40-10.00Sketch Engine and Lexonomy: towards a Single Workspace for Lexicography
(Miloš Jakubíček, Michal Měchura)
10.00-10.20A new Cost Action of interest to lexicographers: European Network for Combining Language Learning with Crowdsourcing Techniques (enetCollect)
(Lionel Nicolas, Verena Lyding)
10.20-10.50 coffee break and poster session
10.50-11.10Corpus Informed Enrichment of Lexical Resources (with Fixed Similes) via Facebook-enabled Crowdsourcing
(Stella Markantonatou, Jelena Mitrovic)
11.10-11.30 Online gamied psycholinguistic experiments (demo)
(Arvi Tavast)
11.30-11.50Crowdsourcing Second Language learner data: experiences and prospects
(Elena Volodina)
11.50-12.10An overview of NLP crowdsourcing systems
(Federico Sangati)
(moderator: Simon Krek)
poster session
The enrichment of the lexical information and the corpus resources by using the results of the morphological analysis of historical texts
(Renata Bronikowska, Emanuel Modrzejewski)


  1. Topic description:

Between corpora and dictionaries

  • (online) linking, portals with corpora and dictionaries
  • multimodal corpora and dictionaries
  • integration of corpora and dictionaries into one resource
  • corpus statistics in dictionaries, visualisation of corpus data for lexicography
  • corpora and dictionaries in the lexicographic workflow
  • (automatic) generation of lexicographic data from corpora

Crowdsourcing and gamification

  • collaborative lexicographic projects, use cases of crowdsourcing
  • (online, mobile) crowdsourcing platforms for lexicography
  • (online, mobile) gamification platforms for lexicography
  • educational games with dictionary or corpus data
  • Wiktionary and use of wiki software
  • crowdsourcing workflow (task design) and motivation
  • reliability issues/studies in crowdsourcing for lexicography
  • crowdsourcing strategies to reach large public

