WG2 has started Digilex, a platform for sharing tips, raising questions and discussing methods of digitizing print dictionaries. See latest blog entries below, for more, visit the website https://digilex.hypotheses.org.

RSS WG2 blog

  • A TEI XML Version of the Historical Thesaurus of English Database December 29, 2018
    My project during the Lexical Data Masterclass was to trial recreating a section of the Historical Thesaurus of English in TEI-compliant XML, to which I could potentially add further information not presently contained in the Thesaurus database. In particular, I’m interested in adding words which collocate with word senses throughout the lifetime of those senses. […]
    fraser
  • Slavic Corpora Terminology Dictionary in TEI December 22, 2018
    The project The corpora linguistics research group at the Institute of Western and Southern Slavic Studies (University of Warsaw) has recently started a project collecting Slavic corpora terminology with definitions so as to be able to investigate this type of lexica. The collected data will be stored in the form of a TEI encoded dictionary. […]
    joannabilinska
  • TEI-encoding of classical Arabic grammatical sources December 17, 2018
    The project The major grammatical treatises of classical Arabic, widely investigated by scholars, are unfortunately accessible online mostly in form of digitized or scanned copies of modern printed editions, but standards-compliant digital collections, and semantic as well as linguistic annotations of their contents are not available.  For this reason, I have started working on a […]
    Simona Olivieri
  • BTB – WordNet: From LMF to TEI with XSLT December 12, 2018
    Why do we need this transformation? The short answer is visibility, accessibility and reuse of our data. WordNet is language resource used in various NLP tasks and we want researchers to have access to the Bulgarian WordNet to do experiments and to be able to reproduce our results. In the LMF standard the data is […]
    radev
  • FROM LEGACY FORMATS AND DATABASES TO TEI: Converting the Academy of Sciences Portuguese Dictionary to TEI Lex-0 December 9, 2018
    Ana de Castro SalgadoNOVA CLUNLLisbon, Portugal 1. ProjectIn Portugal, the last, unique and complete print edition of an academic Portuguese dictionary was published in 2001. At that time, the authors decided for a computational approach, developing a database using Microsoft Access. In 2015, the Academy wanted a new dictionary. With the goal of updating the […]
    Ana de Castro Salgado
  • From Àbèsàbèsì to XPath: An Overview of the Lexical Data Masterclass 2018 December 9, 2018
    Notes from the participants’ symposium, December 7, 2018 After a whole week of intense work, the participants of the DARIAH Lexical Data Masterclass presented their projects and results and discussed a variety of encoding and transformation issues which are summarized here.  Specialized dictionaries Claudia Bonsi – Encoding an Italian meta-dictionary The project provided a good […]
    Laurent Romary
  • A born-digital author lexicon for 17th c. French: Sévigné’s case December 8, 2018
    Preparing an edition of Madame de Sévigné’s correspondance encoded in TEI, we are currently facing two problems. First, while French medievalists have a long experience of establishing lexicons, specialists of 17th c. French literature traditionally do not provide such a study in their editions. Second, we are not aware of any born-digital author lexicon in […]
    Simon Gabay
  • Lexical resources for processing 18th-century French correspondence with NLP tools December 6, 2018
    Background Electronic Enlightenment is a scholarly digital editions of letters and correspondence. The collection started with Voltaire & Rousseau, expanded into other Enlightenment thinkers and writers, and has expanded into other eras, languages and domains. EE now has more than 80 000 letters, involving more than 15 000 people as writers or recipients. Access to […]
    Martin Wynne
  • Тhe Lexical Data Masterclass is back! October 12, 2018
    Co-organized by DARIAH-EU, the Berlin Brandenburg Academy of Sciences (BBAW), Inria and the Belgrade Center for Digital Humanities, with the support of the French Ministry of Higher Education and Research (MESRI), CLARIN and the European Lexicographic Infrastructure (ELEXIS), the 2018 edition of the Lexical Data Masterclass will take place in Berlin at the BBAW from […]
    Laurent Romary
  • The Lexical Data Masterclass – An Overview February 27, 2018
    From 4 to 8 December 2017, 21 participants met together with 8 trainers and 2 keynote speakers to work jointly and improve their digital dictionary projects. The meeting, co-organized by the Centre Marc Bloch, DARIAH-EU, the Berlin Brandenburg Academy of Sciences (BBAW), Inria (Paris, France) and the Belgrade Center for Digital Humanities (BCDH, Serbia), with […]
    Laurent Romary