WG2 blog

WG2 has started Digilex, a platform for sharing tips, raising questions and discussing methods of digitizing print dictionaries. See latest blog entries below, for more, visit the website https://digilex.hypotheses.org.

Toward TEI Lex-0 Publisher: A Workshop Announcement November 29, 2019
When: December 16-17th, 2019 Where: DARIAH Coordination Office, Germaine-Tillion-Saal (7th Floor), Friedrichstr. 191, Berlin Instructors: Magdalena Turska and Wolfgang Meier, eXist Solutions Sponsor: Belgrade Center for Digital Humanities Local Organizer: DARIAH WG “Lexical Resources” Goal The goal of the two-day workshop/hackathon is to: introduce members of the DARIAH WG “Lexical Resources” and other interested parties […]
Toma Tasovac
A TEI XML Version of the Historical Thesaurus of English Database December 29, 2018
My project during the Lexical Data Masterclass was to trial recreating a section of the Historical Thesaurus of English in TEI-compliant XML, to which I could potentially add further information not presently contained in the Thesaurus database. In particular, I’m interested in adding words which collocate with word senses throughout the lifetime of those senses. […]
fraser
Slavic Corpora Terminology Dictionary in TEI December 22, 2018
The project The corpora linguistics research group at the Institute of Western and Southern Slavic Studies (University of Warsaw) has recently started a project collecting Slavic corpora terminology with definitions so as to be able to investigate this type of lexica. The collected data will be stored in the form of a TEI encoded dictionary. […]
joannabilinska
TEI-encoding of classical Arabic grammatical sources December 17, 2018
The project The major grammatical treatises of classical Arabic, widely investigated by scholars, are unfortunately accessible online mostly in form of digitized or scanned copies of modern printed editions, but standards-compliant digital collections, and semantic as well as linguistic annotations of their contents are not available. For this reason, I have started working on a […]
Simona Olivieri
BTB – WordNet: From LMF to TEI with XSLT December 12, 2018
Why do we need this transformation? The short answer is visibility, accessibility and reuse of our data. WordNet is language resource used in various NLP tasks and we want researchers to have access to the Bulgarian WordNet to do experiments and to be able to reproduce our results. In the LMF standard the data is […]
radev
FROM LEGACY FORMATS AND DATABASES TO TEI: Converting the Academy of Sciences Portuguese Dictionary to TEI Lex-0 December 9, 2018
Ana de Castro SalgadoNOVA CLUNLLisbon, Portugal 1. ProjectIn Portugal, the last, unique and complete print edition of an academic Portuguese dictionary was published in 2001. At that time, the authors decided on a computational approach, developing a database using Microsoft Access. In 2015, the Academy wanted a new dictionary. With the goal of updating the […]
Ana de Castro Salgado
From Àbèsàbèsì to XPath: An Overview of the Lexical Data Masterclass 2018 December 9, 2018
Notes from the participants’ symposium, December 7, 2018 After a whole week of intense work, the participants of the DARIAH Lexical Data Masterclass presented their projects and results and discussed a variety of encoding and transformation issues which are summarized here. Specialized dictionaries Claudia Bonsi – Encoding an Italian meta-dictionary The project provided a good […]
Laurent Romary
A born-digital author lexicon for 17th c. French: Sévigné’s case December 8, 2018
Preparing an edition of Madame de Sévigné’s correspondance encoded in TEI, we are currently facing two problems. First, while French medievalists have a long experience of establishing lexicons, specialists of 17th c. French literature traditionally do not provide such a study in their editions. Second, we are not aware of any born-digital author lexicon in […]
Simon Gabay
Lexical resources for processing 18th-century French correspondence with NLP tools December 6, 2018
Background Electronic Enlightenment is a scholarly digital editions of letters and correspondence. The collection started with Voltaire & Rousseau, expanded into other Enlightenment thinkers and writers, and has expanded into other eras, languages and domains. EE now has more than 80 000 letters, involving more than 15 000 people as writers or recipients. Access to […]
Martin Wynne
Тhe Lexical Data Masterclass is back! October 12, 2018
Co-organized by DARIAH-EU, the Berlin Brandenburg Academy of Sciences (BBAW), Inria and the Belgrade Center for Digital Humanities, with the support of the French Ministry of Higher Education and Research (MESRI), CLARIN and the European Lexicographic Infrastructure (ELEXIS), the 2018 edition of the Lexical Data Masterclass will take place in Berlin at the BBAW from […]
Laurent Romary

MoU	029/13
Approval date	16/05/2013
Start of Action	11/10/2013
End of Action	10/10/2017