In February 2026, representatives of the CLARIN ERIC infrastructure, CLARIN:EL, and the DataGEMS consortium co-organized a collaborative workshop at the Department of Informatics and Telecommunications of National and Kapodistrian University of Athens (NKUA). The workshop aimed to foster in-depth discussions on the services and resources provided by CLARIN and CLARIN:EL, explore opportunities for data sharing, and identify potential research synergies with the DataGEMS project. The event brought together experts from CLARIN ERIC, CLARIN:EL, and DataGEMS partner institutions, as well as affiliated teams from Athena Research Center (ATHENA RC), Communication & Information Technologies Experts (CITE), National Observatory of Athens (NOA), and Leibniz Institute for the German Language (IDS), in addition to external participants and contributors.

Key presentations were delivered by Prof. Andreas Witt (CLARIN ERIC and IDS, representing DataGEMS) and Kanella Pouli (CLARIN:EL, ILSP/Athena RC), who provided comprehensive overviews of CLARIN and CLARIN:EL language resources, their research communities, and service portfolios. They also outlined prospective avenues for collaboration with DataGEMS. Dr. Georgia Koutrika, Coordinator of the DataGEMS Project, presented the project’s objectives and anticipated service offerings, while the Project Manager, Eleni Petra, discussed its alignment with the European Open Science Cloud (EOSC) and the corresponding evaluation strategy. Focusing on standards and interoperability, Dr. Piotr Bański, Chair of the Standards and Interoperability Committee (SIC) of CLARIN ERIC, examined the adoption of CLARIN-supported standards within the technical framework of DataGEMS and discussed strategic perspectives for integrating DataGEMS resources into the broader CLARIN ecosystem. Dr. Paweł Kamocki, co-chair of the Working Group on Legal and Ethical Issues of the Text+ consortium—part of the National Research Data Infrastructure (NFDI) and the German EOSC Node—addressed the role of legal metadata in both CLARIN and DataGEMS, highlighting key challenges anticipated during the implementation phase. Dr. Thora Hagen (IDS), representing the Language Use Case and its associated pilot activities, presented the challenges emerging from the contemporary language data ecosystem and discussed how DataGEMS aims to address these through advanced data discovery and semantic technologies.

The ensuing discussions addressed cross-cutting issues, including standardization, interoperability, and legal metadata management. Particular emphasis was placed on practical strategies for implementing a language use case within a combined CLARIN–DataGEMS framework, while also examining how CLARIN’s extensive language datasets could support DataGEMS’ research objectives. These exchanges identified concrete opportunities for collaboration and laid the groundwork for future joint initiatives.

Overall, the workshop underscored the mutual benefits of combining the rich language datasets and services offered by CLARIN and CLARIN:EL with the advanced data analytics and semantic capabilities of DataGEMS, thereby paving the way for enhanced research outcomes in linguistics and AI-driven language technologies.