From the European Court of Human Rights to CLARIN:EL
To celebrate Human Rights Day (10 December), we dedicate this month to language resources which are hosted in the CLARIN:EL Infrastructure and constitute members of the Human Rights Resource Family.
Katerina Korre, talks about the ECtHR (European Court of Human Rights) cases dataset created by the AUEB Natural Language Processing Group, of which she was a member:
The European Court of Human Rights (ECtHR) hears allegations regarding breaches in human rights provisions of the European Convention of Human Rights (ECHR)1 by European states.
This dataset comprises 11k cases of the European Court of Human Rights (ECtHR) and can be considered as an enriched version of the Chalkidis et al. (2019) dataset. This new dataset includes 3 files (dev, train and test), in plain text (TXT) format with each file giving information about different cases. Each case is attributed with a unique identifier (ID). In addition, the facts that occurred for each case are given in chronological order in the form of numbered paragraphs. Moreover, the allegedly violated articles for each case are also listed. The ECtHR cases dataset has been used in AI model development and experiments, for example, in predicting which articles were to be discussed in Court based on the facts. In the same logic, all the articles that the Court decided were actually violated (violated articles) are given. For each judgment of the Court there are references to the facts of the respective case (silver allegation rationales). Finally, a lawyer with experience in ECtHR cases was appointed to annotate for a number of 50 cases the paragraphs describing the facts which constitute a violation of the articles of the Convention and these are listed as the gold allegation rationales, i.e. the desired result.
ECtHR (European Court of Human Rights) cases dataset is freely available for research purposes through CLARIN:EL under a CC-BY-NC-SA License of Use (Attribution, Non-Commercial use,Share Alike).

Katerina Korre
PhD candidate in Computational Linguistics, University of Bologna
Resource information
English
Preview
