Professor Haidee Kotze (UIL OTS) and dr. Gys-Walt van Egdom (ICON), both working in the linguistic research field of Translation Studies, approached the Digital Humanities Lab (DH Lab) with the desire to use big data to explore reader responses to translated literature. They wanted to do research on the reviews of real ‘non-professional’ readers of contemporary literature, such as found on websites like Goodreads and Amazon.

The DH Lab developed the DIOPTRA-L corpus to enable their research. DIOPTRA-L is a database with 280.000 user-generated reviews of more than 150 books that have been translated into Afrikaans, Dutch, English, French, German, Italian, Portuguese and Spanish. The corpus has been enriched with numerous computed labels (for example, the number of words in a review) and annotated information (for example, the genre of the reviewed book). DIOPTRA-L is available for free for anybody interested in readers’ reviews of translations or in literary texts in general. The corpus is hosted in I-Analyzer, a very user-friendly interface in which you can search by title, source language, translation language, year and review language.

Kotze and Egdom’s first research with DIOPTRA-L was on the readers’ awareness of translations in reviews and the way the translation is received. The aim of these first studies was to see how large amounts of user-generated content can contribute to translation reception research. The Digital Humanities Lab supported this research through custom Python and R scripts to count and visualize the occurrence of positive, negative and hedge terms in the discussion of translations. This enabled us to quantify observations by Kotze and Egdom, that reviewers may react to a book that they did not like with uncertainty about the quality of the translation. The corpus is also available for download here and the scripts can be inspected here.