There are many publicly available text corpora. The list below is not exhaustive, but is a good starting point.

A selection of corpora

If you are looking for digitized resources, the Utrecht University Library is a good place to start. They have licenses on a large number of e-books and digital text corpora:

More than 130 million pages from Dutch newspapers, books and magazines:

Library of over 60,000 free digitized books (eBooks) of the world’s great literature, with a focus on older works for which U.S. copyright has expired:

The online text and data mining application I-Analyzer makes the following corpora accessible:

  • Dutch Annual Reports
  • Public Dutch Newspapers
  • Eighteenth Century Collections Online (ECCO)
  • Goodreads reviews of translated literary texts (DIOPTRA-L)
  • Guardian-Observer, archive 1791-2003
  • Jewish Funerary Inscriptions
  • Periodicals, archive 19th century
  • The Times, newspaper archive 1785-2010
  • De Troonredes, 1814-2018

Other corpora

Do you need advice on where to find certain corpora? Ask one of our affiliated members or send an email to cdh@uu.nl. Do you want to build your own corpus for your research and do you need help with that? Contact the Digital Humanities Lab for a no-obligation consultation.