There are many publicly available text corpora. The list below is not exhaustive, but is a good starting point.
A selection of corpora
If you are looking for digitized resources, the Utrecht University Library is a good place to start. They have licenses on a large number of e-books and digital text corpora:
More than 130 million pages from Dutch newspapers, books and magazines:
Library of over 60,000 free digitized books (eBooks) of the world’s great literature, with a focus on older works for which U.S. copyright has expired:
The online text and data mining application I-Analyzer makes the following corpora accessible:
- Dutch Annual Reports
- Public Dutch Newspapers
- Eighteenth Century Collections Online (ECCO)
- Goodreads reviews of translated literary texts (DIOPTRA-L)
- Guardian-Observer, archive 1791-2003
- Jewish Funerary Inscriptions
- Periodicals, archive 19th century
- The Times, newspaper archive 1785-2010
- De Troonredes, 1814-2018
Other corpora
Do you need advice on where to find certain corpora? Ask one of our affiliated members or send an email to cdh@uu.nl. Do you want to build your own corpus for your research and do you need help with that? Contact the Digital Humanities Lab for a no-obligation consultation.