About Research Data

Many articles have been written about what ought to be considered Research Data and the philosophical implications of using the term ‘data’. In the context of this guide, we use a broad definition, based on the practical approach that each and every ‘source’, ‘observation’, ‘text fragment’, ‘book’, ‘procedure’ (e.g. an algorithm) which is used in a research project and which underpins a research result, or leads to research results is to be considered data.

Handling research data is nowadays also called processing research data; a term used in the European law on the protection of personal data (GDPR).

Establishing the types of your datasets

Once you have your research question(s) you start thinking about the methods of answering. Among many other things this means thinking what material and sources you’ll use, e.g. secondary literature, material in an archive, experimental set ups, stories of people captured in interviews etc. Exactly those sources are either your ‘data’ or they will generate your data.

Once you start listing them you will end up with a number of different sets of data, for example:

IDTypeFormat
1Interviews – RAW audio filesAudio
2Interviews – anonymized transcriptsText
3Electronic SurveysText
4Focus group discussions – RAW audio filesAudio
5Notes from observations & discussionsText
6Key fileText
7Participant contact listText
8Literature listText

Once you have this list, you can start asking follow up questions per data set e.g. whether it contains personal data and if so whether that would be special categories of personal data.

Also, when personal data is involved, you can indicate what legal grounds you use for processing the data.

Data list (example) Data list (empty)

You can empty the example and fill it out to create your own list. If you have filled out this table, you have almost all information needed for a data management plan.

2. Working with personal data