Developing a Data Management Plan (DMP)

Research data defined

It can be difficult to determine the types of data within your project. What is data and how should it be collected and recorded?

Here is the very broad definition of research data used by the University of Melbourne:

Research data means any information, facts or observations that have been collected, recorded or used during the research process for the purpose of substantiating research findings. Research data may exist in digital, analogue or combined forms and such data may be numerical, descriptive or visual, raw or processed, analysed or unanalysed, experimental, observational or machine generated. Examples of research data include: documents, spreadsheets, audio and video recordings, transcripts, databases, images, field notebooks, diaries, process journals, artworks, compositions, laboratory notebooks, algorithms, scripts, survey responses and questionnaires.

Management of Research Data and Records Policy (MPF1242): Definitions.

Not all researchers agree about what research data actually is, but all research projects have data. Some disciplines only produce digital data; for others, research may include a significant amount of physical material as data.

The Australian Research Data Commons (ARDC) suggests that one way of defining research data for your research is to consider 'what needs to be kept to validate your research'? What would you need to keep to defend your research outputs and conclusions? As a graduate researcher, you will need to do just that!

ARDC also suggest that further insights about research data can be gained if you think about the re-use of your data beyond the current project. What would another researcher need to re-use your data? In addition to the dataset, what peripheral data and information is needed? Some examples could include file-naming protocols, metadata and documentation like codebooks and data dictionaries: insights into how your data are structured.

For further information and examples of what research data is and what it is NOT check out this ARDC Guide.

Different types of data

For the purposes of these modules research data are described as:

Data created in a digital form (born digital)

  • Documents (text, MS Word), PDF, spreadsheets
  • Electronic Laboratory notebooks, field notebooks, diaries
  • Questionnaires, transcripts, codebooks
  • Audio files
  • Digital photographs, video
  • Slides, artefacts, specimens, samples
  • Collection of digital objects acquired and generated during the process of research
  • Statistical or other data files
  • Database contents (video, audio, text, images)
  • Models, algorithms, scripts
  • Contents of an application (input, output, logfiles for analysis software, simulation software, schemas)
  • Methodologies and workflows
  • Standard operating procedures and protocols

Physical materials created or collected as part of the research process

  • Slides, artefacts, specimens, samples
  • Audio tapes, video tapes

Physical materials converted to a digital form (digitised)

Slides, photographs of artefacts, specimens and samples

  • Audio tapes, video tapes
  • Scanned/imaged documents
  • Scanned 35mm film
  • 3D models of physical objects

Data collection methods

By now you will have some idea of the types of research data you'll be collecting. Some of this data may be physical, some may be digital; some may be collected as physical data (e.g. paper forms or archival records) but then organised and stored in a digital format or both (e.g. an Excel spreadsheet or a Word document). You will need to consider the different requirements of both digital and physical data when developing your data management plan.