How to write a research data management plan

Stavrina Dimosthenous

Henry Royce Institute, University of Manchester

2023-07-12

What is a data management plan?

A research data management plan (DMP)

is a living document

can be updated at any point, as required

What is a data management plan?

A DMP will outline all the stages of a project’s data lifecycle before data collection begins

  • how the data will be collected,
  • stored,
  • protected,
  • shared,
  • licensing rights to the data

The research data lifecycle

Why write a data management plan?

Organisation reasons

  1. It is required
    1. By your funders
    2. By your school/faculty/department
  2. Will help with project reporting to funders

Why write a data management plan?

Personal reasons

  1. Helps:
    1. Formalise a process
    2. Assess what is needed
    3. Identify areas that require attention
    4. Enforce consistency in data management
      • Being consistent with your data management will lead to fast retrieval
  2. Adapt a pre-existing DMP for a similar project
  3. Refer to the DMP when writing your thesis
  4. Guide the project management of your project ⇛ timely thesis completion

When to write a Research DMP?

Plan

Gather your resources

  • Guidance and policy documents including:
    1. Funder guidance
    2. Institutional guidance
    3. Private funder guidance (e.g. industrial sponsor)
  • Decide on your DMP workflow
    • When to update
    • One vs. several DMPs
  • Seek reviewers for your DMP

Plan

Consider your project

  • Pre-existing data
  • Research group data management methods
    1. Does the group use a particular storage system?
    2. How does my group share files internally and externally?
    3. How does my group collaborate?
    4. How does my group manage physical data storage, i.e. samples
  • Institutional data management resources
  • Is the data sensitive to a defence or commercial project?

Collect

  • What type of data?
    • Physical (Don’t forget samples are data too!)
      • What types of samples?
    • Digital
      • What file formats?

Collect

  • How?
    • Physical
      • Received from partner?
      • Created in lab?
    • Digital
      • What pieces of data acquisition equipment?
      • How data will be generated from simulations?

Collect

  • Where will the data be acquired from?
    • Novel?
    • Publicly available?
    • Data from group?
    • Data from partners?

Collect

  • What Software do I need?
    • Do I need specialised software to acquire and analyse the data?
    • What type of file formats does the software I need export to?
    • Can I convert acquired data to open file formats?
    • How will I acquire this software?
      • Free and Open Source? Paid license?
  • What support will I need from institutional resources (people)?
  • What software will I be using to track samples?
    • Sample tracking software?
  • What software will I be using to track experiments?
    • Electronic laboratory notebooks?

Process

  • Does data need to be transferred between locations?
  • How will data be organised?
    • Sensible directory naming conventions
    • Sensible file naming conventions
  • Does data come in proprietary file formats?
    • Does it need to be converted to an open file format?
    • Will it be easily accessible by analysis programmes?
  • Will data need to be cleaned before analysis?

How will the processing step be documented?

Analyse

Well documented == reproducible

Preserve

  • How will the data be stored temporarily, short-term, long-term?
  • How long will the data be retained for after the end of the project?
  • Will the data be archived?
  • Are there any data destruction procedures that need to be followed?
  • Public v Institutional (or Subscription) & Open v Subscription-based Repositories

Share

Two aspects to sharing research and data:

  1. Internal to the project
    • Perhaps external to your institution
  2. Publishing

Share

Consider:

  • Who owns the intellectual property rights to the data?
  • Partner policy
  • Funder policy

Will determine:

  • Who will has rights to reuse it?
  • If the data will remain with the research group
  • How the data can be shared?
  • Where the data can be shared?

Share

Repositories:

  • Which repository will make my data more visible? (Findable to the community)
  • Is there a specific repository to desposit to in accordance with my funder policy?
  • Could my data be made more Accessible Interoperable Reusable?
  • How do I want to license my data/code?

Share

Open access:

  1. Diamond
  2. Gold
  3. Green

Financial aspect:

  • What will I do if I can’t publish Gold Open Access?

Tips

Weave RDM practices into your everyday workflow

Not just an afterthought or check-marking activity

Even as a one-person lab

It is part of your work and responsibilities

Seniors need to encourage, support and promote RDM

Ask if your research group has a nominated data management champion

Remember

It is a living document

can change to reflect changes in the direction of your project

Resources

Data Curators RDMP Guide

References

[1]
M. Pownall et al., “Teaching open and reproducible scholarship: A critical review of the evidence base for current pedagogical methods and their outcomes,” Royal Society Open Science, vol. 10, no. 5, May 2023, doi: https://doi.org/10.1098/rsos.221255
[2]
The Turing Way Community, “The turing way: A handbook for reproducible, ethical and collaborative research.” Zenodo, 2022. doi: 10.5281/zenodo.3233853
[3]
ELIXIR, “Research data management kit. A deliverable from the EU-funded ELIXIR-CONVERGE project (grant agreement 871075).” 2021. Available: https://rdmkit.elixir-europe.org/
[4]
Engineering & Physical Sciences Research Council, UK Research and Innovation, “Policy framework on research data. Principles of EPSRC research data policy framework,” Mar. 31, 2022. Available: https://www.ukri.org/about-us/epsrc/our-policies-and-standards/policy-framework-on-research-data/principles/
[5]
Science and Technology Facilities Council, UK Research and Innovation, “STFC scientific data policy,” Apr. 01, 2016. Available: https://www.ukri.org/publications/stfc-scientific-data-policy/
[6]
Science and Technology Facilities Council, UK Research and Innovation, “Data management plan,” Aug. 17, 2021. Available: https://www.ukri.org/councils/stfc/guidance-for-applicants/what-to-include-in-your-proposal/data-management-plan/
[7]
UK Research and Innovation, “Publishing your research findings. Making your research data open.” May 12, 2023. Available: https://www.ukri.org/manage-your-award/publishing-your-research-findings/making-your-research-data-open/
[8]
UK Data Service, “Research data lifecycle,” Aug. 13, 2019. Available: https://www.youtube.com/watch?v=-wjFMMQD3UA
[9]
OpenAIRE, “Guides for researchers: How to deal with non-digital data.” Available: https://www.openaire.eu/non-digital-data-guide
[10]
S. Jones, “How to develop a data management and sharing plan. DCC how-to guides.” 2011. Available: https://www.dcc.ac.uk/guidance/how-guides/develop-data-plan
[11]
OpenAIRE, “Data formats for preservation.” Available: https://www.openaire.eu/data-formats-preservation-guide
[12]
OpenAIRE, “Raw data, backup and versioning.” Available: https://www.openaire.eu/RAW-DATA-BACKUP-AND-VERSIONING
[13]
A. Perry, “Data management plans and data management costs,” 2023, doi: 10.5281/zenodo.7759355