Organise and select your data

Funders and publishers want to see evidence that you are using good practice in keeping data safe and well managed so that they can be sure of their integrity.

Organising and documenting your data will also help you:

communicate with colleagues about your research
find data you have created earlier
prepare them for archiving and sharing with others at the end of the project
make your data FAIR (findable, accessible, interoperable, reusable).

File structure

Think about the best hierarchy for files. Should they be organised by:

type of data: text, dataset, images?
research activities: interviews, surveys?
type of material: documentation, publications, data?

Make an effective hierarchy and stick to it. It will help you reliably find files in the future and know exactly where files should go as your project progresses.

File names

Meaningful file names help you know the content and status of a file:

use terms like project acronyms, researcher’s initials, or information that describes the type of file
add version numbers, file status, or a date
keep file names short
don't use spaces or special characters

See examples of file structures and naming conventions.

Versioning

Version control helps you distinguish between different iterations of your work, so that you can find correct versions as needed. You should decide:

how many copies of a file you need to keep
how long you need to keep them
how you will tell each version apart, for example by using a consistent naming convention (see above)
If you store files in various places, you'll need to remember to synchronise the copies regularly.

See up-to-date recommendations on versioning.

If your data isn't digital

If notebooks, physical documents, models, artefacts or live experiences form part of your data, you need to manage and care for them as well as your digital data.

Make digital copies to preserve the data against physical damage and prepare it for long-term storage and sharing. You'll need to identify appropriate formats that represent the data and are suitable for preservation and reuse - plan ahead for this.

Choosing file formats for archiving

Data created in digital formats may also need converting to formats for storage and sharing. Make sure you build time and resources into your project to allow for this. Use open or standard formats from the start and make sure your data are backed up in open formats as you go along, can save a lot of time at the end of the project.

Choose the formats of your file so that the minimum amount of work is required to enable others to reuse your work and for the best preservation of your data.

See advice on file formats and recommended formats for various types of data.

Interviews and other audio data will need transcribing so make sure you have planned for this specialist task.

More about creating digital surrogates of your work and preparing your data for archiving.

Keep your data safe

Data can be lost in many different ways: through human error, hardware failure, software or media faults, or malicious hacking and virus infection. Digital data files can also be corrupted in storage or through file transfer.

How you can protect your data:

Use the University of Kent network and Data Centre to process, store and back-up your data. The University of Kent Data Centre is guaranteed by Cyber Essentials certification.
Back up your work regularly. If files contain sensitive data or personal information, create only a minimum number of copies (for example one master file and a backup copy)
Encrypt your files
Make sure your laptop is encrypted
Follow our IT safety and security advice
Make sure you choose the optimum file format for your data in the long term.
Deposit it in a trustworthy repository at the end of your project.

Document your data

Keep a record of decisions you make about how you organise your data.

During your project this will help you remember your decisions and ensure that all the members of the team are doing the same thing. Documentation will keep your data consistent and help you interpret your data and give it context in the short and long term.

Once your project is completed, you will need to archive this information with your data to help future users understand your data.

The more detail you record, the more useful the documentation will be. Include:

study-level information: details about the design of the research, methodologies, processes
data level information: which may be embedded in the files
metadata: according to an established schema that is used by data repositories to describe your data.

You'll need to create a README file to describe everything needed to replicate the data and help others use it and understand it properly.

Advice on documenting your data (UK Data Service)

README files explained

When you archive your data at the end of the project, a README file accompanies the datasets to introduce them and give them context. Its purpose is to describe everything needed to replicate the data, or to use it and understand it properly.

If you have kept notes throughout the project and have an up-to-date data management plan, creating the README file will be straightforward.

The outline below shows one way of approaching a README file, questions you could answer, and information you could include.

Outline

Data and file overview

For each file, a short description of what it contains, and who created it

format of the file if not obvious from the file name
if the data set includes multiple files that relate to one another, the relationship between the files or a description of the file structure that holds them - you could use terminology like 'dataset' or 'study' or 'data package'

Date the file was created, dates of updates (versions) and the nature of the updates
Information about any related data that was collected but isn't in the described dataset.

Methodological information

Description of methods for data collection or generation - include links or references to publications or other documentation containing experimental design or protocols used
Description of methods used for data processing - describe how the data was generated from the raw or collected data

any instrument-specific information needed to understand or interpret the data
standards and calibration information, if appropriate
describe any quality-assurance procedures performed on the data
definitions of codes or symbols used to note or characterize low quality/questionable/outliers that people should be aware of

People involved with sample collection, processing, analysis and/or submission
Legal or ethical considerations and agreements.

Data-specific information

Count of number of variables, and number of cases or rows
List of variables, including full names and definitions of column headings for tabular data - spell out any abbreviated words
Units of measurement
Definitions for codes or symbols used to record missing data
Specialized formats or other abbreviations used.

Email researchsupport@kent.ac.uk to obtain a blank README file template.

Select your data

To make your data FAIR - ie 'as open as possible, as restricted as necessary' - you need to decide which datasets to archive and share.

The Digital Curation Centre (DCC) outlines five steps to help you decide what to keep. They are:

Identify what purposes the data could fulfil
Identify what data must be kept
Identify what data should be kept
Weigh up the costs
Complete the data appraisal - using the DCC checklist.

Get support

Get support with organising and selecting your research data by emailing the Open Research Team.

Email us about your research data

Popular Searches

Your studies

Student life

Support

Careers

Manage your research data