Research Data Management: Data Basics

A guide to KU Libraries' services for managing and preserving research data.

More Stable Formats

Data files often become unreadable or difficult to use with changes in hardware and software over time. Consider migrating your data to formats more likely to be accessible in the future, with the following characteristics:

  • Non-proprietary
  • Open, documented standard
  • Common usage by research community
  • Standard representation (ASCII, Unicode)
  • Unencrypted
  • Uncompressed

Some preferrable formats:

  • PDF/A, not Word
  • ASCII, not Excel
  • MPEG-4, not Quicktime
  • TIFF or JPEG2000, not GIF or JPG
  • XML or RDF, not RDBMS

See the UK Data Archive page on data file formats for more examples of preferred formats for preservation. Not all repositories are able to migrate data to newer formats for preservation. Keep a copy in the original file format.

Basics

The term 'data' refers to:

  • Observational data- captured in real-time, usually irreplaceable

Examples: interview, neuroimages, sample data, sensor data, survey data, telemetry

  • Experimental data- from lab equipment, often reproducible, but can be expensive

Examples: gene sequences, chromatograms, toroid magnetic field data

  • Simulation data- generated from test models where model and metadata (inputs) are more important than output data

Examples: climate models, economic models

  • Derived or compiled data- reproducible but very expensive

Examples: text and data mining, compiled database, 3D models, data gathered from public documents

Examples

Research data (traditional and electronic) may include:

  • Application contents (input, output, log files, simulation software, schemas), audiotapes, videotapes
  • Collection of digital objects acquired and generated during research process
  • Data files
  • Database contents (video, audio, text, images)
  • Laboratory notebooks, field notebooks, diaries
  • Methodologies and workflows
  • Models, algorithms, scripts
  • Photographs, films
  • Questionnaires, transcripts, codebooks
  • Slides, artefacts, specimens, samples
  • Standard operating procedures and protocols
  • Test responses
  • Word documents, spreadsheets

File Formats

information button image

Data formats can be:

text- ascii, Word, PDF

models- 3D, statistical

multi-media- jpeg, tiff, dicom, mpeg, quicktime

numerical- ascii, SPSS, STATA, Excel, Access, MySQL

software- java, C

instrument-specific- Olympus Confocal Microscope Data Format

software-specific- FITS in astronomy, CIF in chemistry