Glossary of Terms


All complex subjects have their own terminology that sometimes makes it hard for new people to break into the field. This sometimes includes uncommon words, but more often than not a subject will have very specific meanings for common words - the discussion of errors vs mistakes in this video is a good example of this.

This glossary is a reference of some of the uncommon terms and specific definitions of more common words that you will encounter throughout Data Tree and your broader dealings with data. 

Many of these definitions come from the course materials and experts that helped develop Data Tree. Others come from the CASRAI Dictionary. Those definitions are kindly made available under a Creative Commons Attribution 4.0 International License.



Browse the glossary using this index

Special | A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R | S | T | U | V | W | X | Y | Z | ALL

E

e-Infrastructure

A combination and interworking of digitally-based technology (hardware and software), resources (data, services, digital libraries), communications (protocols, access rights and networks), and the people and organisational structures needed to support modern, internationally leading collaborative research be it in the arts and humanities or the sciences. http://www.rcuk.ac.uk/research/xrcprogrammes/otherprogs/einfrastructure/

E-Research

Computationally intensive, large-scale, networked and collaborative forms of research and scholarship across all disciplines, including all of the natural and physical sciences, related applied and technological disciplines, biomedicine, social science and the digital humanities.

- CASRAI Dictionary

Earth Observation

Gathering information about the Earth's physical systems via remote sensing technologies, often satellites which look down at the Earth from their orbit.

Electromagnetic Spectrum

The range of wavelengths of electromagnetic radiation, with gamma rays having short wavelengths and high energy, to radio waves with long wavelengths and low energy. Visible light is part of the electromagnetic spectrum. Examples of the use of electromagnetic radiation https://www.bbc.co.uk/education/guides/z66g87h/revision/3

ENIAC

Electronic Numerical Integrator And Computer, the world's first general-purpose computer; designed and built to calculate artillery firing tables in the 1940s and later used for computer weather predictions. https://www.thoughtco.com/history-of-the-eniac-computer-1991601

Ensemble

In weather forecasting an ensemble is a method whereby instead of making a single forecast, a set of forecasts are produced that present a range of future weather possibilities. https://www.ecmwf.int/en/about/media-centre/fact-sheet-ensemble-weather-forecasting

Environmental analytics

Analysis of data sourced from the environment, or data with an application relating to the environment.

Environmental consultant

Works on a contractual basis for private and public sector clients, addressing environmental issues such as water pollution, air quality and soil contamination. [www.sokanu.com]

Environmental research data

Individual items or records (both digital and analogue) usually obtained by measurement, observation or modelling of the natural world and the impact of humans upon it, including all necessary calibration and quality control. This includes data generated through complex systems, such as information retrieval algorithms, data assimilation techniques and the application of numerical models. However, it does not include the models themselves. 

 - NERC Data Policy


Examples of research data:

  • Model output from running a numerical climate model
  • Time series logged by environmental instrumentation
  • Conductivity-Temperature-Deptch casts from oceanographic cruises
  • Groundwater chemistry and stable isotope measurements
  • Butterfly abundance observations.

Error

Error is the difference between the measured value and the ‘true value’ (NPL, 1999).  Errors can come from the measuring device itself, including bias, changes due to wear, instrument drift, electrical noise and device resolution.  Other errors can be introduced by difficulties in performing the measurement and by operator skill.  To avoid sampling error, sufficiently dense measurements in space and time should take place to make sure that full variability is captured e.g. diurnal cycles, variations across a site.

Errors can be random or systematic (NPL, 1999).  With random errors, each measurement gives a different result, so the more measurements (of the same thing) the better the estimate and the more certain the measurement becomes.  Systematic errors arise from a bias, e.g., a stretched tape measure, and more measurements do not produce a better estimate of the ‘true value’.


Estimation

Estimation is the process by which sample data are used to indicate the value of an unknown quantity in a population. The results of estimation can be expressed as a single value, known as a point estimate. It is usual to also give a measure of precision of the estimate. This is called the standard error of the estimate. A range of values, known as a confidence interval can also be given.

Estimator

An estimator is a quantity calculated from the sample data, which is used to give information about an unknown quantity (usually a parameter) in the population. For example, the sample mean is an estimator of the population mean.

Exa-

Prefix denoting a factor of 1018 or a billion billion

Experimental research data

Research data from experimental results, often reproducible, but can be expensive.

Examples: data from lab equipment, metagenomic sequences recovered from soil samples, results of a field experiment.