Friday, 19 April 2024, 2:38 PM
Site: Datatree - Data Training Engaging End-users
Course: Introduction to Data Tree (Intro)
Glossary: Glossary of Terms
S

Sample

A sample is a group of units, selected from a larger group (the population). By studying the sample it is hoped to draw valid conclusions (inferences) about the population. A sample is usually used because the population is too large to study in its entirety. The sample should be representative of the population. This is best achieved by random sampling. The sample is then called a random sample.

Sampling Distribution

A sampling distribution describes the probabilities associated with an estimator, when a random sample is drawn from a population. The random sample is considered as one of the many samples that might have been taken. Each would have given a different value for the estimator. The distribution of these different values is called the sampling distribution of the estimator. Deriving the sampling distribution is the first step in calculating a confidence interval, or in conducting a hypothesis test.

Satellite imagery

An image of part of the Earth taken using artificial satellites in orbit around the Earth. These images have a variety of uses including

Secondary Data

Existing data which is being reused for a purpose other than the one for which it was collected.

Sentinel satellites

A family of Earth Observation satellite missions by the European Space Agency http://m.esa.int/Our_Activities/Observing_the_Earth/Copernicus/Overview4

Signal to noise ratio

A measure of how much useful information there is in a system, a phrase applied generally but originating in electrical systems to indicate the strength of the information (signal) compared to unwanted interference (noise), a low signal to noise ratio means that it is difficult to determine the useful information.

Simulation research data

Research data generated from test models where the model and metadata may be more important than the output data from the model.

Examples: Climate or ocean circulation models.

Skew

If the distribution (or “shape”) of a variable is not symmetrical about the median or the mean it is said to be skew. The distribution has positive skewness if the tail of high values is longer than the tail of low values, and negative skewness if the reverse is true.

Smart Meter

A new kind of energy meter that can digitally send meter readings to your energy supplier and come with in home display units, to see in real-time how much energy is being used in a household.

Software developer

A person who researches, designs, programs and tests computer code.

Stakeholder

Individuals, groups or organisations that have an interest or share in an undertaking or relationship and its outcome - they may be affected by it, impact or influence it, and in some way be accountable for it.

- CASRAI Dictionary

Standard deviation

The standard deviation (s.d.) is a commonly used summary measure of variation or spread of a set of data. It is a “typical” distance from the mean. Usually, about 70% of the observations are closer than 1 standard deviation from the mean and most (about 95%) are within 2 s.d. of the mean.

Standard error

The standard error (s.e.) is a measure of precision. It is a key component of statistical inference. The standard error of an estimator is a measure of how close it is likely to be, to the parameter it is estimating.

Stream processing

The practice of computing over individual data items as they move through a system. This allows for real-time analysis of the data being fed to the system and is useful for time-sensitive operations using high velocity metrics.