Unit 1: Review of Relevant Statistical Concepts

In this unit, you’ll be presented with a review of statistical definitions and concepts related to describing and defining statistical data. These definitions will be explained using a selection of detailed examples.

Statistical characteristics

Before diving into structural modelling, let’s review a few statistical concepts used for describing and defining macro statistics.

A statistical characteristic is:

a statistical measure applied on
the values of one or more statistical variables
for the objects in a certain statistical unit.

So, what do these terms mean? Select each term to find out.

Statistical unit

A statistical unit is an (abstract) entity in the population for which information is sought and for which statistics are ultimately compiled, that is, the counted object. Statistical units can be persons, households, geographical areas, events, etc. To identify the statistical unit, answer the question, ‘what is the counted object?’

A population is a complete set of a certain type of statistical unit. Whereas statistical units are abstract, populations are concrete sets of objects with at least one characteristic in common.

Statistical measure

A statistical measure is a summarising (aggregation) function like count, sum, and average, applied to objects in the population. For example: “Number of accidents per thousand of population”, the statistical measure is “Number of accident events”.

Statistical variable

A statistical variable is a characteristic of a statistical unit which is measured or counted, such as height, country of birth, grades obtained at school, or income.

Statistical variables have a variable type and a value set. A type of statistical variable which divides the population into subdomains of interest is referred to as a classification variable. For example, sex is a classification variable for the population: “Number of persons living in Canada by sex at the end of year 2022”.

Statistical characteristics in context

Now select each of these detailed examples to see what statical characteristics look like in context.

For a reminder of what the key terms mean, select the links.

Example 1

"Persons living in Canada at the end of year 2022.”

Statistical unit (counted object) is ‘person’ (abstract).
Population is ‘persons living in Canada at the end of 2022’ (a complete set of ‘persons’).
Statistical measure is ‘number (of objects in the population)’ which is ‘number of persons’.
Variables
- Population-defining properties:
  - Country of residence = Canada
- Classification variables:
  - Sex
Reference time: year 2022

Example 2

“Number of foreign citizens living in Sweden and their average yearly incomes by citizenship, region, sex, and age. Years 1996-2003.”

Statistical unit (counted object) is ‘person’ (abstract).
Population is ‘foreign citizens living in Sweden at the end of year y’ (a complete set of ‘persons’).
There are two statistical measures in this example, they are:
- ‘number (of objects in the population)’ which is the ‘number of persons’.
- ‘average (income) (of objects in the population)’ which is the ‘average income of persons’.
Variables
- Population-defining properties:
  - Country of residence = Sweden
  - Citizenship = non-Swedish
- Classification variables:
  - Citizenship
  - Region
  - Sex
  - Age
Reference times: years 1996-2003

Example 3

“Number of working persons in Germany, 16 years of age and older, living in the region (night population) by region of dwelling, region of work, sex, occupation, socio-economic status, income class, activity class of working place. Year 1990.”

Statistical unit (counted object) is ‘person’ (abstract).
Population is ‘working persons, 16+ years old, living in Germany at the time of the reference time, t’ (a complete set of ‘persons’).
Statistical measure is the ‘number (of objects in the population)’ which is the ‘number of persons’.
Variables
- Population-defining properties:
  - Country of residence = Germany
  - Working status = working
  - Age > 15 years
- Classification variables:
  - Region of dwelling
  - Region of work
  - Sex
  - Occupation
  - Socio-economic status
  - Activity class of working place
Reference time: year 1990

Example 4

“Number of migrations in Greece by sex, age, from_region and to_region. Years 1998-2003."

Statistical unit (counted object) is ‘migration event’ (abstract).
Population is ‘migration events concerning persons living in Greece (before and/or after the event) that have taken place during the reference year, y’ (a complete set of ‘migration events’).
Statistical measure is the ‘number (of objects in the population)’ which is the ‘number of migration events’.
Variables
- Population-defining properties:
  - Country of residence of person = Greece
- Classification variables:
  - Sex of the migrating person
  - Age of the migrating person
  - Region of dwelling from which the person migrates
  - Region of dwelling to which the person migrates
Reference times: years 1998-2003

Statistical unit

(Abstract) entity in the population for which information is sought and for which statistics are ultimately compiled, that is, the counted object.

Population

A complete set of a certain type of statistical unit. A concrete set of objects with at least one characteristic in common.

Statistical measure

The summarising (aggregation) function like count, sum, and average applied to objects in the population.

Statistical variable

The characteristic of a statistical unit which is measured or counted, such as height, country of birth, grades obtained at school, or income.

Classification variable

A type of statistical variable which divides the population into subdomains of interest.

Unit of measure

The unit of measure is the unit in which the statistical measure values are expressed.

The unit of measure is a quantity or increment by which something is counted or described, such as:

kg, mm, pounds, inches, °C, °F,
monetary unit such as Euro or US dollar,
simple number counts or index numbers.

The unit multiplier is used to indicate if the observations are reported in units, thousands, millions, etc.

Variable types

Variable types are either categorical or numerical.

Select each variable type to learn more.

Categorical variable

Categorical variable

A categorical variable (also called qualitative variable) refers to a characteristic that can’t be quantified. Categorical variables can be either nominal or ordinal.

Nominal: A nominal variable is one that describes a name, label or category without natural order. Sex and type of dwelling are examples of nominal variables.
Example: Variable = “Sex”, Value set = “male”, “female”.
Ordinal: An ordinal variable is a variable whose values are defined by an order relation between the different categories. “Behaviour” is ordinal because the category “Excellent” is better than the category “Very good,” which is better than the category “Good,” etc. There is some natural ordering.
Example: Variable = “Behaviour”, Value set = “Excellent”, “Very Good”, “Good”, “Bad”, “Very Bad”.

Numeric variable

Numeric variable

A numeric variable (also called quantitative variable) is a quantifiable characteristic whose values are numbers. Numeric variables may be either discrete or continuous.

Discrete: A discrete variable can only assume a finite number of real values within a given interval. An example of a discrete variable would be the score given by a judge to a gymnast in competition: the range is 0 to 10 and the score is always given to one decimal (e.g. a score of 8.5). You can enumerate all possible values (0, 0.1, 0.2…) and see that the number of possible values is finite: it is 101.
Example: Variable = “Score”, Value set = A real number greater than or equal to 0 and less than or equal to 10 with one decimal point of precision.
Continuous: A variable is said to be continuous if it can assume an infinite number of real values within a given interval. For instance, consider the height of a student. The height can’t take any values. It can’t be negative, and it can’t be higher than three metres. But between 0 and 3, the number of possible values is theoretically infinite. A student may be 1.6321748755 … metres tall.
Example: Variable = “Height of student”, Value set = A real number greater than or equal to 0 and less than or equal to 3.

Statistical data structures

Statistical data are organised in certain typical structures, for example:

a series of time periods (rather than a single one) – “time series data”, and/or
a structured set of object populations (rather than a single one) – “cross-sectional data” (census data is a good example).

What do you know?

Now that you’ve completed our review of statistical definitions and concepts related to describing and defining statistical data, try this.

Which of the following statements about what you have just learned are TRUE?

Select all that apply and then select Submit.

Statistical unit is an (abstract) entity in the population for which information is sought and for which statistics are ultimately compiled such as persons or households.

Population is a complete set of a certain type of statistical unit.

Statistical measure is a characteristic of a statistical unit such as height, country of birth, or income.

Unit of measure is a quantity or increment by which something is counted or described.

Variable types are either nominal or ordinal.

That's right.

Statistical units are abstract, whereas populations are concrete sets of objects with at least one characteristic in common.

Statistical measure is a summarising (aggregation) function like count, sum, and average applied to objects in the population.

Unit of measure is a quantity or increment by which something is counted or described.

Statistical variable is a characteristic of a statistical unit which is measured or counted. Variable types are either categorical or numerical.

That's incorrect. The correct answers are options 1, 2 and 4.

Statistical units are abstract, whereas populations are concrete sets of objects with at least one characteristic in common.

Statistical measure is a summarising (aggregation) function like count, sum, and average applied to objects in the population.

Unit of measure is a quantity or increment by which something is counted or described.

Statistical variable is a characteristic of a statistical unit which is measured or counted. Variable types are either categorical or numerical.

Not quite. The correct answers are options 1, 2 and 4.

Statistical units are abstract, whereas populations are concrete sets of objects with at least one characteristic in common.

Statistical measure is a summarising (aggregation) function like count, sum, and average applied to objects in the population.

Unit of measure is a quantity or increment by which something is counted or described.

Statistical variable is a characteristic of a statistical unit which is measured or counted. Variable types are either categorical or numerical.

Coming next…

The statistical concepts you covered in this opening unit provide a common vocabulary for statisticians to describe statistical data. When data are described in this manner using a common vocabulary, statisticians from different domains or from different countries can understand the meaning of the data.

These concepts will be used extensively throughout the remainder of this module.