Unit 3: Building a Conceptual Data Model

Unit 3: Building a Conceptual Data Model

In this unit, you’ll apply the concepts learned so far to evaluate a statistical table and produce a Conceptual Data Model by identifying the core statistical concepts of:

  • Statistical Unit
  • Population
  • Statistical Measure
  • Variables
  • Reference Times

Making a start: Data in tables

Data in tables are the starting point for most structural modelling (and SDMX) projects. So, it’s with data in a table that we’ll practice structural modelling.

We’ll use this table from the Statistical Yearbook of the Republic of Maldives, Table 9.2 Fish Catch by Vessel Locality, 2015-2020.

Select the table to enlarge.

Apply statistical concepts

In applying our statistical concepts to the above table, we start to define our structural model by identifying the statistics and statistical concepts. There are also some key considerations we need to take into account.

Select each question to see the answer.

There are three statistics referenced in this table:
  1. Fish catch by vessel locality, 2015-2020
  2. Percentage of fish catch by vessel locality, 2015-2020
  3. %Change over previous year of fish catch by vessel locality, 2015-2020
To develop a clear understanding of what the statistics and statistical unit refer to in our table, it’s imperative to read the metadata. The metadata clearly states that vessel locality does not reference where the fish catch occurs – vessel locality references where the vessel was registered.

Source: Ministry of Fisheries, Marine Resources and Agriculture
Note: This table does not in anyway or form represent the area of fish catch but the catches from the vessels registered to islands in the specific atoll

Reviewing the definition of population introduced earlier, it was stated that population is a concrete set of objects. The only information is that the fish catch was by a vessel registered in an atoll – not where the fish catch occurred.

To be complete, the geographic area within which the fish catch is being measured must be specified. This introduces the need for a new concept, the geographic area. For this exercise, we can assume that the fish catch is being measured in the geographic area of the Republic of Maldives and this will be added to the model in the definition of the population and as a population-defining property.

Reviewing the data and the concepts and determining the statistical unit, population, and statistical measure, it is evident that:

  • there’s a reference to ‘vessel (registration) locality’ but
  • there’s no reference to ‘registered vessels’.

Source: Ministry of Fisheries, Marine Resources and Agriculture
Note: This table does not in anyway or form represent the area of fish catch but the catches from the vessels registered to islands in the specific atoll

Since the data refers to fish catch by locality of vessel registration, the fish catch must come from registered vessels. The concept of ‘registered vessel’ needs to be added to the model for it to be complete.

The options for adding this concept are:

  1. The statistical unit is changed from ‘fish catch’ to ‘fish catch by registered vessel’. This option is not logical, and it is too specific for what is supposed to be an abstract concept.
  2. Include ‘from registered vessel’ to the definition of population and include the concept ‘from registered vessel’ as a population defining property.

In evaluating the two options,

  • Option 2 is the preferred and chosen approach.
Population
The complete set of a certain type of statistical unit. A concrete set of objects with at least one characteristic in common.
Statistical unit
(Abstract) entity in the population for which information is sought and for which statistics are ultimately compiled, that is, the counted object.
Statistical measure
The summarising (aggregation) function like count, sum, and average applied to objects in the population.
Statistical variable
The characteristic of a statistical unit which is measured or counted, such as height, country of birth, grades obtained at school, or income.
Classification variable
A type of statistical variable which divides the population into subdomains of interest.

Structural modelling of the data

By applying the concepts we covered earlier to evaluate our data table, we can produce a Conceptual Data Model for the three referenced statistics by identifying the core statistical concepts of statistical unit, population, statistical measure, variables, and reference times.

Select the table to enlarge.

Then select the tabs to see the statistical concepts for each statistic.

Total

Statistical concepts for statistic:
FISH CATCH BY VESSEL LOCALITY, 2015 - 2020

Statistical unit:

  • the statistical unit (counted object) is ‘fish catch’ (abstract)

Population:

  • fish catch, in Republic of Maldives, from registered vessel, in year X

Statistical Measure:

  • ‘Total (of objects in the population)’ which is ‘total fish catch’

Variables:

  • Population-defining properties:
    • where caught: Republic of Maldives
    • method of catch: registered vessel

Classification variables:

  • vessel locality (locality of vessel registration) (Malé or atoll)

Reference times:

  • year 2015 - 2020
Percentage share

Statistical concepts for statistic:
Percentage of FISH CATCH BY VESSEL LOCALITY, 2015 - 2020

Statistical unit:

  • the statistical unit (counted object) is ‘fish catch’ (abstract)

Population:

  • fish catch, in Republic of Maldives, from registered vessel, in year X

Statistical Measure:

  • ‘Percentage of total (of objects in the population)’ which is ‘percentage of total fish catch’

Variables:

  • Population-defining properties:
    • where caught: Republic of Maldives
    • method of catch: registered vessel

Classification variables:

  • vessel locality (locality of vessel registration) (Malé or atoll)

Reference times:

  • year 2015 - 2020
Percentage change

Statistical concepts for statistic:
%Change over previous year of FISH CATCH BY VESSEL LOCALITY, 2015 - 2020

Statistical unit:

  • the statistical unit (counted object) is ‘fish catch’ (abstract)

Population:

  • fish catch, in Republic of Maldives, from registered vessel, in year X

Statistical Measure:

  • ‘Percentage change over previous year of (of objects in the population)’ which is ‘percentage change over previous year of fish catch’

Variables:

  • Population-defining properties:
    • where caught: Republic of Maldives
    • method of catch: registered vessel

Classification variables:

  • vessel locality (locality of vessel registration) (Malé or atoll)

Reference times:

  • year 2015 - 2020

Conceptual Data Model Table

In tabular format, the Conceptual Data Model is presented as follows:

Select the table to enlarge.

Coming next…

You now know how to apply the statistical concepts we covered earlier to evaluate a statistical table and produce a Conceptual Data Model. Next, we’ll look at how we extend this Conceptual Data Model to produce a Logical Data Model.