Implementation notes¶
sdmx.model
implements the SDMX version 2.1.
(What is an ‘information model’?)
This page gives brief explanations of how :mod:`sdmx` implements the standards, focusing on additional features, conveniences, or interpretations/naming choices that are not strictly detemined by the standards.
Although this page is organized to correspond to the standards, it does not recapitulate them (as stated)—nor does it set out to teach all their details. For those purposes, see Resources; or the Walkthrough, which includes some incidental explanations.
Abstract classes and data types¶
Many classes inherit from one of the following.
For example, every Code
is a NameableArtefact
; 1 this means it has name and description attributes. Because every NameableArtefact
is an IdentifiableArtefact
, a Code also has id, URI, and URN attributes.
AnnotableArtefact
has a list of
annotations
is “annotable”; this means it also has the annotations attribute of an AnnotableArtefact.
The
id
uniquely identifies the object against others of the same type in a SDMX message. The URI and URN are globally unique. See Wikipedia for a discussion of the differences between the two.
has a
name
anddescription
, andis identifiable, therefore also annotable.
has a
version
number,may be valid between certain times (
valid_from
,valid_to
), andis nameable, identifiable, and annotable.
is under the authority of a particular
maintainer
, andis versionable, nameable, identifiable, and annotable.
In an SDMX message, a maintainable object might not be given in full; only as a reference (with
is_external_reference
set toTrue
). If so, it might have astructure_url
, where the maintainer provides more information about the object.
The API reference for sdmx.model
shows the parent classes for each class, to describe whether they are versionable, nameable, identifiable, and/or maintainable.
Because SDMX is used worldwide, an InternationalString
type is used in
the IM—for instance, the name of a Nameable object is an
InternationalString
, with zero or more localizations
in different locales.
- 1
Indirectly, through
Item
.
Items and schemes¶
ItemScheme
,Item
These abstract classes allow for the creation of flat or hierarchical taxonomies.
ItemSchemes are maintainable (see above); their
items
is a collection of Items. See the class documentation for details.
Data¶
A single data point/datum. The value is stored as the
value
attribute.
A collection of Observations, SeriesKeys, and/or GroupKeys.
Note
There are no ‘Series’ or ‘Group’ classes in the IM!
Instead, the idea of ‘data series’ within a DataSet is modeled as:
SeriesKeys and GroupKeys are associated with a DataSet.
Observations are each associated with one SeriesKey and, optionally, referred to by one or more GroupKeys.
One can choose to think of a SeriesKey and the associated Observations, collectively, as a ‘data series’. But, in order to avoid confusion with the IM,
sdmx
does not provide ‘Series’ or ‘Group’ objects.
sdmx
provides:
the
DataSet.series
andDataSet.group
mappings from SeriesKey or GroupKey (respectively) to lists of Observations.
DataSet.obs
, which is a list of all observations in the DataSet.Depending on its structure, a DataSet may be flat, cross-sectional or time series.
Key
Values (
Key.values
) for one or more Dimensions. The meaning varies:- Ordinary Keys, e.g.
Observation.dimension
The dimension(s) varying at the level of a specific observation.
SeriesKey
The dimension(s) shared by all Observations in a conceptual series.
GroupKey
.The dimension(s) comprising the group. These may be a subset of all the dimensions in the DataSet, in which case all matching Observations are considered part of the ‘group’—even if they are associated with different SeriesKeys.
GroupKeys are often used to attach AttributeValues; see below.
- Ordinary Keys, e.g.
AttributeValue
Value (
AttributeValue.value
) for a DataAttribute (AttributeValue.value_for
).May be attached to any of: DataSet, SeriesKey, GroupKey, or Observation. In the first three cases, the attachment means that the attribute applies to all Observations associated with the object.
Data structures¶
Concept
,ConceptScheme
An abstract idea or general notion, such as ‘age’ or ‘country’.
Concepts are one kind of Item, and are collected in an ItemScheme subclass called ConceptScheme.
Dimension
,DataAttribute
These are
Components
of a data structure, linking a Concept (concept_identity
) to its Representation (local_representation
); see below.A component can be either a DataAttribute that appears as an AttributeValue in data sets; or a Dimension that appears in Keys.
Representation
,Facet
For example: the concept ‘country’ can be represented as:
as a value of a certain type (e.g. ‘Canada’, a
str
), called a Facet;using a Code from a specific CodeList (e.g. ‘CA’); multiple lists of codes are possible (e.g. ‘CAN’). See below.
DataStructureDefinition
(DSD)Collects structures used in data sets and data flows. These are stored as
dimensions
,attributes
,group_dimensions
, andmeasures
.For example,
dimensions
is aDimensionDescriptor
object that collects a number of Dimensions in a particular order. Data that is “structured by” this DSD must have all the described dimensions.See the API documentation for details.
Metadata¶
Code
,Codelist
…
Category
,CategoryScheme
,Categorization
Categories serve to classify or categorise things like dataflows, e.g. by subject matter.
A
Categorisation
links the thing to be categorised, e.g., a DataFlowDefinition, to a particular Category.
Constraints¶
Constraint
,ContentConstraint
Classes that specify a subset of data or metadata to, for example, limit the contents of a data flow.
A ContentConstraint may have:
Zero or more
CubeRegion
stored atdata_content_region
.Zero or one
DataKeySet
stored atConstraint.data_content_keys
.
Currently,
ContentConstraint.to_query_string()
, used byRequest.get()
to validate keys based on a data flow definition, only usesdata_content_region
, if any.data_content_keys
are ignored. None of the data sources supported bysdmx
appears to use this latter form.
Formats¶
The IM provides terms and concepts for data and metadata, but does not specify how that (meta)data is stored or represented. The SDMX standards include multiple ways to store data, in the following formats:
- SDMX-ML
Based on eXtensible Markup Language (XML). SDMX-ML provides a complete specification: it can represent every class and property in the IM.
Reference: https://sdmx.org/?page_id=5008
An SDMX-ML document contains exactly one Message. See
sdmx.message
for the different types of Messages and their component parts.See
reader.sdmxml
.
- SDMX-JSON
Based on JavaScript Object Notation (JSON). The SDMX-JSON format is only defined for data, not metadata.
Reference: https://github.com/sdmx-twg/sdmx-json
See
reader.sdmxjson
.
New in version 0.5: Support for SDMX-JSON.
- SDMX-CSV
Based on Comma-Separated Value (CSV). Like SDMX-JSON, the SDMX-CSV format are only defined for data, not metadata.
Reference: https://github.com/sdmx-twg/sdmx-csv
sdmx
does not currently support SDMX-CSV.
sdmx
:
reads all kinds of SDMX-ML and SDMX-JSON messages.
contains, in the tests/data/ source directory, specimens of messages in both data formats. These are used by the test suite to check that the code functions as intended, but can also be viewed to understand the data formats.
Web services¶
The SDMX standards describe both RESTful and SOAP web service APIs. See Resources for the SDMG Technical Working Group’s specification of the REST API. The Eurostat and ECB help materials provide descriptions and examples of HTTP using URLs, parameters and headers to construct queries.
sdmx
supports:
REST web services, i.e. not SOAP services;
Data retrieved in SDMX version 2.1 formats. Some existing services offer a parameter to select SDMX 2.1 or 2.0 format;
sdmx
does not support the latter. Other services only provide SDMX 2.0-formatted data; these cannot be used withsdmx
.
Request
constructs valid URLs and automatically add some parameter and header values.
These can be overridden; see Request.get()
.
In some cases, Request will make an additional query to fetch metadata and validate a query.
sdmx.Source
and its subclasses handle idiosyncrasies of the web services operated by different agencies, such as:
parameters or headers that are not supported, or must take very specific, non-standard values, or
unusual ways of returning data.
For data sources that support it, sdmx
automatically adds the HTTP header Accept: application/vnd.sdmx.structurespecificdata+xml;
when the dsd argument is provided to Request.get()
.
See Data sources and the source code for the details for each data source.