Convert from/to SDMX

sdmx.convert and sdmx.convert.common provide generic and extensible features for converting:

  • from sdmx.message and sdmx.model objects to arbitrary Python data structures, or

  • from arbitrary Python data structures to sdmx objects.

The included sdmx.convert.pandas, PandasConverter, and to_pandas() build on these features to provide conversion to types from the pandas package. This conversion is not described by the SDMX standards; in other words, it is particular to the sdmx package.

In contrast, submodules of sdmx.writer, to_csv(), and to_xml() provide conversion to standard Standard formats.

User code can also subclass Converter or DispatchConverter to convert to/from other Python types or non-standard formats.

class sdmx.convert.common.Converter[source]

Base class for conversion to or from sdmx objects.

convert(data: Any, **kwargs) Any[source]

Convert data.

classmethod handles(data: Any, kwargs: dict) bool[source]

Return True if the class can convert data using kwargs.

class sdmx.convert.common.DispatchConverter[source]

Base class for recursive converters.

Usage:

Example

>>> from sdmx.convert.common import DispatchConverter
>>> class CustomConverter(DispatchConverter):
...     pass
>>> @CustomConverter.register
... def _(c: "CustomConverter", obj: sdmx.model.ItemScheme):
...     ... code to convert an ItemScheme ...
...     return result
>>> @CustomConverter.register
... def _(c: "CustomConverter", obj: sdmx.model.Codelist):
...     ... code to convert a Codelist ...
...     return result
convert(obj, **kwargs)[source]

Convert data.

classmethod register(func: Callable)[source]

Register func as a conversion function.

func must have an argument named obj that is annotated with a particular type.

convert.pandas: Convert to pandas objects

PandasConverter(format_options, attributes, ...)

Convert SDMX messages and IM objects to pandas.DataFrame and similar.

convert_dataset(c, obj)

Convert DataSet.

convert_datamessage(c, obj)

Convert DataMessage.

convert_itemscheme(c, obj)

Convert ItemScheme.

convert_structuremessage(c, obj)

Convert StructureMessage.

Other objects are converted as follows:

Component

The id attribute of the concept_identity is returned.

DataMessage

The DataSet or data sets within the Message are converted to pandas objects. Returns:

dict

The values of the mapping are converted individually. If the resulting values are str or Series with indexes that share the same name, then they are converted to a Series, possibly with a pandas.MultiIndex. Otherwise, a DictLike is returned.

DimensionDescriptor

The components of the DimensionDescriptor are converted.

list

For the following obj, returns Series instead of a list:

  • a list of Observation: the Observations are converted using convert_dataset().

  • a list with only 1 DataSet <.BaseDataSet (e.g. the data attribute of DataMessage): the Series for the single element is returned.

  • a list of SeriesKey: the key values (but no data) are returned.

NameableArtefact

The name attribute of obj is returned.

Todo

Support selection of language for conversion of InternationalString.

Code reference

Convert sdmx.message and model objects to pandas objects.

sdmx.convert.pandas.ALL_CONTENTS = {'category_scheme', 'codelist', 'concept_scheme', 'constraint', 'dataflow', 'organisation_scheme', 'structure'}[source]

TODO Retrieve this info from the StructureMessage class.

class sdmx.convert.pandas.Column[source]

Representation of conversion of a column.

Todo

Unify with reader.csv.Handler.

id: str[source]

SDMX component ID.

name: str[source]

Column name/header.

class sdmx.convert.pandas.ColumnSpec(pc: PandasConverter | None = None, ds: BaseDataSet | None = None)[source]

Information about columns for conversion.

add_obs_attrib(values: Iterable[str]) None[source]

Extend obs_attrib using values.

property assign: Mapping[str, str][source]

Return values for pandas.DataFrame.assign().

convert_obs(obs: BaseObservation) list[source]

Convert a single Observation to a data row.

The items of the result correspond to the column names in obs.

end: list[Fixed][source]

Final columns.

property full: list[str][source]

Full list of column names.

key: list[Column][source]

Columns related to observation keys.

measure: list[Column][source]

Column(s) related to observation measure(s).

property obs: list[str][source]

List of column names for observation data.

obs_attrib: list[Column][source]

Columns related to observation-attached attributes.

start: list[Fixed][source]

Initial columns.

class sdmx.convert.pandas.ComponentBoth(component: Component)[source]

Labels.both column.

id: str[source]

SDMX component ID.

name: str[source]

Column name/header.

class sdmx.convert.pandas.ComponentColumn(component: Component)[source]

A column taking its header from a Component.

id: str[source]

SDMX component ID.

name: str[source]

Column name/header.

class sdmx.convert.pandas.ComponentID(component: Component)[source]

Labels.id column.

id: str[source]

SDMX component ID.

name: str[source]

Column name/header.

class sdmx.convert.pandas.ComponentName(component: Component)[source]

Labels.name column.

id: str[source]

SDMX component ID.

name: str[source]

Column name/header.

class sdmx.convert.pandas.Fixed(name: str, value: str)[source]

Column with fixed value.

id: str[source]

SDMX component ID.

name: str[source]

Column name/header.

class sdmx.convert.pandas.PandasConverter(format_options: CSVFormatOptions = <factory>, attributes: Attributes = <Attributes.none: 0>, constraint: ContentConstraint | None = None, dtype: type[np.generic] | type[ExtensionDtype] | str | None = <class 'numpy.float64'>, datetime_axis: int | str = -1, datetime_dimension: common.DimensionComponent | None = None, datetime_freq: PeriodFrequency | None = None, include: set[str] = <factory>, locale: str = 'en', datetime: InitVar = None, rtype: dataclasses.InitVar[str] = '', _columns: ColumnSpec = <factory>, _strict: bool = False, _unstack: list[str] = <factory>, _context: dict[str | type, ~typing.Any]=<factory>)[source]

Convert SDMX messages and IM objects to pandas.DataFrame and similar.

PandasConverter implements a dispatch pattern according to the type of the object to be converted. The attributes/arguments to the class control the conversion behaviour and return types.

attributes: Attributes = 0[source]

Attributes to include.

constraint: ContentConstraint | None = None[source]

If given, only Observations included by the constraint are returned.

datetime: InitVar = None[source]

True to convert datetime.

Deprecated since version 2.23.0: Use datetime_axis, datetime_dimension, or datetime_freq.

datetime_axis: int | str = -1[source]

Axis on which to place a time dimension. One of:

  • -1: disabled.

  • 0, "index": first/index axis.

  • 1, "columns": second/columns axis.

datetime_dimension: common.DimensionComponent | None = None[source]

Dimension to convert to pandas.DatetimeIndex. A str value is interpreted as a dimension ID.

datetime_freq: PeriodFrequency | None = None[source]

Frequency for conversion to pandas.PeriodIndex. A str value is interpreted as one of the Period aliases.

dtype[source]

Datatype for observation values. If None, data values remain object/str.

alias of float64

format_options: CSVFormatOptions[source]

SDMX-CSV format options.

get_components(kind) list[Component][source]

Return an appropriate list of dimensions or attributes.

handle_compat() None[source]

Analyse and alter settings for deprecated rtype=compat argument.

handle_datetime(value: Any) None[source]

Handle alternate forms of datetime.

If given, return a DataFrame with a DatetimeIndex or PeriodIndex as the index and all other dimensions as columns. Valid datetime values include:

  • bool: if True, determine the time dimension automatically by detecting a TimeDimension.

  • str: ID of the time dimension.

  • Dimension: the matching Dimension is the time dimension.

  • dict: advanced behaviour. Keys may include:

    • dim (Dimension or str): the time dimension or its ID.

    • axis ({0 or ‘index’, 1 or ‘columns’}): axis on which to place the time dimension (default: 0).

    • freq (True or str or Dimension): produce pandas.PeriodIndex. If str, the ID of a Dimension containing a frequency specification. If a Dimension, the specified dimension is used for the frequency specification.

      Any Dimension used for the frequency specification is does not appear in the returned DataFrame.

include: set[str][source]

iterable of str or str, optional One or more of the attributes of the StructureMessage (‘category_scheme’, ‘codelist’, etc.) to transform.

Type:

include

rtype: dataclasses.InitVar[str] = ''[source]

Return type for convert_dataset() and similar methods.

Deprecated since version 2.23.0: User code should instead explicitly modify the returned DataFrame or Series.

sdmx.convert.pandas.convert_datamessage(c: PandasConverter, obj: DataMessage)[source]

Convert DataMessage.

Parameters:
  • rtype ('compat' or 'rows', optional) – Data type to return; default DEFAULT_RTYPE. See the HOWTO.

  • kwargs – Passed to convert_dataset() for each data set.

Returns:

sdmx.convert.pandas.convert_dataset(c: PandasConverter, obj: BaseDataSet)[source]

Convert DataSet.

See the walkthrough for examples of using the datetime argument.

Returns:

sdmx.convert.pandas.convert_itemscheme(c: PandasConverter, obj: ItemScheme)[source]

Convert ItemScheme.

Parameters:

locale (str, optional) – Locale for names to return.

Return type:

pandas.Series or pandas.DataFrame

sdmx.convert.pandas.convert_structuremessage(c: PandasConverter, obj: StructureMessage)[source]

Convert StructureMessage.

Returns:

Keys are StructureMessage attributes; values are pandas objects.

Return type:

DictLike