Writer/convert sdmx objects#

The term write refers to both:

writer.csv: Write to SDMX-CSV#

New in version 2.9.0.

See to_csv().

SDMX-CSV 1.0 writer.

See SDMX-CSV.

sdmx.writer.csv.dataset(obj: DataSet, *, labels: Literal['id', 'both'] = 'id', time_format: Literal['original', 'normalized'] = 'original', **kwargs) DataFrame[source]#

Convert DataSet.

The two optional parameters are exactly as described in the specification.

Because SDMX-CSV includes a DATAFLOW column with an identifier (partial URN) for the dataflow to which the data conform, it is mandatory that the described_by attribute of obj gives an association to a DataflowDefinition object, from which a urn can be constructed.

Parameters:
  • labels ("id" or "both", optional) –

    “id”

    Display only Dimension.id / DataAttribute.id in column headers and Code.id in data rows.

    “both”

    Display both the ID and the localized NameableArtefact.name. Not yet implemented.

  • time_format ("original" or "normalized", optional) –

    “original”

    Values for any dimension or attribute with ID TIME_PERIOD are displayed as recorded.

    “normalized”

    TIME_PERIOD values are converted to the most granular ISO 8601 representation taking into account the highest frequency of the data in the message and the moment in time when the lower-frequency values were collected. Not yet implemented.

    This parameter is called timeFormat in the spec and in HTTP Accept headers.

  • kwargs – Keyword arguments passed to to_pandas(). In particular, attributes is useful to control which attribute values are included in the returned CSV.

Return type:

pandas.DataFrame

Raises:

writer.pandas: Convert to pandas objects#

Changed in version 1.0: sdmx.to_pandas() handles all types of objects, replacing the earlier, separate data2pandas and structure2pd writers.

to_pandas() implements a dispatch pattern according to the type of obj. Some of the internal methods take specific arguments and return varying values. These arguments can be passed to to_pandas() when obj is of the appropriate type:

sdmx.writer.pandas.write_dataset(obj[, ...])

Convert DataSet.

sdmx.writer.pandas.write_datamessage(obj, *args)

Convert DataMessage.

sdmx.writer.pandas.write_itemscheme(obj[, ...])

Convert ItemScheme.

sdmx.writer.pandas.write_structuremessage(obj)

Convert StructureMessage.

sdmx.writer.pandas.DEFAULT_RTYPE

Default return type for write_dataset() and similar methods.

Other objects are converted as follows:

Component

The id attribute of the concept_identity is returned.

DataMessage

The DataSet or data sets within the Message are converted to pandas objects. Returns:

dict

The values of the mapping are converted individually. If the resulting values are str or Series with indexes that share the same name, then they are converted to a Series, possibly with a pandas.MultiIndex. Otherwise, a DictLike is returned.

DimensionDescriptor

The components of the DimensionDescriptor are written.

list

For the following obj, returns Series instead of a list:

  • a list of Observation: the Observations are written using write_dataset().

  • a list with only 1 DataSet (e.g. the data attribute of DataMessage): the Series for the single element is returned.

  • a list of SeriesKey: the key values (but no data) are returned.

NameableArtefact

The name attribute of obj is returned.

sdmx.writer.pandas.DEFAULT_RTYPE = 'rows'[source]#

Default return type for write_dataset() and similar methods. Either ‘compat’ or ‘rows’. See the ref:HOWTO <howto-rtype>.

sdmx.writer.pandas.write_datamessage(obj: DataMessage, *args, rtype=None, **kwargs)[source]#

Convert DataMessage.

Parameters:
Returns:

sdmx.writer.pandas.write_dataset(obj: ~sdmx.model.v21.DataSet, attributes='', dtype=<class 'numpy.float64'>, constraint=None, datetime=False, **kwargs)[source]#

Convert DataSet.

See the walkthrough for examples of using the datetime argument.

Parameters:
  • obj (DataSet or iterable of Observation) –

  • attributes (str) –

    Types of attributes to return with the data. A string containing zero or more of:

    • 'o': attributes attached to each Observation .

    • 's': attributes attached to any (0 or 1) SeriesKey associated with each Observation.

    • 'g': attributes attached to any (0 or more) GroupKey associated with each Observation.

    • 'd': attributes attached to the DataSet containing the Observations.

  • dtype (str or numpy.dtype or None) – Datatype for values. If None, do not return the values of a series. In this case, attributes must not be an empty string so that some attribute is returned.

  • constraint (ContentConstraint, optional) – If given, only Observations included by the constraint are returned.

  • datetime (bool or str or or .Dimension or dict, optional) –

    If given, return a DataFrame with a DatetimeIndex or PeriodIndex as the index and all other dimensions as columns. Valid datetime values include:

    • bool: if True, determine the time dimension automatically by detecting a TimeDimension.

    • str: ID of the time dimension.

    • Dimension: the matching Dimension is the time dimension.

    • dict: advanced behaviour. Keys may include:

      • dim (Dimension or str): the time dimension or its ID.

      • axis ({0 or ‘index’, 1 or ‘columns’}): axis on which to place the time dimension (default: 0).

      • freq (True or str or Dimension): produce pandas.PeriodIndex. If str, the ID of a Dimension containing a frequency specification. If a Dimension, the specified dimension is used for the frequency specification.

        Any Dimension used for the frequency specification is does not appear in the returned DataFrame.

Returns:

  • pandas.DataFrame

    • if attributes is not '', a data frame with one row per Observation, value as the first column, and additional columns for each attribute;

    • if datetime is given, various layouts as described above; or

    • if _rtype (passed from write_datamessage()) is ‘compat’, various layouts as described in the HOWTO.

  • pandas.Series with pandas.MultiIndex – Otherwise.

sdmx.writer.pandas.write_itemscheme(obj: ItemScheme, locale='en')[source]#

Convert ItemScheme.

Parameters:

locale (str, optional) – Locale for names to return.

Return type:

pandas.Series or pandas.DataFrame

sdmx.writer.pandas.write_structuremessage(obj: StructureMessage, include=None, **kwargs)[source]#

Convert StructureMessage.

Parameters:
  • obj (StructureMessage) –

  • include (iterable of str or str, optional) – One or more of the attributes of the StructureMessage (‘category_scheme’, ‘codelist’, etc.) to transform.

  • kwargs – Passed to write() for each attribute.

Returns:

Keys are StructureMessage attributes; values are pandas objects.

Return type:

DictLike

Todo

Support selection of language for conversion of InternationalString.

writer.xml: Write to SDMX-ML#

New in version 1.1.

See to_xml().

SDMX-ML v2.1 writer.

sdmx.writer.xml.i11lstring(obj, name) List[_Element][source]#

InternationalString.

Returns a list of elements with name name.

sdmx.writer.xml.identifiable(obj, *args, **kwargs) _Element[source]#

Write IdentifiableArtefact.

Unless the keyword argument _with_urn is False, a URN is generated for objects lacking one, and forwarded to annotable()

sdmx.writer.xml.reference(obj, parent=None, tag=None, *, style: Literal['Ref', 'URN'])[source]#

Write a reference to obj.

Todo

Currently other functions in writer.xml all pass the style argument to this function. As an enhancement, allow user or automatic selection of different reference styles.

Writer API#

sdmx.writer.to_csv(obj, *args, path: ~os.PathLike | None = None, rtype: ~typing.Type[str | ~pandas.core.frame.DataFrame] = <class 'str'>, **kwargs) None | str | DataFrame[source]#

Convert an SDMX obj to SDMX-CSV.

With rtype = DataFrame, the returned object is not necessarily in SDMX-CSV format. In particular, writing this to file using pandas.DataFrame.to_csv() will yield invalid SDMX-CSV, because pandas includes a CSV column corresponding to the index of the data frame. You must pass index=False to disable this behaviour. With rtype = str or when giving path, this is done automatically.

Parameters:
  • path (os.PathLike, optional) – Path to write an SDMX-CSV file. If given, nothing is returned.

  • rtype – Return type; see below. Pass literally str or pd.DataFrame; not an instance of either class.

  • kwargs – Keyword arguments passed to dataset().

Returns:

Raises:

NotImplementedError – If obj is any class except DataSet; this is the only class for which the SDMX-CSV standard describes a format.

See also

sdmx.writer.csv.

class sdmx.writer.base.BaseWriter(format_name)[source]#

Base class for recursive writers.

Usage:

  • Create an instance of this class.

  • Use register() in the same manner as Python’s built-in functools.singledispatch() to decorate functions that certain types of sdmx.model or sdmx.message objects.

  • Call recurse() to kick off recursive writing of objects, including from inside other functions.

Example

>>> MyWriter = BaseWriter('my')
>>> @MyWriter.register
>>> def _(obj: sdmx.model.ItemScheme):
>>>     ... code to write an ItemScheme ...
>>>     return result
>>> @MyWriter.register
>>> def _(obj: sdmx.model.Codelist):
>>>     ... code to write a Codelist ...
>>>     return result
recurse(obj, *args, **kwargs)[source]#

Recursively write obj.

If there is no register() ‘ed function to write the class of obj, then the parent class of obj is used to find a method.