Write/convert sdmx objects¶
The term write refers to both:
Converting
sdmx.messageandsdmx.modelobjects to the SDMX standard file formats.Converting
sdmx.modelobjects topandasobjects.
writer.csv: Write to SDMX-CSV¶
Added in version 2.9.0.
See to_csv().
SDMX-CSV 1.0 writer.
See SDMX-CSV.
- sdmx.writer.csv.dataset(obj: DataSet, *, labels: Literal['id', 'both'] = 'id', time_format: Literal['original', 'normalized'] = 'original', **kwargs) DataFrame[source]¶
Convert
DataSet.The two optional parameters are exactly as described in the specification.
Because SDMX-CSV includes a
DATAFLOWcolumn with an identifier (partial URN) for the dataflow to which the data conform, it is mandatory that thedescribed_byattribute of obj gives an association to aDataflowDefinitionobject, from which aurncan be constructed.- Parameters:
labels (
"id"or"both", optional) –- “id”
Display only
Dimension.id/DataAttribute.idin column headers andCode.idin data rows.- “both”
Display both the ID and the localized
NameableArtefact.name. Not yet implemented.
time_format (
"original"or"normalized", optional) –- “original”
Values for any dimension or attribute with ID
TIME_PERIODare displayed as recorded.- “normalized”
TIME_PERIODvalues are converted to the most granular ISO 8601 representation taking into account the highest frequency of the data in the message and the moment in time when the lower-frequency values were collected. Not yet implemented.
This parameter is called timeFormat in the spec and in HTTP Accept headers.
kwargs – Keyword arguments passed to
to_pandas(). In particular, attributes is useful to control which attribute values are included in the returned CSV.
- Return type:
- Raises:
NotImplementedError – For
labels="both"ortime_format="normalized".ValueError – If
DataSet.described_byisNone.
writer.pandas: Convert to pandas objects¶
Changed in version 1.0: sdmx.to_pandas() handles all types of objects, replacing the earlier, separate data2pandas and structure2pd writers.
to_pandas() implements a dispatch pattern according to the type of obj.
Some of the internal methods take specific arguments and return varying values.
These arguments can be passed to to_pandas() when obj is of the appropriate type:
|
Convert |
|
Convert |
|
Convert |
|
Convert |
Default return type for |
Other objects are converted as follows:
ComponentThe
idattribute of theconcept_identityis returned.DataMessageThe
DataSetor data sets within the Message are converted to pandas objects. Returns:pandas.Seriesorpandas.DataFrame, if obj has only one data set.list of (Series or DataFrame), if obj has more than one data set.
dictThe values of the mapping are converted individually. If the resulting values are
stror Series with indexes that share the same name, then they are converted to a Series, possibly with apandas.MultiIndex. Otherwise, aDictLikeis returned.DimensionDescriptorThe
componentsof the DimensionDescriptor are written.listFor the following obj, returns Series instead of a
list:a list of
Observation: the Observations are written usingwrite_dataset().a list with only 1
DataSet(e.g. thedataattribute ofDataMessage): the Series for the single element is returned.a list of
SeriesKey: the key values (but no data) are returned.
NameableArtefactThe
nameattribute of obj is returned.
- sdmx.writer.pandas.DEFAULT_RTYPE = 'rows'[source]¶
Default return type for
write_dataset()and similar methods. Either ‘compat’ or ‘rows’. See the ref:HOWTO <howto-rtype>.
- sdmx.writer.pandas.write_datamessage(obj: DataMessage, *args, rtype=None, **kwargs)[source]¶
Convert
DataMessage.- Parameters:
rtype (
'compat'or'rows', optional) – Data type to return; defaultDEFAULT_RTYPE. See the HOWTO.kwargs – Passed to
write_dataset()for each data set.
- Returns:
pandas.Seriesorpandas.DataFrame– if obj has only one data set.listof(:class:`pandas.Series`orpandas.DataFrame) – if obj has more than one data set.
- sdmx.writer.pandas.write_dataset(obj: ~sdmx.model.common.BaseDataSet, attributes='', dtype=<class 'numpy.float64'>, constraint=None, datetime=False, **kwargs)[source]¶
Convert
DataSet.See the walkthrough for examples of using the datetime argument.
- Parameters:
obj (
DataSetoriterableofObservation)attributes (
str) –Types of attributes to return with the data. A string containing zero or more of:
'o': attributes attached to eachObservation.'s': attributes attached to any (0 or 1)SeriesKeyassociated with each Observation.'g': attributes attached to any (0 or more)GroupKeyassociated with each Observation.'d': attributes attached to theDataSetcontaining the Observations.
dtype (
strornumpy.dtypeorNone) – Datatype for values. If None, do not return the values of a series. In this case, attributes must not be an empty string so that some attribute is returned.constraint (
ContentConstraint, optional) – If given, only Observations included by the constraint are returned.datetime (
boolorstroror .Dimensionordict, optional) –If given, return a DataFrame with a
DatetimeIndexorPeriodIndexas the index and all other dimensions as columns. Valid datetime values include:bool: ifTrue, determine the time dimension automatically by detecting aTimeDimension.str: ID of the time dimension.Dimension: the matching Dimension is the time dimension.dict: advanced behaviour. Keys may include:axis ({0 or ‘index’, 1 or ‘columns’}): axis on which to place the time dimension (default: 0).
freq (
TrueorstrorDimension): producepandas.PeriodIndex. Ifstr, the ID of a Dimension containing a frequency specification. If a Dimension, the specified dimension is used for the frequency specification.Any Dimension used for the frequency specification is does not appear in the returned DataFrame.
- Returns:
-
if attributes is not
'', a data frame with one row per Observation,valueas the first column, and additional columns for each attribute;if datetime is given, various layouts as described above; or
if _rtype (passed from
write_datamessage()) is ‘compat’, various layouts as described in the HOWTO.
pandas.Serieswithpandas.MultiIndex– Otherwise.
-
- sdmx.writer.pandas.write_itemscheme(obj: ItemScheme, locale='en')[source]¶
Convert
ItemScheme.- Parameters:
locale (
str, optional) – Locale for names to return.- Return type:
- sdmx.writer.pandas.write_structuremessage(obj: StructureMessage, include=None, **kwargs)[source]¶
Convert
StructureMessage.- Parameters:
obj (
StructureMessage)include (
iterableofstrorstr, optional) – One or more of the attributes of the StructureMessage (‘category_scheme’, ‘codelist’, etc.) to transform.kwargs – Passed to
write()for each attribute.
- Returns:
Keys are StructureMessage attributes; values are pandas objects.
- Return type:
Todo
Support selection of language for conversion of
InternationalString.
writer.xml: Write to SDMX-ML¶
Added in version 1.1.
See to_xml().
SDMX-ML v2.1 writer.
- sdmx.writer.xml.i11lstring(obj, name) list[_Element][source]¶
InternationalString.
Returns a list of elements with name name.
- sdmx.writer.xml.identifiable(obj: IdentifiableArtefact, *args, **kwargs) _Element[source]¶
Write
IdentifiableArtefact.Unless the keyword argument _with_urn is
False, a URN is generated for objects lacking one, and forwarded toannotable()
- sdmx.writer.xml.reference(obj, parent=None, tag=None, *, style: Literal['Ref', 'URN'])[source]¶
Write a reference to obj.
Todo
Currently other functions in
writer.xmlall pass the style argument to this function. As an enhancement, allow user or automatic selection of different reference styles.
Writer API¶
- sdmx.writer.to_csv(obj, *args, path: ~os.PathLike | None = None, rtype: type[str | ~pandas.core.frame.DataFrame] = <class 'str'>, **kwargs) None | str | DataFrame[source]¶
Convert an SDMX obj to SDMX-CSV.
With rtype =
DataFrame, the returned object is not necessarily in SDMX-CSV format. In particular, writing this to file usingpandas.DataFrame.to_csv()will yield invalid SDMX-CSV, because pandas includes a CSV column corresponding to the index of the data frame. You must pass index=False to disable this behaviour. With rtype =stror when giving path, this is done automatically.- Parameters:
path (
os.PathLike, optional) – Path to write an SDMX-CSV file. If given, nothing is returned.rtype – Return type; see below. Pass literally
strorpd.DataFrame; not an instance of either class.kwargs – Keyword arguments passed to
dataset().
- Returns:
- Raises:
NotImplementedError – If obj is any class except
DataSet; this is the only class for which the SDMX-CSV standard describes a format.
See also
- class sdmx.writer.base.BaseWriter(format_name)[source]¶
Base class for recursive writers.
Usage:
Create an instance of this class.
Use
register()in the same manner as Python’s built-infunctools.singledispatch()to decorate functions that certain types ofsdmx.modelorsdmx.messageobjects.Call
recurse()to kick off recursive writing of objects, including from inside other functions.
Example
>>> MyWriter = BaseWriter('my')
>>> @MyWriter.register >>> def _(obj: sdmx.model.ItemScheme): >>> ... code to write an ItemScheme ... >>> return result
>>> @MyWriter.register >>> def _(obj: sdmx.model.Codelist): >>> ... code to write a Codelist ... >>> return result