Write/convert sdmx
objects#
The term write refers to both:
Converting
sdmx.message
andsdmx.model
objects to the SDMX standard file formats.Converting
sdmx.model
objects topandas
objects.
writer.csv
: Write to SDMX-CSV#
Added in version 2.9.0.
See to_csv()
.
SDMX-CSV 1.0 writer.
See SDMX-CSV.
- sdmx.writer.csv.dataset(obj: DataSet, *, labels: Literal['id', 'both'] = 'id', time_format: Literal['original', 'normalized'] = 'original', **kwargs) DataFrame [source]#
Convert
DataSet
.The two optional parameters are exactly as described in the specification.
Because SDMX-CSV includes a
DATAFLOW
column with an identifier (partial URN) for the dataflow to which the data conform, it is mandatory that thedescribed_by
attribute of obj gives an association to aDataflowDefinition
object, from which aurn
can be constructed.- Parameters:
labels (
"id"
or"both"
, optional) –- “id”
Display only
Dimension.id
/DataAttribute.id
in column headers andCode.id
in data rows.- “both”
Display both the ID and the localized
NameableArtefact.name
. Not yet implemented.
time_format (
"original"
or"normalized"
, optional) –- “original”
Values for any dimension or attribute with ID
TIME_PERIOD
are displayed as recorded.- “normalized”
TIME_PERIOD
values are converted to the most granular ISO 8601 representation taking into account the highest frequency of the data in the message and the moment in time when the lower-frequency values were collected. Not yet implemented.
This parameter is called timeFormat in the spec and in HTTP Accept headers.
kwargs – Keyword arguments passed to
to_pandas()
. In particular, attributes is useful to control which attribute values are included in the returned CSV.
- Return type:
- Raises:
NotImplementedError – For
labels="both"
ortime_format="normalized"
.ValueError – If
DataSet.described_by
isNone
.
writer.pandas
: Convert to pandas
objects#
Changed in version 1.0: sdmx.to_pandas()
handles all types of objects, replacing the earlier, separate data2pandas
and structure2pd
writers.
to_pandas()
implements a dispatch pattern according to the type of obj.
Some of the internal methods take specific arguments and return varying values.
These arguments can be passed to to_pandas()
when obj is of the appropriate type:
|
Convert |
|
Convert |
|
Convert |
Convert |
|
Default return type for |
Other objects are converted as follows:
Component
The
id
attribute of theconcept_identity
is returned.DataMessage
The
DataSet
or data sets within the Message are converted to pandas objects. Returns:pandas.Series
orpandas.DataFrame
, if obj has only one data set.list of (Series or DataFrame), if obj has more than one data set.
dict
The values of the mapping are converted individually. If the resulting values are
str
or Series with indexes that share the same name, then they are converted to a Series, possibly with apandas.MultiIndex
. Otherwise, aDictLike
is returned.DimensionDescriptor
The
components
of the DimensionDescriptor are written.list
For the following obj, returns Series instead of a
list
:a list of
Observation
: the Observations are written usingwrite_dataset()
.a list with only 1
DataSet
(e.g. thedata
attribute ofDataMessage
): the Series for the single element is returned.a list of
SeriesKey
: the key values (but no data) are returned.
NameableArtefact
The
name
attribute of obj is returned.
- sdmx.writer.pandas.DEFAULT_RTYPE = 'rows'[source]#
Default return type for
write_dataset()
and similar methods. Either ‘compat’ or ‘rows’. See the ref:HOWTO <howto-rtype>.
- sdmx.writer.pandas.write_datamessage(obj: DataMessage, *args, rtype=None, **kwargs)[source]#
Convert
DataMessage
.- Parameters:
rtype (
'compat'
or'rows'
, optional) – Data type to return; defaultDEFAULT_RTYPE
. See the HOWTO.kwargs – Passed to
write_dataset()
for each data set.
- Returns:
pandas.Series
orpandas.DataFrame
– if obj has only one data set.list
of(:class:`pandas.Series`
orpandas.DataFrame
) – if obj has more than one data set.
- sdmx.writer.pandas.write_dataset(obj: ~sdmx.model.v21.DataSet, attributes='', dtype=<class 'numpy.float64'>, constraint=None, datetime=False, **kwargs)[source]#
Convert
DataSet
.See the walkthrough for examples of using the datetime argument.
- Parameters:
obj (
DataSet
oriterable
ofObservation
)attributes (
str
) –Types of attributes to return with the data. A string containing zero or more of:
'o'
: attributes attached to eachObservation
.'s'
: attributes attached to any (0 or 1)SeriesKey
associated with each Observation.'g'
: attributes attached to any (0 or more)GroupKey
associated with each Observation.'d'
: attributes attached to theDataSet
containing the Observations.
dtype (
str
ornumpy.dtype
orNone
) – Datatype for values. If None, do not return the values of a series. In this case, attributes must not be an empty string so that some attribute is returned.constraint (
ContentConstraint
, optional) – If given, only Observations included by the constraint are returned.datetime (
bool
orstr
oror .Dimension
ordict
, optional) –If given, return a DataFrame with a
DatetimeIndex
orPeriodIndex
as the index and all other dimensions as columns. Valid datetime values include:bool
: ifTrue
, determine the time dimension automatically by detecting aTimeDimension
.str
: ID of the time dimension.Dimension
: the matching Dimension is the time dimension.dict
: advanced behaviour. Keys may include:axis ({0 or ‘index’, 1 or ‘columns’}): axis on which to place the time dimension (default: 0).
freq (
True
orstr
orDimension
): producepandas.PeriodIndex
. Ifstr
, the ID of a Dimension containing a frequency specification. If a Dimension, the specified dimension is used for the frequency specification.Any Dimension used for the frequency specification is does not appear in the returned DataFrame.
- Returns:
-
if attributes is not
''
, a data frame with one row per Observation,value
as the first column, and additional columns for each attribute;if datetime is given, various layouts as described above; or
if _rtype (passed from
write_datamessage()
) is ‘compat’, various layouts as described in the HOWTO.
pandas.Series
withpandas.MultiIndex
– Otherwise.
-
- sdmx.writer.pandas.write_itemscheme(obj: ItemScheme, locale='en')[source]#
Convert
ItemScheme
.- Parameters:
locale (
str
, optional) – Locale for names to return.- Return type:
- sdmx.writer.pandas.write_structuremessage(obj: StructureMessage, include=None, **kwargs)[source]#
Convert
StructureMessage
.- Parameters:
obj (
StructureMessage
)include (
iterable
ofstr
orstr
, optional) – One or more of the attributes of the StructureMessage (‘category_scheme’, ‘codelist’, etc.) to transform.kwargs – Passed to
write()
for each attribute.
- Returns:
Keys are StructureMessage attributes; values are pandas objects.
- Return type:
Todo
Support selection of language for conversion of
InternationalString
.
writer.xml
: Write to SDMX-ML#
Added in version 1.1.
See to_xml()
.
SDMX-ML v2.1 writer.
- sdmx.writer.xml.i11lstring(obj, name) List[_Element] [source]#
InternationalString.
Returns a list of elements with name name.
- sdmx.writer.xml.identifiable(obj, *args, **kwargs) _Element [source]#
Write
IdentifiableArtefact
.Unless the keyword argument _with_urn is
False
, a URN is generated for objects lacking one, and forwarded toannotable()
- sdmx.writer.xml.reference(obj, parent=None, tag=None, *, style: Literal['Ref', 'URN'])[source]#
Write a reference to obj.
Todo
Currently other functions in
writer.xml
all pass the style argument to this function. As an enhancement, allow user or automatic selection of different reference styles.
Writer API#
- sdmx.writer.to_csv(obj, *args, path: ~os.PathLike | None = None, rtype: ~typing.Type[str | ~pandas.core.frame.DataFrame] = <class 'str'>, **kwargs) None | str | DataFrame [source]#
Convert an SDMX obj to SDMX-CSV.
With rtype =
DataFrame
, the returned object is not necessarily in SDMX-CSV format. In particular, writing this to file usingpandas.DataFrame.to_csv()
will yield invalid SDMX-CSV, because pandas includes a CSV column corresponding to the index of the data frame. You must pass index=False to disable this behaviour. With rtype =str
or when giving path, this is done automatically.- Parameters:
path (
os.PathLike
, optional) – Path to write an SDMX-CSV file. If given, nothing is returned.rtype – Return type; see below. Pass literally
str
orpd.DataFrame
; not an instance of either class.kwargs – Keyword arguments passed to
dataset()
.
- Returns:
- Raises:
NotImplementedError – If obj is any class except
DataSet
; this is the only class for which the SDMX-CSV standard describes a format.
See also
- class sdmx.writer.base.BaseWriter(format_name)[source]#
Base class for recursive writers.
Usage:
Create an instance of this class.
Use
register()
in the same manner as Python’s built-infunctools.singledispatch()
to decorate functions that certain types ofsdmx.model
orsdmx.message
objects.Call
recurse()
to kick off recursive writing of objects, including from inside other functions.
Example
>>> MyWriter = BaseWriter('my')
>>> @MyWriter.register >>> def _(obj: sdmx.model.ItemScheme): >>> ... code to write an ItemScheme ... >>> return result
>>> @MyWriter.register >>> def _(obj: sdmx.model.Codelist): >>> ... code to write a Codelist ... >>> return result