Development

This page describes the development of sdmx. Contributions are welcome!

Code style

  • Apply the following to new or modified code:

    isort -rc . && black . && mypy . && flake8
    

    Respectively, these:

    • isort: sort import lines at the top of code files in a consistent way, using isort.

    • black: apply black code style.

    • mypy: check typing using mypy.

    • flake8: check code style against PEP 8 using flake8.

  • Follow the 7 rules of a great Git commit message.

  • Write docstrings in the numpydoc style.

Test specimens

New in version 2.0.

A variety of specimens—example files from real web services, or published with the standards—are used to test that sdmx correctly reads and writes the different SDMX message formats. Since v2.0, specimens are stored in the separate sdmx-test-data repository.

Running the test suite requires these files. To retrieve them, use one of the following methods:

  1. Obtain the files by one of two methods:

    1. Clone khaeru/sdmx-test-data:

      $ git clone git@github.com:khaeru/sdmx-test-data.git
      
    2. Download https://github.com/khaeru/sdmx-test-data/archive/master.zip

  2. Indicate where pytest can find the files, by one of two methods:

    1. Set the SDMX_TEST_DATA environment variable:

      # Set the variable only for one command
      $ SDMX_TEST_DATA=/path/to/files pytest
      
      # Export the variable to the environment
      $ export SDMX_TEST_DATA
      $ pytest
      
    2. Give the option --sdmx-test-data=<PATH> when invoking pytest:

      $ pytest --sdmx-test-data=/path/to/files
      

The files are:

  • Arranged in directories with names matching particular sources in sources.json.

  • Named with:

    • Certain keywords:

      • -structure: a structure message, often associated with a file with a similar name containing a data message.

      • ts: time-series data, i.e. with a TimeDimensions at the level of individual Observations.

      • xs: cross-sectional data arranged in other ways.

      • flat: flat DataSets with all Dimensions at the Observation level.

      • ss: structure-specific data messages.

    • In some cases, the query string or data flow/structure ID as the file name.

    • Hyphens - instead of underscores _.

Releasing

Before releasing, check:

Address any failures before releasing.

  1. Edit doc/whatsnew.rst. Comment the heading “Next release”, then insert another heading below it, at the same level, with the version number and date. Make a commit with a message like “Mark vX.Y.Z in doc/whatsnew”.

  2. Tag the version as a release candidate, i.e. with a rcN suffix, and push:

    $ git tag v1.2.3rc1
    $ git push --tags origin master
    
  3. Check:

    Address any warnings or errors that appear. If needed, make a new commit and go back to step (2), incrementing the rc number.

  4. Optional. This step (but not step (2)) can also be performed directly on GitHub; see (5), next. Tag the release itself and push:

    $ git tag v1.2.3
    $ git push --tags origin master
    
  5. Visit https://github.com/khaeru/sdmx/releases and mark the new release: either using the pushed tag from (4), or by creating the tag and release simultaneously.

  6. Check at https://github.com/khaeru/sdmx/actions?query=workflow:publish and https://pypi.org/project/sdmx1/ that the distributions are published.

Internal code reference

testing: Testing utilities

class sdmx.testing.MessageTest[source]

Bases: object

Base class for tests of specific specimen files.

directory: Union[str, pathlib.Path] = PosixPath('.')
filename: str
msg(path)[source]
path(test_data_path)[source]
class sdmx.testing.SpecimenCollection(base_path)[source]

Bases: object

Collection of test specimens.

as_params(format=None, kind=None, marks={})[source]

Generate pytest.param() from specimens.

One param() is generated for each specimen that matches the format and kind arguments (if any). Marks are attached to each param from marks, wherein the keys are partial paths.

expected_data(path)[source]

Return the expected to_pandas() result for the specimen path.

sdmx.testing.assert_pd_equal(left, right, **kwargs)[source]

Assert equality of two pandas objects.

sdmx.testing.generate_endpoint_tests(metafunc)[source]

pytest hook for parametrizing tests that need an “endpoint” fixture.

sdmx.testing.parametrize_specimens(metafunc)[source]

Handle @pytest.mark.parametrize_specimens(…).

sdmx.testing.pytest_addoption(parser)[source]

Add the --sdmx-test-data command-line option to pytest.

sdmx.testing.pytest_configure(config)[source]

Handle the --sdmx-test-data command-line option.

sdmx.testing.pytest_generate_tests(metafunc)[source]

Generate tests.

Calls both parametrize_specimens() and generate_endpoint_tests().

sdmx.testing.specimen(pytestconfig)[source]

Fixture: the SpecimenCollection.

sdmx.testing.test_data_path(pytestconfig)[source]

Fixture: the Path given as –sdmx-test-data.

sdmx.testing.unsupported = MarkDecorator(mark=Mark(name='xfail', args=(), kwargs={'strict': True, 'reason': 'Known non-supported endpoint.', 'raises': <class 'NotImplementedError'>}))

This exception is raised by client.Client._request_from_args

Todo

parametrize force=True to query these endpoints anyway; then XPASS will reveal when data sources change their support for endpoints

util: Utilities

class sdmx.util.BaseModel[source]

Bases: pydantic.main.BaseModel

Shim for pydantic.BaseModel.

This class changes two behaviours in pydantic. The methods are direct copies from pydantic’s code, with marked changes.

  1. https://github.com/samuelcolvin/pydantic/issues/524

    • “Multiple RecursionErrors with self-referencing models”

    • In e.g. Item, having both .parent and .child references leads to infinite recursion during validation.

    • Fix: override BaseModel.__setattr__.

    • New value ‘limited’ for Config.validate_assignment: no sibling field values are passed to Field.validate().

    • New key Config.validate_assignment_exclude: list of field names that are not validated per se and not passed to Field.validate() when validating a sibling field.

  2. https://github.com/samuelcolvin/pydantic/issues/521

    • “Assignment to attribute changes id() but not referenced object,” marked as wontfix by pydantic maintainer.

    • When cls.attr is typed as BaseModel (or a subclass), then a.attr is b.attr is always False, even when set to the same reference.

    • Fix: override BaseModel.validate() without copy().

class Config[source]

Bases: object

validate_assignment = 'limited'
validate_assignment_exclude: List[str] = []
classmethod validate(value: Any)Model[source]
sdmx.util.summarize_dictlike(dl, maxwidth=72)[source]

Return a string summary of the DictLike contents.

sdmx.util.validate_dictlike(*fields)[source]

Inline TODOs

Todo

Support selection of language for conversion of InternationalString.

(The original entry is located in /home/docs/checkouts/readthedocs.org/user_builds/sdmx1/checkouts/v2.2.1/doc/api.rst, line 141.)

Todo

parametrize force=True to query these endpoints anyway; then XPASS will reveal when data sources change their support for endpoints

(The original entry is located in docstring of sdmx.testing.unsupported, line 3.)