This page was generated from docs/source/notebooks/plugin_system/plugin_howto_write_a_io_plugin.ipynb. Interactive online version: Binder badge

How to Write your own Io plugin

There are all kinds of different data formats, so it is quite likely that your experimental setup uses a format which isn’t yet supported by a glotaran plugin and want to write your own DataIo plugin to support this format.

Since json is very common format (admittedly not for data, but in general) and python has builtin support for it we will use it as an example.

First let’s have a look which DataIo plugins are already installed and which functions they support.

[1]:
from glotaran.io import data_io_plugin_table
[2]:
data_io_plugin_table()
[2]:

Format name

load_dataset

save_dataset

ascii

*

*

nc

*

*

sdt

*

/

Looks like there isn’t a json plugin installed yet, but maybe someone else did already write one, so have a look at the `3rd party plugins list in the user docsumentation <https://pyglotaran.readthedocs.io/en/latest/user_documentation/using_plugins.html>`__ before you start writing your own plugin.

For the sake of the example, we will write our json plugin even if there already exists one by the time you read this.

First you need to import all needed libraries and functions.

  • from __future__ import annotations: needed to write python 3.10 typing syntax (|), even with a lower python version

  • json,xarray: Needed for reading and writing itself

  • DataIoInterface: needed to subclass from, this way you get the proper type and especially signature checking

  • register_data_io: registers the DataIo plugin under the given format_names

[3]:
from __future__ import annotations

import json

import xarray as xr

from glotaran.io.interface import DataIoInterface
from glotaran.plugin_system.data_io_registration import register_data_io

DataIoInterface has two methods we could implement load_dataset and save_dataset, which are used by the identically named functions in glotaran.io.

We will just implement both for our example to be complete. the quickest way to get started is to just copy over the code from DataIoInterface which already has the right signatures and some boilerplate docstrings, for the method arguments.

If the default arguments aren’t enough for your plugin and you need your methods to have additional option, you can just add those. Note the * between file_name and my_extra_option, this tell python that my_extra_option is an keyword only argument and `mypy <https://github.com/python/mypy>`__ won’t raise an [override] type error for changing the signature of the method. To help others who might use your plugin and your future self, it is good practice to documents what each parameter does in the methods docstring, which will be accessed by the help function.

Finally add the @register_data_io with the format_name’s you want to register the plugin to, in our case json and my_json.

Pro tip: You don’t need to implement the whole functionality inside of the method itself,

[4]:
@register_data_io(["json", "my_json"])
class JsonDataIo(DataIoInterface):
    """My new shiny glotaran plugin for json data io"""

    def load_dataset(
        self, file_name: str, *, my_extra_option: str = None
    ) -> xr.Dataset | xr.DataArray:
        """Read json data to xarray.Dataset


        Parameters
        ----------
        file_name : str
            File containing the data.
        my_extra_option: str
            This argument is only for demonstration
        """
        if my_extra_option is not None:
            print(f"Using my extra option loading json: {my_extra_option}")

        with open(file_name) as json_file:
            data_dict = json.load(json_file)
        return xr.Dataset.from_dict(data_dict)

    def save_dataset(
        self, dataset: xr.Dataset | xr.DataArray, file_name: str, *, my_extra_option=None
    ):
        """Write xarray.Dataset to a json file

        Parameters
        ----------
        dataset : xr.Dataset
            Dataset to be saved to file.
        file_name : str
            File to write the result data to.
        my_extra_option: str
            This argument is only for demonstration
        """
        if my_extra_option is not None:
            print(f"Using my extra option for writing json: {my_extra_option}")

        data_dict = dataset.to_dict()
        with open(file_name, "w") as json_file:
            json.dump(data_dict, json_file)

Let’s verify that our new plugin was registered successfully under the format_names json and my_json.

[5]:
data_io_plugin_table()
[5]:

Format name

load_dataset

save_dataset

ascii

*

*

json

*

*

my_json

*

*

nc

*

*

sdt

*

/

Now let’s use the example data from the quickstart to test the reading and writing capabilities of our plugin.

[6]:
from glotaran.io import load_dataset
from glotaran.io import save_dataset
from glotaran.testing.simulated_data.sequential_spectral_decay import DATASET as dataset
[7]:
dataset
[7]:
<xarray.Dataset> Size: 1MB
Dimensions:   (time: 2100, spectral: 72)
Coordinates:
  * time      (time) float64 17kB -1.0 -0.99 -0.98 -0.97 ... 19.97 19.98 19.99
  * spectral  (spectral) float64 576B 600.0 601.4 602.8 ... 696.6 698.0 699.4
Data variables:
    data      (time, spectral) float64 1MB 0.005208 0.008865 ... 2.548 2.31
Attributes:
    source_path:  dataset_1.nc

To get a feeling for our data, let’s plot some traces.

[8]:
plot_data = dataset.data.sel(spectral=[620, 630, 650], method="nearest")
plot_data.plot.line(x="time", aspect=2, size=5)
Matplotlib is building the font cache; this may take a moment.
[8]:
[<matplotlib.lines.Line2D at 0x7f3c30d73d60>,
 <matplotlib.lines.Line2D at 0x7f3c30d73d90>,
 <matplotlib.lines.Line2D at 0x7f3c30d73e80>]
../../_images/notebooks_plugin_system_plugin_howto_write_a_io_plugin_14_2.svg

Since we want to see a difference of our saved and loaded data, we divide the amplitudes by 2 for no reason.

[9]:
dataset["data"] = dataset.data / 2

Now that we changed the data, let’s write them to a file.

But in which order were the arguments again? And are there any additional option?

Good thing we documented our new plugin, so we can just lookup the help.

[10]:
from glotaran.io import show_data_io_method_help

show_data_io_method_help("json", "save_dataset")
Help on method save_dataset in module __main__:

save_dataset(dataset: 'xr.Dataset | xr.DataArray', file_name: 'str', *, my_extra_option=None) method of __main__.JsonDataIo instance
    Write xarray.Dataset to a json file

    Parameters
    ----------
    dataset : xr.Dataset
        Dataset to be saved to file.
    file_name : str
        File to write the result data to.
    my_extra_option: str
        This argument is only for demonstration

Note that the function save_dataset has additional arguments:

  • format_name: overwrites the inferred plugin selection

  • allow_overwrite: Allows to overwrite existing files (USE WITH CAUTION!!!)

[11]:
help(save_dataset)
Help on function save_dataset in module glotaran.plugin_system.data_io_registration:

save_dataset(dataset: 'xr.Dataset | xr.DataArray', file_name: 'StrOrPath', format_name: 'str | None' = None, *, data_filters: 'list[str] | None' = None, allow_overwrite: 'bool' = False, update_source_path: 'bool' = True, **kwargs: 'Any') -> 'None'
    Save data from :xarraydoc:`Dataset` or :xarraydoc:`DataArray` to a file.

    Parameters
    ----------
    dataset : xr.Dataset | xr.DataArray
        Data to be written to file.
    file_name : StrOrPath
        File to write the data to.
    format_name : str
        Format the file should be in, if not provided it will be inferred from the file extension.
    data_filters : list[str] | None
        Optional list of items in the dataset to be saved.
    allow_overwrite : bool
        Whether or not to allow overwriting existing files, by default False
    update_source_path: bool
        Whether or not to update the ``source_path`` attribute to ``file_name`` when saving.
        by default True
    **kwargs : Any
        Additional keyword arguments passes to the ``write_dataset`` implementation
        of the data io plugin. If you aren't sure about those use ``get_datasaver``
        to get the implementation with the proper help and autocomplete.

Since this is just an example and we don’t overwrite important data we will use allow_overwrite=True. Also it makes writing this documentation easier, not having to manually delete the test file each time you run the cell.

[12]:
save_dataset(
    dataset, "half_intensity.json", allow_overwrite=True, my_extra_option="just as an example"
)
Using my extra option for writing json: just as an example

Now let’s test our data loading functionality.

[13]:
reloaded_data = load_dataset("half_intensity.json", my_extra_option="just as an example")
reloaded_data
Using my extra option loading json: just as an example
[13]:
<xarray.Dataset> Size: 1MB
Dimensions:   (time: 2100, spectral: 72)
Coordinates:
  * time      (time) float64 17kB -1.0 -0.99 -0.98 -0.97 ... 19.97 19.98 19.99
  * spectral  (spectral) float64 576B 600.0 601.4 602.8 ... 696.6 698.0 699.4
Data variables:
    data      (time, spectral) float64 1MB 0.002604 0.004432 ... 1.274 1.155
Attributes:
    source_path:  half_intensity.json
    loader:       <function load_dataset at 0x7f3c44199c60>
[14]:
reloaded_plot_data = reloaded_data.data.sel(spectral=[620, 630, 650], method="nearest")
reloaded_plot_data.plot.line(x="time", aspect=2, size=5)
[14]:
[<matplotlib.lines.Line2D at 0x7f3c3077a020>,
 <matplotlib.lines.Line2D at 0x7f3c3077a050>,
 <matplotlib.lines.Line2D at 0x7f3c3077a140>]
../../_images/notebooks_plugin_system_plugin_howto_write_a_io_plugin_25_1.svg

Since this looks like the above plot, but with half the amplitudes, so writing and reading our data worked as we hoped it would.

Writing a ProjectIo plugin words analogous:

DataIo plugin

ProjectIo plugin

Register function

glotaran.plugin_system.data_io_registration.register_data_io

glotaran.plugin_system.project_io_registration.register_project_io

Baseclass

glotaran.io.interface.DataIoInterface

glotaran.io.interface.DataIoInterface

Possible methods

load_dataset , save_dataset

load_model , save_model , load_parameters , save_parameters , load_scheme , save_scheme , load_result , save_result

Of course you don’t have to implement all methods (sometimes that doesn’t even make sense), but only the ones you need.

Last but not least:

Chances are that if you need a plugin someone else does too, so it would awesome if you would publish it open source, so the wheel isn’t reinvented over and over again.