{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# How to Write your own Io plugin"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"There are all kinds of different data formats, so it is quite likely that your experimental setup uses a format which isn't yet supported by a `glotaran` plugin and want to write your own `DataIo` plugin to support this format.\n",
"\n",
"Since `json` is very common format (admittedly not for data, but in general) and python has builtin support for it we will use it as an example.\n",
"\n",
"First let's have a look which `DataIo` plugins are already installed and which functions they support."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from glotaran.io import data_io_plugin_table"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"data_io_plugin_table()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Looks like there isn't a `json` plugin installed yet, but maybe someone else did already write one, so have a look at the [`3rd party plugins` list in the user docsumentation](https://pyglotaran.readthedocs.io/en/latest/user_documentation/using_plugins.html) before you start writing your own plugin.\n",
"\n",
"For the sake of the example, we will write our `json` plugin even if there already exists one by the time you read this.\n",
"\n",
"First you need to import all needed libraries and functions.\n",
"\n",
"- `from __future__ import annotations`: needed to write python 3.10 typing syntax (`|`), even with a lower python version\n",
"- `json`,`xarray`: Needed for reading and writing itself\n",
"- `DataIoInterface`: needed to subclass from, this way you get the proper type and especially signature checking\n",
"- `register_data_io`: registers the DataIo plugin under the given `format_name`s"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from __future__ import annotations\n",
"\n",
"import json\n",
"\n",
"import xarray as xr\n",
"\n",
"from glotaran.io.interface import DataIoInterface\n",
"from glotaran.plugin_system.data_io_registration import register_data_io"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"`DataIoInterface` has two methods we could implement `load_dataset` and `save_dataset`, which are used by the identically named functions in `glotaran.io`.\n",
"\n",
"We will just implement both for our example to be complete.\n",
"the quickest way to get started is to just copy over the code from `DataIoInterface` which already has the right signatures and some boilerplate docstrings, for the method arguments.\n",
"\n",
"If the default arguments aren't enough for your plugin and you need your methods to have additional option, you can just add those.\n",
"Note the `*` between `file_name` and `my_extra_option`, this tell python that `my_extra_option` is an [keyword only argument](https://www.python.org/dev/peps/pep-3102/) and [`mypy`](https://github.com/python/mypy) won't raise an `[override]` type error for changing the signature of the method.\n",
"To help others who might use your plugin and your future self, it is good practice to documents what each parameter does in the methods docstring, which will be accessed by the help function.\n",
"\n",
"Finally add the `@register_data_io` with the `format_name`'s you want to register the plugin to, in our case `json` and `my_json`.\n",
"\n",
"Pro tip: You don't need to implement the whole functionality inside of the method itself,"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"@register_data_io([\"json\", \"my_json\"])\n",
"class JsonDataIo(DataIoInterface):\n",
" \"\"\"My new shiny glotaran plugin for json data io\"\"\"\n",
"\n",
" def load_dataset(\n",
" self, file_name: str, *, my_extra_option: str = None\n",
" ) -> xr.Dataset | xr.DataArray:\n",
" \"\"\"Read json data to xarray.Dataset\n",
"\n",
"\n",
" Parameters\n",
" ----------\n",
" file_name : str\n",
" File containing the data.\n",
" my_extra_option: str\n",
" This argument is only for demonstration\n",
" \"\"\"\n",
" if my_extra_option is not None:\n",
" print(f\"Using my extra option loading json: {my_extra_option}\")\n",
"\n",
" with open(file_name) as json_file:\n",
" data_dict = json.load(json_file)\n",
" return xr.Dataset.from_dict(data_dict)\n",
"\n",
" def save_dataset(\n",
" self, dataset: xr.Dataset | xr.DataArray, file_name: str, *, my_extra_option=None\n",
" ):\n",
" \"\"\"Write xarray.Dataset to a json file\n",
"\n",
" Parameters\n",
" ----------\n",
" dataset : xr.Dataset\n",
" Dataset to be saved to file.\n",
" file_name : str\n",
" File to write the result data to.\n",
" my_extra_option: str\n",
" This argument is only for demonstration\n",
" \"\"\"\n",
" if my_extra_option is not None:\n",
" print(f\"Using my extra option for writing json: {my_extra_option}\")\n",
"\n",
" data_dict = dataset.to_dict()\n",
" with open(file_name, \"w\") as json_file:\n",
" json.dump(data_dict, json_file)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Let's verify that our new plugin was registered successfully under the `format_name`s `json` and `my_json`."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"data_io_plugin_table()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now let's use the example data from the quickstart to test the reading and writing capabilities of our plugin."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from glotaran.io import load_dataset\n",
"from glotaran.io import save_dataset\n",
"from glotaran.testing.simulated_data.sequential_spectral_decay import DATASET as dataset"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"dataset"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"To get a feeling for our data, let's plot some traces."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"plot_data = dataset.data.sel(spectral=[620, 630, 650], method=\"nearest\")\n",
"plot_data.plot.line(x=\"time\", aspect=2, size=5)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Since we want to see a difference of our saved and loaded data, we divide the amplitudes by 2 for no reason."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"dataset[\"data\"] = dataset.data / 2"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now that we changed the data, let's write them to a file.\n",
"\n",
"But in which order were the arguments again? And are there any additional option?\n",
"\n",
"Good thing we documented our new plugin, so we can just lookup the help."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from glotaran.io import show_data_io_method_help\n",
"\n",
"show_data_io_method_help(\"json\", \"save_dataset\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Note that the __function__ `save_dataset` has additional arguments: \n",
"\n",
"- `format_name`: overwrites the inferred plugin selection\n",
"- `allow_overwrite`: Allows to overwrite existing files __(USE WITH CAUTION!!!)__"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"help(save_dataset)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Since this is just an example and we don't overwrite important data we will use `allow_overwrite=True`.\n",
"Also it makes writing this documentation easier, not having to manually delete the test file each time you run the cell."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"save_dataset(\n",
" dataset, \"half_intensity.json\", allow_overwrite=True, my_extra_option=\"just as an example\"\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now let's test our data loading functionality."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"reloaded_data = load_dataset(\"half_intensity.json\", my_extra_option=\"just as an example\")\n",
"reloaded_data"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"reloaded_plot_data = reloaded_data.data.sel(spectral=[620, 630, 650], method=\"nearest\")\n",
"reloaded_plot_data.plot.line(x=\"time\", aspect=2, size=5)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Since this looks like the above plot, but with half the amplitudes, so writing and reading our data worked as we hoped it would.\n",
"\n",
"Writing a `ProjectIo` plugin words analogous:\n",
"\n",
" | `DataIo` plugin | `ProjectIo` plugin \n",
"---|---|---\n",
"Register function| `glotaran.plugin_system.data_io_registration.register_data_io`| `glotaran.plugin_system.project_io_registration.register_project_io`\n",
"Baseclass | `glotaran.io.interface.DataIoInterface` | `glotaran.io.interface.DataIoInterface`\n",
"Possible methods| `load_dataset` ,
`save_dataset` | `load_model` ,
`save_model` ,
`load_parameters` ,
`save_parameters` ,
`load_scheme` ,
`save_scheme` ,
`load_result` ,
`save_result`\n",
"\n",
"Of course you don't have to implement all methods (sometimes that doesn't even make sense), but only the ones you need.\n",
"\n",
"Last but not least:\n",
"\n",
"Chances are that if you need a plugin someone else does too, so it would awesome if you would publish it open source, so the wheel isn't reinvented over and over again."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3"
}
},
"nbformat": 4,
"nbformat_minor": 4
}