Welcome to pyglotaran’s documentation!
Introduction
Pyglotaran is a python library for global analysis of time-resolved spectroscopy data. It is designed to provide a state of the art modeling toolbox to researchers, in a user-friendly manner.
Its features are:
user-friendly modeling with a custom YAML (
*.yml
) based modeling languageparameter optimization using variable projection and non-negative least-squares algorithms
easy to extend modeling framework
battle-hardened model and algorithms for fluorescence dynamics
build upon and fully integrated in the standard Python science stack (NumPy, SciPy, Jupyter)
A Note To Glotaran Users
Although closely related and developed in the same lab, pyglotaran is not a replacement for Glotaran - A GUI For TIMP. Pyglotaran only aims to provide the modeling and optimization framework and algorithms. It is of course possible to develop a new GUI which leverages the power of pyglotaran (contributions welcome).
The current ‘user-interface’ for pyglotaran is Jupyter Notebook. It is designed to seamlessly integrate in this environment and be compatible with all major visualization and data analysis tools in the scientific python environment.
If you are a non-technical user, you should give these tools a try, there are numerous tutorials how to use them. You don’t need to really learn to program. If you can use e.g. Matlab or Mathematica, you can use Jupyter and Python.
Installation
Prerequisites
Python 3.6 or later
Windows
The easiest way of getting Python (and some basic tools to work with it) in Windows is to use Anaconda, which provides python.
You will need a terminal for the installation. One is provided by Anaconda and is called Anaconda Console. You can find it in the start menu.
Note
If you use a Windows Shell like cmd.exe or PowerShell, you might have to prefix ‘$PATH_TO_ANACONDA/’ to all commands (e.g. C:/Anaconda/pip.exe instead of pip)
Stable release
Warning
pyglotaran is early development, so for the moment stable releases are sparse and outdated. We try to keep the master code stable, so please install from source for now.
This is the preferred method to install pyglotaran, as it will always install the most recent stable release.
To install pyglotaran, run this command in your terminal:
$ pip install pyglotaran
If you don’t have pip installed, this Python installation guide can guide you through the process.
If you want to install it via conda, you can run the following command:
$ conda install -c conda-forge pyglotaran
From sources
First you have to install or update some dependencies.
Within a terminal:
$ pip install -U numpy scipy Cython
Alternatively, for Anaconda users:
$ conda install numpy scipy Cython
Afterwards you can simply use pip to install it directly from Github.
$ pip install git+https://github.com/glotaran/pyglotaran.git
For updating pyglotaran, just re-run the command above.
If you prefer to manually download the source files, you can find them on Github. Alternatively you can clone them with git (preferred):
$ git clone https://github.com/glotaran/pyglotaran.git
Within a terminal, navigate to directory where you have unpacked or cloned the code and enter
$ pip install -e .
For updating, simply download and unpack the newest version (or run $ git pull
in pyglotaran directory if you used git) and and re-run the command above.
Quickstart/Cheat-Sheet
Since this documentation is written in a jupyter-notebook we will import a little ipython helper function to display file with syntax highlighting.
[1]:
from glotaran.utils.ipython import display_file
To start using pyglotaran
in your project, you have to import it first. In addition we need to import some extra components for later use.
[2]:
from glotaran.analysis.optimize import optimize
from glotaran.io import load_model
from glotaran.io import load_parameters
from glotaran.io import save_dataset
from glotaran.io.prepare_dataset import prepare_time_trace_dataset
from glotaran.project.scheme import Scheme
Let us get some example data to analyze:
[3]:
from glotaran.examples.sequential import dataset
dataset
[3]:
<xarray.Dataset> Dimensions: (time: 2100, spectral: 72) Coordinates: * time (time) float64 -1.0 -0.99 -0.98 -0.97 ... 19.96 19.97 19.98 19.99 * spectral (spectral) float64 600.0 601.4 602.8 604.2 ... 696.6 698.0 699.4 Data variables: data (time, spectral) float64 -0.008312 -0.01426 ... 1.715 1.533
Like all data in pyglotaran
, the dataset is a xarray.Dataset. You can find more information about the xarray
library the xarray hompage.
The loaded dataset is a simulated sequential model.
Plotting raw data
Now we lets plot some time traces.
[4]:
plot_data = dataset.data.sel(spectral=[620, 630, 650], method="nearest")
plot_data.plot.line(x="time", aspect=2, size=5);
We can also plot spectra at different times.
[5]:
plot_data = dataset.data.sel(time=[1, 10, 20], method="nearest")
plot_data.plot.line(x="spectral", aspect=2, size=5);
Preparing data
To get an idea about how to model your data, you should inspect the singular value decomposition. Pyglotaran has a function to calculate it (among other things).
[6]:
dataset = prepare_time_trace_dataset(dataset)
dataset
[6]:
<xarray.Dataset> Dimensions: (time: 2100, spectral: 72, left_singular_value_index: 72, singular_value_index: 72, right_singular_value_index: 72) Coordinates: * time (time) float64 -1.0 -0.99 -0.98 ... 19.98 19.99 * spectral (spectral) float64 600.0 601.4 ... 698.0 699.4 Dimensions without coordinates: left_singular_value_index, singular_value_index, right_singular_value_index Data variables: data (time, spectral) float64 -0.008312 ... 1.533 data_left_singular_vectors (time, left_singular_value_index) float64 -7... data_singular_values (singular_value_index) float64 4.62e+03 ... ... data_right_singular_vectors (right_singular_value_index, spectral) float64 ...
First, take a look at the first 10 singular values:
[7]:
plot_data = dataset.data_singular_values.sel(singular_value_index=range(0, 10))
plot_data.plot(yscale="log", marker="o", linewidth=0, aspect=2, size=5);
Working with models
To analyze our data, we need to create a model.
Create a file called model.yaml
in your working directory and fill it with the following:
[8]:
display_file("model.yaml", syntax="yaml")
[8]:
type: kinetic-spectrum
initial_concentration:
input:
compartments: [s1, s2, s3]
parameters: [input.1, input.0, input.0]
k_matrix:
k1:
matrix:
(s2, s1): kinetic.1
(s3, s2): kinetic.2
(s3, s3): kinetic.3
megacomplex:
m1:
k_matrix: [k1]
irf:
irf1:
type: gaussian
center: irf.center
width: irf.width
dataset:
dataset1:
initial_concentration: input
megacomplex: [m1]
irf: irf1
Now you can load the model file.
[9]:
model = load_model("model.yaml")
You can check your model for problems with model.validate
.
[10]:
model.validate()
[10]:
'Your model is valid.'
Working with parameters
Now define some starting parameters. Create a file called parameters.yaml
with the following content.
[11]:
display_file("parameters.yaml", syntax="yaml")
[11]:
input:
- ['1', 1, {'vary': False, 'non-negative': False}]
- ['0', 0, {'vary': False, 'non-negative': False}]
kinetic: [
0.5,
0.3,
0.1,
]
irf:
- ['center', 0.3]
- ['width', 0.1]
[12]:
parameters = load_parameters("parameters.yaml")
You can model.validate
also to check for missing parameters.
[13]:
model.validate(parameters=parameters)
[13]:
'Your model is valid.'
Since not all problems in the model can be detected automatically it is wise to visually inspect the model. For this purpose, you can just print the model.
[14]:
model
[14]:
Model
Type: kinetic-spectrum
Initial Concentration
input:
Label: input
Compartments: [‘s1’, ‘s2’, ‘s3’]
Parameters: [input.1, input.0, input.0]
Exclude From Normalize: []
K Matrix
k1:
Label: k1
Matrix:
(‘s2’, ‘s1’): kinetic.1
(‘s3’, ‘s2’): kinetic.2
(‘s3’, ‘s3’): kinetic.3
Irf
irf1 (gaussian):
Label: irf1
Type: gaussian
Center: irf.center
Width: irf.width
Normalize: True
Backsweep: False
Dataset
dataset1:
Label: dataset1
Megacomplex: [‘m1’]
Initial Concentration: input
Irf: irf1
Megacomplex
m1 (None):
Label: m1
K Matrix: [‘k1’]
The same way you should inspect your parameters.
[15]:
parameters
[15]:
input:
Label
Value
StdErr
Min
Max
Vary
Non-Negative
Expr
1
1
0
-inf
inf
False
False
None
0
0
0
-inf
inf
False
False
None
irf:
Label
Value
StdErr
Min
Max
Vary
Non-Negative
Expr
center
0.3
0
-inf
inf
True
False
None
width
0.1
0
-inf
inf
True
False
None
kinetic:
Label
Value
StdErr
Min
Max
Vary
Non-Negative
Expr
1
0.5
0
-inf
inf
True
False
None
2
0.3
0
-inf
inf
True
False
None
3
0.1
0
-inf
inf
True
False
None
Optimizing data
Now we have everything together to optimize our parameters. First we import optimize.
[16]:
scheme = Scheme(model, parameters, {"dataset1": dataset})
result = optimize(scheme)
result
Iteration Total nfev Cost Cost reduction Step norm Optimality
0 1 7.5712e+00 1.36e+02
1 2 7.5710e+00 1.95e-04 1.97e-05 1.16e-02
2 3 7.5710e+00 1.38e-12 3.77e-09 2.27e-06
Both `ftol` and `xtol` termination conditions are satisfied.
Function evaluations 3, initial cost 7.5712e+00, final cost 7.5710e+00, first-order optimality 2.27e-06.
[16]:
Optimization Result |
|
---|---|
Number of residual evaluation |
3 |
Number of variables |
5 |
Number of datapoints |
151200 |
Degrees of freedom |
151195 |
Chi Square |
1.51e+01 |
Reduced Chi Square |
1.00e-04 |
Root Mean Square Error (RMSE) |
1.00e-02 |
Model
Type: kinetic-spectrum
Initial Concentration
input:
Label: input
Compartments: [‘s1’, ‘s2’, ‘s3’]
Parameters: [input.1: 1.00000e+00 (fixed), input.0: 0.00000e+00 (fixed), input.0: 0.00000e+00 (fixed)]
Exclude From Normalize: []
K Matrix
k1:
Label: k1
Matrix:
(‘s2’, ‘s1’): kinetic.1: 4.99982e-01 (StdErr: 7e-05 ,initial: 5.00000e-01)
(‘s3’, ‘s2’): kinetic.2: 2.99994e-01 (StdErr: 4e-05 ,initial: 3.00000e-01)
(‘s3’, ‘s3’): kinetic.3: 1.00005e-01 (StdErr: 5e-06 ,initial: 1.00000e-01)
Irf
irf1 (gaussian):
Label: irf1
Type: gaussian
Center: irf.center: 2.99998e-01 (StdErr: 5e-06 ,initial: 3.00000e-01)
Width: irf.width: 1.00000e-01 (StdErr: 7e-06 ,initial: 1.00000e-01)
Normalize: True
Backsweep: False
Dataset
dataset1:
Label: dataset1
Megacomplex: [‘m1’]
Initial Concentration: input
Irf: irf1
Megacomplex
m1 (None):
Label: m1
K Matrix: [‘k1’]
[17]:
result.optimized_parameters
[17]:
input:
Label
Value
StdErr
Min
Max
Vary
Non-Negative
Expr
1
1
0
-inf
inf
False
False
None
0
0
0
-inf
inf
False
False
None
irf:
Label
Value
StdErr
Min
Max
Vary
Non-Negative
Expr
center
0.299998
5.01464e-06
-inf
inf
True
False
None
width
0.1
6.70888e-06
-inf
inf
True
False
None
kinetic:
Label
Value
StdErr
Min
Max
Vary
Non-Negative
Expr
1
0.499982
7.26317e-05
-inf
inf
True
False
None
2
0.299994
4.19618e-05
-inf
inf
True
False
None
3
0.100005
4.78474e-06
-inf
inf
True
False
None
You can get the resulting data for your dataset with result.get_dataset
.
[18]:
result_dataset = result.data["dataset1"]
result_dataset
[18]:
<xarray.Dataset> Dimensions: (time: 2100, spectral: 72, left_singular_value_index: 72, singular_value_index: 72, right_singular_value_index: 72, clp_label: 3, species: 3, component: 3, to_species: 3, from_species: 3) Coordinates: * time (time) float64 -1.0 ... 19.99 * spectral (spectral) float64 600.0 ... 699.4 * clp_label (clp_label) <U2 's1' 's2' 's3' * species (species) <U2 's1' 's2' 's3' rate (component) float64 -0.5 -0.3 -0.1 lifetime (component) float64 -2.0 ... -10.0 * to_species (to_species) <U2 's1' 's2' 's3' * from_species (from_species) <U2 's1' 's2' 's3' Dimensions without coordinates: left_singular_value_index, singular_value_index, right_singular_value_index, component Data variables: (12/23) data (time, spectral) float64 -0.008... data_left_singular_vectors (time, left_singular_value_index) float64 ... data_singular_values (singular_value_index) float64 ... data_right_singular_vectors (spectral, right_singular_value_index) float64 ... matrix (time, clp_label) float64 6.097... clp (spectral, clp_label) float64 1... ... ... decay_associated_spectra (spectral, component) float64 2... a_matrix (component, species) float64 1.... k_matrix (to_species, from_species) float64 ... k_matrix_reduced (to_species, from_species) float64 ... irf_center float64 0.3 irf_width float64 0.1 Attributes: root_mean_square_error: 0.010007267157487098 weighted_root_mean_square_error: 0.010007267157487098
Visualize the Result
The resulting data can be visualized the same way as the dataset. To judge the quality of the fit, you should look at first left and right singular vectors of the residual.
[19]:
residual_left = result_dataset.residual_left_singular_vectors.sel(left_singular_value_index=0)
residual_right = result_dataset.residual_right_singular_vectors.sel(right_singular_value_index=0)
residual_left.plot.line(x="time", aspect=2, size=5)
residual_right.plot.line(x="spectral", aspect=2, size=5);
Finally, you can save your result.
[20]:
save_dataset(result_dataset, "dataset1.nc")
Changelog
0.4.2 (2021-12-31)
🩹 Bug fixes
🩹🚧 Backport of bugfix #927 discovered in PR #860 related to initial_concentration normalization when saving results (#935).
🚧 Maintenance
0.4.1 (2021-09-07)
✨ Features
Integration test result validation (#760)
🩹 Bug fixes
Fix unintended saving of sub-optimal parameters (0ece818, backport from #747)
Improve ordering in k_matrix involved_compartments function (#791)
0.4.0 (2021-06-25)
✨ Features
Add basic spectral model (#672)
Add Channel/Wavelength dependent shift parameter to irf. (#673)
Refactored Problem class into GroupedProblem and UngroupedProblem (#681)
Plugin system was rewritten (#600, #665)
Deprecation framework (#631)
Better notebook integration (#689)
🩹 Bug fixes
Fix excessive memory usage in
_create_svd
(#576)Fix several issues with KineticImage model (#612)
Fix exception in sdt reader index calculation (#647)
Avoid crash in result markdown printing when optimization fails (#630)
ParameterNotFoundException doesn’t prepend ‘.’ if path is empty (#688)
Ensure Parameter.label is str or None (#678)
Properly scale StdError of estimated parameters with RMSE (#704)
More robust covariance_matrix calculation (#706)
ParameterGroup.markdown()
independent parametergroups of order (#592)
🔌 Plugins
ProjectIo
‘folder’/’legacy’ plugin to save results (#620)Model
‘spectral-model’ (#672)
📚 Documentation
User documentation is written in notebooks (#568)
Documentation on how to write a
DataIo
plugin (#600)
🗑️ Deprecations (due in 0.6.0)
glotaran.ParameterGroup
->glotaran.parameterParameterGroup
glotaran.read_model_from_yaml
->glotaran.io.load_model(..., format_name="yaml_str")
glotaran.read_model_from_yaml_file
->glotaran.io.load_model(..., format_name="yaml")
glotaran.read_parameters_from_csv_file
->glotaran.io.load_parameters(..., format_name="csv")
glotaran.read_parameters_from_yaml
->glotaran.io.load_parameters(..., format_name="yaml_str")
glotaran.read_parameters_from_yaml_file
->glotaran.io.load_parameters(..., format_name="yaml")
glotaran.io.read_data_file
->glotaran.io.load_dataset
result.save
->glotaran.io.save_result(result, ..., format_name="legacy")
result.get_dataset("<dataset_name>")
->result.data["<dataset_name>"]
glotaran.analysis.result
->glotaran.project.result
glotaran.analysis.scheme
->glotaran.project.scheme
model.simulate
->glotaran.analysis.simulation.simulate(model, ...)
0.3.3 (2021-03-18)
Force recalculation of SVD attributes in
scheme._prepare_data
(#597)Remove unneeded check in
spectral_penalties._get_area
Fixes (#598)Added python 3.9 support (#450)
0.3.2 (2021-02-28)
Re-release of version 0.3.1 due to packaging issue
0.3.1 (2021-02-28)
Added compatibility for numpy 1.20 and raised minimum required numpy version to 1.20 (#555)
Fixed excessive memory consumption in result creation due to full SVD computation (#574)
Added feature parameter history (#557)
Moved setup logic to
setup.cfg
(#560)
0.3.0 (2021-02-11)
Significant code refactor with small API changes to parameter relation specification (see docs)
Replaced lmfit with scipy.optimize
0.2.0 (2020-12-02)
Large refactor with significant improvements but also small API changes (see docs)
Removed doas plugin
0.1.0 (2020-07-14)
Package was renamed to
pyglotaran
on PyPi
0.0.8 (2018-08-07)
Changed
nan_policiy
toomit
0.0.7 (2018-08-07)
Added support for multiple shapes per compartment.
0.0.6 (2018-08-07)
First release on PyPI, support for Windows installs added.
Pre-Alpha Development
Overview
Data IO
Plotting
Modelling
Parameter
Optimizing
API Documentation
The API Documentation for pyglotaran is automatically created from its docstrings.
Glotaran package __init__.py |
Plugins
To be as flexible as possible pyglotaran
uses a plugin system to handle new Models
, DataIo
and ProjectIo
.
Those plugins can be defined by pyglotaran
itself, the user or a 3rd party plugin package.
Builtin plugins
Models
KineticSpectrumModel
KineticImageModel
Data Io
Plugins reading and writing data to and from xarray.Dataset or xarray.DataArray.
AsciiDataIo
NetCDFDataIo
SdtDataIo
Project Io
Plugins reading and writing, Model
,:class:Schema,:class:ParameterGroup or Result
.
YmlProjectIo
CsvProjectIo
FolderProjectIo
Reproducibility and plugins
With a plugin ecosystem there always is the possibility that multiple plugins try register under the same format/name.
This is why plugins are registered at least twice. Once under the name the developer intended and secondly
under their full name (full import path).
This allows to ensure that a specific plugin is used by manually specifying the plugin,
so if someone wants to run your analysis the results will be reproducible even if they have conflicting plugins installed.
You can gain all information about the installed plugins by calling the corresponding *_plugin_table
function with both
options (plugin_names
and full_names
) set to true.
To pin a used plugin use the corresponding set_*_plugin
function with the intended name (format_name
/model_name
)
and the full name (full_plugin_name
) of the plugin to use.
If you wanted to ensure that the pyglotaran builtin plugin is used for sdt
files you could add the following lines
to the beginning of your analysis code.
from glotaran.io import set_data_plugin
set_data_plugin("sdt", "glotaran.builtin.io.sdt.sdt_file_reader.SdtDataIo_sdt")
Models
The functions for model plugins are located in glotaran.model
and called model_plugin_table
and set_model_plugin
.
Data Io
The functions for data io plugins are located in glotaran.io
and called data_io_plugin_table
and set_data_plugin
.
Project Io
The functions for project io plugins are located in glotaran.io
and called project_io_plugin_table
and set_project_plugin
.
3rd party plugins
Plugins not part of pyglotaran
itself.
Not yet, why not be the first? Tell us about your plugin and we will feature it here.
Contributing
Contributions are welcome, and they are greatly appreciated! Every little bit helps, and credit will always be given.
You can contribute in many ways:
Types of Contributions
Report Bugs
Report bugs at https://github.com/glotaran/pyglotaran/issues.
If you are reporting a bug, please include:
Your operating system name and version.
Any details about your local setup that might be helpful in troubleshooting.
Detailed steps to reproduce the bug.
Fix Bugs
Look through the GitHub issues for bugs. Anything tagged with “bug” and “help wanted” is open to whoever wants to implement it.
Implement Features
Look through the GitHub issues for features. Anything tagged with “enhancement” and “help wanted” is open to whoever wants to implement it.
Write Documentation
pyglotaran could always use more documentation, whether as part of the official pyglotaran docs, in docstrings, or even on the web in blog posts, articles, and such. If you are writing docstrings please use the NumPyDoc style to write them.
Submit Feedback
The best way to send feedback is to file an issue at https://github.com/glotaran/pyglotaran/issues.
If you are proposing a feature:
Explain in detail how it would work.
Keep the scope as narrow as possible, to make it easier to implement.
Remember that this is a volunteer-driven project, and that contributions are welcome :)
Get Started!
Ready to contribute? Here’s how to set up pyglotaran
for local development.
Fork the
pyglotaran
repo on GitHub.Clone your fork locally:
$ git clone https://github.com/<your_name_here>/pyglotaran.git
Install your local copy into a virtualenv. Assuming you have virtualenvwrapper installed, this is how you set up your fork for local development:
$ mkvirtualenv pyglotaran (pyglotaran)$ cd pyglotaran (pyglotaran)$ python -m pip install -r requirements_dev.txt (pyglotaran)$ pip install -e . --process-dependency-links
Install the
pre-commit
hooks, to automatically format and check your code:$ pre-commit install
Create a branch for local development:
$ git checkout -b name-of-your-bugfix-or-feature
Now you can make your changes locally.
When you’re done making changes, check that your changes pass flake8 and the tests, including testing other Python versions with tox:
$ pre-commit run -a $ py.test
Or to run all at once:
$ tox
Commit your changes and push your branch to GitHub:
$ git add . $ git commit -m "Your detailed description of your changes." $ git push origin name-of-your-bugfix-or-feature
Submit a pull request through the GitHub website.
Note
By default pull requests will use the template located at .github/PULL_REQUEST_TEMPLATE.md
.
But we also provide custom tailored templates located inside of .github/PULL_REQUEST_TEMPLATE
.
Sadly the GitHub Web Interface doesn’t provide an easy way to select them as it does for issue templates
(see this comment for more details).
To use them you need to add the following query parameters to the url when creating the pull request and hit enter:
✨ Feature PR:
?expand=1&template=feature_PR.md
🩹 Bug Fix PR:
?expand=1&template=bug_fix_PR
📚 Documentation PR:
?expand=1&template=docs_PR.md
Pull Request Guidelines
Before you submit a pull request, check that it meets these guidelines:
The pull request should include tests.
If the pull request adds functionality, the docs should be updated. Put your new functionality into a function with a docstring.
The pull request should work for Python 3.8 and 3.9 Check your Github Actions
https://github.com/<your_name_here>/pyglotaran/actions
and make sure that the tests pass for all supported Python versions.
Docstrings
We use numpy style docstrings, which can also be autogenerated from function/method signatures by extensions for your editor.
Some extensions for popular editors are:
Note
If your pull request improves the docstring coverage (check pre-commit run -a interrogate
),
please raise the value of the interrogate setting fail-under
in
pyproject.toml.
That way the next person will improve the docstring coverage as well and
everyone can enjoy a better documentation.
Warning
As soon as all our docstrings are in proper shape we will enforce that it stays that way. If you want to check if your docstrings are fine you can use pydocstyle and darglint.
Tips
To run a subset of tests:
$ py.test tests.test_pyglotaran
Deprecations
Only maintainers are allowed to decide about deprecations, thus you should first open an issue and check back with them if they are ok with deprecating something.
To make deprecations as robust as possible and give users all needed information
to adjust their code, we provide helper functions inside the module
glotaran.deprecation
.
The functions you most likely want to use are
deprecate()
for functions, methods and classeswarn_deprecated()
for call argumentsdeprecate_module_attribute()
for module attributesdeprecate_submodule()
for modules
Those functions not only make it easier to deprecate something, but they also check that that deprecations will be removed when they are due and that at least the imports in the warning work. Thus all deprecations need to be tested.
Tests for deprecations should be placed in glotaran/deprecation/modules/test
which also
provides the test helper functions deprecation_warning_on_call_test_helper
and
changed_import_test_warn
.
Since the tests for deprecation are mainly for maintainability and not to test the
functionality (those tests should be in the appropriate place)
deprecation_warning_on_call_test_helper
will by default just test that a
DeprecationWarning
was raised and ignore all raise Exception
s.
An exception to this rule is when adding back removed functionality
(which shouldn’t happen in the first place but might), which should be
implemented in a file under glotaran/deprecation/modules
and filenames should be like the
relative import path from glotaran root, but with _
instead of .
.
E.g. glotaran.analysis.scheme
would map to analysis_scheme.py
The only exceptions to this rule are the root __init__.py
which
is named glotaran_root.py
and testing changed imports which should
be placed in test_changed_imports.py
.
Deprecating a Function, method or class
Deprecating a function, method or class is as easy as adding the deprecate
decorator to it. Other decorators (e.g. @staticmethod
or @classmethod
)
should be placed both deprecate
in order to work.
from glotaran.deprecation import deprecate
@deprecate(
deprecated_qual_name_usage="glotaran.some_module.function_to_deprecate(filename)",
new_qual_name_usage='glotaran.some_module.new_function(filename, format_name="legacy")',
to_be_removed_in_version="0.6.0",
)
def function_to_deprecate(*args, **kwargs):
...
Deprecating a call argument
When deprecating a call argument you should use warn_deprecated
and set
the argument to deprecate to a default value (e.g. "deprecated"
) to check against.
Note that for this use case we need to set check_qual_names=(False, False)
which
will deactivate the import testing.
This might not always be possible, e.g. if the argument is positional only,
so it might make more sense to deprecate the whole callable, just discuss what to
do with our trusted maintainers.
from glotaran.deprecation import deprecate
def function_to_deprecate(args1, new_arg="new_default_behavior", deprecated_arg="deprecated", **kwargs):
if deprecated_arg != "deprecated":
warn_deprecated(
deprecated_qual_name_usage="deprecated_arg",
new_qual_name_usage='new_arg="legacy"',
to_be_removed_in_version="0.6.0",
check_qual_names=(False, False)
)
new_arg = "legacy"
...
Deprecating a module attribute
Sometimes it might be necessary to remove an attribute (function, class, or constant)
from a module to prevent circular imports or just to streamline the API.
In those cases you would use deprecate_module_attribute
inside a module __getattr__
function definition. This will import the attribute from the new location and return it when
an import or use is requested.
def __getattr__(attribute_name: str):
from glotaran.deprecation import deprecate_module_attribute
if attribute_name == "deprecated_attribute":
return deprecate_module_attribute(
deprecated_qual_name="glotaran.old_package.deprecated_attribute",
new_qual_name="glotaran.new_package.new_attribute_name",
to_be_removed_in_version="0.6.0",
)
raise AttributeError(f"module {__name__} has no attribute {attribute_name}")
Deprecating a submodule
For a better logical structure, it might be needed to move modules to a different
location in the project. In those cases, you would use deprecate_submodule
,
which imports the module from the new location, add it to sys.modules
and
as an attribute to the parent package.
from glotaran.deprecation import deprecate_submodule
module_name = deprecate_submodule(
deprecated_module_name="glotaran.old_package.module_name",
new_module_name="glotaran.new_package.new_module_name",
to_be_removed_in_version="0.6.0",
)
Testing Result consistency
To test the consistency of results locally you need to clone the pyglotaran-examples and run them:
$ git clone https://github.com/glotaran/pyglotaran-examples
$ cd pyglotaran-examples
$ python scripts/run_examples.py run-all --headless
Note
Make sure you got the the latest version (git pull
) and are
on the correct branch for both pyglotaran
and pyglotaran-examples
.
The results from the examples will be saved in you home folder under pyglotaran_examples_results
.
Those results than will be compared to the ‘gold standard’ defined by the maintainers.
To test the result consistency run:
$ pytest .github/test_result_consistency.py
If needed this will clone the ‘gold standard’ results
to the folder comparison-results
, update them and test your current results against them.
Deploying
A reminder for the maintainers on how to deploy.
Make sure all your changes are committed (including an entry in HISTORY.rst),
the version number only needs to be changed in glotaran/__init__.py
.
Then make a new release on GitHub and
give the tag a proper name, e.g. 0.3.0
since might be included in a citation.
Github Actions will then deploy to PyPI if the tests pass.
Plugin development
If you don’t find the plugin that fits your needs you can always write your own. This sections will explain you how and what you need to know.
In time we will also provide you with a cookiecutter template, to kickstart your new plugin for publishing as a package on PyPi.
How to Write your own Io plugin
There are all kinds of different data formats, so it is quite likely that your experimental setup uses a format which isn’t yet supported by a glotaran
plugin and want to write your own DataIo
plugin to support this format.
Since json
is very common format (admittedly not for data, but in general) and python has builtin support for it we will use it as an example.
First let’s have a look which DataIo
plugins are already installed and which functions they support.
[1]:
from glotaran.io import data_io_plugin_table
[2]:
data_io_plugin_table()
[2]:
Format name |
load_dataset |
save_dataset |
---|---|---|
|
* |
* |
|
* |
* |
|
* |
/ |
Looks like there isn’t a json
plugin installed yet, but maybe someone else did already write one, so have a look at the `3rd party plugins
list in the user docsumentation <https://pyglotaran.readthedocs.io/en/latest/user_documentation/using_plugins.html>`__ before you start writing your own plugin.
For the sake of the example, we will write our json
plugin even if there already exists one by the time you read this.
First you need to import all needed libraries and functions.
from __future__ import annotations
: needed to write python 3.10 typing syntax (|
), even with a lower python versionjson
,xarray
: Needed for reading and writing itselfDataIoInterface
: needed to subclass from, this way you get the proper type and especially signature checkingregister_data_io
: registers the DataIo plugin under the givenformat_name
s
[3]:
from __future__ import annotations
import json
import xarray as xr
from glotaran.io.interface import DataIoInterface
from glotaran.plugin_system.data_io_registration import register_data_io
DataIoInterface
has two methods we could implement load_dataset
and save_dataset
, which are used by the identically named functions in glotaran.io
.
We will just implement both for our example to be complete. the quickest way to get started is to just copy over the code from DataIoInterface
which already has the right signatures and some boilerplate docstrings, for the method arguments.
If the default arguments aren’t enough for your plugin and you need your methods to have additional option, you can just add those. Note the *
between file_name
and my_extra_option
, this tell python that my_extra_option
is an keyword only argument and `mypy
<https://github.com/python/mypy>`__ won’t raise an [override]
type error for changing the signature of the method. To help others who might use your plugin and your
future self, it is good practice to documents what each parameter does in the methods docstring, which will be accessed by the help function.
Finally add the @register_data_io
with the format_name
’s you want to register the plugin to, in our case json
and my_json
.
Pro tip: You don’t need to implement the whole functionality inside of the method itself,
[4]:
@register_data_io(["json", "my_json"])
class JsonDataIo(DataIoInterface):
"""My new shiny glotaran plugin for json data io"""
def load_dataset(
self, file_name: str, *, my_extra_option: str = None
) -> xr.Dataset | xr.DataArray:
"""Read json data to xarray.Dataset
Parameters
----------
file_name : str
File containing the data.
my_extra_option: str
This argument is only for demonstration
"""
if my_extra_option is not None:
print(f"Using my extra option loading json: {my_extra_option}")
with open(file_name) as json_file:
data_dict = json.load(json_file)
return xr.Dataset.from_dict(data_dict)
def save_dataset(
self, dataset: xr.Dataset | xr.DataArray, file_name: str, *, my_extra_option=None
):
"""Write xarray.Dataset to a json file
Parameters
----------
dataset : xr.Dataset
Dataset to be saved to file.
file_name : str
File to write the result data to.
my_extra_option: str
This argument is only for demonstration
"""
if my_extra_option is not None:
print(f"Using my extra option for writing json: {my_extra_option}")
data_dict = dataset.to_dict()
with open(file_name, "w") as json_file:
json.dump(data_dict, json_file)
Let’s verify that our new plugin was registered successfully under the format_name
s json
and my_json
.
[5]:
data_io_plugin_table()
[5]:
Format name |
load_dataset |
save_dataset |
---|---|---|
|
* |
* |
|
* |
* |
|
* |
* |
|
* |
* |
|
* |
/ |
Now let’s use the example data from the quickstart to test the reading and writing capabilities of our plugin.
[6]:
from glotaran.examples.sequential import dataset
from glotaran.io import load_dataset
from glotaran.io import save_dataset
[7]:
dataset
[7]:
<xarray.Dataset> Dimensions: (time: 2100, spectral: 72) Coordinates: * time (time) float64 -1.0 -0.99 -0.98 -0.97 ... 19.96 19.97 19.98 19.99 * spectral (spectral) float64 600.0 601.4 602.8 604.2 ... 696.6 698.0 699.4 Data variables: data (time, spectral) float64 -0.00178 0.0028 -0.002776 ... 1.717 1.53
To get a feeling for our data, let’s plot some traces.
[8]:
plot_data = dataset.data.sel(spectral=[620, 630, 650], method="nearest")
plot_data.plot.line(x="time", aspect=2, size=5)
[8]:
[<matplotlib.lines.Line2D at 0x7f93c11461f0>,
<matplotlib.lines.Line2D at 0x7f93c1146100>,
<matplotlib.lines.Line2D at 0x7f93c1146790>]
Since we want to see a difference of our saved and loaded data, we divide the amplitudes by 2 for no reason.
[9]:
dataset["data"] = dataset.data / 2
Now that we changed the data, let’s write them to a file.
But in which order were the arguments again? And are there any additional option?
Good thing we documented our new plugin, so we can just lookup the help.
[10]:
from glotaran.io import show_data_io_method_help
show_data_io_method_help("json", "save_dataset")
Help on method save_dataset in module __main__:
save_dataset(dataset: 'xr.Dataset | xr.DataArray', file_name: 'str', *, my_extra_option=None) method of __main__.JsonDataIo instance
Write xarray.Dataset to a json file
Parameters
----------
dataset : xr.Dataset
Dataset to be saved to file.
file_name : str
File to write the result data to.
my_extra_option: str
This argument is only for demonstration
Note that the function save_dataset
has additional arguments:
format_name
: overwrites the inferred plugin selectionallow_overwrite
: Allows to overwrite existing files (USE WITH CAUTION!!!)
[11]:
help(save_dataset)
Help on function save_dataset in module glotaran.plugin_system.data_io_registration:
save_dataset(dataset: 'xr.Dataset | xr.DataArray', file_name: 'str | PathLike[str]', format_name: 'str' = None, *, allow_overwrite: 'bool' = False, **kwargs: 'Any') -> 'None'
Save data from :xarraydoc:`Dataset` or :xarraydoc:`DataArray` to a file.
Parameters
----------
dataset : xr.Dataset | xr.DataArray
Data to be written to file.
file_name : str | PathLike[str]
File to write the data to.
format_name : str
Format the file should be in, if not provided it will be inferred from the file extension.
allow_overwrite : bool
Whether or not to allow overwriting existing files, by default False
**kwargs : Any
Additional keyword arguments passes to the ``write_dataset`` implementation
of the data io plugin. If you aren't sure about those use ``get_datawriter``
to get the implementation with the proper help and autocomplete.
Since this is just an example and we don’t overwrite important data we will use allow_overwrite=True
. Also it makes writing this documentation easier, not having to manually delete the test file each time you run the cell.
[12]:
save_dataset(
dataset, "half_intensity.json", allow_overwrite=True, my_extra_option="just as an example"
)
Using my extra option for writing json: just as an example
Now let’s test our data loading functionality.
[13]:
reloaded_data = load_dataset("half_intensity.json", my_extra_option="just as an example")
reloaded_data
Using my extra option loading json: just as an example
[13]:
<xarray.Dataset> Dimensions: (time: 2100, spectral: 72) Coordinates: * time (time) float64 -1.0 -0.99 -0.98 -0.97 ... 19.96 19.97 19.98 19.99 * spectral (spectral) float64 600.0 601.4 602.8 604.2 ... 696.6 698.0 699.4 Data variables: data (time, spectral) float64 -0.0008901 0.0014 ... 0.8583 0.765
[14]:
reloaded_plot_data = reloaded_data.data.sel(spectral=[620, 630, 650], method="nearest")
reloaded_plot_data.plot.line(x="time", aspect=2, size=5)
[14]:
[<matplotlib.lines.Line2D at 0x7f93c0170190>,
<matplotlib.lines.Line2D at 0x7f93b84c36a0>,
<matplotlib.lines.Line2D at 0x7f93b84c37c0>]
Since this looks like the above plot, but with half the amplitudes, so writing and reading our data worked as we hoped it would.
Writing a ProjectIo
plugin words analogous:
|
|
|
---|---|---|
Register function |
|
|
Baseclass |
|
|
Possible methods |
|
|
Of course you don’t have to implement all methods (sometimes that doesn’t even make sense), but only the ones you need.
Last but not least:
Chances are that if you need a plugin someone else does too, so it would awesome if you would publish it open source, so the wheel isn’t reinvented over and over again.