Welcome to pyglotaran’s documentation!

Introduction

Pyglotaran is a python library for global analysis of time-resolved spectroscopy data. It is designed to provide a state of the art modeling toolbox to researchers, in a user-friendly manner.

Its features are:

  • user-friendly modeling with a custom YAML (*.yml) based modeling language

  • parameter optimization using variable projection and non-negative least-squares algorithms

  • easy to extend modeling framework

  • battle-hardened model and algorithms for fluorescence dynamics

  • build upon and fully integrated in the standard Python science stack (NumPy, SciPy, Jupyter)

A Note To Glotaran Users

Although closely related and developed in the same lab, pyglotaran is not a replacement for Glotaran - A GUI For TIMP. Pyglotaran only aims to provide the modeling and optimization framework and algorithms. It is of course possible to develop a new GUI which leverages the power of pyglotaran (contributions welcome).

The current ‘user-interface’ for pyglotaran is Jupyter Notebook. It is designed to seamlessly integrate in this environment and be compatible with all major visualization and data analysis tools in the scientific python environment.

If you are a non-technical user, you should give these tools a try, there are numerous tutorials how to use them. You don’t need to really learn to program. If you can use e.g. Matlab or Mathematica, you can use Jupyter and Python.

Installation

Prerequisites

  • Python 3.10 or 3.11

Windows

The easiest way of getting Python (and some basic tools to work with it) in Windows is to use Anaconda, which provides python.

You will need a terminal for the installation. One is provided by Anaconda and is called Anaconda Console. You can find it in the start menu.

Note

If you use a Windows Shell like cmd.exe or PowerShell, you might have to prefix ‘$PATH_TO_ANACONDA/’ to all commands (e.g. C:/Anaconda/pip.exe instead of pip)

Stable release

Warning

pyglotaran is early development, so for the moment stable releases are sparse and outdated. We try to keep the master code stable, so please install from source for now.

This is the preferred method to install pyglotaran, as it will always install the most recent stable release.

To install pyglotaran, run this command in your terminal:

$ pip install pyglotaran

If you don’t have pip installed, this Python installation guide can guide you through the process.

If you want to install it via conda, you can run the following command:

$ conda install -c conda-forge pyglotaran

From sources

First you have to install or update some dependencies.

Within a terminal:

$ pip install -U numpy scipy Cython

Alternatively, for Anaconda users:

$ conda install numpy scipy Cython

Afterwards you can simply use pip to install it directly from Github.

$ pip install git+https://github.com/glotaran/pyglotaran.git

For updating pyglotaran, just re-run the command above.

If you prefer to manually download the source files, you can find them on Github. Alternatively you can clone them with git (preferred):

$ git clone https://github.com/glotaran/pyglotaran.git

Within a terminal, navigate to directory where you have unpacked or cloned the code and enter

$ pip install -e .

For updating, simply download and unpack the newest version (or run $ git pull in pyglotaran directory if you used git) and and re-run the command above.

This page was generated from docs/source/notebooks/quickstart/quickstart.ipynb. Interactive online version: Binder badge

Quickstart/Cheat-Sheet

To start using pyglotaran in your analysis, you only have to import the Project class and open a project.

[1]:
from glotaran.project import Project

quickstart_project = Project.open("quickstart_project")
quickstart_project
[1]:

Project (quickstart_project)

pyglotaran version: 0.7.2

Data

None

Model

  • my_model

Parameters

  • my_parameters

Results

None

If the project does not already exist this will create a new project and its folder structure for you. In our case we had only the models + parameters folders and the data + results folder were created when opening the project.

[2]:
%ls quickstart_project
data/  models/  parameters/  project.gta  results/

Let us get some example data to analyze:

[3]:
from glotaran.testing.simulated_data.sequential_spectral_decay import DATASET as my_dataset

my_dataset
[3]:
<xarray.Dataset>
Dimensions:   (time: 2100, spectral: 72)
Coordinates:
  * time      (time) float64 -1.0 -0.99 -0.98 -0.97 ... 19.96 19.97 19.98 19.99
  * spectral  (spectral) float64 600.0 601.4 602.8 604.2 ... 696.6 698.0 699.4
Data variables:
    data      (time, spectral) float64 -0.01242 -0.007345 ... 2.573 2.303
Attributes:
    source_path:  dataset_1.nc

Like all data in pyglotaran, the dataset is a xarray.Dataset. You can find more information about the xarray library the xarray hompage.

The loaded dataset is a simulated sequential model.

Plotting raw data

Now lets plot some time traces.

[4]:
plot_data = my_dataset.data.sel(spectral=[620, 630, 650], method="nearest")
plot_data.plot.line(x="time", aspect=2, size=5);
_images/notebooks_quickstart_quickstart_9_0.svg

We can also plot spectra at different times.

[5]:
plot_data = my_dataset.data.sel(time=[1, 10, 20], method="nearest")
plot_data.plot.line(x="spectral", aspect=2, size=5);
_images/notebooks_quickstart_quickstart_11_0.svg

Import the data into your project

As long as you can read your data into a xarray.Dataset or xarray.DataArray you can directly import it in to your project.

This will save your data as NetCDF (.nc) file into the data folder inside of your project with the name that you gave it (here quickstart_project/data/my_data.nc).

If the data format you are using is supported by a plugin you can simply copy the file to the data folder of the project (here quickstart_project/data).

[6]:
quickstart_project.import_data(my_dataset, dataset_name="my_data")
quickstart_project
[6]:

Project (quickstart_project)

pyglotaran version: 0.7.2

Data

  • my_data

Model

  • my_model

Parameters

  • my_parameters

Results

None

After importing our quickstart_project is aware of the data that we named my_data when importing.

Preparing data

To get an idea about how to model your data, you should inspect the singular value decomposition. As a convenience the load_data method has the option to add svd data on the fly.

[7]:
dataset_with_svd = quickstart_project.load_data("my_data", add_svd=True)
dataset_with_svd
[7]:
<xarray.Dataset>
Dimensions:                      (time: 2100, spectral: 72,
                                  left_singular_value_index: 72,
                                  singular_value_index: 72,
                                  right_singular_value_index: 72)
Coordinates:
  * time                         (time) float64 -1.0 -0.99 -0.98 ... 19.98 19.99
  * spectral                     (spectral) float64 600.0 601.4 ... 698.0 699.4
Dimensions without coordinates: left_singular_value_index,
                                singular_value_index, right_singular_value_index
Data variables:
    data                         (time, spectral) float64 -0.01242 ... 2.303
    data_left_singular_vectors   (time, left_singular_value_index) float64 -7...
    data_singular_values         (singular_value_index) float64 6.577e+03 ......
    data_right_singular_vectors  (spectral, right_singular_value_index) float64 ...
Attributes:
    source_path:  /home/docs/checkouts/readthedocs.org/user_builds/pyglotaran...
    loader:       <function load_dataset at 0x7f4b372c56c0>

First, take a look at the first 10 singular values:

[8]:
plot_data = dataset_with_svd.data_singular_values.sel(singular_value_index=range(10))
plot_data.plot(yscale="log", marker="o", linewidth=0, aspect=2, size=5);
_images/notebooks_quickstart_quickstart_18_0.svg

This tells us that our data have at least three components which we need to model.

Working with models

To analyze our data, we need to create a model.

Create a file called my_model.yaml in your projects models directory and fill it with the following content.

[9]:
quickstart_project.show_model_definition("my_model")
[9]:
default_megacomplex: decay

initial_concentration:
  input:
    compartments: [s1, s2, s3]
    parameters: [input.1, input.0, input.0]

k_matrix:
  k1:
    matrix:
      (s2, s1): kinetic.1
      (s3, s2): kinetic.2
      (s3, s3): kinetic.3

megacomplex:
  m1:
    k_matrix: [k1]

irf:
  irf1:
    type: gaussian
    center: irf.center
    width: irf.width

dataset:
  my_data:
    initial_concentration: input
    megacomplex: [m1]
    irf: irf1

You can check your model for problems with the validate method.

[10]:
quickstart_project.validate("my_model")
[10]:

Your model is valid.

Working with parameters

Now define some starting parameters. Create a file called parameters.yaml in your projects parameters directory with the following content.

[11]:
quickstart_project.show_parameters_definition("my_parameters")
[11]:
input:
  - ["1", 1, { "vary": False }]
  - ["0", 0, { "vary": False }]

kinetic: [0.51, 0.31, 0.11]

irf:
  - ["center", 0.31]
  - ["width", 0.11]

Note the { "vary": False } which tells pyglotaran that those parameters should not be changed.

You can use validate method also to check for missing parameters.

[12]:
quickstart_project.validate("my_model", "my_parameters")
[12]:

Your model is valid.

Since not all problems in the model can be detected automatically it is wise to visually inspect the model. For this purpose, you can just load the model and inspect its markdown rendered version.

[13]:
quickstart_project.load_model("my_model")
[13]:

Model

Dataset Groups
  • default

    • Label: default

    • Residual Function: variable_projection

K Matrix
  • k1

    • Label: k1

    • Matrix: {(‘s2’, ‘s1’): ‘kinetic.1’, (‘s3’, ‘s2’): ‘kinetic.2’, (‘s3’, ‘s3’): ‘kinetic.3’}

Megacomplex
  • m1

    • Label: m1

    • Dimension: time

    • Type: decay

    • K Matrix: [‘k1’]

Initial Concentration
  • input

    • Label: input

    • Compartments: [‘s1’, ‘s2’, ‘s3’]

    • Parameters: [‘input.1’, ‘input.0’, ‘input.0’]

    • Exclude From Normalize: []

Irf
  • irf1

    • Label: irf1

    • Normalize: True

    • Backsweep: False

    • Type: gaussian

    • Center: irf.center

    • Width: irf.width

Dataset
  • my_data

    • Label: my_data

    • Group: default

    • Force Index Dependent: False

    • Megacomplex: [‘m1’]

    • Initial Concentration: input

    • Irf: irf1

The same way you should inspect your parameters.

[14]:
quickstart_project.load_parameters("my_parameters")
[14]:
  • input:

    Label

    Value

    Standard Error

    t-value

    Minimum

    Maximum

    Vary

    Non-Negative

    Expression

    1

    1.000e+00

    nan

    nan

    -inf

    inf

    False

    False

    None

    0

    0.000e+00

    nan

    nan

    -inf

    inf

    False

    False

    None

  • irf:

    Label

    Value

    Standard Error

    t-value

    Minimum

    Maximum

    Vary

    Non-Negative

    Expression

    center

    3.100e-01

    nan

    nan

    -inf

    inf

    True

    False

    None

    width

    1.100e-01

    nan

    nan

    -inf

    inf

    True

    False

    None

  • kinetic:

    Label

    Value

    Standard Error

    t-value

    Minimum

    Maximum

    Vary

    Non-Negative

    Expression

    1

    5.100e-01

    nan

    nan

    -inf

    inf

    True

    False

    None

    2

    3.100e-01

    nan

    nan

    -inf

    inf

    True

    False

    None

    3

    1.100e-01

    nan

    nan

    -inf

    inf

    True

    False

    None

Optimizing data

Now we have everything together to optimize our parameters.

[15]:
result = quickstart_project.optimize("my_model", "my_parameters")
result
   Iteration     Total nfev        Cost      Cost reduction    Step norm     Optimality
       0              1         1.1175e+04                                    1.73e+06
       1              2         1.4973e+01      1.12e+04       1.96e-02       1.26e+04
       2              3         7.5378e+00      7.44e+00       5.86e-03       1.01e+03
       3              4         7.5339e+00      3.87e-03       1.69e-05       6.53e-02
       4              5         7.5339e+00      1.64e-11       7.31e-09       8.74e-06
`ftol` termination condition is satisfied.
Function evaluations 5, initial cost 1.1175e+04, final cost 7.5339e+00, first-order optimality 8.74e-06.
[15]:

Optimization Result

Number of residual evaluation

5

Number of residuals

151200

Number of free parameters

5

Number of conditionally linear parameters

216

Degrees of freedom

150979

Chi Square

1.51e+01

Reduced Chi Square

9.98e-05

Root Mean Square Error (RMSE)

9.99e-03

Model

Dataset Groups
  • default

    • Label: default

    • Residual Function: variable_projection

K Matrix
  • k1

    • Label: k1

    • Matrix: {(‘s2’, ‘s1’): ‘kinetic.1(5.00e-01±6.78e-05, t-value: 7373, initial: 5.10e-01)’, (‘s3’, ‘s2’): ‘kinetic.2(3.00e-01±3.93e-05, t-value: 7630, initial: 3.10e-01)’, (‘s3’, ‘s3’): ‘kinetic.3(1.00e-01±4.22e-06, t-value: 23696, initial: 1.10e-01)’}

Megacomplex
  • m1

    • Label: m1

    • Dimension: time

    • Type: decay

    • K Matrix: [‘k1’]

Initial Concentration
  • input

    • Label: input

    • Compartments: [‘s1’, ‘s2’, ‘s3’]

    • Parameters: [‘input.1(1.00e+00, fixed)’, ‘input.0(0.00e+00, fixed)’, ‘input.0(0.00e+00, fixed)’]

    • Exclude From Normalize: []

Irf
  • irf1

    • Label: irf1

    • Normalize: True

    • Backsweep: False

    • Type: gaussian

    • Center: irf.center(3.00e-01±5.03e-06, t-value: 59620, initial: 3.10e-01)

    • Width: irf.width(1.00e-01±6.71e-06, t-value: 14894, initial: 1.10e-01)

Dataset
  • my_data

    • Label: my_data

    • Group: default

    • Force Index Dependent: False

    • Megacomplex: [‘m1’]

    • Initial Concentration: input

    • Irf: irf1

Each time you run an optimization the result will be saved in the projects results folder.

[16]:
%ls "quickstart_project/results"
my_model_run_0000/

To visualize how quickly the optimization converged we ca plot the optimality of the optimization_history.

[17]:
result.optimization_history.data["optimality"].plot(logy=True)
[17]:
<Axes: xlabel='iteration'>
_images/notebooks_quickstart_quickstart_38_1.svg
[18]:
result.optimized_parameters
[18]:
  • input:

    Label

    Value

    Standard Error

    t-value

    Minimum

    Maximum

    Vary

    Non-Negative

    Expression

    1

    1.000e+00

    nan

    nan

    -inf

    inf

    False

    False

    None

    0

    0.000e+00

    nan

    nan

    -inf

    inf

    False

    False

    None

  • irf:

    Label

    Value

    Standard Error

    t-value

    Minimum

    Maximum

    Vary

    Non-Negative

    Expression

    center

    3.000e-01

    5.032e-06

    59620

    -inf

    inf

    True

    False

    None

    width

    1.000e-01

    6.714e-06

    14894

    -inf

    inf

    True

    False

    None

  • kinetic:

    Label

    Value

    Standard Error

    t-value

    Minimum

    Maximum

    Vary

    Non-Negative

    Expression

    1

    4.999e-01

    6.780e-05

    7373

    -inf

    inf

    True

    False

    None

    2

    3.000e-01

    3.933e-05

    7630

    -inf

    inf

    True

    False

    None

    3

    9.999e-02

    4.220e-06

    23696

    -inf

    inf

    True

    False

    None

You can inspect the data of your result by accessing data attribute. In our example it only contains our single my_data dataset, but it ca contain as many dataset as you analysis needs.

[19]:
result.data
[19]:
{'my_data': <xarray.Dataset>}
my_data
<xarray.Dataset>
Dimensions:                          (time: 2100, spectral: 72,
                                      left_singular_value_index: 72,
                                      singular_value_index: 72,
                                      right_singular_value_index: 72,
                                      clp_label: 3, species: 3,
                                      component_m1: 3, species_m1: 3,
                                      to_species_m1: 3, from_species_m1: 3)
Coordinates:
  * time                             (time) float64 -1.0 -0.99 ... 19.98 19.99
  * spectral                         (spectral) float64 600.0 601.4 ... 699.4
  * clp_label                        (clp_label) <U2 's1' 's2' 's3'
  * species                          (species) <U2 's1' 's2' 's3'
  * component_m1                     (component_m1) int64 1 2 3
    rate_m1                          (component_m1) float64 0.4999 0.3 0.09999
    lifetime_m1                      (component_m1) float64 2.0 3.333 10.0
  * species_m1                       (species_m1) <U2 's1' 's2' 's3'
    initial_concentration_m1         (species_m1) float64 1.0 0.0 0.0
  * to_species_m1                    (to_species_m1) <U2 's1' 's2' 's3'
  * from_species_m1                  (from_species_m1) <U2 's1' 's2' 's3'
Dimensions without coordinates: left_singular_value_index,
                                singular_value_index, right_singular_value_index
Data variables: (12/21)
    data                             (time, spectral) float64 -0.01242 ... 2.303
    data_left_singular_vectors       (time, left_singular_value_index) float64 ...
    data_singular_values             (singular_value_index) float64 6.577e+03...
    data_right_singular_vectors      (spectral, right_singular_value_index) float64 ...
    residual                         (time, spectral) float64 -0.01242 ... -0...
    matrix                           (time, clp_label) float64 6.087e-39 ... ...
    ...                               ...
    irf_center                       float64 0.3
    irf_width                        float64 0.1
    decay_associated_spectra_m1      (spectral, component_m1) float64 31.32 ....
    a_matrix_m1                      (component_m1, species_m1) float64 1.0 ....
    k_matrix_m1                      (to_species_m1, from_species_m1) float64 ...
    k_matrix_reduced_m1              (to_species_m1, from_species_m1) float64 ...
Attributes:
    source_path:                      /home/docs/checkouts/readthedocs.org/us...
    model_dimension:                  time
    global_dimension:                 spectral
    root_mean_square_error:           0.009982719463118765
    weighted_root_mean_square_error:  0.009982719463118765
    dataset_scale:                    1
    loader:                           <function load_dataset at 0x7f4b372c56c0>

Visualize the Result

The resulting data can be visualized the same way as the dataset. To judge the quality of the fit, you should look at first left and right singular vectors of the residual.

[20]:
result_dataset = result.data["my_data"]

residual_left = result_dataset.residual_left_singular_vectors.sel(left_singular_value_index=0)
residual_right = result_dataset.residual_right_singular_vectors.sel(right_singular_value_index=0)
residual_left.plot.line(x="time", aspect=2, size=5)
residual_right.plot.line(x="spectral", aspect=2, size=5);
_images/notebooks_quickstart_quickstart_43_0.svg
_images/notebooks_quickstart_quickstart_43_1.svg

Changelog

🚀 0.7.2 (2023-12-06)

✨ Features

  • ✨ Official numpy 1.26 support (#1374)

🚧 Maintenance

  • 🧹 Remove unused dependency: ‘rich’ (#1345)

🚀 0.7.1 (2023-07-28)

✨ Features

  • ✨ Python 3.11 support (#1161)

🩹 Bug fixes

  • 🩹 Fix coherent artifact clp label duplication (#1292)

🚀 0.7.0 (Unreleased)

💥 BREAKING CHANGE

  • 💥🚧 Dropped support for Python 3.8 and 3.9 and only support 3.10 (#1135)

✨ Features

  • ✨ Add optimization history to result and iteration column to parameter history (#1134)

  • ♻️ Complete refactor of model and parameter packages using attrs (#1135)

  • ♻️ Move index dependent calculation to megacomplexes for speed-up (#1175)

  • ✨ Add PreProcessingPipeline (#1256, #1263)

👌 Minor Improvements:

  • 👌🎨 Wrap model section in result markdown in details tag for notebooks (#1098)

  • 👌 Allow more natural column names in pandas parameters file reading (#1174)

  • ✨ Integrate plugin system into Project (#1229)

  • 👌 Make yaml the default plugin when passing a folder to save_result and load_result (#1230)

  • ✨ Allow usage of subfolders in project API for parameters, models and data (#1232)

  • ✨ Allow import of xarray objects in project API import_data (#1235)

  • 🩹 Add number_of_clps to result and correct degrees_of_freedom calculation (#1249)

  • 👌 Improve Project API data handling (#1257)

  • 🗑️ Deprecate Result.number_of_parameters in favor of Result.number_of_free_parameters (#1262)

  • 👌Improve reporting of standard error in case of non_negative constraint in the parameter (#1320)

🩹 Bug fixes

  • 🩹 Fix result data overwritten when using multiple dataset_groups (#1147)

  • 🩹 Fix for normalization issue described in #1157 (multi-gaussian irfs and multiple time ranges (streak))

  • 🩹 Fix for crash described in #1183 when doing an optimization using more than 30 datasets (#1184)

  • 🩹 Fix pretty_format_numerical for negative values (#1192)

  • 🩹 Fix yaml result saving with relative paths (#1199)

  • 🩹 Fix model markdown render for items without label (#1213)

  • 🩹 Fix wrong file loading due to partial filename matching in Project (#1212)

  • 🩹 Fix Project.import_data path resolving for different script and cwd (#1214)

  • 👌 Refine project API (#1240)

  • 🩹📚 Fix search in docs (#1268)

📚 Documentation

  • 📚 Update quickstart guide to use Project API (#1241)

🗑️ Deprecations (due in 0.8.0)

  • <model_file>.clp_area_penalties -> <model_file>.clp_penalties

  • glotaran.ParameterGroup -> glotaran.Parameters

  • Command Line Interface (removed without replacement) (#1228)

  • Project.generate_model (removed without replacement)

  • Project.generate_parameters (removed without replacement)

  • glotaran.project.Result.number_of_data_points -> glotaran.project.Result.number_of_residuals

  • glotaran.project.Result.number_of_parameters -> glotaran.project.Result.number_of_free_parameters

🗑️❌ Deprecated functionality removed in this release

  • glotaran.project.Scheme(..., non_negative_least_squares=...)

  • glotaran.project.Scheme(..., group=...)

  • glotaran.project.Scheme(..., group_tolerance=...)

  • <model_file>.non-negative-least-squares: true

  • <model_file>.non-negative-least-squares: false

  • glotaran.parameter.ParameterGroup.to_csv(file_name=parameters.csv)

🚧 Maintenance

  • 🚇🩹 Fix wrong comparison in pr_benchmark workflow (#1097)

  • 🔧 Set sourcery-ai target python version to 3.8 (#1095)

  • 🚇🩹🔧 Fix manifest check (#1099)

  • ♻️ Refactor: optimization (#1060)

  • ♻️🚇 Use GITHUB_OUTPUT instead of set-output in github actions (#1166, #1177)

  • 🚧 Add pinned version of odfpy to requirements_dev.txt (#1164)

  • ♻️ Use validation action and validation as a git submodule (#1165)

  • 🧹 Upgrade syntax to py310 using pyupgrade (#1162)

  • 🧹 Remove unused ‘type: ignore’ (#1168)

  • 🚧 Raise minimum dependency version to releases that support py310 (#1170)

  • 🔧 Make mypy and doc string linters opt out instead of opt in (#1173)

🚀 0.6.0 (2022-06-06)

✨ Features

  • ✨ Python 3.10 support (#977)

  • ✨ Add simple decay megacomplexes (#860)

  • ✨ Feature: Generators (#866)

  • ✨ Project Class (#869)

  • ✨ Add clp guidance megacomplex (#1029)

👌 Minor Improvements:

  • 👌🎨 Add proper repr for DatasetMapping (#957)

  • 👌 Add SavingOptions to save_result API (#966)

  • ✨ Add parameter IO support for more formats supported by pandas (#896)

  • 👌 Apply IRF shift in coherent artifact megacomplex (#992)

  • 👌 Added IRF shift to result dataset (#994)

  • 👌 Improve Result, Parameter and ParameterGroup markdown (#1012)

  • 👌🧹 Add suffix to rate and lifetime and guard for missing datasets (#1022)

  • ♻️ Move simulation to own module (#1041)

  • ♻️ Move optimization to new module glotaran.optimization (#1047)

  • 🩹 Fix missing installation of clp-guide megacomplex as plugin (#1066)

  • 🚧🔧 Add ‘extras’ and ‘full’ extras_require installation options (#1089)

🩹 Bug fixes

  • 🩹 Fix Crash in optimization_group_calculator_linked when using guidance spectra (#950)

  • 🩹 ParameterGroup.get degrades full_label of nested Parameters with nesting over 2 (#1043)

  • 🩹 Show validation problem if parameters are missing values (default: NaN) (#1076)

📚 Documentation

  • 🎨 Add new logo (#1083, #1087)

🗑️ Deprecations (due in 0.8.0)

  • glotaran.io.save_result(result, result_path, format_name='legacy') -> glotaran.io.save_result(result, Path(result_path) / 'result.yml')

  • glotaran.analysis.simulation -> glotaran.simulation.simulation

  • glotaran.analysis.optimize -> glotaran.optimization.optimize

🗑️❌ Deprecated functionality removed in this release

  • glotaran.ParameterGroup -> glotaran.parameter.ParameterGroup

  • glotaran.read_model_from_yaml -> glotaran.io.load_model(..., format_name="yaml_str")

  • glotaran.read_model_from_yaml_file -> glotaran.io.load_model(..., format_name="yaml")

  • glotaran.read_parameters_from_csv_file -> glotaran.io.load_parameters(..., format_name="csv")

  • glotaran.read_parameters_from_yaml -> glotaran.io.load_parameters(..., format_name="yaml_str")

  • glotaran.read_parameters_from_yaml_file -> glotaran.io.load_parameters(..., format_name="yaml")

  • glotaran.io.read_data_file -> glotaran.io.load_dataset

  • result.get_dataset("<dataset_name>") -> result.data["<dataset_name>"]

  • glotaran.analysis.result -> glotaran.project.result

  • glotaran.analysis.scheme -> glotaran.project.scheme

🚧 Maintenance

  • 🔧 Improve packaging tooling (#923)

  • 🔧🚇 Exclude test files from duplication checks on sonarcloud (#959)

  • 🔧🚇 Only run check-manifest on the CI (#967)

  • 🚇👌 Exclude dependabot push CI runs (#978)

  • 🚇👌 Exclude sourcery AI push CI runs (#1014)

  • 👌📚🚇 Auto remove notebook written data when building docs (#1019)

  • 👌🚇 Change integration tests to use self managed examples action (#1034)

  • 🚇🧹 Exclude pre-commit bot branch from CI runs on push (#1085)

🚀 0.5.1 (2021-12-31)

🩹 Bug fixes

  • 🩹 Bugfix Use normalized initial_concentrations in result creation for decay megacomplex (#927)

  • 🩹 Fix save_result crashes on Windows if input data are on a different drive than result (#931)

🚧 Maintenance

  • 🚧 Forward port Improve result comparison workflow and v0.4 changelog (#938)

  • 🚧 Forward port of #936 test_result_consistency

🚀 0.5.0 (2021-12-01)

✨ Features

  • ✨ Feature: Megacomplex Models (#736)

  • ✨ Feature: Full Models (#747)

  • ✨ Damped Oscillation Megacomplex (a.k.a. DOAS) (#764)

  • ✨ Add Dataset Groups (#851)

  • ✨ Performance improvements (in some cases up to 5x) (#740)

👌 Minor Improvements:

  • 👌 Add dimensions to megacomplex and dataset_descriptor (#702)

  • 👌 Improve ordering in k_matrix involved_compartments function (#788)

  • 👌 Improvements to application of clp_penalties (equal area) (#801)

  • ♻️ Refactor model.from_dict to parse megacomplex_type from dict and add simple_generator for testing (#807)

  • ♻️ Refactor model spec (#836)

  • ♻️ Refactor Result Saving (#841)

  • ✨ Use ruaml.yaml parser for roundtrip support (#893)

  • ♻️ Refactor Result and Scheme loading/initializing from files (#903)

  • ♻️ Several refactoring in glotaran.Parameter (#910)

  • 👌 Improved Reporting of Parameters (#910, #914, #918)

  • 👌 Scheme now excepts paths to model, parameter and data file without initializing them first (#912)

🩹 Bug fixes

  • 🩹 Fix/cli0.5 (#765)

  • 🩹 Fix compartment ordering randomization due to use of set (#799)

  • 🩹 Fix check_deprecations not showing deprecation warnings (#775)

  • 🩹 Fix and re-enable IRF Dispersion Test (#786)

  • 🩹 Fix coherent artifact crash for index dependent models #808

  • 🩹 False positive model validation fail when combining multiple default megacomplexes (#797)

  • 🩹 Fix ParameterGroup repr when created with ‘from_list’ (#827)

  • 🩹 Fix for DOAS with reversed oscillations (negative rates) (#839)

  • 🩹 Fix parameter expression parsing (#843)

  • 🩹 Use a context manager when opening a nc dataset (#848)

  • 🚧 Disallow xarray versions breaking plotting in integration tests (#900)

  • 🩹 Fix ‘dataset_groups’ not shown in model markdown (#906)

📚 Documentation

  • 📚 Moved API documentation from User to Developer Docs (#776)

  • 📚 Add docs for the CLI (#784)

  • 📚 Fix deprecation in model used in quickstart notebook (#834)

🗑️ Deprecations (due in 0.7.0)

  • glotaran.model.Model.model_dimension -> glotaran.project.Scheme.model_dimension

  • glotaran.model.Model.global_dimension -> glotaran.project.Scheme.global_dimension

  • <model_file>.type.kinetic-spectrum -> <model_file>.default_megacomplex.decay

  • <model_file>.type.spectral-model -> <model_file>.default_megacomplex.spectral

  • <model_file>.spectral_relations -> <model_file>.clp_relations

  • <model_file>.spectral_relations.compartment -> <model_file>.clp_relations.source

  • <model_file>.spectral_constraints -> <model_file>.clp_constraints

  • <model_file>.spectral_constraints.compartment -> <model_file>.clp_constraints.target

  • <model_file>.equal_area_penalties -> <model_file>.clp_area_penalties

  • <model_file>.irf.center_dispersion -> <model_file>.irf.center_dispersion_coefficients

  • <model_file>.irf.width_dispersion -> <model_file>.irf.width_dispersion_coefficients

  • glotaran.project.Scheme(..., non_negative_least_squares=...) -> <model_file>dataset_groups.default.residual_function

  • glotaran.project.Scheme(..., group=...) -> <model_file>dataset_groups.default.link_clp

  • glotaran.project.Scheme(..., group_tolerance=...) -> glotaran.project.Scheme(..., clp_link_tolerance=...)

  • <scheme_file>.maximum-number-function-evaluations -> <scheme_file>.maximum_number_function_evaluations

  • <model_file>.non-negative-least-squares: true -> <model_file>dataset_groups.default.residual_function: non_negative_least_squares

  • <model_file>.non-negative-least-squares: false -> <model_file>dataset_groups.default.residual_function: variable_projection

  • glotaran.parameter.ParameterGroup.to_csv(file_name=parameters.csv) -> glotaran.io.save_parameters(parameters, file_name=parameters.csv)

🚧 Maintenance

  • 🩹 Fix Performance Regressions (between version) (#740)

  • 🧪🚇 Add integration test result validation (#754)

  • 🔧 Add more QA tools for parts of glotaran (#739)

  • 🔧 Fix interrogate usage (#781)

  • 🚇 Speedup PR benchmark (#785)

  • 🚇🩹 Use pinned versions of dependencies to run integration CI tests (#892)

  • 🧹 Move megacomplex integration tests from root level to megacomplexes (#894)

  • 🩹 Fix artifact download in pr_benchmark_reaction workflow (#907)

🚀 0.4.2 (2021-12-31)

🩹 Bug fixes

  • 🩹🚧 Backport of bugfix #927 discovered in PR #860 related to initial_concentration normalization when saving results (#935).

🚧 Maintenance

  • 🚇🚧 Updated ‘gold standard’ result comparison reference (old -> new)

  • 🚇 Refine test_result_consistency (#936).

🚀 0.4.1 (2021-09-07)

✨ Features

  • Integration test result validation (#760)

🩹 Bug fixes

  • Fix unintended saving of sub-optimal parameters (0ece818, backport from #747)

  • Improve ordering in k_matrix involved_compartments function (#791)

🚀 0.4.0 (2021-06-25)

✨ Features

  • Add basic spectral model (#672)

  • Add Channel/Wavelength dependent shift parameter to irf. (#673)

  • Refactored Problem class into GroupedProblem and UngroupedProblem (#681)

  • Plugin system was rewritten (#600, #665)

  • Deprecation framework (#631)

  • Better notebook integration (#689)

🩹 Bug fixes

  • Fix excessive memory usage in _create_svd (#576)

  • Fix several issues with KineticImage model (#612)

  • Fix exception in sdt reader index calculation (#647)

  • Avoid crash in result markdown printing when optimization fails (#630)

  • ParameterNotFoundException doesn’t prepend ‘.’ if path is empty (#688)

  • Ensure Parameter.label is str or None (#678)

  • Properly scale StdError of estimated parameters with RMSE (#704)

  • More robust covariance_matrix calculation (#706)

  • ParameterGroup.markdown() independent parametergroups of order (#592)

🔌 Plugins

  • ProjectIo ‘folder’/’legacy’ plugin to save results (#620)

  • Model ‘spectral-model’ (#672)

📚 Documentation

  • User documentation is written in notebooks (#568)

  • Documentation on how to write a DataIo plugin (#600)

🗑️ Deprecations (due in 0.6.0)

  • glotaran.ParameterGroup -> glotaran.parameterParameterGroup

  • glotaran.read_model_from_yaml -> glotaran.io.load_model(..., format_name="yaml_str")

  • glotaran.read_model_from_yaml_file -> glotaran.io.load_model(..., format_name="yaml")

  • glotaran.read_parameters_from_csv_file -> glotaran.io.load_parameters(..., format_name="csv")

  • glotaran.read_parameters_from_yaml -> glotaran.io.load_parameters(..., format_name="yaml_str")

  • glotaran.read_parameters_from_yaml_file -> glotaran.io.load_parameters(..., format_name="yaml")

  • glotaran.io.read_data_file -> glotaran.io.load_dataset

  • result.save -> glotaran.io.save_result(result, ..., format_name="legacy")

  • result.get_dataset("<dataset_name>") -> result.data["<dataset_name>"]

  • glotaran.analysis.result -> glotaran.project.result

  • glotaran.analysis.scheme -> glotaran.project.scheme

  • model.simulate -> glotaran.analysis.simulation.simulate(model, ...)

🚀 0.3.3 (2021-03-18)

  • Force recalculation of SVD attributes in scheme._prepare_data (#597)

  • Remove unneeded check in spectral_penalties._get_area Fixes (#598)

  • Added python 3.9 support (#450)

🚀 0.3.2 (2021-02-28)

  • Re-release of version 0.3.1 due to packaging issue

🚀 0.3.1 (2021-02-28)

  • Added compatibility for numpy 1.20 and raised minimum required numpy version to 1.20 (#555)

  • Fixed excessive memory consumption in result creation due to full SVD computation (#574)

  • Added feature parameter history (#557)

  • Moved setup logic to setup.cfg (#560)

🚀 0.3.0 (2021-02-11)

  • Significant code refactor with small API changes to parameter relation specification (see docs)

  • Replaced lmfit with scipy.optimize

🚀 0.2.0 (2020-12-02)

  • Large refactor with significant improvements but also small API changes (see docs)

  • Removed doas plugin

🚀 0.1.0 (2020-07-14)

  • Package was renamed to pyglotaran on PyPi

🚀 0.0.8 (2018-08-07)

  • Changed nan_policiy to omit

🚀 0.0.7 (2018-08-07)

  • Added support for multiple shapes per compartment.

🚀 0.0.6 (2018-08-07)

  • First release on PyPI, support for Windows installs added.

  • Pre-Alpha Development

Authors

Development Lead

Contributors

Special Thanks

  • Stefan Schuetz

  • Sergey P. Laptenok

Supervision

Original publications

  1. Joris J. Snellenburg, Sergey Laptenok, Ralf Seger, Katharine M. Mullen, Ivo H. M. van Stokkum. “Glotaran: A Java-Based Graphical User Interface for the R Package TIMP”. Journal of Statistical Software (2012), Volume 49, Number 3, Pages: 1–22. URL https://dx.doi.org/10.18637/jss.v049.i03

  2. Katharine M. Mullen, Ivo H. M. van Stokkum. “TIMP: An R Package for Modeling Multi-way Spectroscopic Measurements”. Journal of Statistical Software (2007), Volume 18, Number 3, Pages 1-46, ISSN 1548-7660. URL https://dx.doi.org/10.18637/jss.v018.i03

  3. Ivo H. M. van Stokkum, Delmar S. Larsen, Rienk van Grondelle, “Global and target analysis of time-resolved spectra”. Biochimica et Biophysica Acta (BBA) - Bioenergetics (2004), Volume 1657, Issues 2–3, Pages 82-104, ISSN 0005-2728. URL https://doi.org/10.1016/j.bbabio.2004.04.011

Overview

Data IO

Plotting

Modelling

Parameter

Optimizing

Plugins

To be as flexible as possible pyglotaran uses a plugin system to handle new Models, DataIo and ProjectIo. Those plugins can be defined by pyglotaran itself, the user or a 3rd party plugin package.

Builtin plugins

Models

  • KineticSpectrumModel

  • KineticImageModel

Data Io

Plugins reading and writing data to and from xarray.Dataset or xarray.DataArray.

  • AsciiDataIo

  • NetCDFDataIo

  • SdtDataIo

Project Io

Plugins reading and writing, Model,:class:Schema,:class:ParameterGroup or Result.

  • YmlProjectIo

  • CsvProjectIo

  • FolderProjectIo

Reproducibility and plugins

With a plugin ecosystem there always is the possibility that multiple plugins try register under the same format/name. This is why plugins are registered at least twice. Once under the name the developer intended and secondly under their full name (full import path). This allows to ensure that a specific plugin is used by manually specifying the plugin, so if someone wants to run your analysis the results will be reproducible even if they have conflicting plugins installed. You can gain all information about the installed plugins by calling the corresponding *_plugin_table function with both options (plugin_names and full_names) set to true. To pin a used plugin use the corresponding set_*_plugin function with the intended name (format_name/model_name) and the full name (full_plugin_name) of the plugin to use.

If you wanted to ensure that the pyglotaran builtin plugin is used for sdt files you could add the following lines to the beginning of your analysis code.

from glotaran.io import set_data_plugin
set_data_plugin("sdt", "glotaran.builtin.io.sdt.sdt_file_reader.SdtDataIo_sdt")

Models

The functions for model plugins are located in glotaran.model and called model_plugin_table and set_model_plugin.

Data Io

The functions for data io plugins are located in glotaran.io and called data_io_plugin_table and set_data_plugin.

Project Io

The functions for project io plugins are located in glotaran.io and called project_io_plugin_table and set_project_plugin.

3rd party plugins

Plugins not part of pyglotaran itself.

  • Not yet, why not be the first? Tell us about your plugin and we will feature it here.

Command-line Interface

glotaran

The glotaran CLI main function.

glotaran [OPTIONS] COMMAND [ARGS]...

Options

--version

Show the version and exit.

optimize

Optimizes a model. e.g.: glotaran optimize –

glotaran optimize [OPTIONS] [SCHEME_FILE]

Options

-dfmt, --dataformat <dataformat>

The input format of the data. Will be inferred from extension if not set.

Options:

ascii | nc | sdt

-d, --data <data>

Path to a dataset in the form ‘–data DATASET_LABEL PATH_TO_DATA’

-o, --out <out>

Path to an output directory.

-ofmt, --outformat <outformat>

The format of the output.

Default:

folder

Options:

folder | legacy | yaml

-n, --nfev <nfev>

Maximum number of function evaluations.

--nnls

Use non-negative least squares.

-y, --yes

Don’t ask for confirmation.

-p, --parameters_file <parameters_file>

(optional) Path to parameter file.

-m, --model_file <model_file>

Path to model file.

Arguments

SCHEME_FILE

Optional argument

pluginlist

Prints a list of installed plugins.

glotaran pluginlist [OPTIONS]

print

Parses scheme, a model or a parameter file and prints the result as a Markdown formatted string.

glotaran print [OPTIONS] [SCHEME_FILE]

Options

-p, --parameters_file <parameters_file>

(optional) Path to parameter file.

-m, --model_file <model_file>

Path to model file.

Arguments

SCHEME_FILE

Optional argument

validate

Validates a model file and optionally a parameter file.

glotaran validate [OPTIONS] [SCHEME_FILE]

Options

-p, --parameters_file <parameters_file>

(optional) Path to parameter file.

-m, --model_file <model_file>

Path to model file.

Arguments

SCHEME_FILE

Optional argument

Contributing

Contributions are welcome, and they are greatly appreciated! Every little bit helps, and credit will always be given.

You can contribute in many ways:

Types of Contributions

Report Bugs

Report bugs at https://github.com/glotaran/pyglotaran/issues.

If you are reporting a bug, please include:

  • Your operating system name and version.

  • Any details about your local setup that might be helpful in troubleshooting.

  • Detailed steps to reproduce the bug.

Fix Bugs

Look through the GitHub issues for bugs. Anything tagged with “bug” and “help wanted” is open to whoever wants to implement it.

Implement Features

Look through the GitHub issues for features. Anything tagged with “enhancement” and “help wanted” is open to whoever wants to implement it.

Write Documentation

pyglotaran could always use more documentation, whether as part of the official pyglotaran docs, in docstrings, or even on the web in blog posts, articles, and such. If you are writing docstrings please use the NumPyDoc style to write them.

Submit Feedback

The best way to send feedback is to file an issue at https://github.com/glotaran/pyglotaran/issues.

If you are proposing a feature:

  • Explain in detail how it would work.

  • Keep the scope as narrow as possible, to make it easier to implement.

  • Remember that this is a volunteer-driven project, and that contributions are welcome :)

Get Started!

Ready to contribute? Here’s how to set up pyglotaran for local development.

  1. Fork the pyglotaran repo on GitHub.

  2. Clone your fork locally:

    $ git clone https://github.com/<your_name_here>/pyglotaran.git
    
  3. Install your local copy into a virtualenv. Assuming you have virtualenvwrapper installed, this is how you set up your fork for local development:

    $ mkvirtualenv pyglotaran
    (pyglotaran)$ cd pyglotaran
    (pyglotaran)$ python -m pip install -r requirements_dev.txt
    (pyglotaran)$ pip install -e . --process-dependency-links
    
  4. Install the pre-commit hooks, to automatically format and check your code:

    $ pre-commit install
    
  5. Create a branch for local development:

    $ git checkout -b name-of-your-bugfix-or-feature
    

    Now you can make your changes locally.

  6. When you’re done making changes, check that your changes pass flake8 and the tests, including testing other Python versions with tox:

    $ pre-commit run -a
    $ py.test
    

    Or to run all at once:

    $ tox
    
  7. Commit your changes and push your branch to GitHub:

    $ git add .
    $ git commit -m "Your detailed description of your changes."
    $ git push origin name-of-your-bugfix-or-feature
    
  8. Submit a pull request through the GitHub website.

  9. Add the change referring the pull request ((#<PR_nr>)) to changelog.md. If you are in doubt in which section your pull request belongs, just ask a maintainer what they think where it belongs.

Note

By default pull requests will use the template located at .github/PULL_REQUEST_TEMPLATE.md. But we also provide custom tailored templates located inside of .github/PULL_REQUEST_TEMPLATE. Sadly the GitHub Web Interface doesn’t provide an easy way to select them as it does for issue templates (see this comment for more details).

To use them you need to add the following query parameters to the url when creating the pull request and hit enter:

  • ✨ Feature PR: ?expand=1&template=feature_PR.md

  • 🩹 Bug Fix PR: ?expand=1&template=bug_fix_PR

  • 📚 Documentation PR: ?expand=1&template=docs_PR.md

Pull Request Guidelines

Before you submit a pull request, check that it meets these guidelines:

  1. The pull request should include tests.

  2. If the pull request adds functionality, the docs should be updated. Put your new functionality into a function with a docstring.

  3. The pull request should work for Python 3.10 and 3.11 Check your Github Actions https://github.com/<your_name_here>/pyglotaran/actions and make sure that the tests pass for all supported Python versions.

Docstrings

We use numpy style docstrings, which can also be autogenerated from function/method signatures by extensions for your editor.

Some extensions for popular editors are:

Note

If your pull request improves the docstring coverage (check pre-commit run -a interrogate), please raise the value of the interrogate setting fail-under in pyproject.toml. That way the next person will improve the docstring coverage as well and everyone can enjoy a better documentation.

Warning

As soon as all our docstrings are in proper shape we will enforce that it stays that way. If you want to check if your docstrings are fine you can use pydocstyle and darglint.

Tips

To run a subset of tests:

$ py.test tests.test_pyglotaran

Deprecations

Only maintainers are allowed to decide about deprecations, thus you should first open an issue and check back with them if they are ok with deprecating something.

To make deprecations as robust as possible and give users all needed information to adjust their code, we provide helper functions inside the module glotaran.deprecation.

The functions you most likely want to use are

Those functions not only make it easier to deprecate something, but they also check that that deprecations will be removed when they are due and that at least the imports in the warning work. Thus all deprecations need to be tested.

Tests for deprecations should be placed in glotaran/deprecation/modules/test which also provides the test helper functions deprecation_warning_on_call_test_helper and changed_import_test_warn. Since the tests for deprecation are mainly for maintainability and not to test the functionality (those tests should be in the appropriate place) deprecation_warning_on_call_test_helper will by default just test that a GlotaranApiDeprecationWarning was raised and ignore all raise Exception s. An exception to this rule is when adding back removed functionality (which shouldn’t happen in the first place but might), which should be implemented in a file under glotaran/deprecation/modules and filenames should be like the relative import path from glotaran root, but with _ instead of ..

E.g. glotaran.analysis.scheme would map to analysis_scheme.py

The only exceptions to this rule are the root __init__.py which is named glotaran_root.py and testing changed imports which should be placed in test_changed_imports.py.

Deprecating a Function, method or class

Deprecating a function, method or class is as easy as adding the deprecate decorator to it. Other decorators (e.g. @staticmethod or @classmethod) should be placed both deprecate in order to work.

glotaran/some_module.py
from glotaran.deprecation import deprecate

@deprecate(
    deprecated_qual_name_usage="glotaran.some_module.function_to_deprecate(filename)",
    new_qual_name_usage='glotaran.some_module.new_function(filename, format_name="legacy")',
    to_be_removed_in_version="0.6.0",
)
def function_to_deprecate(*args, **kwargs):
    ...

Deprecating a call argument

When deprecating a call argument you should use warn_deprecated and set the argument to deprecate to a default value (e.g. "deprecated") to check against. Note that for this use case we need to set check_qual_names=(False, False) which will deactivate the import testing. This might not always be possible, e.g. if the argument is positional only, so it might make more sense to deprecate the whole callable, just discuss what to do with our trusted maintainers.

glotaran/some_module.py
from glotaran.deprecation import deprecate

def function_to_deprecate(args1, new_arg="new_default_behavior", deprecated_arg="deprecated", **kwargs):
    if deprecated_arg != "deprecated":
        warn_deprecated(
            deprecated_qual_name_usage="deprecated_arg",
            new_qual_name_usage='new_arg="legacy"',
            to_be_removed_in_version="0.6.0",
            check_qual_names=(False, False)
        )
        new_arg = "legacy"
    ...

Deprecating a module attribute

Sometimes it might be necessary to remove an attribute (function, class, or constant) from a module to prevent circular imports or just to streamline the API. In those cases you would use deprecate_module_attribute inside a module __getattr__ function definition. This will import the attribute from the new location and return it when an import or use is requested.

glotaran/old_package/__init__.py
def __getattr__(attribute_name: str):
    from glotaran.deprecation import deprecate_module_attribute

    if attribute_name == "deprecated_attribute":
        return deprecate_module_attribute(
            deprecated_qual_name="glotaran.old_package.deprecated_attribute",
            new_qual_name="glotaran.new_package.new_attribute_name",
            to_be_removed_in_version="0.6.0",
        )

    raise AttributeError(f"module {__name__} has no attribute {attribute_name}")

Deprecating a submodule

For a better logical structure, it might be needed to move modules to a different location in the project. In those cases, you would use deprecate_submodule, which imports the module from the new location, add it to sys.modules and as an attribute to the parent package.

glotaran/old_package/__init__.py
from glotaran.deprecation import deprecate_submodule

module_name = deprecate_submodule(
    deprecated_module_name="glotaran.old_package.module_name",
    new_module_name="glotaran.new_package.new_module_name",
    to_be_removed_in_version="0.6.0",
)

Deprecating dict entries

The possible dict deprecation actions are:

  • Swapping of keys {"foo": 1} -> {"bar": 1} (done via swap_keys=("foo", "bar"))

  • Replacing of matching values {"foo": 1} -> {"foo": 2} (done via replace_rules=({"foo": 1}, {"foo": 2}))

  • Replacing of matching values and swapping of keys {"foo": 1} -> {"bar": 2} (done via replace_rules=({"foo": 1}, {"bar": 2}))

For full examples have a look at the examples from the docstring (deprecate_dict_entry()).

Deprecation Errors

In some cases deprecations cannot have a replacement with the original behavior maintained. This will be mostly the case when at this point in time and in the object hierarchy there isn’t enough information available to calculate the appropriate values. Rather than using a ‘dummy’ value not to break the API, which could cause undefined behavior down the line, those cases should throw an error which informs the users about the new usage. In general this should only be used if it is unavoidable due to massive refactoring of the internal structure and tried to avoid by any means in a reasonable context.

If you have one of those rare cases you can use raise_deprecation_error().

Testing Result consistency

To test the consistency of results locally you need to clone the pyglotaran-examples and run them:

$ git clone https://github.com/glotaran/pyglotaran-examples
$ cd pyglotaran-examples
$ python scripts/run_examples.py run-all --headless

Note

Make sure you got the the latest version (git pull) and are on the correct branch for both pyglotaran and pyglotaran-examples.

The results from the examples will be saved in you home folder under pyglotaran_examples_results. Those results than will be compared to the ‘gold standard’ defined by the maintainers.

To test the result consistency run:

$ pytest validation/pyglotaran-examples/test_result_consistency.py

If needed this will clone the ‘gold standard’ results to the folder comparison-results, update them and test your current results against them.

Deploying

A reminder for the maintainers on how to deploy. Make sure all your changes are committed (including an entry in changelog.md), the version number only needs to be changed in glotaran/__init__.py.

Then make a new release on GitHub and give the tag a proper name, e.g. v0.3.0 since it might be included in a citation.

Github Actions will then deploy to PyPI if the tests pass.

API Documentation

The API Documentation for pyglotaran is automatically created from its docstrings.

glotaran

Glotaran package root.

Plugin development

If you don’t find the plugin that fits your needs you can always write your own. This sections will explain you how and what you need to know.

In time we will also provide you with a cookiecutter template, to kickstart your new plugin for publishing as a package on PyPi.

This page was generated from docs/source/notebooks/plugin_system/plugin_howto_write_a_io_plugin.ipynb. Interactive online version: Binder badge

How to Write your own Io plugin

There are all kinds of different data formats, so it is quite likely that your experimental setup uses a format which isn’t yet supported by a glotaran plugin and want to write your own DataIo plugin to support this format.

Since json is very common format (admittedly not for data, but in general) and python has builtin support for it we will use it as an example.

First let’s have a look which DataIo plugins are already installed and which functions they support.

[1]:
from glotaran.io import data_io_plugin_table
[2]:
data_io_plugin_table()
[2]:

Format name

load_dataset

save_dataset

ascii

*

*

nc

*

*

sdt

*

/

Looks like there isn’t a json plugin installed yet, but maybe someone else did already write one, so have a look at the `3rd party plugins list in the user docsumentation <https://pyglotaran.readthedocs.io/en/latest/user_documentation/using_plugins.html>`__ before you start writing your own plugin.

For the sake of the example, we will write our json plugin even if there already exists one by the time you read this.

First you need to import all needed libraries and functions.

  • from __future__ import annotations: needed to write python 3.10 typing syntax (|), even with a lower python version

  • json,xarray: Needed for reading and writing itself

  • DataIoInterface: needed to subclass from, this way you get the proper type and especially signature checking

  • register_data_io: registers the DataIo plugin under the given format_names

[3]:
from __future__ import annotations

import json

import xarray as xr

from glotaran.io.interface import DataIoInterface
from glotaran.plugin_system.data_io_registration import register_data_io

DataIoInterface has two methods we could implement load_dataset and save_dataset, which are used by the identically named functions in glotaran.io.

We will just implement both for our example to be complete. the quickest way to get started is to just copy over the code from DataIoInterface which already has the right signatures and some boilerplate docstrings, for the method arguments.

If the default arguments aren’t enough for your plugin and you need your methods to have additional option, you can just add those. Note the * between file_name and my_extra_option, this tell python that my_extra_option is an keyword only argument and `mypy <https://github.com/python/mypy>`__ won’t raise an [override] type error for changing the signature of the method. To help others who might use your plugin and your future self, it is good practice to documents what each parameter does in the methods docstring, which will be accessed by the help function.

Finally add the @register_data_io with the format_name’s you want to register the plugin to, in our case json and my_json.

Pro tip: You don’t need to implement the whole functionality inside of the method itself,

[4]:
@register_data_io(["json", "my_json"])
class JsonDataIo(DataIoInterface):
    """My new shiny glotaran plugin for json data io"""

    def load_dataset(
        self, file_name: str, *, my_extra_option: str = None
    ) -> xr.Dataset | xr.DataArray:
        """Read json data to xarray.Dataset


        Parameters
        ----------
        file_name : str
            File containing the data.
        my_extra_option: str
            This argument is only for demonstration
        """
        if my_extra_option is not None:
            print(f"Using my extra option loading json: {my_extra_option}")

        with open(file_name) as json_file:
            data_dict = json.load(json_file)
        return xr.Dataset.from_dict(data_dict)

    def save_dataset(
        self, dataset: xr.Dataset | xr.DataArray, file_name: str, *, my_extra_option=None
    ):
        """Write xarray.Dataset to a json file

        Parameters
        ----------
        dataset : xr.Dataset
            Dataset to be saved to file.
        file_name : str
            File to write the result data to.
        my_extra_option: str
            This argument is only for demonstration
        """
        if my_extra_option is not None:
            print(f"Using my extra option for writing json: {my_extra_option}")

        data_dict = dataset.to_dict()
        with open(file_name, "w") as json_file:
            json.dump(data_dict, json_file)

Let’s verify that our new plugin was registered successfully under the format_names json and my_json.

[5]:
data_io_plugin_table()
[5]:

Format name

load_dataset

save_dataset

ascii

*

*

json

*

*

my_json

*

*

nc

*

*

sdt

*

/

Now let’s use the example data from the quickstart to test the reading and writing capabilities of our plugin.

[6]:
from glotaran.io import load_dataset
from glotaran.io import save_dataset
from glotaran.testing.simulated_data.sequential_spectral_decay import DATASET as dataset
[7]:
dataset
[7]:
<xarray.Dataset>
Dimensions:   (time: 2100, spectral: 72)
Coordinates:
  * time      (time) float64 -1.0 -0.99 -0.98 -0.97 ... 19.96 19.97 19.98 19.99
  * spectral  (spectral) float64 600.0 601.4 602.8 604.2 ... 696.6 698.0 699.4
Data variables:
    data      (time, spectral) float64 -0.007945 0.008873 ... 2.582 2.299
Attributes:
    source_path:  dataset_1.nc

To get a feeling for our data, let’s plot some traces.

[8]:
plot_data = dataset.data.sel(spectral=[620, 630, 650], method="nearest")
plot_data.plot.line(x="time", aspect=2, size=5)
[8]:
[<matplotlib.lines.Line2D at 0x7f47c0f2f5e0>,
 <matplotlib.lines.Line2D at 0x7f47c0f2f610>,
 <matplotlib.lines.Line2D at 0x7f47c0f2f700>]
_images/notebooks_plugin_system_plugin_howto_write_a_io_plugin_14_1.svg

Since we want to see a difference of our saved and loaded data, we divide the amplitudes by 2 for no reason.

[9]:
dataset["data"] = dataset.data / 2

Now that we changed the data, let’s write them to a file.

But in which order were the arguments again? And are there any additional option?

Good thing we documented our new plugin, so we can just lookup the help.

[10]:
from glotaran.io import show_data_io_method_help

show_data_io_method_help("json", "save_dataset")
Help on method save_dataset in module __main__:

save_dataset(dataset: 'xr.Dataset | xr.DataArray', file_name: 'str', *, my_extra_option=None) method of __main__.JsonDataIo instance
    Write xarray.Dataset to a json file

    Parameters
    ----------
    dataset : xr.Dataset
        Dataset to be saved to file.
    file_name : str
        File to write the result data to.
    my_extra_option: str
        This argument is only for demonstration

Note that the function save_dataset has additional arguments:

  • format_name: overwrites the inferred plugin selection

  • allow_overwrite: Allows to overwrite existing files (USE WITH CAUTION!!!)

[11]:
help(save_dataset)
Help on function save_dataset in module glotaran.plugin_system.data_io_registration:

save_dataset(dataset: 'xr.Dataset | xr.DataArray', file_name: 'StrOrPath', format_name: 'str | None' = None, *, data_filters: 'list[str] | None' = None, allow_overwrite: 'bool' = False, update_source_path: 'bool' = True, **kwargs: 'Any') -> 'None'
    Save data from :xarraydoc:`Dataset` or :xarraydoc:`DataArray` to a file.

    Parameters
    ----------
    dataset : xr.Dataset | xr.DataArray
        Data to be written to file.
    file_name : StrOrPath
        File to write the data to.
    format_name : str
        Format the file should be in, if not provided it will be inferred from the file extension.
    data_filters : list[str] | None
        Optional list of items in the dataset to be saved.
    allow_overwrite : bool
        Whether or not to allow overwriting existing files, by default False
    update_source_path: bool
        Whether or not to update the ``source_path`` attribute to ``file_name`` when saving.
        by default True
    **kwargs : Any
        Additional keyword arguments passes to the ``write_dataset`` implementation
        of the data io plugin. If you aren't sure about those use ``get_datasaver``
        to get the implementation with the proper help and autocomplete.

Since this is just an example and we don’t overwrite important data we will use allow_overwrite=True. Also it makes writing this documentation easier, not having to manually delete the test file each time you run the cell.

[12]:
save_dataset(
    dataset, "half_intensity.json", allow_overwrite=True, my_extra_option="just as an example"
)
Using my extra option for writing json: just as an example

Now let’s test our data loading functionality.

[13]:
reloaded_data = load_dataset("half_intensity.json", my_extra_option="just as an example")
reloaded_data
Using my extra option loading json: just as an example
[13]:
<xarray.Dataset>
Dimensions:   (time: 2100, spectral: 72)
Coordinates:
  * time      (time) float64 -1.0 -0.99 -0.98 -0.97 ... 19.96 19.97 19.98 19.99
  * spectral  (spectral) float64 600.0 601.4 602.8 604.2 ... 696.6 698.0 699.4
Data variables:
    data      (time, spectral) float64 -0.003973 0.004437 ... 1.291 1.15
Attributes:
    source_path:  half_intensity.json
    loader:       <function load_dataset at 0x7f47d391d630>
[14]:
reloaded_plot_data = reloaded_data.data.sel(spectral=[620, 630, 650], method="nearest")
reloaded_plot_data.plot.line(x="time", aspect=2, size=5)
[14]:
[<matplotlib.lines.Line2D at 0x7f47c09b54b0>,
 <matplotlib.lines.Line2D at 0x7f47c09b5570>,
 <matplotlib.lines.Line2D at 0x7f47c09b5660>]
_images/notebooks_plugin_system_plugin_howto_write_a_io_plugin_25_1.svg

Since this looks like the above plot, but with half the amplitudes, so writing and reading our data worked as we hoped it would.

Writing a ProjectIo plugin words analogous:

DataIo plugin

ProjectIo plugin

Register function

glotaran.plugin_system.data_io_registration.register_data_io

glotaran.plugin_system.project_io_registration.register_project_io

Baseclass

glotaran.io.interface.DataIoInterface

glotaran.io.interface.DataIoInterface

Possible methods

load_dataset , save_dataset

load_model , save_model , load_parameters , save_parameters , load_scheme , save_scheme , load_result , save_result

Of course you don’t have to implement all methods (sometimes that doesn’t even make sense), but only the ones you need.

Last but not least:

Chances are that if you need a plugin someone else does too, so it would awesome if you would publish it open source, so the wheel isn’t reinvented over and over again.

Indices and tables