Welcome to pyglotaran’s documentation!

Introduction

Pyglotaran is a python library for global analysis of time-resolved spectroscopy data. It is designed to provide a state of the art modeling toolbox to researchers, in a user-friendly manner.

Its features are:

  • user-friendly modeling with a custom YAML (*.yml) based modeling language

  • parameter optimization using variable projection and non-negative least-squares algorithms

  • easy to extend modeling framework

  • battle-hardened model and algorithms for fluorescence dynamics

  • build upon and fully integrated in the standard Python science stack (NumPy, SciPy, Jupyter)

A Note To Glotaran Users

Although closely related and developed in the same lab, pyglotaran is not a replacement for Glotaran - A GUI For TIMP. Pyglotaran only aims to provide the modeling and optimization framework and algorithms. It is of course possible to develop a new GUI which leverages the power of pyglotaran (contributions welcome).

The current ‘user-interface’ for pyglotaran is Jupyter Notebook. It is designed to seamlessly integrate in this environment and be compatible with all major visualization and data analysis tools in the scientific python environment.

If you are a non-technical user, you should give these tools a try, there are numerous tutorials how to use them. You don’t need to really learn to program. If you can use e.g. Matlab or Mathematica, you can use Jupyter and Python.

Installation

Prerequisites

  • Python 3.6 or later

Windows

The easiest way of getting Python (and some basic tools to work with it) in Windows is to use Anaconda, which provides python.

You will need a terminal for the installation. One is provided by Anaconda and is called Anaconda Console. You can find it in the start menu.

Note

If you use a Windows Shell like cmd.exe or PowerShell, you might have to prefix ‘$PATH_TO_ANACONDA/’ to all commands (e.g. C:/Anaconda/pip.exe instead of pip)

Stable release

Warning

pyglotaran is early development, so for the moment stable releases are sparse and outdated. We try to keep the master code stable, so please install from source for now.

This is the preferred method to install pyglotaran, as it will always install the most recent stable release.

To install pyglotaran, run this command in your terminal:

$ pip install pyglotaran

If you don’t have pip installed, this Python installation guide can guide you through the process.

If you want to install it via conda, you can run the following command:

$ conda install -c conda-forge pyglotaran

From sources

First you have to install or update some dependencies.

Within a terminal:

$ pip install -U numpy scipy Cython

Alternatively, for Anaconda users:

$ conda install numpy scipy Cython

Afterwards you can simply use pip to install it directly from Github.

$ pip install git+https://github.com/glotaran/pyglotaran.git

For updating pyglotaran, just re-run the command above.

If you prefer to manually download the source files, you can find them on Github. Alternatively you can clone them with git (preferred):

$ git clone https://github.com/glotaran/pyglotaran.git

Within a terminal, navigate to directory where you have unpacked or cloned the code and enter

$ pip install -e .

For updating, simply download and unpack the newest version (or run $ git pull in pyglotaran directory if you used git) and and re-run the command above.

Quickstart/Cheat-Sheet

Warning

This is documentation for an early access release of pyglotaran, certain aspects of how it works (as demonstrated in this quickstart) may be subjected to changes in future 0.x releases of the software. Please consult the pyglotaran readme to learn more on what this means.

To start using pyglotaran in your project, you have to import it first. In addition we need to import some extra components for later use.

In [1]: import glotaran as gta

In [2]: from glotaran.analysis.optimize import optimize

In [3]: from glotaran.analysis.scheme import Scheme

Let us get some data to analyze:

In [4]: from glotaran.examples.sequential import dataset

In [5]: dataset
Out[5]: 
<xarray.Dataset>
Dimensions:   (spectral: 72, time: 2100)
Coordinates:
  * time      (time) float64 -1.0 -0.99 -0.98 -0.97 ... 19.96 19.97 19.98 19.99
  * spectral  (spectral) float64 600.0 601.4 602.8 604.2 ... 696.6 698.0 699.4
Data variables:
    data      (time, spectral) float64 0.01551 0.02087 0.001702 ... 1.714 1.538

Like all data in pyglotaran, the dataset is a xarray.Dataset. You can find more information about the xarray library the xarray hompage.

The loaded dataset is a simulated sequential model.

To plot our data, we must first import matplotlib.

In [6]: import matplotlib.pyplot as plt

Now we can plot some time traces.

In [7]: plot_data = dataset.data.sel(spectral=[620, 630, 650], method='nearest')

In [8]: plot_data.plot.line(x='time', aspect=2, size=5);
_images/plot_usage_dataset_traces.png

We can also plot spectra at different times.

In [9]: plot_data = dataset.data.sel(time=[1, 10, 20], method='nearest')

In [10]: plot_data.plot.line(x='spectral', aspect=2, size=5);
_images/plot_usage_dataset_spectra.png

To get an idea about how to model your data, you should inspect the singular value decomposition. Pyglotaran has a function to calculate it (among other things).

In [11]: dataset = gta.io.prepare_time_trace_dataset(dataset)

In [12]: dataset
Out[12]: 
<xarray.Dataset>
Dimensions:                      (left_singular_value_index: 72, right_singular_value_index: 72, singular_value_index: 72, spectral: 72, time: 2100)
Coordinates:
  * time                         (time) float64 -1.0 -0.99 -0.98 ... 19.98 19.99
  * spectral                     (spectral) float64 600.0 601.4 ... 698.0 699.4
Dimensions without coordinates: left_singular_value_index, right_singular_value_index, singular_value_index
Data variables:
    data                         (time, spectral) float64 0.01551 ... 1.538
    data_left_singular_vectors   (time, left_singular_value_index) float64 -8...
    data_singular_values         (singular_value_index) float64 4.62e+03 ... ...
    data_right_singular_vectors  (right_singular_value_index, spectral) float64 ...

First, take a look at the first 10 singular values:

In [13]: plot_data = dataset.data_singular_values.sel(singular_value_index=range(0, 10))

In [14]: plot_data.plot(yscale='log', marker='o', linewidth=0, aspect=2, size=5);
_images/quickstart_data_singular_values.png

To analyze our data, we need to create a model. Create a file called model.yml in your working directory and fill it with the following:

type: kinetic-spectrum

initial_concentration:
  input:
    compartments: [s1, s2, s3]
    parameters: [input.1, input.0, input.0]

k_matrix:
  k1:
    matrix:
      (s2, s1): kinetic.1
      (s3, s2): kinetic.2
      (s3, s3): kinetic.3

megacomplex:
  m1:
    k_matrix: [k1]

irf:
  irf1:
    type: gaussian
    center: irf.center
    width: irf.width

dataset:
  dataset1:
    initial_concentration: input
    megacomplex: [m1]
    irf: irf1

Now you can load the model file.

In [15]: model = gta.read_model_from_yaml_file('model.yml')

You can check your model for problems with model.validate.

In [16]: print(model.validate())
Your model is valid.

Now define some starting parameters. Create a file called parameters.yml with the following content.

input:
  - ['1', 1, {'vary': False, 'non-negative': False}]
  - ['0', 0, {'vary': False, 'non-negative': False}]

kinetic: [
     0.5,
     0.3,
     0.1,
]

irf:
  - ['center', 0.3]
  - ['width', 0.1]
In [17]: parameters = gta.read_parameters_from_yaml_file('parameters.yml')

You can model.validate also to check for missing parameters.

In [18]: print(model.validate(parameters=parameters))
Your model is valid.

Since not all problems in the model can be detected automatically it is wise to visually inspect the model. For this purpose, you can just print the model.

In [19]: print(model)
# Model

_Type_: kinetic-spectrum

## Initial Concentration

* **input**:
  * *Label*: input
  * *Compartments*: ['s1', 's2', 's3']
  * *Parameters*: [input.1, input.0, input.0]
  * *Exclude From Normalize*: []

## K Matrix

* **k1**:
  * *Label*: k1
  * *Matrix*: 
    * *('s2', 's1')*: kinetic.1
    * *('s3', 's2')*: kinetic.2
    * *('s3', 's3')*: kinetic.3
  

## Irf

* **irf1** (gaussian):
  * *Label*: irf1
  * *Type*: gaussian
  * *Center*: irf.center
  * *Width*: irf.width
  * *Normalize*: False
  * *Backsweep*: False

## Dataset

* **dataset1**:
  * *Label*: dataset1
  * *Megacomplex*: ['m1']
  * *Initial Concentration*: input
  * *Irf*: irf1

## Megacomplex

* **m1**:
  * *Label*: m1
  * *K Matrix*: ['k1']

The same way you should inspect your parameters.

In [20]: print(parameters)
* __None__:
  * __input__:
    * __1__: _Value_: 1.0, _StdErr_: 0.0, _Min_: -inf, _Max_: inf, _Vary_: False, _Non-Negative_: False, _Expr_: None
    * __0__: _Value_: 0.0, _StdErr_: 0.0, _Min_: -inf, _Max_: inf, _Vary_: False, _Non-Negative_: False, _Expr_: None
  * __kinetic__:
    * __1__: _Value_: 0.5, _StdErr_: 0.0, _Min_: -inf, _Max_: inf, _Vary_: True, _Non-Negative_: False, _Expr_: None
    * __2__: _Value_: 0.3, _StdErr_: 0.0, _Min_: -inf, _Max_: inf, _Vary_: True, _Non-Negative_: False, _Expr_: None
    * __3__: _Value_: 0.1, _StdErr_: 0.0, _Min_: -inf, _Max_: inf, _Vary_: True, _Non-Negative_: False, _Expr_: None
  * __irf__:
    * __center__: _Value_: 0.3, _StdErr_: 0.0, _Min_: -inf, _Max_: inf, _Vary_: True, _Non-Negative_: False, _Expr_: None
    * __width__: _Value_: 0.1, _StdErr_: 0.0, _Min_: -inf, _Max_: inf, _Vary_: True, _Non-Negative_: False, _Expr_: None

Now we have everything together to optimize our parameters. First we import optimize.

In [21]: scheme = Scheme(model, parameters, {'dataset1': dataset})

In [22]: result = optimize(scheme)
   Iteration     Total nfev        Cost      Cost reduction    Step norm     Optimality   
       0              1         7.5381e+00                                    3.74e+01    
       1              2         7.5377e+00      4.62e-04       1.02e-04       3.28e-01    
       2              3         7.5377e+00      9.13e-10       3.48e-08       3.49e-06    
`ftol` termination condition is satisfied.
Function evaluations 3, initial cost 7.5381e+00, final cost 7.5377e+00, first-order optimality 3.49e-06.

In [23]: print(result)
Optimization Result            |            |
-------------------------------|------------|
 Number of residual evaluation |          3 |
           Number of variables |          5 |
          Number of datapoints |     151200 |
            Degrees of freedom |     151195 |
                    Chi Square |   1.51e+01 |
            Reduced Chi Square |   9.97e-05 |
 Root Mean Square Error (RMSE) |   9.99e-03 |


In [24]: print(result.optimized_parameters)
* __None__:
  * __input__:
    * __1__: _Value_: 1.0, _StdErr_: 0.0, _Min_: -inf, _Max_: inf, _Vary_: False, _Non-Negative_: False, _Expr_: None
    * __0__: _Value_: 0.0, _StdErr_: 0.0, _Min_: -inf, _Max_: inf, _Vary_: False, _Non-Negative_: False, _Expr_: None
  * __kinetic__:
    * __1__: _Value_: 0.4999391441068187, _StdErr_: 0.007261698651628417, _Min_: -inf, _Max_: inf, _Vary_: True, _Non-Negative_: False, _Expr_: None
    * __2__: _Value_: 0.30008042937966384, _StdErr_: 0.0041961923515427, _Min_: -inf, _Max_: inf, _Vary_: True, _Non-Negative_: False, _Expr_: None
    * __3__: _Value_: 0.0999874440333988, _StdErr_: 0.0004778304467913974, _Min_: -inf, _Max_: inf, _Vary_: True, _Non-Negative_: False, _Expr_: None
  * __irf__:
    * __center__: _Value_: 0.30000199460726046, _StdErr_: 0.0005010729238540931, _Min_: -inf, _Max_: inf, _Vary_: True, _Non-Negative_: False, _Expr_: None
    * __width__: _Value_: 0.09999158675249081, _StdErr_: 0.0006703684374783712, _Min_: -inf, _Max_: inf, _Vary_: True, _Non-Negative_: False, _Expr_: None

You can get the resulting data for your dataset with result.get_dataset.

In [25]: result_dataset = result.get_dataset('dataset1')

In [26]: result_dataset
Out[26]: 
<xarray.Dataset>
Dimensions:                                   (clp_label: 3, component: 3, from_species: 3, left_singular_value_index: 72, right_singular_value_index: 72, singular_value_index: 72, species: 3, spectral: 72, time: 2100, to_species: 3)
Coordinates:
  * time                                      (time) float64 -1.0 ... 19.99
  * spectral                                  (spectral) float64 600.0 ... 699.4
  * clp_label                                 (clp_label) <U2 's1' 's2' 's3'
  * species                                   (species) <U2 's1' 's2' 's3'
    rate                                      (component) float64 -0.4999 ......
    lifetime                                  (component) float64 -2.0 ... -10.0
  * to_species                                (to_species) <U2 's1' 's2' 's3'
  * from_species                              (from_species) <U2 's1' 's2' 's3'
Dimensions without coordinates: component, left_singular_value_index, right_singular_value_index, singular_value_index
Data variables: (12/24)
    data                                      (time, spectral) float64 0.0155...
    data_left_singular_vectors                (time, left_singular_value_index) float64 ...
    data_singular_values                      (singular_value_index) float64 ...
    data_right_singular_vectors               (right_singular_value_index, spectral) float64 ...
    matrix                                    (time, clp_label) float64 6.006...
    clp                                       (spectral, clp_label) float64 1...
    ...                                        ...
    a_matrix                                  (component, species) float64 1....
    k_matrix                                  (to_species, from_species) float64 ...
    k_matrix_reduced                          (to_species, from_species) float64 ...
    irf_center                                float64 0.3
    irf_width                                 float64 0.09999
    irf                                       (time) float64 1.976e-37 ... 0.0
Attributes:
    root_mean_square_error:           0.009985218742264433
    weighted_root_mean_square_error:  0.009985218742264433

The resulting data can be visualized the same way as the dataset. To judge the quality of the fit, you should look at first left and right singular vectors of the residual.

In [27]: plot_data = result_dataset.residual_left_singular_vectors.sel(left_singular_value_index=0)

In [28]: plot_data.plot.line(x='time', aspect=2, size=5);
_images/plot_quickstart_lsv.png
In [29]: plot_data = result_dataset.residual_right_singular_vectors.sel(right_singular_value_index=0)

In [30]: plot_data.plot.line(x='spectral', aspect=2, size=5);
_images/plot_quickstart_rsv.png

Finally, you can save your result.

In [31]: result_dataset.to_netcdf('dataset1.nc')

History

0.3.3 (2021-03-18)

  • Force recalculation of SVD attributes in scheme._prepare_data (#597)

  • Remove unneeded check in spectral_penalties._get_area Fixes (#598)

  • Added python 3.9 support (#450)

0.3.2 (2021-02-28)

  • Re-release of version 0.3.1 due to packaging issue

0.3.1 (2021-02-28)

  • Added compatibility for numpy 1.20 and raised minimum required numpy version to 1.20 (#555)

  • Fixed excessive memory consumption in result creation due to full SVD computation (#574)

  • Added feature parameter history (#557)

  • Moved setup logic to setup.cfg (#560)

0.3.0 (2021-02-11)

  • Significant code refactor with small API changes to parameter relation specification (see docs)

  • Replaced lmfit with scipy.optimize

0.2.0 (2020-12-02)

  • Large refactor with significant improvements but also small API changes (see docs)

  • Removed doas plugin

0.1.0 (2020-07-14)

  • Package was renamed to pyglotaran on PyPi

0.0.8 (2018-08-07)

  • Changed nan_policiy to omit

0.0.7 (2018-08-07)

  • Added support for multiple shapes per compartement.

0.0.6 (2018-08-07)

  • First release on PyPI, support for Windows installs added.

  • Pre-Alpha Development

Authors

Development Lead

Contributors

Special Thanks

  • Stefan Schuetz

  • Sergey P. Laptenok

Supervision

Original publications

  1. Snellenburg JJ, Laptenok SP, Seger R, Mullen KM, van Stokkum IHM (2012). “Glotaran: A Java-Based Graphical User Interface for the R Package TIMP.” Journal of Statistical Software, 49(3), 1–22. URL http://www.jstatsoft.org/v49/i03/.

  2. Mullen, Katharine, & Ivo H. M. van Stokkum. “TIMP: An R Package for Modeling Multi-way Spectroscopic Measurements.” Journal of Statistical Software [Online], 18.3 (2007): 1 - 46. Web. 25 Jul. URL https://www.jstatsoft.org/article/view/v018i03

  3. van Stokkum, IHM, Delmar S. Larsen, and Rienk van Grondelle. “Global and target analysis of time-resolved spectra.” Biochimica et Biophysica Acta (BBA)-Bioenergetics 1657.2-3 (2004): 82-104. https://doi.org/10.1016/j.bbabio.2004.04.011

Overview

Data IO

Plotting

Modelling

Parameter

Optimizing

API Documentation

The API Documentation for pyglotaran is automatically created from its docstrings.

glotaran

Glotaran package __init__.py

Contributing

Contributions are welcome, and they are greatly appreciated! Every little bit helps, and credit will always be given.

You can contribute in many ways:

Types of Contributions

Report Bugs

Report bugs at https://github.com/glotaran/pyglotaran/issues.

If you are reporting a bug, please include:

  • Your operating system name and version.

  • Any details about your local setup that might be helpful in troubleshooting.

  • Detailed steps to reproduce the bug.

Fix Bugs

Look through the GitHub issues for bugs. Anything tagged with “bug” and “help wanted” is open to whoever wants to implement it.

Implement Features

Look through the GitHub issues for features. Anything tagged with “enhancement” and “help wanted” is open to whoever wants to implement it.

Write Documentation

pyglotaran could always use more documentation, whether as part of the official pyglotaran docs, in docstrings, or even on the web in blog posts, articles, and such. If you are writing docstrings please use the NumPyDoc style to write them.

Submit Feedback

The best way to send feedback is to file an issue at https://github.com/glotaran/pyglotaran/issues.

If you are proposing a feature:

  • Explain in detail how it would work.

  • Keep the scope as narrow as possible, to make it easier to implement.

  • Remember that this is a volunteer-driven project, and that contributions are welcome :)

Get Started!

Ready to contribute? Here’s how to set up pyglotaran for local development.

  1. Fork the pyglotaran repo on GitHub.

  2. Clone your fork locally:

    $ git clone https://github.com/<your_name_here>/pyglotaran.git
    
  3. Install your local copy into a virtualenv. Assuming you have virtualenvwrapper installed, this is how you set up your fork for local development:

    $ mkvirtualenv pyglotaran
    (pyglotaran)$ cd pyglotaran
    (pyglotaran)$ python -m pip install -r requirements_dev.txt
    (pyglotaran)$ pip install -e . --process-dependency-links
    
  4. Install the pre-commit hooks, to automatically format and check your code:

    $ pre-commit install
    
  5. Create a branch for local development:

    $ git checkout -b name-of-your-bugfix-or-feature
    

    Now you can make your changes locally.

  6. When you’re done making changes, check that your changes pass flake8 and the tests, including testing other Python versions with tox:

    $ pre-commit run -a
    $ py.test
    

    Or to run all at once:

    $ tox
    
  7. Commit your changes and push your branch to GitHub:

    $ git add .
    $ git commit -m "Your detailed description of your changes."
    $ git push origin name-of-your-bugfix-or-feature
    
  8. Submit a pull request through the GitHub website.

Pull Request Guidelines

Before you submit a pull request, check that it meets these guidelines:

  1. The pull request should include tests.

  2. If the pull request adds functionality, the docs should be updated. Put your new functionality into a function with a docstring.

  3. The pull request should work for Python 3.8 and 3.9 Check your Github Actions https://github.com/<your_name_here>/pyglotaran/actions and make sure that the tests pass for all supported Python versions.

Docstrings

We use numpy style docstrings, which can also be autogenerated from function/method signatures by extensions for your editor.

Some extensions for popular editors are:

Note

If your pull request improves the docstring coverage (check pre-commit run -a interrogate), please raise the value of the interrogate setting fail-under in pyproject.toml. That way the next person will improve the docstring coverage as well and everyone can enjoy a better documentation.

Warning

As soon as all our docstrings in proper shape we will enforce that it stays that way. If you want to check if your docstrings are fine you can use pydocstyle and darglint.

Tips

To run a subset of tests:

$py.test tests.test_pyglotaran

Deploying

A reminder for the maintainers on how to deploy. Make sure all your changes are committed (including an entry in HISTORY.rst), the version number only needs to be changed in glotaran/__init__.py.

Then make a new release on GitHub and give the tag a proper name, e.g. 0.3.0 since might be included in a citation.

Github Actions will then deploy to PyPI if the tests pass.

Indices and tables