Welcome to pyglotaran’s documentation!
Introduction
Pyglotaran is a python library for global analysis of time-resolved spectroscopy data. It is designed to provide a state of the art modeling toolbox to researchers, in a user-friendly manner.
Its features are:
user-friendly modeling with a custom YAML (
*.yml
) based modeling languageparameter optimization using variable projection and non-negative least-squares algorithms
easy to extend modeling framework
battle-hardened model and algorithms for fluorescence dynamics
build upon and fully integrated in the standard Python science stack (NumPy, SciPy, Jupyter)
A Note To Glotaran Users
Although closely related and developed in the same lab, pyglotaran is not a replacement for Glotaran - A GUI For TIMP. Pyglotaran only aims to provide the modeling and optimization framework and algorithms. It is of course possible to develop a new GUI which leverages the power of pyglotaran (contributions welcome).
The current ‘user-interface’ for pyglotaran is Jupyter Notebook. It is designed to seamlessly integrate in this environment and be compatible with all major visualization and data analysis tools in the scientific python environment.
If you are a non-technical user, you should give these tools a try, there are numerous tutorials how to use them. You don’t need to really learn to program. If you can use e.g. Matlab or Mathematica, you can use Jupyter and Python.
Installation
Prerequisites
Python 3.10 or 3.11
Windows
The easiest way of getting Python (and some basic tools to work with it) in Windows is to use Anaconda, which provides python.
You will need a terminal for the installation. One is provided by Anaconda and is called Anaconda Console. You can find it in the start menu.
Note
If you use a Windows Shell like cmd.exe or PowerShell, you might have to prefix ‘$PATH_TO_ANACONDA/’ to all commands (e.g. C:/Anaconda/pip.exe instead of pip)
Stable release
Warning
pyglotaran is early development, so for the moment stable releases are sparse and outdated. We try to keep the master code stable, so please install from source for now.
This is the preferred method to install pyglotaran, as it will always install the most recent stable release.
To install pyglotaran, run this command in your terminal:
$ pip install pyglotaran
If you don’t have pip installed, this Python installation guide can guide you through the process.
If you want to install it via conda, you can run the following command:
$ conda install -c conda-forge pyglotaran
From sources
First you have to install or update some dependencies.
Within a terminal:
$ pip install -U numpy scipy Cython
Alternatively, for Anaconda users:
$ conda install numpy scipy Cython
Afterwards you can simply use pip to install it directly from Github.
$ pip install git+https://github.com/glotaran/pyglotaran.git
For updating pyglotaran, just re-run the command above.
If you prefer to manually download the source files, you can find them on Github. Alternatively you can clone them with git (preferred):
$ git clone https://github.com/glotaran/pyglotaran.git
Within a terminal, navigate to directory where you have unpacked or cloned the code and enter
$ pip install -e .
For updating, simply download and unpack the newest version (or run $ git pull
in pyglotaran directory if you used git) and and re-run the command above.
Quickstart/Cheat-Sheet
To start using pyglotaran
in your analysis, you only have to import the Project
class and open a project.
[1]:
from glotaran.project import Project
quickstart_project = Project.open("quickstart_project")
quickstart_project
[1]:
If the project does not already exist this will create a new project and its folder structure for you. In our case we had only the models
+ parameters
folders and the data
+ results
folder were created when opening the project.
[2]:
%ls quickstart_project
data/ models/ parameters/ project.gta results/
Let us get some example data to analyze:
[3]:
from glotaran.testing.simulated_data.sequential_spectral_decay import DATASET as my_dataset
my_dataset
[3]:
<xarray.Dataset> Dimensions: (time: 2100, spectral: 72) Coordinates: * time (time) float64 -1.0 -0.99 -0.98 -0.97 ... 19.96 19.97 19.98 19.99 * spectral (spectral) float64 600.0 601.4 602.8 604.2 ... 696.6 698.0 699.4 Data variables: data (time, spectral) float64 0.002915 -0.008417 ... 2.562 2.291 Attributes: source_path: dataset_1.nc
Like all data in pyglotaran
, the dataset is a xarray.Dataset. You can find more information about the xarray
library the xarray hompage.
The loaded dataset is a simulated sequential model.
Plotting raw data
Now lets plot some time traces.
[4]:
plot_data = my_dataset.data.sel(spectral=[620, 630, 650], method="nearest")
plot_data.plot.line(x="time", aspect=2, size=5);
We can also plot spectra at different times.
[5]:
plot_data = my_dataset.data.sel(time=[1, 10, 20], method="nearest")
plot_data.plot.line(x="spectral", aspect=2, size=5);
Import the data into your project
As long as you can read your data into a xarray.Dataset
or xarray.DataArray
you can directly import it in to your project.
This will save your data as NetCDF
(.nc
) file into the data folder inside of your project with the name that you gave it (here quickstart_project/data/my_data.nc
).
If the data format you are using is supported by a plugin you can simply copy the file to the data folder of the project (here quickstart_project/data
).
[6]:
quickstart_project.import_data(my_dataset, dataset_name="my_data")
quickstart_project
[6]:
After importing our quickstart_project
is aware of the data that we named my_data
when importing.
Preparing data
To get an idea about how to model your data, you should inspect the singular value decomposition. As a convenience the load_data
method has the option to add svd data on the fly.
[7]:
dataset_with_svd = quickstart_project.load_data("my_data", add_svd=True)
dataset_with_svd
[7]:
<xarray.Dataset> Dimensions: (time: 2100, spectral: 72, left_singular_value_index: 72, singular_value_index: 72, right_singular_value_index: 72) Coordinates: * time (time) float64 -1.0 -0.99 -0.98 ... 19.98 19.99 * spectral (spectral) float64 600.0 601.4 ... 698.0 699.4 Dimensions without coordinates: left_singular_value_index, singular_value_index, right_singular_value_index Data variables: data (time, spectral) float64 0.002915 ... 2.291 data_left_singular_vectors (time, left_singular_value_index) float64 2.... data_singular_values (singular_value_index) float64 6.577e+03 ...... data_right_singular_vectors (spectral, right_singular_value_index) float64 ... Attributes: source_path: /home/docs/checkouts/readthedocs.org/user_builds/pyglotaran... loader: <function load_dataset at 0x7fd2b5ed8820>
First, take a look at the first 10 singular values:
[8]:
plot_data = dataset_with_svd.data_singular_values.sel(singular_value_index=range(10))
plot_data.plot(yscale="log", marker="o", linewidth=0, aspect=2, size=5);
This tells us that our data have at least three components which we need to model.
Working with models
To analyze our data, we need to create a model.
Create a file called my_model.yaml
in your projects models
directory and fill it with the following content.
[9]:
quickstart_project.show_model_definition("my_model")
[9]:
default_megacomplex: decay
initial_concentration:
input:
compartments: [s1, s2, s3]
parameters: [input.1, input.0, input.0]
k_matrix:
k1:
matrix:
(s2, s1): kinetic.1
(s3, s2): kinetic.2
(s3, s3): kinetic.3
megacomplex:
m1:
k_matrix: [k1]
irf:
irf1:
type: gaussian
center: irf.center
width: irf.width
dataset:
my_data:
initial_concentration: input
megacomplex: [m1]
irf: irf1
You can check your model for problems with the validate
method.
[10]:
quickstart_project.validate("my_model")
[10]:
Your model is valid.
Working with parameters
Now define some starting parameters. Create a file called parameters.yaml
in your projects parameters
directory with the following content.
[11]:
quickstart_project.show_parameters_definition("my_parameters")
[11]:
input:
- ["1", 1, { "vary": False }]
- ["0", 0, { "vary": False }]
kinetic: [0.51, 0.31, 0.11]
irf:
- ["center", 0.31]
- ["width", 0.11]
Note the { "vary": False }
which tells pyglotaran
that those parameters should not be changed.
You can use validate
method also to check for missing parameters.
[12]:
quickstart_project.validate("my_model", "my_parameters")
[12]:
Your model is valid.
Since not all problems in the model can be detected automatically it is wise to visually inspect the model. For this purpose, you can just load the model and inspect its markdown rendered version.
[13]:
quickstart_project.load_model("my_model")
[13]:
Model
Dataset Groups
default
Label: default
Residual Function: variable_projection
K Matrix
k1
Label: k1
Matrix: {(‘s2’, ‘s1’): ‘kinetic.1’, (‘s3’, ‘s2’): ‘kinetic.2’, (‘s3’, ‘s3’): ‘kinetic.3’}
Megacomplex
m1
Label: m1
Dimension: time
Type: decay
K Matrix: [‘k1’]
Initial Concentration
input
Label: input
Compartments: [‘s1’, ‘s2’, ‘s3’]
Parameters: [‘input.1’, ‘input.0’, ‘input.0’]
Exclude From Normalize: []
Irf
irf1
Label: irf1
Normalize: True
Backsweep: False
Type: gaussian
Center: irf.center
Width: irf.width
Dataset
my_data
Label: my_data
Group: default
Force Index Dependent: False
Megacomplex: [‘m1’]
Initial Concentration: input
Irf: irf1
The same way you should inspect your parameters.
[14]:
quickstart_project.load_parameters("my_parameters")
[14]:
input:
Label
Value
Standard Error
t-value
Minimum
Maximum
Vary
Non-Negative
Expression
1
1.000e+00
nan
nan
-inf
inf
False
False
None
0
0.000e+00
nan
nan
-inf
inf
False
False
None
irf:
Label
Value
Standard Error
t-value
Minimum
Maximum
Vary
Non-Negative
Expression
center
3.100e-01
nan
nan
-inf
inf
True
False
None
width
1.100e-01
nan
nan
-inf
inf
True
False
None
kinetic:
Label
Value
Standard Error
t-value
Minimum
Maximum
Vary
Non-Negative
Expression
1
5.100e-01
nan
nan
-inf
inf
True
False
None
2
3.100e-01
nan
nan
-inf
inf
True
False
None
3
1.100e-01
nan
nan
-inf
inf
True
False
None
Optimizing data
Now we have everything together to optimize our parameters.
[15]:
result = quickstart_project.optimize("my_model", "my_parameters")
result
Iteration Total nfev Cost Cost reduction Step norm Optimality
0 1 1.1176e+04 1.73e+06
1 2 1.4981e+01 1.12e+04 1.96e-02 1.27e+04
2 3 7.5475e+00 7.43e+00 5.86e-03 1.01e+03
3 4 7.5437e+00 3.86e-03 1.65e-05 6.31e-02
4 5 7.5437e+00 1.50e-11 1.67e-09 7.92e-06
Both `ftol` and `xtol` termination conditions are satisfied.
Function evaluations 5, initial cost 1.1176e+04, final cost 7.5437e+00, first-order optimality 7.92e-06.
[15]:
Optimization Result |
|
---|---|
Number of residual evaluation |
5 |
Number of residuals |
151200 |
Number of free parameters |
5 |
Number of conditionally linear parameters |
216 |
Degrees of freedom |
150979 |
Chi Square |
1.51e+01 |
Reduced Chi Square |
9.99e-05 |
Root Mean Square Error (RMSE) |
1.00e-02 |
Model
Dataset Groups
default
Label: default
Residual Function: variable_projection
K Matrix
k1
Label: k1
Matrix: {(‘s2’, ‘s1’): ‘kinetic.1(5.00e-01±6.78e-05, t-value: 7372, initial: 5.10e-01)’, (‘s3’, ‘s2’): ‘kinetic.2(3.00e-01±3.93e-05, t-value: 7629, initial: 3.10e-01)’, (‘s3’, ‘s3’): ‘kinetic.3(1.00e-01±4.22e-06, t-value: 23676, initial: 1.10e-01)’}
Megacomplex
m1
Label: m1
Dimension: time
Type: decay
K Matrix: [‘k1’]
Initial Concentration
input
Label: input
Compartments: [‘s1’, ‘s2’, ‘s3’]
Parameters: [‘input.1(1.00e+00, fixed)’, ‘input.0(0.00e+00, fixed)’, ‘input.0(0.00e+00, fixed)’]
Exclude From Normalize: []
Irf
irf1
Label: irf1
Normalize: True
Backsweep: False
Type: gaussian
Center: irf.center(3.00e-01±5.03e-06, t-value: 59583, initial: 3.10e-01)
Width: irf.width(1.00e-01±6.72e-06, t-value: 14884, initial: 1.10e-01)
Dataset
my_data
Label: my_data
Group: default
Force Index Dependent: False
Megacomplex: [‘m1’]
Initial Concentration: input
Irf: irf1
Each time you run an optimization the result will be saved in the projects results folder.
[16]:
%ls "quickstart_project/results"
my_model_run_0000/
To visualize how quickly the optimization converged we ca plot the optimality of the optimization_history
.
[17]:
result.optimization_history.data["optimality"].plot(logy=True)
[17]:
<Axes: xlabel='iteration'>
[18]:
result.optimized_parameters
[18]:
input:
Label
Value
Standard Error
t-value
Minimum
Maximum
Vary
Non-Negative
Expression
1
1.000e+00
nan
nan
-inf
inf
False
False
None
0
0.000e+00
nan
nan
-inf
inf
False
False
None
irf:
Label
Value
Standard Error
t-value
Minimum
Maximum
Vary
Non-Negative
Expression
center
3.000e-01
5.035e-06
59583
-inf
inf
True
False
None
width
9.999e-02
6.718e-06
14884
-inf
inf
True
False
None
kinetic:
Label
Value
Standard Error
t-value
Minimum
Maximum
Vary
Non-Negative
Expression
1
5.000e-01
6.782e-05
7372
-inf
inf
True
False
None
2
3.000e-01
3.932e-05
7629
-inf
inf
True
False
None
3
1.000e-01
4.224e-06
23676
-inf
inf
True
False
None
You can inspect the data of your result
by accessing data
attribute. In our example it only contains our single my_data
dataset, but it ca contain as many dataset as you analysis needs.
[19]:
result.data
[19]:
{'my_data': <xarray.Dataset>}
my_data
<xarray.Dataset> Dimensions: (time: 2100, spectral: 72, left_singular_value_index: 72, singular_value_index: 72, right_singular_value_index: 72, clp_label: 3, species: 3, component_m1: 3, species_m1: 3, to_species_m1: 3, from_species_m1: 3) Coordinates: * time (time) float64 -1.0 -0.99 ... 19.98 19.99 * spectral (spectral) float64 600.0 601.4 ... 699.4 * clp_label (clp_label) <U2 's1' 's2' 's3' * species (species) <U2 's1' 's2' 's3' * component_m1 (component_m1) int64 1 2 3 rate_m1 (component_m1) float64 0.5 0.3 0.1 lifetime_m1 (component_m1) float64 2.0 3.333 10.0 * species_m1 (species_m1) <U2 's1' 's2' 's3' initial_concentration_m1 (species_m1) float64 1.0 0.0 0.0 * to_species_m1 (to_species_m1) <U2 's1' 's2' 's3' * from_species_m1 (from_species_m1) <U2 's1' 's2' 's3' Dimensions without coordinates: left_singular_value_index, singular_value_index, right_singular_value_index Data variables: (12/21) data (time, spectral) float64 0.002915 ... 2.291 data_left_singular_vectors (time, left_singular_value_index) float64 ... data_singular_values (singular_value_index) float64 6.577e+03... data_right_singular_vectors (spectral, right_singular_value_index) float64 ... residual (time, spectral) float64 0.002915 ... -0... matrix (time, clp_label) float64 6e-39 ... 0.2516 ... ... irf_center float64 0.3 irf_width float64 0.09999 decay_associated_spectra_m1 (spectral, component_m1) float64 31.29 .... a_matrix_m1 (component_m1, species_m1) float64 1.0 .... k_matrix_m1 (to_species_m1, from_species_m1) float64 ... k_matrix_reduced_m1 (to_species_m1, from_species_m1) float64 ... Attributes: source_path: /home/docs/checkouts/readthedocs.org/us... model_dimension: time global_dimension: spectral root_mean_square_error: 0.00998919004747027 weighted_root_mean_square_error: 0.00998919004747027 dataset_scale: 1 loader: <function load_dataset at 0x7fd2b5ed8820>
Visualize the Result
The resulting data can be visualized the same way as the dataset. To judge the quality of the fit, you should look at first left and right singular vectors of the residual.
[20]:
result_dataset = result.data["my_data"]
residual_left = result_dataset.residual_left_singular_vectors.sel(left_singular_value_index=0)
residual_right = result_dataset.residual_right_singular_vectors.sel(right_singular_value_index=0)
residual_left.plot.line(x="time", aspect=2, size=5)
residual_right.plot.line(x="spectral", aspect=2, size=5);
Changelog
🚀 0.7.1 (2023-07-28)
✨ Features
✨ Python 3.11 support (#1161)
🩹 Bug fixes
🩹 Fix coherent artifact clp label duplication (#1292)
🚀 0.7.0 (Unreleased)
💥 BREAKING CHANGE
💥🚧 Dropped support for Python 3.8 and 3.9 and only support 3.10 (#1135)
✨ Features
✨ Add optimization history to result and iteration column to parameter history (#1134)
♻️ Complete refactor of model and parameter packages using attrs (#1135)
♻️ Move index dependent calculation to megacomplexes for speed-up (#1175)
✨ Add PreProcessingPipeline (#1256, #1263)
👌 Minor Improvements:
👌🎨 Wrap model section in result markdown in details tag for notebooks (#1098)
👌 Allow more natural column names in pandas parameters file reading (#1174)
✨ Integrate plugin system into Project (#1229)
👌 Make yaml the default plugin when passing a folder to save_result and load_result (#1230)
✨ Allow usage of subfolders in project API for parameters, models and data (#1232)
✨ Allow import of xarray objects in project API import_data (#1235)
🩹 Add number_of_clps to result and correct degrees_of_freedom calculation (#1249)
👌 Improve Project API data handling (#1257)
🗑️ Deprecate Result.number_of_parameters in favor of Result.number_of_free_parameters (#1262)
👌Improve reporting of standard error in case of non_negative constraint in the parameter (#1320)
🩹 Bug fixes
🩹 Fix result data overwritten when using multiple dataset_groups (#1147)
🩹 Fix for normalization issue described in #1157 (multi-gaussian irfs and multiple time ranges (streak))
🩹 Fix for crash described in #1183 when doing an optimization using more than 30 datasets (#1184)
🩹 Fix pretty_format_numerical for negative values (#1192)
🩹 Fix yaml result saving with relative paths (#1199)
🩹 Fix model markdown render for items without label (#1213)
🩹 Fix wrong file loading due to partial filename matching in Project (#1212)
🩹 Fix
Project.import_data
path resolving for different script and cwd (#1214)👌 Refine project API (#1240)
🩹📚 Fix search in docs (#1268)
📚 Documentation
📚 Update quickstart guide to use Project API (#1241)
🗑️ Deprecations (due in 0.8.0)
<model_file>.clp_area_penalties
-><model_file>.clp_penalties
glotaran.ParameterGroup
->glotaran.Parameters
Command Line Interface (removed without replacement) (#1228)
Project.generate_model
(removed without replacement)Project.generate_parameters
(removed without replacement)glotaran.project.Result.number_of_data_points
->glotaran.project.Result.number_of_residuals
glotaran.project.Result.number_of_parameters
->glotaran.project.Result.number_of_free_parameters
🗑️❌ Deprecated functionality removed in this release
glotaran.project.Scheme(..., non_negative_least_squares=...)
glotaran.project.Scheme(..., group=...)
glotaran.project.Scheme(..., group_tolerance=...)
<model_file>.non-negative-least-squares: true
<model_file>.non-negative-least-squares: false
glotaran.parameter.ParameterGroup.to_csv(file_name=parameters.csv)
🚧 Maintenance
🚇🩹 Fix wrong comparison in pr_benchmark workflow (#1097)
🔧 Set sourcery-ai target python version to 3.8 (#1095)
🚇🩹🔧 Fix manifest check (#1099)
♻️ Refactor: optimization (#1060)
♻️🚇 Use GITHUB_OUTPUT instead of set-output in github actions (#1166, #1177)
🚧 Add pinned version of odfpy to requirements_dev.txt (#1164)
♻️ Use validation action and validation as a git submodule (#1165)
🧹 Upgrade syntax to py310 using pyupgrade (#1162)
🧹 Remove unused ‘type: ignore’ (#1168)
🚧 Raise minimum dependency version to releases that support py310 (#1170)
🔧 Make mypy and doc string linters opt out instead of opt in (#1173)
🚀 0.6.0 (2022-06-06)
✨ Features
✨ Python 3.10 support (#977)
✨ Add simple decay megacomplexes (#860)
✨ Feature: Generators (#866)
✨ Project Class (#869)
✨ Add clp guidance megacomplex (#1029)
👌 Minor Improvements:
👌🎨 Add proper repr for DatasetMapping (#957)
👌 Add SavingOptions to save_result API (#966)
✨ Add parameter IO support for more formats supported by pandas (#896)
👌 Apply IRF shift in coherent artifact megacomplex (#992)
👌 Added IRF shift to result dataset (#994)
👌 Improve Result, Parameter and ParameterGroup markdown (#1012)
👌🧹 Add suffix to rate and lifetime and guard for missing datasets (#1022)
♻️ Move simulation to own module (#1041)
♻️ Move optimization to new module glotaran.optimization (#1047)
🩹 Fix missing installation of clp-guide megacomplex as plugin (#1066)
🚧🔧 Add ‘extras’ and ‘full’ extras_require installation options (#1089)
🩹 Bug fixes
🩹 Fix Crash in optimization_group_calculator_linked when using guidance spectra (#950)
🩹 ParameterGroup.get degrades full_label of nested Parameters with nesting over 2 (#1043)
🩹 Show validation problem if parameters are missing values (default: NaN) (#1076)
📚 Documentation
🎨 Add new logo (#1083, #1087)
🗑️ Deprecations (due in 0.8.0)
glotaran.io.save_result(result, result_path, format_name='legacy')
->glotaran.io.save_result(result, Path(result_path) / 'result.yml')
glotaran.analysis.simulation
->glotaran.simulation.simulation
glotaran.analysis.optimize
->glotaran.optimization.optimize
🗑️❌ Deprecated functionality removed in this release
glotaran.ParameterGroup
->glotaran.parameter.ParameterGroup
glotaran.read_model_from_yaml
->glotaran.io.load_model(..., format_name="yaml_str")
glotaran.read_model_from_yaml_file
->glotaran.io.load_model(..., format_name="yaml")
glotaran.read_parameters_from_csv_file
->glotaran.io.load_parameters(..., format_name="csv")
glotaran.read_parameters_from_yaml
->glotaran.io.load_parameters(..., format_name="yaml_str")
glotaran.read_parameters_from_yaml_file
->glotaran.io.load_parameters(..., format_name="yaml")
glotaran.io.read_data_file
->glotaran.io.load_dataset
result.get_dataset("<dataset_name>")
->result.data["<dataset_name>"]
glotaran.analysis.result
->glotaran.project.result
glotaran.analysis.scheme
->glotaran.project.scheme
🚧 Maintenance
🔧 Improve packaging tooling (#923)
🔧🚇 Exclude test files from duplication checks on sonarcloud (#959)
🔧🚇 Only run check-manifest on the CI (#967)
🚇👌 Exclude dependabot push CI runs (#978)
🚇👌 Exclude sourcery AI push CI runs (#1014)
👌📚🚇 Auto remove notebook written data when building docs (#1019)
👌🚇 Change integration tests to use self managed examples action (#1034)
🚇🧹 Exclude pre-commit bot branch from CI runs on push (#1085)
🚀 0.5.1 (2021-12-31)
🩹 Bug fixes
🩹 Bugfix Use normalized initial_concentrations in result creation for decay megacomplex (#927)
🩹 Fix save_result crashes on Windows if input data are on a different drive than result (#931)
🚧 Maintenance
🚧 Forward port Improve result comparison workflow and v0.4 changelog (#938)
🚧 Forward port of #936 test_result_consistency
🚀 0.5.0 (2021-12-01)
✨ Features
✨ Feature: Megacomplex Models (#736)
✨ Feature: Full Models (#747)
✨ Damped Oscillation Megacomplex (a.k.a. DOAS) (#764)
✨ Add Dataset Groups (#851)
✨ Performance improvements (in some cases up to 5x) (#740)
👌 Minor Improvements:
👌 Add dimensions to megacomplex and dataset_descriptor (#702)
👌 Improve ordering in k_matrix involved_compartments function (#788)
👌 Improvements to application of clp_penalties (equal area) (#801)
♻️ Refactor model.from_dict to parse megacomplex_type from dict and add simple_generator for testing (#807)
♻️ Refactor model spec (#836)
♻️ Refactor Result Saving (#841)
✨ Use ruaml.yaml parser for roundtrip support (#893)
♻️ Refactor Result and Scheme loading/initializing from files (#903)
♻️ Several refactoring in
glotaran.Parameter
(#910)👌 Improved Reporting of Parameters (#910, #914, #918)
👌 Scheme now excepts paths to model, parameter and data file without initializing them first (#912)
🩹 Bug fixes
🩹 Fix/cli0.5 (#765)
🩹 Fix compartment ordering randomization due to use of set (#799)
🩹 Fix check_deprecations not showing deprecation warnings (#775)
🩹 Fix and re-enable IRF Dispersion Test (#786)
🩹 Fix coherent artifact crash for index dependent models #808
🩹 False positive model validation fail when combining multiple default megacomplexes (#797)
🩹 Fix ParameterGroup repr when created with ‘from_list’ (#827)
🩹 Fix for DOAS with reversed oscillations (negative rates) (#839)
🩹 Fix parameter expression parsing (#843)
🩹 Use a context manager when opening a nc dataset (#848)
🚧 Disallow xarray versions breaking plotting in integration tests (#900)
🩹 Fix ‘dataset_groups’ not shown in model markdown (#906)
📚 Documentation
📚 Moved API documentation from User to Developer Docs (#776)
📚 Add docs for the CLI (#784)
📚 Fix deprecation in model used in quickstart notebook (#834)
🗑️ Deprecations (due in 0.7.0)
glotaran.model.Model.model_dimension
->glotaran.project.Scheme.model_dimension
glotaran.model.Model.global_dimension
->glotaran.project.Scheme.global_dimension
<model_file>.type.kinetic-spectrum
-><model_file>.default_megacomplex.decay
<model_file>.type.spectral-model
-><model_file>.default_megacomplex.spectral
<model_file>.spectral_relations
-><model_file>.clp_relations
<model_file>.spectral_relations.compartment
-><model_file>.clp_relations.source
<model_file>.spectral_constraints
-><model_file>.clp_constraints
<model_file>.spectral_constraints.compartment
-><model_file>.clp_constraints.target
<model_file>.equal_area_penalties
-><model_file>.clp_area_penalties
<model_file>.irf.center_dispersion
-><model_file>.irf.center_dispersion_coefficients
<model_file>.irf.width_dispersion
-><model_file>.irf.width_dispersion_coefficients
glotaran.project.Scheme(..., non_negative_least_squares=...)
-><model_file>dataset_groups.default.residual_function
glotaran.project.Scheme(..., group=...)
-><model_file>dataset_groups.default.link_clp
glotaran.project.Scheme(..., group_tolerance=...)
->glotaran.project.Scheme(..., clp_link_tolerance=...)
<scheme_file>.maximum-number-function-evaluations
-><scheme_file>.maximum_number_function_evaluations
<model_file>.non-negative-least-squares: true
-><model_file>dataset_groups.default.residual_function: non_negative_least_squares
<model_file>.non-negative-least-squares: false
-><model_file>dataset_groups.default.residual_function: variable_projection
glotaran.parameter.ParameterGroup.to_csv(file_name=parameters.csv)
->glotaran.io.save_parameters(parameters, file_name=parameters.csv)
🚧 Maintenance
🩹 Fix Performance Regressions (between version) (#740)
🧪🚇 Add integration test result validation (#754)
🔧 Add more QA tools for parts of glotaran (#739)
🔧 Fix interrogate usage (#781)
🚇 Speedup PR benchmark (#785)
🚇🩹 Use pinned versions of dependencies to run integration CI tests (#892)
🧹 Move megacomplex integration tests from root level to megacomplexes (#894)
🩹 Fix artifact download in pr_benchmark_reaction workflow (#907)
🚀 0.4.2 (2021-12-31)
🩹 Bug fixes
🩹🚧 Backport of bugfix #927 discovered in PR #860 related to initial_concentration normalization when saving results (#935).
🚧 Maintenance
🚀 0.4.1 (2021-09-07)
✨ Features
Integration test result validation (#760)
🩹 Bug fixes
Fix unintended saving of sub-optimal parameters (0ece818, backport from #747)
Improve ordering in k_matrix involved_compartments function (#791)
🚀 0.4.0 (2021-06-25)
✨ Features
Add basic spectral model (#672)
Add Channel/Wavelength dependent shift parameter to irf. (#673)
Refactored Problem class into GroupedProblem and UngroupedProblem (#681)
Plugin system was rewritten (#600, #665)
Deprecation framework (#631)
Better notebook integration (#689)
🩹 Bug fixes
Fix excessive memory usage in
_create_svd
(#576)Fix several issues with KineticImage model (#612)
Fix exception in sdt reader index calculation (#647)
Avoid crash in result markdown printing when optimization fails (#630)
ParameterNotFoundException doesn’t prepend ‘.’ if path is empty (#688)
Ensure Parameter.label is str or None (#678)
Properly scale StdError of estimated parameters with RMSE (#704)
More robust covariance_matrix calculation (#706)
ParameterGroup.markdown()
independent parametergroups of order (#592)
🔌 Plugins
ProjectIo
‘folder’/’legacy’ plugin to save results (#620)Model
‘spectral-model’ (#672)
📚 Documentation
User documentation is written in notebooks (#568)
Documentation on how to write a
DataIo
plugin (#600)
🗑️ Deprecations (due in 0.6.0)
glotaran.ParameterGroup
->glotaran.parameterParameterGroup
glotaran.read_model_from_yaml
->glotaran.io.load_model(..., format_name="yaml_str")
glotaran.read_model_from_yaml_file
->glotaran.io.load_model(..., format_name="yaml")
glotaran.read_parameters_from_csv_file
->glotaran.io.load_parameters(..., format_name="csv")
glotaran.read_parameters_from_yaml
->glotaran.io.load_parameters(..., format_name="yaml_str")
glotaran.read_parameters_from_yaml_file
->glotaran.io.load_parameters(..., format_name="yaml")
glotaran.io.read_data_file
->glotaran.io.load_dataset
result.save
->glotaran.io.save_result(result, ..., format_name="legacy")
result.get_dataset("<dataset_name>")
->result.data["<dataset_name>"]
glotaran.analysis.result
->glotaran.project.result
glotaran.analysis.scheme
->glotaran.project.scheme
model.simulate
->glotaran.analysis.simulation.simulate(model, ...)
🚀 0.3.3 (2021-03-18)
Force recalculation of SVD attributes in
scheme._prepare_data
(#597)Remove unneeded check in
spectral_penalties._get_area
Fixes (#598)Added python 3.9 support (#450)
🚀 0.3.2 (2021-02-28)
Re-release of version 0.3.1 due to packaging issue
🚀 0.3.1 (2021-02-28)
Added compatibility for numpy 1.20 and raised minimum required numpy version to 1.20 (#555)
Fixed excessive memory consumption in result creation due to full SVD computation (#574)
Added feature parameter history (#557)
Moved setup logic to
setup.cfg
(#560)
🚀 0.3.0 (2021-02-11)
Significant code refactor with small API changes to parameter relation specification (see docs)
Replaced lmfit with scipy.optimize
🚀 0.2.0 (2020-12-02)
Large refactor with significant improvements but also small API changes (see docs)
Removed doas plugin
🚀 0.1.0 (2020-07-14)
Package was renamed to
pyglotaran
on PyPi
🚀 0.0.8 (2018-08-07)
Changed
nan_policiy
toomit
🚀 0.0.7 (2018-08-07)
Added support for multiple shapes per compartment.
🚀 0.0.6 (2018-08-07)
First release on PyPI, support for Windows installs added.
Pre-Alpha Development
Overview
Data IO
Plotting
Modelling
Parameter
Optimizing
Plugins
To be as flexible as possible pyglotaran
uses a plugin system to handle new Models
, DataIo
and ProjectIo
.
Those plugins can be defined by pyglotaran
itself, the user or a 3rd party plugin package.
Builtin plugins
Models
KineticSpectrumModel
KineticImageModel
Data Io
Plugins reading and writing data to and from xarray.Dataset or xarray.DataArray.
AsciiDataIo
NetCDFDataIo
SdtDataIo
Project Io
Plugins reading and writing, Model
,:class:Schema,:class:ParameterGroup or Result
.
YmlProjectIo
CsvProjectIo
FolderProjectIo
Reproducibility and plugins
With a plugin ecosystem there always is the possibility that multiple plugins try register under the same format/name.
This is why plugins are registered at least twice. Once under the name the developer intended and secondly
under their full name (full import path).
This allows to ensure that a specific plugin is used by manually specifying the plugin,
so if someone wants to run your analysis the results will be reproducible even if they have conflicting plugins installed.
You can gain all information about the installed plugins by calling the corresponding *_plugin_table
function with both
options (plugin_names
and full_names
) set to true.
To pin a used plugin use the corresponding set_*_plugin
function with the intended name (format_name
/model_name
)
and the full name (full_plugin_name
) of the plugin to use.
If you wanted to ensure that the pyglotaran builtin plugin is used for sdt
files you could add the following lines
to the beginning of your analysis code.
from glotaran.io import set_data_plugin
set_data_plugin("sdt", "glotaran.builtin.io.sdt.sdt_file_reader.SdtDataIo_sdt")
Models
The functions for model plugins are located in glotaran.model
and called model_plugin_table
and set_model_plugin
.
Data Io
The functions for data io plugins are located in glotaran.io
and called data_io_plugin_table
and set_data_plugin
.
Project Io
The functions for project io plugins are located in glotaran.io
and called project_io_plugin_table
and set_project_plugin
.
3rd party plugins
Plugins not part of pyglotaran
itself.
Not yet, why not be the first? Tell us about your plugin and we will feature it here.
Command-line Interface
glotaran
The glotaran CLI main function.
glotaran [OPTIONS] COMMAND [ARGS]...
Options
- --version
Show the version and exit.
optimize
Optimizes a model. e.g.: glotaran optimize –
glotaran optimize [OPTIONS] [SCHEME_FILE]
Options
- -dfmt, --dataformat <dataformat>
The input format of the data. Will be inferred from extension if not set.
- Options:
ascii | nc | sdt
- -d, --data <data>
Path to a dataset in the form ‘–data DATASET_LABEL PATH_TO_DATA’
- -o, --out <out>
Path to an output directory.
- -ofmt, --outformat <outformat>
The format of the output.
- Default:
folder
- Options:
folder | legacy | yaml
- -n, --nfev <nfev>
Maximum number of function evaluations.
- --nnls
Use non-negative least squares.
- -y, --yes
Don’t ask for confirmation.
- -p, --parameters_file <parameters_file>
(optional) Path to parameter file.
- -m, --model_file <model_file>
Path to model file.
Arguments
- SCHEME_FILE
Optional argument
pluginlist
Prints a list of installed plugins.
glotaran pluginlist [OPTIONS]
print
Parses scheme, a model or a parameter file and prints the result as a Markdown formatted string.
glotaran print [OPTIONS] [SCHEME_FILE]
Options
- -p, --parameters_file <parameters_file>
(optional) Path to parameter file.
- -m, --model_file <model_file>
Path to model file.
Arguments
- SCHEME_FILE
Optional argument
validate
Validates a model file and optionally a parameter file.
glotaran validate [OPTIONS] [SCHEME_FILE]
Options
- -p, --parameters_file <parameters_file>
(optional) Path to parameter file.
- -m, --model_file <model_file>
Path to model file.
Arguments
- SCHEME_FILE
Optional argument
Contributing
Contributions are welcome, and they are greatly appreciated! Every little bit helps, and credit will always be given.
You can contribute in many ways:
Types of Contributions
Report Bugs
Report bugs at https://github.com/glotaran/pyglotaran/issues.
If you are reporting a bug, please include:
Your operating system name and version.
Any details about your local setup that might be helpful in troubleshooting.
Detailed steps to reproduce the bug.
Fix Bugs
Look through the GitHub issues for bugs. Anything tagged with “bug” and “help wanted” is open to whoever wants to implement it.
Implement Features
Look through the GitHub issues for features. Anything tagged with “enhancement” and “help wanted” is open to whoever wants to implement it.
Write Documentation
pyglotaran could always use more documentation, whether as part of the official pyglotaran docs, in docstrings, or even on the web in blog posts, articles, and such. If you are writing docstrings please use the NumPyDoc style to write them.
Submit Feedback
The best way to send feedback is to file an issue at https://github.com/glotaran/pyglotaran/issues.
If you are proposing a feature:
Explain in detail how it would work.
Keep the scope as narrow as possible, to make it easier to implement.
Remember that this is a volunteer-driven project, and that contributions are welcome :)
Get Started!
Ready to contribute? Here’s how to set up pyglotaran
for local development.
Fork the
pyglotaran
repo on GitHub.Clone your fork locally:
$ git clone https://github.com/<your_name_here>/pyglotaran.git
Install your local copy into a virtualenv. Assuming you have virtualenvwrapper installed, this is how you set up your fork for local development:
$ mkvirtualenv pyglotaran (pyglotaran)$ cd pyglotaran (pyglotaran)$ python -m pip install -r requirements_dev.txt (pyglotaran)$ pip install -e . --process-dependency-links
Install the
pre-commit
hooks, to automatically format and check your code:$ pre-commit install
Create a branch for local development:
$ git checkout -b name-of-your-bugfix-or-feature
Now you can make your changes locally.
When you’re done making changes, check that your changes pass flake8 and the tests, including testing other Python versions with tox:
$ pre-commit run -a $ py.test
Or to run all at once:
$ tox
Commit your changes and push your branch to GitHub:
$ git add . $ git commit -m "Your detailed description of your changes." $ git push origin name-of-your-bugfix-or-feature
Submit a pull request through the GitHub website.
Add the change referring the pull request (
(#<PR_nr>)
) tochangelog.md
. If you are in doubt in which section your pull request belongs, just ask a maintainer what they think where it belongs.
Note
By default pull requests will use the template located at .github/PULL_REQUEST_TEMPLATE.md
.
But we also provide custom tailored templates located inside of .github/PULL_REQUEST_TEMPLATE
.
Sadly the GitHub Web Interface doesn’t provide an easy way to select them as it does for issue templates
(see this comment for more details).
To use them you need to add the following query parameters to the url when creating the pull request and hit enter:
✨ Feature PR:
?expand=1&template=feature_PR.md
🩹 Bug Fix PR:
?expand=1&template=bug_fix_PR
📚 Documentation PR:
?expand=1&template=docs_PR.md
Pull Request Guidelines
Before you submit a pull request, check that it meets these guidelines:
The pull request should include tests.
If the pull request adds functionality, the docs should be updated. Put your new functionality into a function with a docstring.
The pull request should work for Python 3.10 and 3.11 Check your Github Actions
https://github.com/<your_name_here>/pyglotaran/actions
and make sure that the tests pass for all supported Python versions.
Docstrings
We use numpy style docstrings, which can also be autogenerated from function/method signatures by extensions for your editor.
Some extensions for popular editors are:
Note
If your pull request improves the docstring coverage (check pre-commit run -a interrogate
),
please raise the value of the interrogate setting fail-under
in
pyproject.toml.
That way the next person will improve the docstring coverage as well and
everyone can enjoy a better documentation.
Warning
As soon as all our docstrings are in proper shape we will enforce that it stays that way. If you want to check if your docstrings are fine you can use pydocstyle and darglint.
Tips
To run a subset of tests:
$ py.test tests.test_pyglotaran
Deprecations
Only maintainers are allowed to decide about deprecations, thus you should first open an issue and check back with them if they are ok with deprecating something.
To make deprecations as robust as possible and give users all needed information
to adjust their code, we provide helper functions inside the module
glotaran.deprecation
.
The functions you most likely want to use are
deprecate()
for functions, methods and classeswarn_deprecated()
for call argumentsdeprecate_module_attribute()
for module attributesdeprecate_submodule()
for modulesdeprecate_dict_entry()
for dict entriesraise_deprecation_error()
if the original behavior cannot be maintained
Those functions not only make it easier to deprecate something, but they also check that that deprecations will be removed when they are due and that at least the imports in the warning work. Thus all deprecations need to be tested.
Tests for deprecations should be placed in glotaran/deprecation/modules/test
which also
provides the test helper functions deprecation_warning_on_call_test_helper
and
changed_import_test_warn
.
Since the tests for deprecation are mainly for maintainability and not to test the
functionality (those tests should be in the appropriate place)
deprecation_warning_on_call_test_helper
will by default just test that a
GlotaranApiDeprecationWarning
was raised and ignore all raise Exception
s.
An exception to this rule is when adding back removed functionality
(which shouldn’t happen in the first place but might), which should be
implemented in a file under glotaran/deprecation/modules
and filenames should be like the
relative import path from glotaran root, but with _
instead of .
.
E.g. glotaran.analysis.scheme
would map to analysis_scheme.py
The only exceptions to this rule are the root __init__.py
which
is named glotaran_root.py
and testing changed imports which should
be placed in test_changed_imports.py
.
Deprecating a Function, method or class
Deprecating a function, method or class is as easy as adding the deprecate
decorator to it. Other decorators (e.g. @staticmethod
or @classmethod
)
should be placed both deprecate
in order to work.
from glotaran.deprecation import deprecate
@deprecate(
deprecated_qual_name_usage="glotaran.some_module.function_to_deprecate(filename)",
new_qual_name_usage='glotaran.some_module.new_function(filename, format_name="legacy")',
to_be_removed_in_version="0.6.0",
)
def function_to_deprecate(*args, **kwargs):
...
Deprecating a call argument
When deprecating a call argument you should use warn_deprecated
and set
the argument to deprecate to a default value (e.g. "deprecated"
) to check against.
Note that for this use case we need to set check_qual_names=(False, False)
which
will deactivate the import testing.
This might not always be possible, e.g. if the argument is positional only,
so it might make more sense to deprecate the whole callable, just discuss what to
do with our trusted maintainers.
from glotaran.deprecation import deprecate
def function_to_deprecate(args1, new_arg="new_default_behavior", deprecated_arg="deprecated", **kwargs):
if deprecated_arg != "deprecated":
warn_deprecated(
deprecated_qual_name_usage="deprecated_arg",
new_qual_name_usage='new_arg="legacy"',
to_be_removed_in_version="0.6.0",
check_qual_names=(False, False)
)
new_arg = "legacy"
...
Deprecating a module attribute
Sometimes it might be necessary to remove an attribute (function, class, or constant)
from a module to prevent circular imports or just to streamline the API.
In those cases you would use deprecate_module_attribute
inside a module __getattr__
function definition. This will import the attribute from the new location and return it when
an import or use is requested.
def __getattr__(attribute_name: str):
from glotaran.deprecation import deprecate_module_attribute
if attribute_name == "deprecated_attribute":
return deprecate_module_attribute(
deprecated_qual_name="glotaran.old_package.deprecated_attribute",
new_qual_name="glotaran.new_package.new_attribute_name",
to_be_removed_in_version="0.6.0",
)
raise AttributeError(f"module {__name__} has no attribute {attribute_name}")
Deprecating a submodule
For a better logical structure, it might be needed to move modules to a different
location in the project. In those cases, you would use deprecate_submodule
,
which imports the module from the new location, add it to sys.modules
and
as an attribute to the parent package.
from glotaran.deprecation import deprecate_submodule
module_name = deprecate_submodule(
deprecated_module_name="glotaran.old_package.module_name",
new_module_name="glotaran.new_package.new_module_name",
to_be_removed_in_version="0.6.0",
)
Deprecating dict entries
The possible dict deprecation actions are:
Swapping of keys
{"foo": 1} -> {"bar": 1}
(done viaswap_keys=("foo", "bar")
)Replacing of matching values
{"foo": 1} -> {"foo": 2}
(done viareplace_rules=({"foo": 1}, {"foo": 2})
)Replacing of matching values and swapping of keys
{"foo": 1} -> {"bar": 2}
(done viareplace_rules=({"foo": 1}, {"bar": 2})
)
For full examples have a look at the examples from the docstring (deprecate_dict_entry()
).
Deprecation Errors
In some cases deprecations cannot have a replacement with the original behavior maintained. This will be mostly the case when at this point in time and in the object hierarchy there isn’t enough information available to calculate the appropriate values. Rather than using a ‘dummy’ value not to break the API, which could cause undefined behavior down the line, those cases should throw an error which informs the users about the new usage. In general this should only be used if it is unavoidable due to massive refactoring of the internal structure and tried to avoid by any means in a reasonable context.
If you have one of those rare cases you can use raise_deprecation_error()
.
Testing Result consistency
To test the consistency of results locally you need to clone the pyglotaran-examples and run them:
$ git clone https://github.com/glotaran/pyglotaran-examples
$ cd pyglotaran-examples
$ python scripts/run_examples.py run-all --headless
Note
Make sure you got the the latest version (git pull
) and are
on the correct branch for both pyglotaran
and pyglotaran-examples
.
The results from the examples will be saved in you home folder under pyglotaran_examples_results
.
Those results than will be compared to the ‘gold standard’ defined by the maintainers.
To test the result consistency run:
$ pytest validation/pyglotaran-examples/test_result_consistency.py
If needed this will clone the ‘gold standard’ results
to the folder comparison-results
, update them and test your current results against them.
Deploying
A reminder for the maintainers on how to deploy.
Make sure all your changes are committed (including an entry in changelog.md
),
the version number only needs to be changed in glotaran/__init__.py
.
Then make a new release on GitHub and
give the tag a proper name, e.g. v0.3.0
since it might be included in a citation.
Github Actions will then deploy to PyPI if the tests pass.
API Documentation
The API Documentation for pyglotaran is automatically created from its docstrings.
Glotaran package root. |
Plugin development
If you don’t find the plugin that fits your needs you can always write your own. This sections will explain you how and what you need to know.
In time we will also provide you with a cookiecutter template, to kickstart your new plugin for publishing as a package on PyPi.
How to Write your own Io plugin
There are all kinds of different data formats, so it is quite likely that your experimental setup uses a format which isn’t yet supported by a glotaran
plugin and want to write your own DataIo
plugin to support this format.
Since json
is very common format (admittedly not for data, but in general) and python has builtin support for it we will use it as an example.
First let’s have a look which DataIo
plugins are already installed and which functions they support.
[1]:
from glotaran.io import data_io_plugin_table
[2]:
data_io_plugin_table()
[2]:
Format name |
load_dataset |
save_dataset |
---|---|---|
|
* |
* |
|
* |
* |
|
* |
/ |
Looks like there isn’t a json
plugin installed yet, but maybe someone else did already write one, so have a look at the `3rd party plugins
list in the user docsumentation <https://pyglotaran.readthedocs.io/en/latest/user_documentation/using_plugins.html>`__ before you start writing your own plugin.
For the sake of the example, we will write our json
plugin even if there already exists one by the time you read this.
First you need to import all needed libraries and functions.
from __future__ import annotations
: needed to write python 3.10 typing syntax (|
), even with a lower python versionjson
,xarray
: Needed for reading and writing itselfDataIoInterface
: needed to subclass from, this way you get the proper type and especially signature checkingregister_data_io
: registers the DataIo plugin under the givenformat_name
s
[3]:
from __future__ import annotations
import json
import xarray as xr
from glotaran.io.interface import DataIoInterface
from glotaran.plugin_system.data_io_registration import register_data_io
DataIoInterface
has two methods we could implement load_dataset
and save_dataset
, which are used by the identically named functions in glotaran.io
.
We will just implement both for our example to be complete. the quickest way to get started is to just copy over the code from DataIoInterface
which already has the right signatures and some boilerplate docstrings, for the method arguments.
If the default arguments aren’t enough for your plugin and you need your methods to have additional option, you can just add those. Note the *
between file_name
and my_extra_option
, this tell python that my_extra_option
is an keyword only argument and `mypy
<https://github.com/python/mypy>`__ won’t raise an [override]
type error for changing the signature of the method. To help others who might use your plugin and your
future self, it is good practice to documents what each parameter does in the methods docstring, which will be accessed by the help function.
Finally add the @register_data_io
with the format_name
’s you want to register the plugin to, in our case json
and my_json
.
Pro tip: You don’t need to implement the whole functionality inside of the method itself,
[4]:
@register_data_io(["json", "my_json"])
class JsonDataIo(DataIoInterface):
"""My new shiny glotaran plugin for json data io"""
def load_dataset(
self, file_name: str, *, my_extra_option: str = None
) -> xr.Dataset | xr.DataArray:
"""Read json data to xarray.Dataset
Parameters
----------
file_name : str
File containing the data.
my_extra_option: str
This argument is only for demonstration
"""
if my_extra_option is not None:
print(f"Using my extra option loading json: {my_extra_option}")
with open(file_name) as json_file:
data_dict = json.load(json_file)
return xr.Dataset.from_dict(data_dict)
def save_dataset(
self, dataset: xr.Dataset | xr.DataArray, file_name: str, *, my_extra_option=None
):
"""Write xarray.Dataset to a json file
Parameters
----------
dataset : xr.Dataset
Dataset to be saved to file.
file_name : str
File to write the result data to.
my_extra_option: str
This argument is only for demonstration
"""
if my_extra_option is not None:
print(f"Using my extra option for writing json: {my_extra_option}")
data_dict = dataset.to_dict()
with open(file_name, "w") as json_file:
json.dump(data_dict, json_file)
Let’s verify that our new plugin was registered successfully under the format_name
s json
and my_json
.
[5]:
data_io_plugin_table()
[5]:
Format name |
load_dataset |
save_dataset |
---|---|---|
|
* |
* |
|
* |
* |
|
* |
* |
|
* |
* |
|
* |
/ |
Now let’s use the example data from the quickstart to test the reading and writing capabilities of our plugin.
[6]:
from glotaran.io import load_dataset
from glotaran.io import save_dataset
from glotaran.testing.simulated_data.sequential_spectral_decay import DATASET as dataset
[7]:
dataset
[7]:
<xarray.Dataset> Dimensions: (time: 2100, spectral: 72) Coordinates: * time (time) float64 -1.0 -0.99 -0.98 -0.97 ... 19.96 19.97 19.98 19.99 * spectral (spectral) float64 600.0 601.4 602.8 604.2 ... 696.6 698.0 699.4 Data variables: data (time, spectral) float64 0.01221 -1.546e-05 ... 2.562 2.302 Attributes: source_path: dataset_1.nc
To get a feeling for our data, let’s plot some traces.
[8]:
plot_data = dataset.data.sel(spectral=[620, 630, 650], method="nearest")
plot_data.plot.line(x="time", aspect=2, size=5)
[8]:
[<matplotlib.lines.Line2D at 0x7f63b8dc1150>,
<matplotlib.lines.Line2D at 0x7f63b8dc1210>,
<matplotlib.lines.Line2D at 0x7f63b8dc11e0>]
Since we want to see a difference of our saved and loaded data, we divide the amplitudes by 2 for no reason.
[9]:
dataset["data"] = dataset.data / 2
Now that we changed the data, let’s write them to a file.
But in which order were the arguments again? And are there any additional option?
Good thing we documented our new plugin, so we can just lookup the help.
[10]:
from glotaran.io import show_data_io_method_help
show_data_io_method_help("json", "save_dataset")
Help on method save_dataset in module __main__:
save_dataset(dataset: 'xr.Dataset | xr.DataArray', file_name: 'str', *, my_extra_option=None) method of __main__.JsonDataIo instance
Write xarray.Dataset to a json file
Parameters
----------
dataset : xr.Dataset
Dataset to be saved to file.
file_name : str
File to write the result data to.
my_extra_option: str
This argument is only for demonstration
Note that the function save_dataset
has additional arguments:
format_name
: overwrites the inferred plugin selectionallow_overwrite
: Allows to overwrite existing files (USE WITH CAUTION!!!)
[11]:
help(save_dataset)
Help on function save_dataset in module glotaran.plugin_system.data_io_registration:
save_dataset(dataset: 'xr.Dataset | xr.DataArray', file_name: 'StrOrPath', format_name: 'str | None' = None, *, data_filters: 'list[str] | None' = None, allow_overwrite: 'bool' = False, update_source_path: 'bool' = True, **kwargs: 'Any') -> 'None'
Save data from :xarraydoc:`Dataset` or :xarraydoc:`DataArray` to a file.
Parameters
----------
dataset : xr.Dataset | xr.DataArray
Data to be written to file.
file_name : StrOrPath
File to write the data to.
format_name : str
Format the file should be in, if not provided it will be inferred from the file extension.
data_filters : list[str] | None
Optional list of items in the dataset to be saved.
allow_overwrite : bool
Whether or not to allow overwriting existing files, by default False
update_source_path: bool
Whether or not to update the ``source_path`` attribute to ``file_name`` when saving.
by default True
**kwargs : Any
Additional keyword arguments passes to the ``write_dataset`` implementation
of the data io plugin. If you aren't sure about those use ``get_datasaver``
to get the implementation with the proper help and autocomplete.
Since this is just an example and we don’t overwrite important data we will use allow_overwrite=True
. Also it makes writing this documentation easier, not having to manually delete the test file each time you run the cell.
[12]:
save_dataset(
dataset, "half_intensity.json", allow_overwrite=True, my_extra_option="just as an example"
)
Using my extra option for writing json: just as an example
Now let’s test our data loading functionality.
[13]:
reloaded_data = load_dataset("half_intensity.json", my_extra_option="just as an example")
reloaded_data
Using my extra option loading json: just as an example
[13]:
<xarray.Dataset> Dimensions: (time: 2100, spectral: 72) Coordinates: * time (time) float64 -1.0 -0.99 -0.98 -0.97 ... 19.96 19.97 19.98 19.99 * spectral (spectral) float64 600.0 601.4 602.8 604.2 ... 696.6 698.0 699.4 Data variables: data (time, spectral) float64 0.006104 -7.732e-06 ... 1.281 1.151 Attributes: source_path: half_intensity.json loader: <function load_dataset at 0x7f63cb32c790>
[14]:
reloaded_plot_data = reloaded_data.data.sel(spectral=[620, 630, 650], method="nearest")
reloaded_plot_data.plot.line(x="time", aspect=2, size=5)
[14]:
[<matplotlib.lines.Line2D at 0x7f63b69941f0>,
<matplotlib.lines.Line2D at 0x7f63b6994280>,
<matplotlib.lines.Line2D at 0x7f63b69942b0>]
Since this looks like the above plot, but with half the amplitudes, so writing and reading our data worked as we hoped it would.
Writing a ProjectIo
plugin words analogous:
|
|
|
---|---|---|
Register function |
|
|
Baseclass |
|
|
Possible methods |
|
|
Of course you don’t have to implement all methods (sometimes that doesn’t even make sense), but only the ones you need.
Last but not least:
Chances are that if you need a plugin someone else does too, so it would awesome if you would publish it open source, so the wheel isn’t reinvented over and over again.