This page was generated from docs/source/notebooks/quickstart/quickstart.ipynb. Interactive online version: Binder badge

Quickstart/Cheat-Sheet

Since this documentation is written in a jupyter-notebook we will import a little ipython helper function to display file with syntax highlighting.

[1]:
from glotaran.utils.ipython import display_file

To start using pyglotaran in your project, you have to import it first. In addition we need to import some extra components for later use.

[2]:
from glotaran.analysis.optimize import optimize
from glotaran.io import load_model
from glotaran.io import load_parameters
from glotaran.io import save_dataset
from glotaran.io.prepare_dataset import prepare_time_trace_dataset
from glotaran.project.scheme import Scheme

Let us get some example data to analyze:

[3]:
from glotaran.examples.sequential import dataset

dataset
[3]:
<xarray.Dataset>
Dimensions:   (time: 2100, spectral: 72)
Coordinates:
  * time      (time) float64 -1.0 -0.99 -0.98 -0.97 ... 19.96 19.97 19.98 19.99
  * spectral  (spectral) float64 600.0 601.4 602.8 604.2 ... 696.6 698.0 699.4
Data variables:
    data      (time, spectral) float64 -0.000586 0.00234 ... 1.719 1.528
Attributes:
    source_path:  dataset_1.nc

Like all data in pyglotaran, the dataset is a xarray.Dataset. You can find more information about the xarray library the xarray hompage.

The loaded dataset is a simulated sequential model.

Plotting raw data

Now we lets plot some time traces.

[4]:
plot_data = dataset.data.sel(spectral=[620, 630, 650], method="nearest")
plot_data.plot.line(x="time", aspect=2, size=5);
../../_images/notebooks_quickstart_quickstart_8_0.svg

We can also plot spectra at different times.

[5]:
plot_data = dataset.data.sel(time=[1, 10, 20], method="nearest")
plot_data.plot.line(x="spectral", aspect=2, size=5);
../../_images/notebooks_quickstart_quickstart_10_0.svg

Preparing data

To get an idea about how to model your data, you should inspect the singular value decomposition. Pyglotaran has a function to calculate it (among other things).

[6]:
dataset = prepare_time_trace_dataset(dataset)
dataset
[6]:
<xarray.Dataset>
Dimensions:                      (time: 2100, spectral: 72, left_singular_value_index: 72, singular_value_index: 72, right_singular_value_index: 72)
Coordinates:
  * time                         (time) float64 -1.0 -0.99 -0.98 ... 19.98 19.99
  * spectral                     (spectral) float64 600.0 601.4 ... 698.0 699.4
Dimensions without coordinates: left_singular_value_index, singular_value_index, right_singular_value_index
Data variables:
    data                         (time, spectral) float64 -0.000586 ... 1.528
    data_left_singular_vectors   (time, left_singular_value_index) float64 2....
    data_singular_values         (singular_value_index) float64 4.62e+03 ... ...
    data_right_singular_vectors  (right_singular_value_index, spectral) float64 ...
Attributes:
    source_path:  dataset_1.nc

First, take a look at the first 10 singular values:

[7]:
plot_data = dataset.data_singular_values.sel(singular_value_index=range(0, 10))
plot_data.plot(yscale="log", marker="o", linewidth=0, aspect=2, size=5);
../../_images/notebooks_quickstart_quickstart_14_0.svg

Working with models

To analyze our data, we need to create a model.

Create a file called model.yaml in your working directory and fill it with the following:

[8]:
display_file("model.yaml", syntax="yaml")
[8]:
default_megacomplex: decay

initial_concentration:
  input:
    compartments: [s1, s2, s3]
    parameters: [input.1, input.0, input.0]

k_matrix:
  k1:
    matrix:
      (s2, s1): kinetic.1
      (s3, s2): kinetic.2
      (s3, s3): kinetic.3

megacomplex:
  m1:
    k_matrix: [k1]

irf:
  irf1:
    type: gaussian
    center: irf.center
    width: irf.width

dataset:
  dataset1:
    initial_concentration: input
    megacomplex: [m1]
    irf: irf1

Now you can load the model file.

[9]:
model = load_model("model.yaml")

You can check your model for problems with model.validate.

[10]:
model.validate()
[10]:
'Your model is valid.'

Working with parameters

Now define some starting parameters. Create a file called parameters.yaml with the following content.

[11]:
display_file("parameters.yaml", syntax="yaml")
[11]:
input:
 - ['1', 1, {'vary': False, 'non-negative': False}]
 - ['0', 0, {'vary': False, 'non-negative': False}]

kinetic: [
    0.5,
    0.3,
    0.1,
]

irf:
 - ['center', 0.3]
 - ['width', 0.1]
[12]:
parameters = load_parameters("parameters.yaml")

You can model.validate also to check for missing parameters.

[13]:
model.validate(parameters=parameters)
[13]:
'Your model is valid.'

Since not all problems in the model can be detected automatically it is wise to visually inspect the model. For this purpose, you can just print the model.

[14]:
model
[14]:

Model

Megacomplex Types: decay

Dataset Groups

  • default:

  • Label: default

  • residual_function: variable_projection

  • link_clp: None

K Matrix

  • k1:

    • Label: k1

    • Matrix:

    • (‘s2’, ‘s1’): kinetic.1(nan)

    • (‘s3’, ‘s2’): kinetic.2(nan)

    • (‘s3’, ‘s3’): kinetic.3(nan)

Initial Concentration

  • input:

    • Label: input

    • Compartments:

    • s1

    • s2

    • s3

    • Parameters:

    • input.1(nan)

    • input.0(nan)

    • input.0(nan)

    • Exclude From Normalize:

Irf

  • irf1 (gaussian):

    • Label: irf1

    • Type: gaussian

    • Center: irf.center(nan)

    • Width: irf.width(nan)

    • Normalize: True

    • Backsweep: False

Megacomplex

  • m1 (None):

    • Label: m1

    • Dimension: time

    • K Matrix:

    • k1

Dataset

  • dataset1:

    • Label: dataset1

    • Group: default

    • Megacomplex:

    • m1

    • Initial Concentration: input

    • Irf: irf1

The same way you should inspect your parameters.

[15]:
parameters
[15]:
  • input:

    Label

    Value

    Standard Error

    Minimum

    Maximum

    Vary

    Non-Negative

    Expression

    1

    1.000e+00

    nan

    -inf

    inf

    False

    False

    None

    0

    0.000e+00

    nan

    -inf

    inf

    False

    False

    None

  • irf:

    Label

    Value

    Standard Error

    Minimum

    Maximum

    Vary

    Non-Negative

    Expression

    center

    3.000e-01

    nan

    -inf

    inf

    True

    False

    None

    width

    1.000e-01

    nan

    -inf

    inf

    True

    False

    None

  • kinetic:

    Label

    Value

    Standard Error

    Minimum

    Maximum

    Vary

    Non-Negative

    Expression

    1

    5.000e-01

    nan

    -inf

    inf

    True

    False

    None

    2

    3.000e-01

    nan

    -inf

    inf

    True

    False

    None

    3

    1.000e-01

    nan

    -inf

    inf

    True

    False

    None

Optimizing data

Now we have everything together to optimize our parameters. First we import optimize.

[16]:
scheme = Scheme(model, parameters, {"dataset1": dataset})
result = optimize(scheme)
result
   Iteration     Total nfev        Cost      Cost reduction    Step norm     Optimality
       0              1         7.5629e+00                                    6.68e+01
       1              2         7.5624e+00      4.68e-04       8.07e-05       2.26e-01
       2              3         7.5624e+00      4.33e-10       2.33e-08       6.04e-06
`ftol` termination condition is satisfied.
Function evaluations 3, initial cost 7.5629e+00, final cost 7.5624e+00, first-order optimality 6.04e-06.
[16]:

Optimization Result

Number of residual evaluation

3

Number of variables

5

Number of datapoints

151200

Degrees of freedom

151195

Chi Square

1.51e+01

Reduced Chi Square

1.00e-04

Root Mean Square Error (RMSE)

1.00e-02

RMSE additional penalty

[array([], dtype=float64)]

Model

Megacomplex Types: decay

Dataset Groups

  • default:

  • Label: default

  • residual_function: variable_projection

  • link_clp: None

K Matrix

  • k1:

    • Label: k1

    • Matrix:

    • (‘s2’, ‘s1’): kinetic.1(5.00e-01±7.26e-05, initial: 5.00e-01)

    • (‘s3’, ‘s2’): kinetic.2(3.00e-01±4.19e-05, initial: 3.00e-01)

    • (‘s3’, ‘s3’): kinetic.3(1.00e-01±4.78e-06, initial: 1.00e-01)

Initial Concentration

  • input:

    • Label: input

    • Compartments:

    • s1

    • s2

    • s3

    • Parameters:

    • input.1(1.00e+00, fixed)

    • input.0(0.00e+00, fixed)

    • input.0(0.00e+00, fixed)

    • Exclude From Normalize:

Irf

  • irf1 (gaussian):

    • Label: irf1

    • Type: gaussian

    • Center: irf.center(3.00e-01±5.01e-06, initial: 3.00e-01)

    • Width: irf.width(1.00e-01±6.71e-06, initial: 1.00e-01)

    • Normalize: True

    • Backsweep: False

Megacomplex

  • m1 (None):

    • Label: m1

    • Dimension: time

    • K Matrix:

    • k1

Dataset

  • dataset1:

    • Label: dataset1

    • Group: default

    • Megacomplex:

    • m1

    • Initial Concentration: input

    • Irf: irf1

[17]:
result.optimized_parameters
[17]:
  • input:

    Label

    Value

    Standard Error

    Minimum

    Maximum

    Vary

    Non-Negative

    Expression

    1

    1.000e+00

    nan

    -inf

    inf

    False

    False

    None

    0

    0.000e+00

    nan

    -inf

    inf

    False

    False

    None

  • irf:

    Label

    Value

    Standard Error

    Minimum

    Maximum

    Vary

    Non-Negative

    Expression

    center

    3.000e-01

    5.012e-06

    -inf

    inf

    True

    False

    None

    width

    1.000e-01

    6.705e-06

    -inf

    inf

    True

    False

    None

  • kinetic:

    Label

    Value

    Standard Error

    Minimum

    Maximum

    Vary

    Non-Negative

    Expression

    1

    5.000e-01

    7.256e-05

    -inf

    inf

    True

    False

    None

    2

    2.999e-01

    4.191e-05

    -inf

    inf

    True

    False

    None

    3

    1.000e-01

    4.783e-06

    -inf

    inf

    True

    False

    None

You can get the resulting data for your dataset with result.get_dataset.

[18]:
result_dataset = result.data["dataset1"]
result_dataset
[18]:
<xarray.Dataset>
Dimensions:                                   (clp_label: 3, time: 2100, spectral: 72, left_singular_value_index: 72, singular_value_index: 72, right_singular_value_index: 72, species: 3, component: 3, to_species: 3, from_species: 3)
Coordinates:
  * clp_label                                 (clp_label) object 's1' 's2' 's3'
  * time                                      (time) float64 -1.0 ... 19.99
  * spectral                                  (spectral) float64 600.0 ... 699.4
  * species                                   (species) <U2 's1' 's2' 's3'
  * component                                 (component) int64 0 1 2
    rate                                      (component) float64 -0.5 ... -0.1
    lifetime                                  (component) float64 -2.0 ... -9...
  * to_species                                (to_species) <U2 's1' 's2' 's3'
  * from_species                              (from_species) <U2 's1' 's2' 's3'
Dimensions without coordinates: left_singular_value_index, singular_value_index, right_singular_value_index
Data variables: (12/24)
    data                                      (time, spectral) float64 -0.000...
    data_left_singular_vectors                (time, left_singular_value_index) float64 ...
    data_singular_values                      (singular_value_index) float64 ...
    data_right_singular_vectors               (spectral, right_singular_value_index) float64 ...
    matrix                                    (time, clp_label) float64 6.101...
    clp                                       (spectral, clp_label) float64 1...
    ...                                        ...
    irf_center                                float64 0.3
    irf_width                                 float64 0.1
    decay_associated_spectra                  (spectral, component) float64 2...
    a_matrix                                  (component, species) float64 1....
    k_matrix                                  (to_species, from_species) float64 ...
    k_matrix_reduced                          (to_species, from_species) float64 ...
Attributes:
    source_path:                      dataset_1.nc
    root_mean_square_error:           0.01000158369561628
    weighted_root_mean_square_error:  0.01000158369561628
    dataset_scale:                    1

Visualize the Result

The resulting data can be visualized the same way as the dataset. To judge the quality of the fit, you should look at first left and right singular vectors of the residual.

[19]:
residual_left = result_dataset.residual_left_singular_vectors.sel(left_singular_value_index=0)
residual_right = result_dataset.residual_right_singular_vectors.sel(right_singular_value_index=0)
residual_left.plot.line(x="time", aspect=2, size=5)
residual_right.plot.line(x="spectral", aspect=2, size=5);
../../_images/notebooks_quickstart_quickstart_36_0.svg
../../_images/notebooks_quickstart_quickstart_36_1.svg

Finally, you can save your result.

[20]:
save_dataset(result_dataset, "dataset1.nc")