BioUML plug-in for nonlinear parameter estimation using multiple experimental data

Elena Kutumova; Anna Ryabova; Tagir Valeev; Fedor Kolpakov

Journal Information

Journal ID (publisher-id): vb

Title: Virtual Biology

ISSN (electronic): 2306-8140

Article Information

Publication date (electronic): 25 March 2013

Electronic Location Identifier: e10

DOI: 10.12704/vb/e10

BioUML plug-in for nonlinear parameter estimation using multiple experimental data

Elena Kutumova^{1,
2}

Anna Ryabova¹

Tagir Valeev^{1,
3}

Fedor Kolpakov^{1,
2}

[1] Institute of Systems Biology, Ltd, Novosibirsk, Russia

[2] Design Technological Institute of Digital Techniques SB RAS, Novosibirsk, Russia

[3] Institute of Informatics Systems SB RAS, Novosibirsk, Russia

Abstract

Motivation: Systems biology deals with many different types of experimental data representing individual components of biological systems. Behavior of these systems over time could be described using systems of ordinary differential equations (ODE). In order to analyze dynamics of the ODEs and estimate their parameters based on data obtained in different experimental conditions, biologists need a flexible framework that allows them to create dynamic models and perform multi-experiment parameter fitting.

Results: We present optimization tools of the BioUML software (http://biouml.org) developed for modeling and analysis of biochemical systems. We created optimization plug-in to solve non-linear optimization problems via minimization of the function of deviations between experimental data and model simulation results. Experimental data can be considered as separate sets of time courses or steady states stored in different tab-separated files. BioUML includes several deterministic and stochastic optimization methods which find reasonably accurate solutions faster than the COPASI software. Some of these methods provide constrained optimization and some of them were parallelized.

Keywords

BioUML, parameter estimation, multi-experiment parameter fitting

Introduction

Development of experimental technologies in molecular biology led to accumulation of huge volumes of data relating to various levels of life organization. However, the data alone cannot be used to reconstruct the full organization of biological systems. Therefore, the interests of bioinformatics are now focused on the problems of data processing, including the problems of integration and systematization of primary experimental data and the problems of knowledge production based on mathematics and modern information technologies. The challenge of systems biology is construction of mathematical models to describe dynamic behavior of biological systems based on experimental data. Such problems involve studying a large volume of data and require software for their processing and interpreting.

The standard tools for working with biological data include access to biological databases, formalized description of biological systems, as well as visualization, simulation, parameter fitting and analysis of ODE models representing these systems. The BioUML software is an integrated environment that was developed to span all of these capabilities. Here we present optimization tools of this software intended for multi-experiment training of the models created using BioUML notation or imported in the SBML format [Hucka et al, 2003]. These tools are available both in the desktop and web editions of BioUML.

Optimization problem in BioUML

The general nonlinear optimization problem [Runarsson and Yao, 2000] can be formulated as follows: find a minimum of the objective function $ϕ (x)$ , where $x$ lies in the intersection of the $N$ -dimensional search space

Ω = {y \in ℝ^{N} | \underline{y_{i}} \leq y_{i} \leq \bar{y_{i}}, \underline{y_{i}}, \bar{y_{i}} \in ℝ, i = 1, \dots, N},

and the admissible region $ℱ \subseteq ℝ^{N}$ defined by a set of equality and/or inequality constraints on $x$ . Since the equality $g_{s} (x) = 0$ can be replaced by two inequalities $g_{s} (x) \leq 0$ and $- g_{s} (x) \leq 0$ , the admissible region can be defined without loss of generality as

ℱ = {y \in ℝ^{N} | g_{s} (y) \leq 0, s = 1, \dots, p} .

In order to get solution situated inside $ℱ$ , we minimize the penalty function

$ψ (x) = \sum_{i = 1}^{s} \max {0, g_{s} (x)}^{2}$ .

The problem could be solved by different optimization methods. We implemented the following of them in the BioUML software:

stochastic ranking evolution strategy (SRES) [Runarsson and Yao, 2000];
cellular genetic algorithm MOCell [Nebro et al, 2009];
particle swarm optimization (PSO) [Sierra and Coello, 2005];
deterministic method of global optimization glbSolve [Björkman and Holmström, 1999];
adaptive simulated annealing (ASA) [Ingber, 1996].

Table 1 shows the generic scheme of the optimization process for these methods. SRES, MOCell, PSO and glbSolve run a predefined number of iterations $N_{i t}$ considering a sequence of sets (populations) $P^{i}$ , $i = 0, \dots, N_{i t} - 1$ , of potential solutions (guesses). In the case of the first three methods, the size $s \in ℕ^{+}$ of the population is fixed, whereas in glbSolve the initial population $P^{0}$ consists of one guess, while the size $s_{k + 1}$ of the population $P^{k + 1}$ is found during the iteration with the number $k = 0, \dots, N_{i t} - 1$ . The method ASA considers sequentially generated guesses $x^{k} \in Ω$ , $k \in ℕ^{+}$ , and stops if distance between $x^{k}$ and $x^{k + 1}$ defined as Euclidean norm $x^{k} - x^{k + 1} = \sqrt{\sum_{i = 1}^{N} {(x_{i}^{k} - x_{i}^{k + 1})}^{2}}$ becomes less than a predefined accuracy $ε$ .

Table 1. An overview of the optimization process for methods SRES, MOCell, PSO, glbSolve and ASA.

Step	SRES, MOCell, PSO	glbSolve
1	Set $k = 0$ .
2	Generate $P^{0} = {x_{i}^{0} \in Ω, i = 1, \dots, s}$ , $s \in ℕ^{+}$ . Find the best guess $y \in P^{0}$ . Set $x^{min} = y$ .	Generate $P^{0} = {x^{0} \in Ω}$ . Set $x^{min} = x^{0}$ .	Generate $x^{0} \in Ω$ . Set $x^{min} = x^{0}$ , $err = + \infty$ .
3	Evaluate values of the functions $ϕ$ and $ψ$ for all guesses $P^{0}$ .		Evaluate $ϕ (x^{0})$ and $ψ (x^{0})$ .
4	If a predefined number of iterations $N_{i t}$ is passed, then go to step 9, otherwise go to step 5.		If $err < ε$ , where $ε$ is a predefined accuracy, then go to step 9, otherwise go to step 6.
5	Set $s_{k + 1} = s$ .	Find $s_{k + 1}$ for the current iteration.	–
6	Generate $P^{k + 1} = {x_{i}^{k + 1} \in Ω, i = 1, \dots, s_{k + 1}}$ .		Generate $x^{k + 1} \in Ω$ .
7	Evaluate values of the functions $ϕ$ and $ψ$ for all guesses $P^{k + 1}$ .		Evaluate $ϕ (x^{k + 1})$ and $ψ (x^{k + 1})$ . Set $err = x^{k} - x^{k + 1}$ .
8	Update $x^{min}$ . Increment value $k$ by one, go back to step 4.
9	Return $x^{min}$ as the solution.

All methods, excepting glbSolve, are stochastic and seek global minimum of the function $ϕ$ taking into account the admissible region $ℱ$ . Thus, a guess $x \in Ω$ is more preferable than a guess $y \in Ω$ at some iteration of methods, if $ψ (x) = 0$ and $ψ (y) \neq 0$ or $ψ (x) < ψ (y)$ . The method glbSolve is suited to solve only the problems with $Ω \subseteq ℱ$ . Values of the function $ψ$ are calculated but do not affect on the generation of potential solutions.

Implementing the optimization scheme in BioUML, we designed the OptimizationProblem interface (fig. 1) comprising the following procedures:

getParameters specifies a list of parameters to fit including identification of initial values and variation intervals (upper and lower bounds);
testGoodnessOfFit defines type of the functions $ϕ$ and $ψ$ and evaluates their values for a population of guesses;
getEvaluationsNumber returns the number of passed evaluations during optimization process.

An abstract class OptimizationMethod provides the number of subclasses representing implementation of the foregoing methods. These subclasses involve search of optimal parameters by calling a procedure getSolution depending on the settings of optimization problem.

Figure 1. The class diagram representing implementation of the optimization process in BioUML.

Application of non-linear optimization to systems biology

We assume that a mathematical model of some biological process consists of a set of chemical species $S = {S_{1}, \dots, S_{m}}$ associated with variables $C (t) = (C_{1} (t), \dots, C_{m} (t))$ representing their concentrations, and a set of biochemical reactions $ℛ = {R_{1}, \dots, R_{n}}$ with rates $v (t) = (v_{1} (t), \dots, v_{n} (t))$ depending on a set of kinetic constants $K$ . Reaction rates are modeled by standard laws of chemical kinetics. A Cauchy problem for ordinary differential equations representing a linear combination of reaction rates is used to describe the model behavior over time:

(1)

\frac{d C (t)}{d t} = N \cdot v (C, K, t)

C (0) = C^{0}

Here $N$ is a stoichiometric matrix of $n$ by $m$ . We say that $C^{s s}$ is a steady state of the system (1) if

$N \cdot v (C^{s s}, K, t) = 0$ , $\lim_{t \to \infty} C_{i} (t) = C_{i}^{s s}$ .

Identification of parameters $K$ and initial concentrations $C^{0}$ is based on experimental data represented by a set of points $C_{i}^{e x p} (t_{i j})$ defining dynamics of variables $C_{1} (t), \dots, C_{l} (t)$ , $l \leq m$ , at given times $t_{i j}$ , $j = 1, \dots, r_{i}$ , where $r_{i}$ is the number of such points for the concentration $C_{i} (t)$ , $i = 1, \dots l$ . The problem of parameter identification consists in minimization of the function of deviations defined as the normalized sum of squares [Hoops et al, 2006]:

(2)

ϕ (C^{0}, K) = \sum_{i = 1}^{l} \sum_{j = 1}^{r_{i}} \frac{ω_{\min}}{ω_{i}} \cdot {(C_{i} (t_{i j}) - C_{i}^{e x p} (t_{i j}))}^{2}

where normalization factors $ω_{\min} / ω_{i}$ with $ω_{\min} = \min_{i} ω_{i}$ are used to make all concentration trajectories have similar importance. The weights $ω_{i}$ are calculated by one of the formulas on experimentally measured concentrations: $ω_{i}^{s q} = \sqrt{r_{i}^{- 1} \cdot \sum_{j} {(C_{i}^{e x p} (t_{i j}))}^{2}}$ (mean square value), $ω_{i}^{m e a n} = | r_{i}^{- 1} \cdot \sum_{j} C_{i}^{e x p} (t_{i j}) |$ (mean value) and $ω_{i}^{s t} = \sqrt{ω_{i}^{s q} \cdot ω_{i}^{s q} - ω_{i}^{m e a n} \cdot ω_{i}^{m e a n}}$ (standard deviation).

If we want to consider additional constrains

(3)

g_{s} (C, K) \leq 0

s = 1, \dots, p

holding for concentrations $C (t)$ and parameters $K$ for some period of time $t \in [t_{s}^{s t a r t}, t_{s}^{e n d}]$ , the penalty function is defined as

(4)

ψ (C^{0}, K) = \sum_{s = 1}^{p} {(\frac{\sum_{t = t_{s}^{0}}^{t_{s}^{e n d}} \max {0, g_{s} (C, K)}}{t_{s}^{e n d} - t_{s}^{s t a r t} + 1})}^{2}

This function assumes summation of values $g_{s} (C, K)$ in the nodes of grid defined by an ODE solver to find a numerical solution of the system (1).

In the particular case, experimental data could be represented by steady state values of species concentrations. Then functions $ϕ$ and $ψ$ have the simpler forms:

$ϕ (C^{0}, K) = \sum_{i = 1}^{l} \frac{ω_{\min}}{ω_{i}} \cdot {(C_{i}^{s s} - C_{i}^{e x p_s s})}^{2}$ , $ψ (C^{0}, K) = \sum_{s = 1}^{p} {(\max {0, g_{s} (C^{s s}, K)})}^{2}$

where $C_{i}^{e x p_s s}$ and $C_{i}^{s s}$ , $i = 1, \dots, l$ , denote experimental and simulated steady state values.

Typically, researchers want to perform evaluation of model parameters using experimental data obtained with different experimental conditions, i.e. different initial concentrations $C^{01}, \dots, C^{0 k}$ of species. In such case, we will consider the functions

(5)

ϕ (C^{01}, \dots, C^{0 k}, K) = \sum_{i = 1}^{k} ϕ (C^{0 i}, K)

and

ψ (C^{01}, \dots, C^{0 k}, K) = \sum_{i = 1}^{k} ψ (C^{0 i}, K)

Implementation of the parameter estimation process in BioUML

Initiation of the parameter estimation process requires definition of many details including specification of the search space, the admissible region, settings of numerical methods to solve ODE system and optimization problem, links to the files with experimental data and model description, etc. In order to structure this information, we designed an appropriate hierarchy of classes (fig. 2) taking into account the following rules:

experimental data must be represented by time-courses or steady states of chemical species concentrations; in the first case, these data may be expressed as percentage values obtained, for example, on the basis of the Western-blot technology;
an initial state of the Cauchy problem must be specified for each considered file with experimental data (these states may be different for some experiments);
parameters to fit may be divided into local and global:
- the local parameters can take different values for some groups of experiments (for example, conducted for different cell lines);
- the global parameters have the same value for all experiments.

The main class Optimization in fig. 2 comprises definition of the optimization method and parameters including the model parameters to fit, parametric constraints holding for given time intervals, and experiments. The following fields keep information about each experiment:

cellLine defines an experimental group to evaluate local parameters of the model;
diagramStateName corresponds to the model initial state;
experimentType defines the time-course or steady state type of experimental data;
tableSupport contains the link to the file with experimental data, and specifies a method for calculation of weights $ω_{i}$ in the formula (2);
parameterConnections is a list of correspondences between variable names in the model and in the experimental file including information whether data in this file are expressed as exact values or as values related to a given time point.

Figure 2. The class diagram of the optimization plug-in in BioUML.

An object of the OptimizationMethod class contains a set of parameters (methodParameters) including links to the file with the processed model (diagramPath) and the directory to save results when the parameter estimation will be completed (resultPath). The estimation process begins immediately after the object control gets command doRun. Goodness of the current fit is defined through the object optimizationProblem providing correct simulation of the model. Firstly, evaluation of functions (2) and (4) is performed separately for all initial states of the model by calling a method testGoodnessOfFit for each object of the class SingleExperimentParameterEstimation associated with the certain experiment. Then the total values of these functions are calculated in the class ParameterEstimationProblem by the formulas (5).

The parameter estimation process is optimized using the following technologies:

Acceleration of simulation of the Cauchy problem for different values of fitted parameter is achieved by automatic generation and compilation of the Java class file at the first iteration of the optimization algorithm. At the subsequent iterations, current values are passed to the object of this compiled class and a solution of the Cauchy problem is found.
Acceleration of the optimization methods considering a population of guesses is achieved by parallelization of calculations. The following task (SimulationTask in fig. 2) is generated for each guess:
- find a solution or a steady state of the system (1) for the adjustable parameter values;
- evaluate values of the objective function (2) and the penalty function (4)
When all tasks are generated, they are passed to the executor service, which distributes their performance between the predefined threads.

Fig. 3 shows a graphical user interface of optimization methods in the BioUML workbench (desktop edition). The upper left panel includes a list of methods. Description of the selected method is provided below. The upper right panel defines the search space. Under it you can find the tab panel with settings of optimization problem. In fig. 3 the selected tab contains method parameters and fields displaying intermediate values of objective/penalty functions and the number of passed evaluations. The next tab includes description of experimental data and specifies settings for all experiment fields listed above.

Figure 3. The user interface of the optimization plug-in provided by the BioUML software.

Web edition optimization

BioUML web edition (http://ie.biouml.org/bioumlweb/) is a web application providing access to BioUML tools and data via the Internet. The user can manipulate the data stored on the BioUML server and run analyses through the web browser. The web interface is a set of HTML pages with interactive JavaScript content. Ajax technology is used to communicate with the server and process user activities without page reloading.

The web edition provides a set of optimization tools, like in the workbench edition. The user can set up the boundaries and initial values of optimization parameters, select and adjust an optimization method, manage experimental data, set up constraints. After all parameters are adjusted, optimization process can be launched. Optimization results can be saved and then viewed as a table of fitted parameter values. Graphical representation of the optimization process is also available.

Analysis of the methods convergence

Stochastic methods (SRES, MOCell, PSO and ASA) for global optimization rely on probabilistic approaches and have weak theoretical guarantees of convergence to the global optimum. However, they can locate its vicinity with relative efficiency [Moles et al, 2003]. In contrast, deterministic method glbSolve guarantees global optimality, if the objective function is continuous or at least continuous in the neighborhood of a global optimum [Björkman and Holmström, 1999]. However, it can not solve general global optimization problems with certainty in finite time.

To analyze convergence rate of the implemented methods, we considered a reaction chain (fig. 4, table 2) extracted from the model by L. Neumann et al. [Neumann et al, 2010] and representing activation of caspase-8 triggered by the receptor CD95 (APO-1/Fas).

Figure 4. The test model of caspase-8 activation.

Table 2. List of reactions of caspase-8 activation.

№	Reactions	Kinetic laws
r1	CD95L + FADD:CD95R → DISC	$k_{1} \cdot C_{C D 95 L} \cdot C_{C D 95 R : F A D D}$
r2	DISC + pro8 → DISC:pro8	$k_{2} \cdot C_{D I S C} \cdot C_{p r o 8}$
r3	DISC:pro8 + pro8 → 2 ⋅ p43⁄p41	$k_{3} \cdot C_{D I S C : p r o 8} \cdot C_{p r o 8}$
r4	2 ⋅ p43/p41 → casp8	$k_{4} \cdot C_{p 43 / p 41}^{2}$
r5	casp8 →	$k_{5} \cdot C_{c a s p 8}$

We performed estimation of parameters using the search space defined as

$0 \leq k_{1} \leq 1$ , $0 \leq k_{5} \leq 0.1$ , $0 \leq k_{i} \leq 10^{- 3}$ , $i = 2, 3, 4,$

where upper bounds were chosen based on the order of magnitude of parameter values proposed by authors of the original model. Initial values of variables were fixed according to [Neumann et al, 2010]:

$C_{C D 95 L} (0) = 113.220$ , $C_{C D 95 R : F A D D} (0) = 91.266$ , $C_{p r o 8} (0) = 64.477$ .

Estimation was based on the experimental data obtained by Neumann et al. for procaspase-8 and its cleaved products p43/p41 and caspase-8 (table 3).

Table 3. Experimental data obtained by Neumann et al. for total procaspase-8 (pro-8), p43/p41 and caspase-8 (casp-8).

Time (min ^-1 )	Concentrations (nM)
Time (min ^-1 )	p43/p41	pro-8	casp-8
0.0	0.058	59.963	0.000
10.0	0.268	57.565	0.041
20.0	4.760	58.590	0.316
30.0	8.252	59.422	1.397
45.0	16.144	48.190	3.520
60.0	17.021	38.950	3.947
90.0	15.269	23.502	4.871
120.0	12.530	13.127	4.878
150.0	10.335	10.703	4.228

Fig. 5 shows dependence of the objective function mean values on the number of considered guesses for 100 runs of the optimization process. Statistics of the best, mean and worst values of $ϕ$ , as well as the best guesses found by the methods after consideration of 10⁷ guesses are listed in the tables 4 and 5 correspondently.

As can be seen from these tables, the best result was obtained by the particle swarm optimization (PSO) and the cellular genetic algorithm (MOCell). Methods SRES, MOCell and PSO found similar solutions. Methods ASA and glbSolve found dissimilar values of parameters $k_{1}$ and $k_{2}$ resulting in lower efficiency compared to the first three methods.

For comparison, some test cases considered in the study by Moles et al. [Moles et al, 2003] resulted in superiority of SRES. However, the authors did not explore such methods as genetic algorithms and particle swarm optimization.

Figure 5. Dynamics of mean values of the objective function for 100 runs of the optimization process. The best value obtained by the particle swarm optimization is marked by the red line.

Table 4. Statistics of values of the function $ϕ$ for 100 runs of the optimization process (the number of considered guesses was 10⁷).

Methods	The best value of $ϕ$	The mean value of $ϕ$	The worst value of $ϕ$
PSO	11.787	13.164	14.703
MOCell	12.082	13.484	14.771
SRES	12.466	14.987	18.283
ASA	13.728	15.794	16.610
glbSolve	16.614	16.614	16.614

Table 5. The best guesses obtained by optimization methods for 100 runs of the optimization process.

Parameters	SRES	MOCell	PSO	ASA	glbSolve
$k_{1}$	0.0004691	0.0004611	0.0004277	0.0001028	0.0020576
$k_{2}$	0.0002059	0.0002046	0.0002155	0.0007875	0.0001228
$k_{3}$	0.0009999	0.0010000	0.0009984	0.0009930	0.0009527
$k_{4}$	0.0007915	0.0008225	0.0008419	0.0008117	0.0007790
$k_{5}$	0.0325900	0.0336720	0.0334167	0.0334118	0.0313443

Comparison of parameter estimation features of BioUML with other software

We compared optimization tools of BioUML with the following software applications assigned for analysis of biochemical networks and supporting the procedure of model fitting:

COPASI (Complex Pathway Simulator) – a stand-alone program providing an C++ API [Hoops et al, 2006];
AMIGO (Advanced Model Identification using Global Optimization) – a multi-platform (Windows and Linux) MATLAB-based toolbox [Balsa-Canto and Banga, 2011];
SBToolbox 2 (Systems Biology Toolbox 2) – a part of SBPOP Package requiring MATLAB[Schmidt and Jirstrand, 2005];
PET (Parameter Estimation Toolkit) – a graphical user interface intended to run under Windows, Mac OS X, and Unix [Shaffer et al, 2009];
PottersWheel – a framework designed as a MATLAB toolbox [Maiwald and Timmer, 2008].

Details of the comparison are given in table 6.

Table 6. Comparison of the parameter estimation features for different software applications.

Features	BioUML	COPASI	AMIGO	SBToolbox 2	PET	PottersWheel
Environment	Java	C++	MATLAB	MATLAB	Perl, Gtk+	MATLAB
Experimental data:
– Multi-experiment fitting	+	+	+	+	+	+
– Experiment types:
– time course	+	+	+	+	+	+
– steady-state	+	+	−	−	+	−
– Individual initial state of the model for each experiment	+	−	+	+	+	+
– Error bars	−	−	+	+	−	+
– Normalization of data using weights	+	+	+	+	+	+
Local (experiment dependent) and global parameters	+	−	+	+	−	−

Further, we compared computation speed of the optimization methods implemented in BioUML and COPASI. For this purpose, we considered a series of test cases. A brief description of the models used in these test cases is provided below. For more details, including specification of experimental data and fitting parameters, see Additional file 1 of the supplementary materials.

In the first test case, we analyzed three models of CD95-induced caspase-8 activation constructed on the basis of the model by Neumann et al. [Neumann et al, 2010] with varying degrees of detail (fig. 6, A-C). The second test case was proposed by Mendes et al. [Mendes et al, 2009] and corresponded to the model of the MAP kinase cascade (fig. 7) developed by Kholodenko et al. [Kholodenko, 2000]. Finally, we tested the model of Bagci et al. [Bagci et al, 2006] representing the mitochondria-depended apoptosis resulting from the cooperative formation of heptameric apoptosome complex and activation of caspase-9 and caspase-3 (fig. 8).

Figure 6. The models of caspase-8 activation constructed based on the model by Neumann et al. with the varying number of species: 7 (A), 13 (B), and 18 (C).

Figure 7. The model of the MAP kinase cascade constructed by Kholodenko et al.

Figure 8. The model of the mitochondria-depended apoptosis proposed by Bagci et al.

Analyzing the number of the objective function evaluations per second for these test cases, we found that BioUML showed a better result than COPASI (fig. 9).

Figure 9. The number of the objective function evaluations per second for different test cases in COPASI and BioUML: the model by Neumann et al. including 7 (A), 13 (B), and 18 (C) species; the model by Kholodenko et al. consisting of 8 species (D); the model of Bagci et al. consisting of 32 species (E).

Discussion

In this paper, we considered parameter estimation tools of the BioUML software. These tools can be applied to biological systems characterized with a set of ODEs. The fitting process is based on experimental time course or steady state measurements, and assumes minimization of the function of error between these measurements and the corresponding model prediction. We implemented several stochastic and deterministic global optimization methods as new plug-in for BioUML. None of these methods is effective for all cases. Nevertheless, on the basis of our observations, we concluded that adaptive simulated annealing can be used when it is necessary to quickly find the vicinity of the solution. In the case when adequacy of solution is more preferable than the rate of convergence, it is better to use such methods as MOCell, PSO and SRES.

Parameter fitting is an important part of the quantitative biological modeling. However, if the model includes more elements than are necessary to approximate experimental data with the given accuracy, we face the problem of overfitting [Hawkins, 2004]. In this case, there is no overall best solution and it is expedient to find distribution of the parameter values which are compatible with observed experimental dynamics. For this purpose, we should run parameter estimation process many times (it is better with different optimization methods) and evaluate bounds for all parameters. Implementation of this technique in BioUML is a task for the future work.

We successfully applied our optimization plug-in for creation of the combined model of CD95 and NF-κB signaling pathways [Kutumova et al, 2013], where the problem of parameter overfitting was solved using the methodology of model reduction [Gorban et al, 2010].

References

Bagci E Z, Vodovotz Y, Billiar T R, Ermentrout G B, Bahar I, authors. Bistability in apoptosis: roles of bax, bcl-2, and mitochondrial permeability transition pores. Biophys J. 1–3;2006;(5)90:1546–1559. DOI:10.1529/biophysj.105.068122 [PMID:16339882]

Balsa-Canto Eva, Banga Julio R, authors. AMIGO, a toolbox for advanced model identification in systems biology using global optimization. Bioinformatics. 17–6;2011;(16)27:2311–2313. DOI:10.1093/bioinformatics/btr370 [PMID:21685047]

Björkman M, Holmström K, authors. Global Optimization Using the DIRECT Algorithm in Matlab. Advanced Modeling and Optimization. 1999;(2)1:17–37

Brown Peter N, Byrne George D, Hindmarsh Alan C, authors. VODE: A Variable-Coefficient ODE Solver. SIAM J. Sci. and Stat. Comput. 1989;(5)10:1038–1051. ISSN: 0196-5204 DOI:10.1137/0910062

Dormand J R, Prince P J, authors. A family of embedded Runge-Kutta formulae. Journal of Computational and Applied Mathematics. 1980;(1)6:19–26. ISSN: 03770427 DOI:10.1016/0771-050X(80)90013-3

Gorban A N, Radulescu O, Zinovyev A Y, authors. Asymptotology of chemical reaction networks. Chemical Engineering Science. 2010;(7)65:2310–2324. ISSN: 00092509 DOI:10.1016/j.ces.2009.09.005

Hairer E, Wanner G, authors. Solving Ordinary Differential Equations II: Stiff and Differential-Algebraic Problems. 1996. 2nd revised. Berlin: Springer;

Hawkins Douglas M, author. The problem of overfitting. J Chem Inf Comput Sci. 2004;(1)44:1–12. DOI:10.1021/ci0342472 [PMID:14741005]

Hoops Stefan, Sahle Sven, Gauges Ralph, Lee Christine, Pahle Jürgen, Simus Natalia, Singhal Mudita, Xu Liang, Mendes Pedro, Kummer Ursula, authors. COPASI--a COmplex PAthway SImulator. Bioinformatics. 10–10;2006;(24)22:3067–3074. DOI:10.1093/bioinformatics/btl485 [PMID:17032683]

Hucka M, Finney A, Sauro H M, Bolouri H, Doyle J C, Kitano H, Arkin A P, Bornstein B J, Bray D, Cornish-Bowden A, Cuellar A A, Dronov S, Gilles E D, Ginkel M, Gor V, Goryanin I I, Hedley W J, Hodgman T C, Hofmeyr J-H, Hunter P J, Juty N S, Kasberger J L, Kremling A, Kummer U, Le Novère N , Loew L M, Lucio D, Mendes P, Minch E, Mjolsness E D, Nakayama Y, Nelson M R, Nielsen P F, Sakurada T, Schaff J C, Shapiro B E, Shimizu T S, Spence H D, Stelling J, Takahashi K, Tomita M, Wagner J, Wang J, authors. The systems biology markup language (SBML): a medium for representation and exchange of biochemical network models. Bioinformatics. 1–3;2003;(4)19:524–531. [PMID:12611808]

Ingber L, author. Adaptive simulated annealing (ASA): Lessons learned. Control and Cybernetics. 1996;(1)25:33–54

Kholodenko B N, author. Negative feedback and ultrasensitivity can bring about oscillations in the mitogen-activated protein kinase cascades. Eur J Biochem. 2000;(6)267:1583–1588. [PMID:10712587]

Kutumova Elena, Zinovyev Andrei, Sharipov Ruslan, Kolpakov Fedor, authors. Model composition through model reduction: a combined model of CD95 and NF-κB signaling pathways. BMC Syst Biol. 15–2;2013;7:13 DOI:10.1186/1752-0509-7-13 [PMID:23409788]

Maiwald Thomas, Timmer Jens, authors. Dynamical modeling and multi-experiment fitting with PottersWheel. Bioinformatics. 9–7;2008;(18)24:2037–2043. DOI:10.1093/bioinformatics/btn350 [PMID:18614583]

Mendes Pedro, Hoops Stefan, Sahle Sven, Gauges Ralph, Dada Joseph, Kummer Ursula, authors. Computational modeling of biochemical networks using COPASI. Methods Mol Biol. 2009;500:17–59. DOI:10.1007/978-1-59745-525-1_2 [PMID:19399433]

Moles Carmen G, Mendes Pedro, Banga Julio R, authors. Parameter estimation in biochemical pathways: a comparison of global optimization methods. Genome Res. 14–10;2003;(11)13:2467–2474. DOI:10.1101/gr.1262503 [PMID:14559783]

Nebro Antonio J, Durillo Juan J, Luna Francisco, Dorronsoro Bernabé, Alba Enrique, authors. MOCell: A cellular genetic algorithm for multiobjective optimization. Int. J. Intell. Syst. 2009;(7)24:726–746. ISSN: 08848173 DOI:10.1002/int.20358

Neumann Leo, Pforr Carina, Beaudouin Joel, Pappa Alexander, Fricker Nicolai, Krammer Peter H, Lavrik Inna N, Eils Roland, authors. Dynamics within the CD95 death-inducing signaling complex decide life and death of cells. Mol Syst Biol. 9–3;2010;6:352 DOI:10.1038/msb.2010.6 [PMID:20212524]

Runarsson T P, Yao Xin, authors. Stochastic ranking for constrained evolutionary optimization. IEEE Trans. Evol. Computat. 2000;(3)4:284–294. ISSN: 1089778X DOI:10.1109/4235.873238

Schmidt Henning, Jirstrand Mats, authors. Systems Biology Toolbox for MATLAB: a computational platform for research in systems biology. Bioinformatics. 29–11;2005;(4)22:514–515. DOI:10.1093/bioinformatics/bti799 [PMID:16317076]

Shaffer Clifford A, Zwolak Jason W, Randhawa Ranjit, Tyson John J, authors. Modeling molecular regulatory networks with JigCell and PET. Methods Mol Biol. 2009;500:81–111. DOI:10.1007/978-1-59745-525-1_4 [PMID:19399431]

Sierra Margarita R, Coello Carlos A Coello, authors. Improving PSO-Based Multi-objective Optimization Using Crowding, Mutation and ∈-Dominance. Evolutionary Multi-Criterion Optimization. 2005. p. 505–519. Berlin, Heidelberg: Springer Berlin Heidelberg; DOI:10.1007/978-3-540-31880-4_35

Refbacks

There are currently no refbacks.

E-mail
Password
Remember me

Virtual Biology

Journal Information

Article Information

BioUML plug-in for nonlinear parameter estimation using multiple experimental data

Abstract

Keywords

Introduction

Optimization problem in BioUML

Table 1. An overview of the optimization process for methods SRES, MOCell, PSO, glbSolve and ASA.

Figure 1. The class diagram representing implementation of the optimization process in BioUML.

Application of non-linear optimization to systems biology

(1)

(2)

(3)

(4)

(5)

Implementation of the parameter estimation process in BioUML

Figure 2. The class diagram of the optimization plug-in in BioUML.

Figure 3. The user interface of the optimization plug-in provided by the BioUML software.

Web edition optimization

Analysis of the methods convergence

Figure 4. The test model of caspase-8 activation.

Table 2. List of reactions of caspase-8 activation.

Table 3. Experimental data obtained by Neumann et al. for total procaspase-8 (pro-8), p43/p41 and caspase-8 (casp-8).

Figure 5. Dynamics of mean values of the objective function for 100 runs of the optimization process. The best value obtained by the particle swarm optimization is marked by the red line.

Table 4. Statistics of values of the function ϕ for 100 runs of the optimization process (the number of considered guesses was 107).

Table 5. The best guesses obtained by optimization methods for 100 runs of the optimization process.

Comparison of parameter estimation features of BioUML with other software

Table 6. Comparison of the parameter estimation features for different software applications.

Figure 6. The models of caspase-8 activation constructed based on the model by Neumann et al. with the varying number of species: 7 (A), 13 (B), and 18 (C).

Figure 7. The model of the MAP kinase cascade constructed by Kholodenko et al.

Figure 8. The model of the mitochondria-depended apoptosis proposed by Bagci et al.

Discussion

References

Refbacks

Table 4. Statistics of values of the function $ϕ$ for 100 runs of the optimization process (the number of considered guesses was 10⁷).