Dräger, Andreas

Computational Modeling of Biochemical Networks

Ph.D. thesis, University of Tuebingen, Verlag Dr. Hut, Sternstraße 18, München, Tübingen, Germany, 2011

Abstract

In recent years biology has shifted its focus from the mainly descriptive exploration of individual organisms, cells, or molecules towards a holistic investigation of the complex interactions within living systems. This progress is driven by the development of large-scale measurement techniques, e.g., the timely resolved simultaneous quantification of many cellular components. The exploration of these data sets requires the application of methods from other areas of natural science and the development of novel analysis methods. At this intersection systems biology aims to understand mutual interactions and influences between biological components. Control mechanisms and regulation of processes, which often show a nonlinear behavior, are to be described. Predicting the dynamics of complex interaction systems constitutes its main challenge.

Computational modeling methods play a central role to attain this goal. Without loss of generality, we here focus on metabolic systems in preparation for an extension to gene-regulatory systems. Physico-chemical, in particular thermodynamic, constraints restrict the investigated systems and define the domain of plausible and valid models. Setting up such a mathematical description is not only complicated and highly error-prone, it also requires knowledge from many different fields and is therefore not easily applicable for non-experts. This thesis aims at the development of a method that performs the model construction steps to the widest extent automatically, reducing the number of necessary human interactions to a minimum but that still leads to thermodynamically feasible and correct model systems.

To this end, it introduces a five-step modeling pipeline that ultimately leads to a mathematical description of a biochemical reaction system. We discuss how to automate each individual step and how to put these steps together: First, we create a topology of interconversion processes and mutual influences between reactive species encoded in the Systems Biology Markup Language (SBML) including semantic information. Second, a procedure is established that generates kinetic equations in a context-sensitive manner. The resulting model can then be combined with already existing models. Third, we estimate the values of all newly introduced parameters in each created rate law. This procedure requires that a time series of quantitative measurements of the reactive species within this system be available, because we calibrate the parameters with the aim that the model will fit these experimental data. The applicability and functioning of these approaches is demonstrated on a model of the valine and leucine biosynthesis in C. glutamicum. In a large-scale benchmark the results of seven modeling techniques are systematically compared when estimating their parameters with a multitude of nature-inspired heuristic optimization processes. This step is crucial for the development of an automatic modeling procedure because it highlights the strength and weaknesses of all these approaches. Fourth, an experimental validation of the resulting model is advisable. Fifth, a model report is generated automatically to document the model with all of its components.

In an extension, this computer-aided modeling pipeline is further developed to a fully automatic procedure, the AMUSE algorithm (Automated Modeling Using Specialized Enzyme kinetics). The network topology in form of an SBML file and biological reference data, i.e., at least one time-series of participating metabolite concentrations and some relevant target fluxes, build its only required input. Based on latest estimation methods for standard reaction Gibbs energies, AMUSE determines a thermodynamically feasible, minimal equilibrium configuration of the system, identifies the key reactions and selects kinetic equations describing all reaction velocities. Furthermore, it estimates all remaining parameters with respect to given experimental data.

For a better understanding, we begin with an introduction to fundamental modeling and generalized approaches for common rate equations. Subsequently, parameter estimation techniques are introduced and evaluated on the valine and leucine biosynthesis in C. glutamicum. Thereafter, standardization attempts in systems biology will be discussed that are required for the computer-aided modeling pipeline. This pipeline necessitates extending the existing standards. In particular, the development and implementation of JSBML, the Java library for SBML, has been conducted to simplify computation and manipulation of complex models. A discussion of possible further improvements and an extension of the suggested modeling pipeline to gene-regulatory networks completes this thesis.

Downloads and Links

[pdf]

BibTeX

@phdthesis{Draeger2011a,
  author = {Dr\"ager, Andreas},
  title = {Computational Modeling of Biochemical Networks},
  school = {University of Tuebingen},
  year = {2011},
  address = {T\"ubingen, Germany},
  month = jan,
  abstract = {In recent years biology has shifted its focus from the mainly descriptive
	exploration of individual organisms, cells, or molecules towards
	a holistic investigation of the complex interactions within living
	systems. This progress is driven by the development of large-scale
	measurement techniques, e.g., the timely resolved simultaneous quantification
	of many cellular components. The exploration of these data sets requires
	the application of methods from other areas of natural science and
	the development of novel analysis methods. At this intersection systems
	biology aims to understand mutual interactions and influences between
	biological components. Control mechanisms and regulation of processes,
	which often show a nonlinear behavior, are to be described. Predicting
	the dynamics of complex interaction systems constitutes its main
	challenge.

        Computational modeling methods play a central role to attain this goal.
        Without loss of generality, we here focus on metabolic systems in preparation
        for an extension to gene-regulatory systems. Physico-chemical, in particular
        thermodynamic, constraints restrict the investigated systems and define the
        domain of plausible and valid models. Setting up such a mathematical
        description is not only complicated and highly error-prone, it also requires
        knowledge from many different fields and is therefore not easily applicable
        for non-experts. This thesis aims at the development of a method that performs
        the model construction steps to the widest extent automatically, reducing the
	number of necessary human interactions to a minimum but that still leads to
        thermodynamically feasible and correct model systems.

        To this end, it introduces a five-step modeling pipeline that ultimately
	leads to a mathematical description of a biochemical reaction system.
	We discuss how to automate each individual step and how to put these
	steps together: First, we create a topology of interconversion processes
	and mutual influences between reactive species encoded in the Systems
	Biology Markup Language (SBML) including semantic information. Second,
	a procedure is established that generates kinetic equations in a
	context-sensitive manner. The resulting model can then be combined
	with already existing models. Third, we estimate the values of all
	newly introduced parameters in each created rate law. This procedure
	requires that a time series of quantitative measurements of the reactive
	species within this system be available, because we calibrate the
	parameters with the aim that the model will fit these experimental
	data. The applicability and functioning of these approaches is demonstrated
	on a model of the valine and leucine biosynthesis in \emph{C.~glutamicum}.
	In a large-scale benchmark the results of seven modeling techniques
	are systematically compared when estimating their parameters with
	a multitude of nature-inspired heuristic optimization processes.
	This step is crucial for the development of an automatic modeling
	procedure because it highlights the strength and weaknesses of all
	these approaches. Fourth, an experimental validation of the resulting
	model is advisable. Fifth, a model report is generated automatically
	to document the model with all of its components.

        In an extension, this computer-aided modeling pipeline is further developed
        to a fully automatic procedure, the AMUSE algorithm (Automated Modeling
        Using Specialized Enzyme kinetics). The network topology in form of an
	SBML file and biological reference data, i.e., at least one time-series
	of participating metabolite concentrations and some relevant target
	fluxes, build its only required input. Based on latest estimation
	methods for standard reaction Gibbs energies, AMUSE determines a
	thermodynamically feasible, minimal equilibrium configuration of
	the system, identifies the key reactions and selects kinetic equations
	describing all reaction velocities. Furthermore, it estimates all
	remaining parameters with respect to given experimental data.

        For a better understanding, we begin with an introduction to fundamental
	modeling and generalized approaches for common rate equations. Subsequently,
	parameter estimation techniques are introduced and evaluated on the
	valine and leucine biosynthesis in \emph{C.~glutamicum}. Thereafter,
	standardization attempts in systems biology will be discussed that
	are required for the computer-aided modeling pipeline. This pipeline
	necessitates extending the existing standards. In particular, the
	development and implementation of JSBML, the Java library for SBML,
	has been conducted to simplify computation and manipulation of complex
	models. A discussion of possible further improvements and an extension
	of the suggested modeling pipeline to gene-regulatory networks completes
	this thesis.},
  isbn = {978-3-86853-850-2},
  keywords = {automated modeling, AMUSE algorithm, C. glutamicum, enzyme kinetics,
	evolutionary algorithms, heuristic optimization, JSBML, leucine biosynthesis,
	metabolic modeling, gene-regulatory network, parameter estimation,
	rate law, SBML, SBML2LaTeX, SBMLsqueezer, valine biosynthesis},
  publisher = {Verlag Dr.~Hut, Sternstra{\ss}e 18, M\"unchen},
  url = {http://www.dr.hut-verlag.de/978-3-86853-850-2.html}
}