1. To analyze existing evolutionary simulation systems.
2. To formulate a general framework for evolutionary simulations from the above analysis.
3. To design and implement a class library to facilitate the construction of evolutionary simulations based on the general framework.
4. To apply the class library to published examples of evolutionary simulation to demonstrate the abilities of the toolkit.
The first objective was accomplished in Chapter 2 beginning with an overview of
the two foremost areas of biologically inspired computational methods: genetic
algorithms and artificial neural networks. This was followed by a survey of
several representative evolutionary simulation systems. The second objective
was satisfied in the third chapter with the description of a general framework
for evolutionary simulation. Chapter 4 began with a discussion of the
developer-oriented design methodology. A framework of classes was proposed
based on the set of abstract components derived from the analysis. Chapter 5
completed the third objective with a discussion of the implementation of the
class framework. The fourth objective was satisfied in Chapter 6 with a
description of three test applications.
Discussion
The flow of the thesis follows a traditional software-engineering model for
application development. System requirements are gathered through an analysis
of the problem domain. The design process translates these requirements into a
representation of software components. The implementation translates the
abstract design into a real system. Finally, the system is tested to ensure the
requirements have been met.
To understand the problem domain, a general framework for evolutionary systems was formulated from an examination of several representative systems. The framework is comprised of several criteria in two categories: the simulation model and the implementation model. The simulation model criteria focus on the abstractions of the simulations including the physics and biology while the implementation model criteria focus on programmer and user issues such as implementation platform and extensibility. The surveyed systems were re-examined within the context of the general framework. This allowed common features of the systems to be factored out into abstract components which became the basis for the design requirements of the toolkit.
The components of the toolkit were designed using a developer-oriented design methodology, a variation of user-centered design that advocates focusing on the needs of the developer. The toolkit domain was narrowed to that of evolutionary ethological simulations, that is, programs that evolve behavior through genetic algorithms. The delivery platform was chosen to be Microsoft Windows running on PCs because it is the one most accessible to biology labs. The components of the toolkit were designed to serve two purposes: the ability to be assembled into new simulations, and the ability to be extended into new components. This design choice led to a discussion of how object-oriented programming concepts and techniques provide a practical means to fulfilling the design requirements. The resulting design consisted of a set of base classes:
· an environment class represents the virtual world where the simulation occurs,
· a thing class represents objects that inhabit the world,
· an agent class is a specialization of thing which displays autonomous behavior,
· a program class which drives an agent's behavior,
· a phenotype class is a specialization of agent which evolves over time, and
· a genotype class that stores an evolvable representation of a phenotype.
These classes and several specialized classes were implemented in C++ and tested in three applications.
The implementation of Conway's Game of Life demonstrated some of the basic components of the toolkit, namely a two-dimensional discrete environment inhabited by a number of interacting autonomous agents. The implementation of the Genesys/Tracker system introduced the use of the evolutionary components of the framework, specifically populations, genotypes, and phenotypes. The Prisoner's Dilemma simulation brought all the components of the class framework together with its two simultaneously evolving populations of autonomous agents with complex interactions.
The three test simulations demonstrate the capabilities of the current set of classes implemented in the framework, however they are too similar to be a good test for generality. All three make use of the class which implements a discrete 2-dimensional environment. Further testing with more diverse applications will certainly reveal some limitations in the base classes. In the same way that ostensibly portable source code is not truly portable until it has actually been ported to at least one other platform, a general base class is not truly general until it has more than one derived class. Under this criterion, four of the base classes (bioWorld, bioPopulation, bioGType, and bioPType) have not been shown to be general. Since iteration is arguably the most important step of the developer-oriented design methodology, the design of the SIMBIOSYS framework cannot be considered complete until the process has been repeated several times.
Despite its deficiencies, the SIMBIOSYS framework in its present form has
utility as a class framework for facilitating the development of evolutionary
simulations. In many ways it provides merely a skeleton for a framework, that
is, the abstract base classes specify the overall architecture of the
simulation at a high level but the lower level details, in the form of derived
classes, have yet to be implemented. This and other enhancements and extensions
to the system will be discussed in the next section.
Future
Work
Further enhancements to the SIMBIOSYS framework fall into three categories:
implementation improvements, specialized base classes, and new facilities. The
first refers to the aspects of the framework design which did not make it into
the implementation described here. The second category refers to filling out
the class hierarchy with simulation components derived from the framework's
abstract base classes that could be useful in constructing new systems.
Finally, some useful facilities that fall outside the original design of the
framework will be discussed.
Implementation
Enhancements
Evolution does not have an ultimate goal and the same can be said for some
evolutionary simulations. Often a simulation is started without a predefined
limit to the length of time it will run, but rather just to see what kinds of
patterns and behaviors evolve. For these types of experiments which can last
for many days, it would be very useful to have the capability to interrupt a
simulation in progress, save its state, and restart it again at a later time.
This would require some support at the library level in the form of
persistence. Each class would have to be able to save enough information to a
secondary storage device (e.g. the disk) so that the objects could be recreated
in the same state during a separate run of the program.
The implementation of the framework makes use of random number functions to introduce unpredictability into the simulation, as in the mutation operator in the bioGType class for example. These functions present a problem to persistence. To arrive at the same state, a random number generator will usually have to be seeded with the same value and be invoked the same number of times. The first part is easy for a separate invocation of the simulation, but keeping track of the number of times the function was called in the previous invocation and repeating them would be a major inconvenience. This indicates that the random number generator should be encapsulated in a new framework class with persistence capability.
An entire category of design elements, the Instrument classes described
on p.70, were not realized in the implementation. At the time the test
simulations were being developed it was simply more convenient to separate the
user interface into a Visual Basic module. This solution may not be as elegant
as it could have been because the development was split across two loosely
coupled environments: C++ and Visual Basic. However this deficiency may be
looked at as an opportunity, with a slight change in perspective, because the
final implementation ended up being more portable than it would have been
following the original design. Currently the main barrier to portability
between PC and UNIX systems is the user interface; the C++ compilers are
relatively standardized across platforms but the standard PC user-interface
(Microsoft Windows) is incompatible with the one on UNIX (X/Motif). With the
user-interface component removed (or at least deferred), the rest of the
library is portable with little change. Work towards this end is already
underway with the application of the SIMBIOSYS framework to an economic
simulation based on [Tesfatsion 94].
Specialized
Classes
The evaluation of the current implementation of the class framework
demonstrated that it can be successfully applied to the development of
evolutionary simulations. Much of its value comes from its extensibility, as
the framework was designed to be easily specialized to the specific
requirements of new applications. This leaves room for a middle layer of
functionality in the form of classes derived from the top-level abstract base
classes in the framework. These classes would be partly specialized so that
they alleviate some, if not all, of the effort needed to assimilate them into
an application. The bioCellWorld class is an example of this type of middle
layer component: it provides implementations for all five of the pure virtual
functions specified by the bioWorld base class. The trade off is that it is not
as general as the bioWorld class so it not usable in all applications.
More specialized classes derived from bioWorld are implied by the different metaphysics observed in the surveyed simulation systems. A Tierra-like system could make use of a linear discrete world to simulate its RAM environment and several simulations displayed a two-dimensional continuous world. A sophisticated implementation of a three-dimensional world will be in demand to facilitate research along the lines of [Terzopolous 94] and [Sims 94].
Similarly, the branches of the class hierarchy under bioProgram and bioGType
can be usefully extended. The bioNNet provides a very basic implementation of
PDP programs as described in the section on Neural Networks on p. 17,
however a demand for more sophisticated models is anticipated because the
interaction between evolution and learning is becoming a popular area of
research [Beer 92; Belew 90; Yamauchi 94]. The existence of the bioHapGType
implies a diploid counterpart which has yet to be realized. Another variation
of the haploid genotype that is not so obvious is a variable length version,
that is, a genotype which can become longer or shorter through the use of the
crossover operators.
Framework
Extensions
Morphogenesis refers to the developmental process of an organism, or cast in
the terms of the class framework, the process of translating an instance of
class bioGType into a corresponding instance of class bioPType. It was
previously mentioned that facilities to model morphogenesis were explicitly
excluded in order to limit the scope of this project. The developmental process
has many similarities to the evolutionary process in that global patterns
emerge from the interaction of many units [Taylor 92; Prusinkiewicz 94].
It may be possible to exploit these similarities within the class framework by
adding facilities to create simulations within simulations.
On a single processor computer, the intrinsic parallelism of evolutionary simulations has to be simulated. The class framework does this crudely by iterating over a list of all the agents inhabiting the world at a given time step, allowing each to examine its local environment and decide on its next intended action. A disadvantage of using this method is it places no limitations on the amount of actual time each agent may take to decide so there is no evolutionary pressure for efficiency. Sophisticated new operating systems such as Windows NT provide a possible remedy in the form of light weight processes or threads. These are essentially processes within processes. If each agent in the simulation was assigned its own thread then they would behave as if they each had their own processor. This has the added advantage of possibly making the system more portable to true multi-processor platforms.
Finally, the SIMBIOSYS class framework can be viewed as an instantiation of the ontology for evolutionary simulations derived from the analysis of the surveyed system presented at the end of Chapter 3. It is important to note that this is only one of possibly many ontologies. To a large extent, the makeup of the set of classes in the framework described in this thesis is a function of the particular systems that were chosen in the survey. Since the entities were derived by abstracting common features from the simulations, it is quite possible that had a different set of systems been examined, a different ontology would have resulted which would have been useful in its own right.