by M. Marazakis, and C. Nikolaou
This article outlines the design and implementation of the TPsim simulator for distributed transaction processing systems, in the context of an integrated simulation-support software architecture. The purpose of the simulator is to facilitate the performance evaluation of such systems under a variety of operational conditions, in order to gather data that can be used to validate system design decisions that affect performance. Emphasis is currently given to the study of transaction routing and scheduling algorithms. A distinguishing feature of the simulator is that it offers a simple specification language that can be used to describe system configuration and workload. The TPsim simulator can also be embedded in an experiment support environment.
The transaction processing system is assumed to incorporate a number of processing nodes, and one or more front-end nodes where user requests for the execution of units of work arrive, coupled in a Shared-Nothing architecture. A front-end node routes each transaction in the sequence of steps that makes up a unit of work to a node for service. Simulation of transaction execution entails simulation of concurrency control, database buffer management, logging protocol, I/O accesses. In order to access data on multiple nodes, a ``primary'' transaction is initiated at a node selected by the front-end to execute the application program that issues database calls, and ``secondary'' transactions are started on other nodes to process the requests forwarded (``function-shipped'') to them by the primary transaction. Commitment of a transaction that has accessed data at multiple nodes requires that all transaction agents participate in a distributed two-phase commit protocol. A queuing facility is used for asynchronous communication and data passing among transactions, in the context of a multi-step workflow. Thus TPsim can model the execution of multi-transaction units of work.
In order to incorporate into the simulator the flexibility necessary to adapt to a variety of modeling requirements, a model description language has been defined, so as to avoid having to ``hardwire'' assumptions about the configuration of the system under study in the code implementing the simulator. Using a formal language to describe system configuration and workload enhances portability, as stating explicitly all assumptions makes it easier for others to reproduce the results of a simulation experiment independently.
The model description language offers a number of constructs to describe the configuration of the simulated system, the design of the (distributed) database accessed by transactions, and the workload that the system has to handle. The information passed by the system modeler to the parser of this simple specification language is used to implement models of the system's structural components (processing nodes, storage devices, communication network) by instantiating objects supported by a simulation-support library. The parser then proceeds to set in motion an event-driven simulation engine. The parser produces a data structure which is used during the simulation as the system's data catalog.
The actual simulation of the system under study relies on a small number of primitives that allow the construction of a multi-threaded application within a single address space. Such primitives are provided by the simulation-support library. The simulator is therefore built on top of the parser for the model description language and the simulation-support library (see figure).
TPsim can be used in an experiment support environment, that enables the specification of simulation experiments, and automates the process of collecting measurements from multiple runs that are executed in a distributed environment.
The current implementation of the simulator has been developed on DEC Alpha AXP workstations, under the DEC OSF/1 operating system. Several transaction routing algorithms have been implemented in this framework, together with a variety of CPU scheduling policies for transactions that appear as steps of multi-transaction units of work. Work on the development and application of TPsim is carried out as part of the LYDIA research project (ESPRIT III P8144 - see URL http://www.ics.forth.gr/proj/pleiades/projects/LYDIA/).