CE-Ants: Ant-Like Agents for Path Management in the Next-Generation Internet

by Otto Wittner and Bjarne E. Helvik


In general, today's Internet can provide only 'best effort' connections, with no quality of service (QoS) guarantees. Both new and popular existing applications and services would benefit from connections with QoS guarantees, eg minimum capacity, maximum delay, continuity of service and availability. How can connections in the Next-Generation Internet (NGI) be managed such that desired QoS guarantees can be provided while the robustness of today's network is maintained? We pursue an approach inspired by the robustness and emergent behaviour seen in swarms combined with stochastic optimization techniques.

One of the fundamental design objectives of today's Internet was connectivity robustness. First of all, two terminals connected to the network should be able to communicate. Secondly, the network should stay operative and provide reliable communication even when a significant number of network nodes and transmission links fail. Hence packet forwarding is controlled in a fully distributed manner and routing units are more or less independent of each other. In this way, the Internet provides connectivity to everybody with a high probability. Unfortunately, no QoS guarantees or differentiations can be given.

Resource Control and Path Management
It is common for network operators to use over-provisioning combined with virtual connections realized by multiprotocol label switching (MPLS), as a way of providing sufficiently low packet losses and delays, load distribution and capacity control. Continuity of service, short downtimes and high availability are obtained by supplementing the traditional Internet restoration techniques with protection switching based on MPLS and/or in the underlying optical layer. These techniques require centralized planning of resource utilization and path management, which significantly weakens the inherent robustness and autonomy of the traditional Internet.

Furthermore, when multiple paths exist between two communicating terminals, today's routing systems by default forward all packets along the path regarded as having the lowest cost. The exact definition of the lowest cost is generally decided by the network operator, and is not influenced by applications and users. Hence path management does not handle service differentiation. As for MPLS configurations, cost metrics are set manually and infrequently during centralized planning. Hence it is only possible to consider long-term variation (ie over months to years) in general traffic patterns and topology.

Would it be possible to design a robust path management system that is compliant with the 'traditional Internet', that operates on short and medium timescales (minutes to hours), provides controlled QoS differentiation of services or service classes, and enforces operator policies in an optimal way? We believe self-organizing systems inspired by the robustness and emergent behaviour seen in swarms may have the necessary properties to enable such next-generation path management.

CE-Ants
At the Centre for Quantifiable Quality of Service in Communication Systems (Q2S) and the Department of Telematics (ITEM) at NTNU, a set of path management systems is being developed based on a bio-inspired stochastic optimization technique known as Cross Entropy ants (CE-ants). The CE-ant system is a distributed version of Reuven Rubinstein's popular cross entropy method for optimization. CE-ants have been shown to find near-optimal solutions to NP-complete path-finding problems of large scale. Given its fully distributed nature, it is a promising candidate for path management in the NGI. The figure illustrates the typical components in a CE-ant system. A high number of ant-like agents traverse the network in search of paths between given source-destination node pairs while seeking to fulfil a set of requirements (the path objective) provided by the owner/producer of the agents. In their simplest form, agents may only be small signalling packets transporting state information. Associating a behaviour with them, however, makes the system description more intuitive. Each node manages a database of messages that are read and updated by the agents, that is, agents communicate asynchronously. Selected nodes (source nodes) provide agent factory functionality, while other nodes (destination nodes) provide search focus control. All nodes provide path utilization functionality. Note that no single node or component is significantly more important than any other. There is no centralized control and no centralized storage of information.

During a single CE-ant agent life cycle, a path is found and reported in the following manner:

  1. After visiting a node (initially the factory node) an agent decides which node to visit next by applying a probability distribution. The distribution is based on the agent's memory of previously visited nodes, the path objective, and the messages (written or updated by other cooperating/competing agents) available in the current node.
  2. When reaching its given destination node, ie completing a path, an agent evaluates the cost of the path found and adjusts search focus parameters shared with cooperating agents.
  3. Finally, the agent backtracks along the path, writing or updating messages at every node visited. Messages are updated according to the evaluated cost of the path.

After a number of agent life cycles, the overall distribution of messages will converge toward a pattern indicating not only the optimal path but also alternative close-to-optimal paths. As long as factory nodes produce new agents, the message distribution will be maintained. Hence, if traffic conditions change or link/node failures occur, the optimal path (and the close-to-optimal path) will be updated accordingly. The time scales of such updates are governed by the production frequency in the factory nodes. Finally, any node may initiate a connection set-up by applying the information provided by the message distribution, ie it may regard the message distribution as a distributed routing table and utilize any of the paths found.

CE-ant-based path management.
CE-ant-based path management.

Three path management systems based on CE-ants are currently being developed at Q2S and ITEM. Adaptive restoration ants simply maintain routing information continuously in a network. Traffic is assumed to be routed (stochastically or along the 'best' path) hop-by-hop by applying the information. On failures, traffic is forwarded along 'second-best' hops. Primary backup ants find two (link and/or node) disjoint paths between source-destination pairs, meaning they provide 1:1 protection. Back-up paths share capacity. P-cycle ants find cyclic paths that may be applied as protection rings. On link and node failures, selected traffic may be switched onto and rerouted around the ring.

A Java-based prototype of a CE-ant system has been developed at Q2S and ITEM. A testbed in the EU-project BISON will soon be ready, in which CE-ants perform monitoring tasks in a system based on Click software routers. These prototypes have shown that only a limited effort is required to realize CE-ant based path management.

Links:
http://www.q2s.ntnu.no
http://www.item.ntnu.no

Please contact:
Otto Wittner, Centre for Quantifiable Quality of Service in Communication Systems (Q2S), NTNU, Norway
E-mail: wittner@q2s.ntnu.no