Distributed Traffic Engineering (TE) has been considered an effective solution to control traffic distribution in large networks with good scalability. However, it is difficult for distributed TE to reach global optimal performance based on partially observed network states. To tackle this issue, we proposed a federated and collaborative TE approach by deploying intelligent agents at the regional controllers of a multi-region network. Through lightweight and effective Graph Neural Network (GNN)-encoded message exchanges, network regional controllers can collaboratively devise good local routing strategies toward near-optimal performance.
Project Overview:
(1) Scenario: A large network can be divided into multiple network regions, where distributed TE controllers can compute local routing decisions with low computation overhead and fast reaction. However, a critical research problem is how to achieve close-to-optimal network performance without a global view of the network.
(2) FedTe: Exchange lightweight GNN-encoded messages among regional controllers to facilitate collaborative local routing decisions toward global optimal load balancing performance.
(3) Centralized training pipeline of FedTe, followed by distributed online deployment.
(4) Main Techniques: Leverage Supervised Learning (SL) to learn from global optimal routing while using Graph Neural Network (GNN) to support efficient message exchanges among network nodes and regions.
(5) We design a novel 2-layer GNN architecture to model the hierarchical structure of multi-region networks with intra-region encoder and inter-region encoder.
(6) Evaluation results: FedTe can achieve near-optimal load balancing performance under dynamic traffic variations and single link failure scenarios.
(7) As a distributed TE solution, FedTe also incurs much lower computation costs compared to centralized TE solutions.
Contributions:
We designed a 2-layer GNN architecture to model different levels of network abstractions in multi-region networks (intra-region and inter-region).
We exploited lightweight GNN message exchanges to facilitate collaborations among regions and adopted SL to predict cross-region traffic at border routers for local intra-region routing optimization.
The proposed TE solution boosted distributed TE’s performance by up to 28.9% with low computation cost in large networks (<1s execution time in the BRITE network with 204 nodes and 964 links).
Abstract:
Network operators usually adopt Traffic Engineering (TE) to configure the routing in their networks to achieve good load balancing performance and high resource utilization. While centralized TE can effectively improve network performance with a global view of the network, distributed TE has been considered as an alternative to manage large-scale networks that are usually partitioned into multiple regions. However, it is challenging for distributed TE to reach a global optimal performance since each region can make its local routing decisions only based on partially observed network states.
In this paper, we propose a novel distributed TE scheme called FedTe, which leverages supervised learning coupled with a collaborative approach to improve the overall load balancing performance for multi-region networks. FedTe learns from the global optimal routing strategy in a centralized offline manner and predicts the optimal distribution of cross-region traffic among different regions through distributed deployment in real time. The predicted cross-region traffic distribution is integrated with measured local traffic to construct each region’s optimal regional traffic matrix, which is used to perform intra-region TE optimization. FedTe can also handle dynamic traffic variation and link failures with a 2-layer hierarchical graph neural network architecture. To validate the effectiveness of the proposed scheme, we evaluate FedTe with two real-world network topologies and a large-scale synthetic topology. Extensive evaluation results show that FedTe can achieve near-optimal load balancing performance and outperform state-of-the-art distributed TE approaches by up to 28.9% on average.
Publications:
[ICNP ’21] Minghao Ye, Junjie Zhang, Zehua Guo, and H. Jonathan Chao, “Federated Traffic Engineering with Supervised Learning in Multi-region Networks,” The 29th IEEE International Conference on Network Protocols (ICNP), 2021. (One of the 11 pre-accepted papers without conditional acceptance. Acceptance rate: 24.7%, 38/154) [Paper URL] [Video] [Teaser] [PDF]