Showing papers by "Sugang Xu published in 2020"

PDF

Open Access

Journal Article•DOI•

Joint Progressive Network and Datacenter Recovery After Large-Scale Disasters

[...]

Sifat Ferdousi¹, Massimo Tornatore¹, Ferhat Dikbiyik, Charles U. Martel¹, Sugang Xu², Yusuke Hirota², Yoshinari Awaji², Biswanath Mukherjee¹ - Show less +4 more•Institutions (2)

University of California, Davis¹, National Institute of Information and Communications Technology²

27 Mar 2020-IEEE Transactions on Network and Service Management

TL;DR: This work solves the optimization problem of joint progressive recovery to find the optimal sequence of network element and DC repairs with the objective to maximize cumulative weighted content reachability in the network, and proposes a scalable heuristic for scheduling the sequential repair of network nodes/links and DCs.

...read moreread less

Abstract: Large-scale disasters affecting both network and datacenter (DC) infrastructures can cause severe disruptions in cloud-based services. During post-disaster recovery, repairs are usually carried out in stages in a progressive manner due to limited repair resource availability. The order in which network elements and DCs are repaired can significantly impact users’ reachability to important contents/services. We investigate joint progressive network and DC recovery in which network recovery and DC recovery are conducted in a coordinated manner such that users have access to the maximum possible amount of contents/services at each repair stage. We first solve the optimization problem of joint progressive recovery to find the optimal sequence of network element and DC repairs with the objective to maximize cumulative weighted content reachability in the network. We then propose a scalable heuristic for scheduling the sequential repair of network nodes/links and DCs. Our model assumes that, at each repair stage, one network node with adjacent links and one DC can be fully repaired; however, full recovery may not be guaranteed due to limited resource availability. Hence, we also propose a “resource-aware” approach (with two resource-allocation strategies, namely “selective allocation” and “adaptive allocation”), which considers both full and partial recovery of elements based on available resources at each stage. We show that, compared to disjoint progressive recovery approach, in which network recovery and DC recovery plans are independent, our joint progressive recovery approach provides significantly higher per-stage content reachability in the network.

...read moreread less

18 citations

Journal Article•DOI•

Emergency OPM Recreation and Telemetry for Disaster Recovery in Optical Networks

[...]

Sugang Xu¹, Yusuke Hirota¹, Masaki Shiraiwa¹, Massimo Tornatore², Sifat Ferdousi², Yoshinari Awaji¹, Naoya Wada¹, Biswanath Mukherjee² - Show less +4 more•Institutions (2)

National Institute of Information and Communications Technology¹, University of California, Davis²

01 May 2020-Journal of Lightwave Technology

TL;DR: This work proposes an approach for quick recreation of OPM and for achieving robust telemetry based on OpenConfig YANG that can tolerate low post-disaster bandwidth and can adapt the telemetry system following the changing conditions of the C/M-plane network.

...read moreread less

Abstract: Optical performance monitoring (OPM) and the corresponding telemetry systems play an important role in modern optical transport networks based on software-defined networking (SDN). There have been extensive studies and standardization activities to build high-speed and high-accuracy OPM/telemetry systems that can ensure sufficient monitoring data for effective network control and management. However, current solutions for OPM/telemetry assume that control and management planes (C/M-plane) always provide sufficient bandwidth (BW) to deliver telemetry data. Unfortunately, in the event of several concurrent network failures (e.g., following a large-scale disaster), C/M-plane networks can become heavily degraded and/or unstable, and even experience isolation of some of their parts. Under such circumstances, the existing OPM systems would hardly function. To enhance resiliency and to ensure the quick recovery of OPM/telemetry in case of disaster, we propose an approach for quick recreation of OPM and for achieving robust telemetry based on OpenConfig YANG. Our proposal addresses three key problems: (1) how to quickly recreate the lost OPM capability, (2) how to address the mismatch between the high data rate of OPM and the low BW in the C/M-plane network, and (3) how to flexibly reconfigure the telemetry system to be adaptive to sudden BW changes in the C/M-plane network. We implement a testbed and experimentally demonstrate that our proposal can tolerate low post-disaster bandwidth and can adapt the telemetry system following the changing conditions of the C/M-plane network.

...read moreread less

15 citations

Journal Article•DOI•

Survivable virtual network mapping with content connectivity against multiple link failures in optical metro networks

[...]

Giap Le¹, Sifat Ferdousi¹, Andrea Marotta², Sugang Xu³, Yusuke Hirota³, Yoshinari Awaji³, Massimo Tornatore¹, Biswanath Mukherjee¹ - Show less +4 more•Institutions (3)

University of California, Davis¹, University of L'Aquila², National Institute of Information and Communications Technology³

01 Nov 2020-IEEE\/OSA Journal of Optical Communications and Networking

TL;DR: This work derives necessary and sufficient conditions and develops what it believes to be a novel mathematical formulation to map a virtual network over a physical network such that content connectivity for the virtual network is ensured against multiple link failures in the physical network.

...read moreread less

Abstract: Network connectivity, i.e., the reachability of any network node from all other nodes, is often considered as the default network survivability metric against failures. However, in the case of a large-scale disaster disconnecting multiple network components, network connectivity may not be achievable. On the other hand, with the shifting service paradigm towards the cloud in today’s networks, most services can still be provided as long as at least a content replica is available in all disconnected network partitions. As a result, the concept of content connectivity has been introduced as a new network survivability metric under a large-scale disaster. Content connectivity is defined as the reachability of content from every node in a network under a specific failure scenario. In this work, we investigate how to ensure content connectivity in optical metro networks. We derive necessary and sufficient conditions and develop what we believe to be a novel mathematical formulation to map a virtual network over a physical network such that content connectivity for the virtual network is ensured against multiple link failures in the physical network. In our numerical results, obtained under various network settings, we compare the performance of mapping with content connectivity and network connectivity and show that mapping with content connectivity can guarantee higher survivability, lower network bandwidth utilization, and significant improvement of service availability.

...read moreread less

13 citations

Proceedings Article•DOI•

Transfer Learning across Different Lightpaths for Failure-Cause Identification in Optical Networks

[...]

Francesco Musumeci¹, Virajit Garbhapu Venkata¹, Yusuke Hirota², Yoshinari Awaji², Sugang Xu², Masaki Shiraiwa², Biswanath Mukherjee³, Massimo Tornatore³ - Show less +4 more•Institutions (3)

Polytechnic University of Milan¹, National Institute of Information and Communications Technology², University of California, Davis³

01 Dec 2020

TL;DR: In this article, transfer learning across different lightpaths for failure-cause identification using OSNR traces collected over NICT's Sendai optical-network testbed was performed. But the authors did not investigate the performance of transfer learning on the target lightpath.

...read moreread less

Abstract: We perform transfer learning across different lightpaths for failure-cause identification using OSNR traces collected over NICT's Sendai optical-network testbed. Results suggest that limited additional data on the target lightpath allow to achieve satisfactory accuracy.

...read moreread less

13 citations

Proceedings Article•DOI•

Experimental Demonstration of Optical Multicast Packet Transmissions in Optical Packet/Circuit Integrated Networks

[...]

Yusuke Hirota, Sugang Xu, Masaki Shiraiwa, Yoshinari Awaji, Massimo Tornatore¹, Biswanath Mukherjee¹, Hideaki Furukawa, Naoya Wada - Show less +4 more•Institutions (1)

University of California, Davis¹

08 Mar 2020

TL;DR: An SDN-based control for optical-multicast packet transmission is developed and multicast functionality is demonstrated by validating it using an application-layer network service for efficient content duplication in Optical Packet/Circuit Integrated (OPCI) network.

...read moreread less

Abstract: We develop an SDN-based control for optical-multicast packet transmission and experimentally demonstrate multicast functionality by validating it using an application-layer network service for efficient content duplication in Optical Packet/Circuit Integrated (OPCI) network.

...read moreread less

3 citations

Proceedings Article•DOI•

Toward Disaster-Resilient Optical Networks with Open and Disaggregated Subsystems [Invited]

[...]

Sugang Xu, Noboru Yoshikane, Masaki Shiraiwa, Yusuke Hirota, Takehiro Tsuritani, Sifat Ferdousi¹, Yoshinari Awaji, Naoya Wada, Biswanath Mukherjee¹ - Show less +5 more•Institutions (1)

University of California, Davis¹

25 Mar 2020

TL;DR: Various approaches for rapid post-disaster recovery in optical networks (including legacy optical networks) employing disaggregated subsystems, namely, the emergency first-aid unit (FAU) with open application programming interfaces and protocols are discussed.

...read moreread less

Abstract: Novel open and disaggregated optical-networking technologies promise to enhance multi-vendor interoperability thanks to their open interfaces in both data-plane and control/management-plane (C/M-plane). From the viewpoint of disaster resilience in optical networks, such interoperability will significantly improve the flexibility in product selection with regard to replacing damaged subsystems with products of different vendors. In this paper, we discuss various approaches for rapid post-disaster recovery in optical networks (including legacy optical networks) employing disaggregated subsystems, namely, the emergency first-aid unit (FAU) with open application programming interfaces and protocols. We address the following problems (and introduce the solutions that we are currently investigating): (1) how to take advantage of the new disaggregated resources and surviving legacy optical resources to achieve early recovery, (2) how to achieve integrated control of FAUs and non-FAU legacy ROADMs, and (3) how to quickly recreate the lost optical performance monitoring (OPM) capability with FAUs and perform a robust telemetry under the restricted bandwidth in the degraded C/M-plane networks.

...read moreread less

3 citations

Proceedings Article•DOI•

Automatic Resource Mapping using Functional Block Based Disaggregation Model for ROADM Networks

[...]

Kiyo Ishii¹, Sugang Xu², Noboru Yoshikane, Atsuko Takefusa³, Shigeyuki Yanagimachi⁴, Takeshi Hoshida⁵, Kohei Shiomoto⁶, Tomohiro Kudoh⁷, Takehiro Tsuritani, Yoshinari Awaji², Shu Namiki¹ - Show less +7 more•Institutions (7)

National Institute of Advanced Industrial Science and Technology¹, National Institute of Information and Communications Technology², National Institute of Informatics³, NEC⁴, Fujitsu⁵, Tokyo City University⁶, University of Tokyo⁷

08 Mar 2020

TL;DR: The functional-block-based model precisely describing the physical layer structures can act as a hardware abstraction layer for more abstracted models like OpenROADM.

...read moreread less

Abstract: Automated mapping of real hardware composition onto a ROADM-based model is demonstrated. The functional-block-based model precisely describing the physical layer structures can act as a hardware abstraction layer for more abstracted models like OpenROADM.

...read moreread less

1 citations

Proceedings Article•DOI•

First Demonstration of Automated Updates of Disaggregate Blades in Multi-Domain/Layer Optical Path Network

[...]

08 Mar 2020

TL;DR: Updating an OpenROADM node and subsequent re-routing were automated using a mathematical component-based model, triggered by the addition of node components.

...read moreread less

Abstract: Updating an OpenROADM node and subsequent re-routing were automated using a mathematical component-based model, triggered by the addition of node components. This process required only five minutes on an orchestrated testbed using SINET5 and a field optical network.

...read moreread less

1 citations

Book Chapter•DOI•

A Novel Carrier-Cooperation Scheme with an Incentive for Offering Emergency Lightpath Support in Disaster Recovery

[...]

Sugang Xu, Noboru Yoshikane, Naoki Miyata¹, Masaki Shiraiwa, Takehiro Tsuritani, Xiaocheng Zhang¹, Yoshinari Awaji, Naoya Wada - Show less +4 more•Institutions (1)

NTT Communications Corp¹

16 Feb 2020

TL;DR: The evaluation results reveal that the proposal can significantly reduce the burden on recovery and the corresponding cost for carriers, resulting in fast and efficient disaster recovery.

...read moreread less

Abstract: To achieve the fast recovery of optical transport networks following a disaster, we investigate a novel scheme to enable cooperation between carriers. Carriers can take advantage of their surviving or recovered optical resources to aid one another with emergency lightpath support to reduce efficiently the burden of recovery, which is heavy immediately after disasters. These lightpaths can be employed exclusively by the counterpart carriers to satisfy their highest priority traffic demands, such as safety confirmation and victim relief. In addition, we introduce an incentive to carriers to prompt cooperation. The carrier cooperation-planning problem is decomposed into eight tasks, and distributed to individual carriers and a third-party organization. During cooperation, the carriers’ confidential information can be strictly protected by employing a carrier optical network abstraction mechanism. The evaluation results reveal that our proposal can significantly reduce the burden on recovery and the corresponding cost for carriers, resulting in fast and efficient disaster recovery.

...read moreread less