Data center interconnect (DCI) serves as the foundation of enterprise digital transformation. DCI networks carry increasing amounts of services and require higher reliability. In addition, as data centers (DCs) rapidly expand, it is increasingly difficult to manage large number of optical fibers and services. When a fault occurs, upper-layer services, such as those on the DCN switches, storage servers, and IT applications will be interrupted, resulting in a deluge of complaints. DCI O&M departments can only respond reactively. An alarm domino effect will occur following a fiber or board fault. This makes it difficult for O&M personnel without transmission backgrounds to quickly locate the cause, particularly as several O&M personnel are required to maintain hundreds (sometimes thousands) of devices. The contradiction between limited network O&M personnel and rapidly growing network requirements is becoming increasingly prominent.
In such a scenario, how can we build an intelligent network system to better manage services, especially optical fibers? Security, reliability, and maintainability of optical fibers needs to be ensured in order to quickly and accurately locate faulty fibers using software without any other tools or site visits. As such, warnings are preferable to enable the prevention of faults.
Accurately collecting a large number of optical-layer network status parameters is the basis for implementing intelligent network management. A coherent receiver in a typical transmission system includes three main parts: an optical digital signal processor (oDSP), a transmitter, and a receiver. The oDSP is a core component of the entire system. By integrating the optical-layer neuron module, the oDSP can implement distributed full coverage on the optical-layer network, and accurately detect all network status parameters in real time. The module is built-in, requiring no extra devices. The control module analyzes and calculates the parameters to provide regional or global fault warnings, service scheduling, configuration, and optimization.
The optical-layer neuron integrated in Huawei's latest generation of oDSP detects the operational status parameters of all wavelength channels at the L0 optical layer. At the transmit end, an optical-layer label is loaded to all wavelength channels by utilizing the oDSP. When the wavelengths pass through any node with a detection function, the labels are extracted, and the optical layer status information is digitized to implement visualized management and control of the network.
The optical-layer neurons detect more than just the transmission state parameters of optical signals at the L0 layer. The collected status information, which is the basis of network optimization, also includes:
• Optical signal-to-noise ratio (OSNR)
• Polarization state
• Polarization change
• Non-linear effect
• Link margin
Huawei's optical network neurons can collect more than 400 optical network physical parameters at a high frequency in seconds. The collected information involves network maintenance parameters and does not involve customer service data or information security. The data is then used to better describe and map the physical optical network to the digital optical network. With technologies such as big data mining and intelligent algorithms, the data can also be used for training and modeling for upper-layer applications, improving intelligence and automation levels and reducing operating expense (OPEX).
The device has a built-in intelligent chip for processing and analyzing optical network parameters. The increase in AI algorithm complexity and the processing of massive amounts of data require high computing power.
The device integrates intelligent chips, which are extremely fast and energy-efficient. Working with the hardware acceleration card, the chip provides computing power 10 times higher than that of the original CPU, and can be used for big data collection, storage, analysis, training, and reporting. The devices are widely used in each node of the network to process delay-sensitive data and applications.
Cloud deployment also integrates a intelligent chip. This chip delivers the highest computing power (256T) in the industry, far exceeding that of the next in line (125T). These chips enable the deployment of more complex intelligent algorithms and applications, and centrally process data that is not delay-sensitive.
Algorithms are key to enabling intelligence. AI algorithms can extract patterns from large amounts of data, and create models from existing fault patterns or resource features based on expert experience to quickly resolve known issues. In addition, AI algorithms based on machine learning can create a knowledge map and predict trends, enabling advance fault prevention and performance optimization.
In short, the optical-layer neurons detect all status parameters of the optical-layer network, with the hardware providing enhanced computing power, and the control layer optimizing algorithms, resulting in the implementation of intelligent network management and control. A wide range of applications may be enabled by these technologies.
• Optical network health assurance: From reactive O&M to predictive O&M, reducing OPEX and improving customer experience.
The O&M of optical services has always been reactive. Maintenance is performed only after a fault occurs or when user complains. The tell-tale signs of sub-optimal optical services cannot be identified until they deteriorate into faults, interrupting services and increasing maintenance costs. According to network fault data analysis, fiber issues account for 68% of total network faults. Meanwhile, incipient OTS/OCh faults account for 56% (38% of the network faults), and the four most common types of faults are bending, swinging, loose contact, and fiber core faults (accounting for 90%).Huawei's optical network health assurance package contains use cases in optical network health visualization, optical network health prediction, and optical network intelligent commissioning to implement automatic O&M of OTS/OCh. The optical network health prediction function uses machine learning and prediction algorithms to analyze the health status of each optical fiber and channel. Based on optical performance fluctuations, fault risks and locations can be predicted, suggestions can be provided, enabling the implementation of proactive O&M and reducing service interruption.
• Fault root cause analysis, and one alarm per fault: enabling quick and accurate troubleshooting.
When a fault occurs on a DC network, corresponding alarms are generated for services. However, the same fault affects the OPU at the service layer, ODU at the electrical layer, Och at the optical channel, and OTS at the optical line side. As a result, a domino effect of alarms occurs, making it difficult to locate the actual cause. However, intelligent computing and analysis can determine the root alarm and suppress other minor alarms. As a result, only one alarm is generated for each fault, significantly simplifying O&M and enabling quick and accurate fault location.
• Intelligent fiber management tools, and accurate online fiber quality monitoring: enabling rapid fiber repair.
When a fault occurs on a long-distance optical fiber between DCs, the traditional solution can only notify customers of the fault via alarms on the management system. In addition, skilled engineers need to be dispatched to the site in order to check the fiber quality with an OTDR meter. This is an expensive process that takes time. With the built-in eOTDR function, the intelligent fiber management system can accurately locate fiber faults with one click utilizing software. O&M personnel can quickly repair the faults, shortening service recovery time and reducing the OPEX attributed to troubleshooting.
Network solutions are becoming automatic. Through in-depth research and exploration of computing power, labeled data, and algorithms, Huawei intelligent OptiX network solution is designed to enable end-to-end lifecycle automation, greatly assisting the digital transformation of enterprise customers while also tackling the challenges of increasing DC traffic and network O&M difficulties.