CSPs seeking to provide a new generation of on demand, dynamic services to consumers, enterprises and through partners need to be able to rethink the design of their networks for flexibility and agility and be able to deal with complexity. “Network as a service” is a big opportunity if we can quickly and cheaply assemble new end-to-end services out of pre-existing components.
How do we deliver such composite dynamically changing services, when these services are built on top of complex and evolving network technologies? It is not possible to do such operations in the current manual/automated model. We need to make sure that our networks and their operations are autonomous or self-managing, minimizing manual operations, so that we can manage the complexity at a reasonable cost, as proven in other industries and in hyperscaler environments.
Transport networks form the backbone of many of these high value Enterprise services, Cloud interconnect and 5G backhaul. Transport networks are also becoming more sophisticated with optical networks going to 400G, 800G and beyond with coherent optics/routing, flexible photonics etc and the network becoming more programmable with features like segment routing and more pervasive optical domain switching.
Coupled with full observability in real time of all the layers (e.g., span-by-span performance of IP layers), this enables sophisticated IP/Optical multi-layer path computation and simulation to find optimal and diverse paths against multiple policy constraints like bandwidth, latency, availability, utilization etc. AI/ML techniques can enhance traditional methods of intelligent traffic management.
To enable autonomy, CSPs need to change the architecture of their networks and their operations models fundamentally, thinking across organizational boundaries and across domains, to seize these new service opportunities. The figure below shows how all the drivers and enablers are lined up to enable this transformation. But CSPs have challenges from the distributed and heterogenous nature of the networks that they have built over the generations. Domain by domain automation is a good starting point, CSPs can identify the high-value domain as a key building block for this transformation.
In our recent analysis for our Network Automation Software Market Share (2023) report, we found that CSPs are finally prioritising and increasing their spending on automation, revealing a fundamental change taking place in where telcos direct their investment; network automation is gaining over legacy system maintenance.
Autonomy starts with intent, meaning that the high-level requirement and goal for the service is provided to the domain as intent, leaving it to the domain to set it up based on its own intent-driven algorithms and to self-manage the service in terms of healing and scaling as well, in a continuous closed loop. This gives the domain the flexibility to manage its resources on its own, in an optimal manner without needing any over-provisioning, and provide the best customer experience.
Intent-driven algorithms must have access to the dynamic real-time inventory. Inventory/state/traffic needs to be maintained in the domain, or close to the domain, so that the intent processing algorithms can have access to the best data for fast decision-making. SLAs and observability, understanding span-by-span performance with a view of the ports and traffic carried, best and alternative routes in real-time should be possible. Agility required that slices, bandwidth on demand, service turn up on demand, constant changes of topology to support changing enterprise needs, should all be possible. And for this to happen, the intent algorithms need real-time to understand the performance with respect to SLAs.
The decision making is done by the intent-algorithms, which may be informed by learning algorithms from network data of all kinds. The intent-algorithms react to triggers such as congestion or capacity threshold crossing, making an immediate change on the network. The intent-algorithms are coded based on the network knowledge and expertise of the domain experts. These algorithms can also be created based on knowledge learned from the network data using AI/ML/GenAI methods which can map and understand complex patterns and knowledge from network data in great volumes on an ongoing basis. This can eventually lead to autonomous domains where the control loop logic and process itself can be modified based on recommendations from these slow learning loops, leading to truly adaptive domains.
In our whitepaper Autonomous Networks – Thinking differently, Appledore outlines how the architecture needs to evolve.
At the Huawei Analyst Summit in Shenzhen in April 2024, Appledore spoke with Stephen Shao, Vice President of General Development Department, Huawei. As part of its ADN strategy, Huawei explained its iMaster NCE “One Map, One Master” approach, based on FBB ADN solutions for its products. Huawei referred to its map application One Map as the digital base and One Master based on a Large Language Model (LLM) as the decision-making master.
Huawei iMaster NCE ‘One Map and One Master’
Huawei has built an advanced digital base with the One Map first with the iMaster NCE product and then adds the LLM in telecom domain, One Master. An AI agent combines LLM and the digital base, utilising the LLM “chain of thought”, to plan the optimal solution for first managing the network, and then optimization and closed-loop based on data and tools of the digital base. Based on One-Master, China Mobile has implemented a mobile application using the intent-based APIs from One Map and other capabilities of iMaster NCE, for visualizing and managing the entire network troubleshooting process.
Huawei’s One Map refers to the Network Digital Map of iMaster NCE, which has full stack visibility of all the essential data from the network physical level to the application level with network service status awareness built based on real-time data collection. Technologies such as In-situ Flow Information Telemetry are used to implement precise monitoring and diagnosis of the traffic, achieving full-time and full-path monitoring and root cause analysis. It also allows for configuration simulation using live network data, allowing for error free configurations.
Huawei’s One Master is an application built on the FBB Telecom Foundation model that Huawei has created based on its own Pangu Large Language model and using an accumulated corpus of data from its many years of experience and using product documents, customer cases, expert experience, solution designs etc
According to information provided by Huawei, in 2023 China Mobile Guangdong and Huawei jointly implemented the Telecom Foundation Model Application – Huawei NCE One Master, which includes comprehensive telecom knowledge, and service intent understanding and fault self-diagnosis capabilities, and this has helped China Mobile Guangdong keep up its network quality and customer satisfaction.
China Mobile Guangdong has 140 million subscribers and about 500K 5G base stations. With such a large scale, it has a strong demand for network autonomy and has gone through a digital transformation to achieve this.
The solution has been fully implemented in 13 cities across Guangdong province and has achieved good results: in the service recovery scenario, field engineers are now able to use mobile apps to obtain network resources and fault information in real-time. The mobile application uses the intent-based APIs from One Map. In this way the entire troubleshooting process can be visualized and managed, significantly reducing the site visit cost and the average fault response time by 83%. MTTR is reduced from hours to minutes using intelligent conversational interaction based on the AI agent, which reduces the complexity of network configuration and troubleshooting.
Huawei iMaster NCE with One Map and One Master is building toward a vision of autonomy, with a good foundation consistent with Appledore best practices. For more information, see here.
Picture credit: Photo by Cris Ovalle on Unsplash