As new businesses are formed, new modes of operation, and new technologies are increasingly disrupting traditional industries. Most global enterprises reach the conclusion that digital transformation is imperative. Professional operation and maintenance (O&M) services play a key role when enterprises go digital. This is why the O&M management of information technology (IT) infrastructure is changing.
Data center O&M service departments no longer focus solely on infrastructure, they also focus on the platform and upper-layer applications. As the enterprise IT architecture evolves towards cloud computing, the flaws of traditional passive O&M, which is manual in nature, are exposed. Enterprises are attracted to automatic O&M because it integrates services and products. In the future, the combination of big data and artificial intelligence (AI) technologies is expected to further reduce enterprises' dependence on O&M personnel. This will be done through machine analysis, judgment, and decision-making, thereby promoting the development of an automatic and intelligent O&M process.
O&M Problems in Digital Transformation
Problem 1: The limited use of automatic O&M processes extends the time required to bring resources online. Due to lack of digital process management measures, service provisioning activities such as VPN applications and network provisioning are completed using manual operations. Resource application and provisioning requires offline communication among personnel, which is a time-consuming and labor-intensive delivery process. Due to the lack of standardized processes, process management and control is inadequate, resulting in unresponsive services. In addition, the lack of automatic inspection tools means that service logs must be manually filtered. Most service log analyses are imprecise, without analysis and lacking in service quality.
Problem 2: The O&M modes are too scattered for related owners to coordinate. Each department uses O&M tools from various manufacturers, which often have different compliance standards. Since these tools cannot be effectively connected with each other, the O&M capability of each department is not able to be shared. A large number of device types lead to siloed O&M models that are difficult to monitor in a unified manner. Resource usage lacks transparency because unified resource management and O&M does not have unified configuration management. Resources are wasted. Due to improper life-cycle management, assets are blindly added, leading to invisible resource usage by zombie hosts. It has been shown that cloud data center services often integrate internal services provided by multiple parties. If different service providers use varying technology stacks and protocols and offer separate services, it will be difficult to achieve effective overall coordination.
Problem 3: The O&M team lacks capabilities and is overly reliant on core personnel. The O&M team is unable to provide constant high-quality 24/7 services. Once the core personnel are absent, the service quality may deteriorate. Team members do not readily summarize their personal experiences to develop organizational capabilities. In addition, there is no repository platform for team members to create automation scripts or document the knowledge base and share organizational experiences.
Problem 4: There is no unified platform for automatic network inspection, traffic monitoring, and anticipation of network errors. Traditional O&M tools and methods only generate alarms when a fault occurs. The O&M work is then carried out after a fault has occurred. This requires remedying what has been lost. In addition, some O&M personnel are not even aware of a fault until they receive complaints about a service interruption.
O&M Transformation Establishes the Foundation for a Successful Digital Transformation
Digital transformation gives rise to many O&M problems. Therefore there is an urgent need for O&M transformation. Enterprises are utilizing rapidly growing ICT resources and using fast developing technologies in increasingly extensive business scenarios. Enterprises need to make large investments into human and material resources. A top priority is employing "generalists". Generalists have a wide range of ICT knowledge, which is needed to maintain the complex O&M systems. However, an individual's capacity is limited. Even outstanding O&M generalists cannot monitor a large number of services 24/7, nor are they able to quickly deduce faults based on the wide range of possible alarms. For enterprises, the cost of recruiting O&M generalists is high. In addition, since the O&M is managed by people, a significant amount of interpersonal communication is inevitable, which means that many business departments will also be pulled in during the O&M process.
O&M transformation trend 1: AI technologies are spurring the evolution from traditional O&M to artificial intelligence for IT operations (AIOps). AIOps is based on highly complete O&M automation technology. Through the use of machine learning, the system continuously extracts and summarizes rules from O&M big data such as logs, monitoring information, and application information, and then it makes intelligent analysis and decisions to achieve the overall objectives. The self-analysis, self-judgment, and self-determination of machines will gradually reduce the risks caused by an over reliance on O&M personnel. AIOps is expected to become a new growth point in the O&M field.
O&M transformation trend 2: Application O&M has become the focus for cloud users. O&M departments of many enterprises primarily perform basic O&M (for enterprise IT infrastructure) and application O&M (for specific enterprise services). Some large-scale O&M departments may also establish O&M development teams to develop O&M tools and platforms.
When customers decide to migrate to the cloud, especially an IaaS public cloud, they are handing over the basic O&M and related tool platform development work to the cloud providers. When O&M departments put application O&M at the heart of their work, they are achieving the intended design objective of cloud computing, which allows users to focus on their service development. By taking this approach customers' top concern can be focused on ensuring the stable running of main services.
O&M transformation trend 3: Flexibility and self service have become basic requirements for transformation infrastructure. Traditional infrastructure cannot be flexibly used. Therefore, to streamline resource management and planning, many O&M teams set rules and procedures for the use of the infrastructure. However, when applied to cloud infrastructure management, these rules and procedures weaken the infrastructure's flexibility.
Enhancing infrastructure flexibility and the degree of self service with measures such as automatic service expansion and reduction can greatly reduce O&M costs. In addition, as the infrastructure costs become flexible, the operational costs of the entire service are reduced, and the market competitiveness is improved. Cloud not only enables infrastructure flexibility, it also enables the large-scale deployment of self-service IT infrastructure services. Any user can obtain required infrastructure resources within minutes. This greatly improves the iteration speed of the entire process and reduces the time O&M personnel spend on resource provisioning and statistics collection.
O&M transformation trend 4: The value of third-party O&M services is becoming increasingly clear, and the number of key application fields continues to increase. The complex heterogeneous environment of enterprise IT infrastructure requires a highly professional data center O&M team that can provide specific O&M services for different software and hardware. As the data center O&M services evolve from hardware O&M to software O&M, comprehensive O&M service providers that are specialized across products, platforms, and applications are needed to integrate the upstream and downstream service ecosystems. They also need to be able to provide customers with end-to-end O&M services from infrastructure to platforms and upper-layer applications. In 2017, the market scale of third-party O&M services for IT data centers reached 79.22 billion RMB, accounting for 45.7% of the total market.
Huawei Launches the I•MOC Platform to address the O&M Transformation Trends
Over the past three decades, Huawei has served 50,000+ customers in 170+ countries and regions around the world. Huawei IT platforms have benefited a large number of customers, including organizations in industries such as R&D, sales, service, and finance. Huawei also has an amazing number of partners and hundreds of millions of terminal consumers around the world. Handling such a wide range of business processes posed great challenges to Huawei's IT systems.
To tackle these challenges, Huawei IT platforms began its cloud-based transformation in 2014. After several years, as the scale of Huawei IT cloud platform grew dramatically and the services became more diversified, Huawei entered an all-cloud era. Currently, Huawei is managing 200+ data centers, 50,000+ cabinets, 300,000+ servers, 1,000+ PB of data, millions of virtual machines (VMs), and several heterogeneous cloud environments.
How does Huawei overcome the difficulties and challenges in the digital transformation of IT systems? Huawei, leveraging decades of self development and practical experience, launched the I•MOC platform, a unified O&M platform that provides core functions including the management, monitoring, control, service, operation, and multi-tenancy. The management function registers all resources in a unified manner. The monitoring function provides real-time running status of resources. The control and service functions handle O&M problems. The operation function visualizes the usage, running status, and health of all assets and resources to help O&M personnel get an overview of the situation and quickly solve problems. The multi-tenancy function performs tenant isolation, permission management and control, and authentication authorization ensures platform security.
Huawei officially launched the unified O&M platform I•MOC for global enterprises at HUAWEI CONNECT 2018. By sharing Huawei's successful practices in automated and intelligent O&M, Huawei aims to help customers implement comprehensive and refined O&M processes and automate O&M tasks, providing customers with a visualized, intelligent, and easy-to-use operations experience.