As digital transformation in the global banking industry accelerates and world-leading banks adopt data-driven strategies, converged data lakes have become the preferred service innovation platform for many of the world’s top banks. To meet these data-driven service innovation demands, Huawei has launched a Converged Data Lake Solution that is supported by partnerships with specialized Integration Service Vendors (ISVs).
A Converged Data Lake: The Most Popular Way for Financial Institutions to Construct Data Platforms
A data warehouse system has long been an important part of enterprise IT architecture, especially for traditional industries that rely on digital technologies, including the banking industry. Indeed, a data warehouse plays an increasingly critical role in traditional supervision and reporting as well as in business intelligence, which has become an increasingly important concept over recent years.
Indeed, with the rapid development of mobile technologies — especially the mobile Internet — online, mobile, and scenario-based financial services have become mainstream, resulting in the explosive growth of diverse data types. Processing capability requirements of traditional data warehouse platforms range from hundreds of gigabytes (GB) to hundreds of terabytes (TB). A large modern bank generates several TB — or even dozens of TB — of data every day, and the amount of new data every year reaches the petabytes (PB) level.
Meanwhile, as banking services are more and more integrated with the everyday lives of customers, a large amount of unstructured data is generated on a daily basis, from tracing point data and transaction logs, to images and audio and video data. This represents a significant challenge for traditional data warehouse platforms that process just a single structured data type and a limited amount of data. To tackle it, it is therefore imperative that IT managers reconstruct existing data warehouse platforms in order to process massive, diverse data and support data-driven service innovation.
• Challenges Traditional Data Warehouse Platforms Face
Traditional data warehouse platform costs are high, accounting for a large proportion of an IT department’s spend, both in construction and subsequent capacity expansion. For example, a medium-sized bank will invest millions of dollars in its data warehouse platform each year; for a large bank, this figure may reach tens of millions of dollars.
A traditional data warehouse also lacks real-time analysis capability. With the increase of both data volume and user scale, traditional data warehouses therefore can’t meet Service Level Agreement (SLA) requirements for real-time analysis, including real-time anti-fraud measures.
Traditional data warehouses are also mostly relational databases and lack diverse computing capabilities, and are weak when it comes to processing semi-structured and unstructured data.
They also can’t offer online capacity expansion. Rather, on a traditional data warehouse platform, existing service systems are usually suspended during capacity expansion. And, as the scale of data increases, capacity expansion becomes increasingly time-consuming, posing a real challenge to service continuity.
Finally, traditional data warehouse platforms use an all-in-one architecture, which fails to meet banking’s strategic requirement for IT architecture platform decoupling.
• Development Trends of Future Data Platforms
Open distributed architecture is key to the future. Indeed, an open platform combined with a Massively Parallel Processing Database (MPP DB) has become the preferred choice for an increasing number of large financial institutions. Open distributed architecture helps financial institutions decouple software from hardware, provides processing capabilities for massive amounts of data, and supports linear platform expansion.
Real-time service decision-making capabilities will also be critical. Real-time processing has become a universal requirement for banks around the world seeking to deliver real-time services and a personalized user experience. Indeed, real-time processing capability is now a basic requirement for banks when building a data warehouse platform.
Banks also require the ability to process diverse types of data. Data platforms must therefore be capable of storing, processing, and analyzing structured, semi-structured, and unstructured data. By applying the latest technologies, financial institutions are able to mine and analyze the different data types they own, to create more value.
Financial institutions and banks now need always-on services that are uninterrupted even during system expansion and upgrade; quite simply, 24/7 operations are essential for mission critical service systems.
Finally — and unsurprisingly — integration with Artificial Intelligence (AI) platforms is also essential. Financial institutions are increasingly exploring the application of AI in more and more fields. AI depends on data, and this means that integration with an AI platform must be a key consideration when planning and constructing a data platform.
• A Converged Data Lake Is the Main Direction for Data Platform Construction in the Financial Sector
Integrating distributed data warehouses and big data processing platforms, a converged data lake processes structured and unstructured data simultaneously, as well as being able to process both real-time and offline batch data. Such a platform also supports the processing of massive amounts of data through distributed linear expansion. As more and more financial services are made available online and on mobile apps, the customer experience is continuously improving; a converged data lake has become an important platform for banks to deliver customer-centric, scenario-based financial services, and a way to implement rapid service innovation.
Huawei’s Converged Data Lake Solution
Huawei is the only vendor in the industry that can provide a converged big data platform (FusionInsight MRS), a distributed data warehouse platform (FusionInsight DWS), an AI development platform (ModelArts), and a distributed storage solution. Huawei’s self-developed data virtualization platform and data enablement platform — including data governance (DAYU) and data integration (ROMA) — are integrated to provide end-to-end solutions for financial customers, including data access, storage, processing, analysis, and governance. Huawei’s full-stack hardware also allows industry customers to conduct chip- and platform-level performance optimization, empowering them with impressive data analytics and processing capabilities, accelerating data-driven service innovation.
Huawei’s Converged Data Lake Solution
Building a Converged Data Platform for Global Top Banks
With the rapid development of mobile Internet technologies — especially the widespread use of mobile payments — China’s traditional financial institutions face fierce competition from emerging FinTech companies. Bank G — a world-class bank — defined its data-driven strategy in a digital transformation blueprint launched in 2015. To execute this strategy, the bank selected an open architecture-based distributed data platform to cope with the challenges brought by the surge of service data and the need for rapid service innovation.
Before this transformation to a distributed data platform, Bank G faced significant challenges.
The bank was under tremendous cost pressure in both early-stage platform construction and fol low-up capacity expansion. From 2005 to 2015, Bank G paid a data warehouse vendor as much as CNY800 million (approximately US$113 million), and its average annual maintenance cost reached tens of millions of yuan.
Furthermore, traditional, closed appliance architecture was in direct conflict with the bank’s technological decoupling strategy. Indeed, many financial companies — heavily reliant on innovative digital technology — are frustrated by such vendor lock-in.
The bank also urgently needed to upgrade its online services. Given the twin pressures of increasing customer demand for an improved service experience and the need for efficient and timely service reporting, Bank G’s traditional data warehouse platform was simply unable to accommodate online capacity expansion.
There was also a severe lack of real-time data processing capabilities. Traditional data warehouse platforms are, of course, based on offline analysis and processing, and they lack real-time capabilities in Internet scenarios. In particular, traditional data warehouse platforms can’t handle real-time data flow processing for anti-fraud operations.
Traditional data warehouse platforms are mainly used to process relational structured data and can’t handle the diverse — semi-structured and unstructured — data generated in mobile Internet scenarios, including log and tracing point data, voice, and images. In pursuit of its data-driven strategy, Bank G opted for a data platform that had the ability to analyze, process, and explore these diverse types of data.
After evaluating platforms of multiple vendors, Bank G chose Huawei’s converged data lake platform for two main reasons. Firstly, Huawei has years of experience in data platform technologies, and its big data platform has been successfully deployed by many global customers, including the large-scale deployment of more than 10,000 nodes by Huawei’s consumer business, as well as deployments of thousands of nodes by global Tier-1 carriers. Indeed, one of China’s most innovative banks, Bank Z, has developed many new applications based on Huawei’s data platform. Secondly, the data warehouse platform and the big data platform are integrated in Huawei’s solution, which supports the future development trend toward converged data lakes and data warehouses. Huawei also provides a complete data migration solution to ensure smooth data migration from a legacy platform to the new platform, achieving zero data loss and zero service interruption.
After investigation and analysis, migration solution design, solution verification, and solution implementation, Huawei helped Bank G replace all of its traditional data warehouse platforms in June 2019, completing the smooth migration and deployment of nearly 1,000 nodes and more than 2 PB of data in the production environment, ensuring service security.
As digital transformation in the global banking industry accelerates and world-leading banks adopt data-driven strategies, converged data lakes have become the preferred service innovation platform for many of the world’s top banks. To meet their data-driven service innovation demands, Huawei partners with specialized ISVs. And while many leading banks in China have already deployed Huawei’s data lake platform solution, mainstream banks in Malaysia, Singapore, and Nordic Europe are beginning to follow suit. If the future for the finance sector is data, now is the time to upgrade legacy infrastructure or risk being left behind.