In today’s commercial society, a person’s credit rating has an overarching impact on all aspects of a person’s life; whether applying for a credit card, purchasing insurance, or applying for a loan. Looking ahead, it is anticipated that personal credit ratings will play an important role in more fields, including social networking, marriage prospects, and career development.
Credit Reporting Is Entering the Big Data Era
Credit reporting is an activity in which professional and independent third parties build risk profiles for individuals and organizations for use by banks and other external credit granting agencies.
The business logic behind credit reporting is based on the collection and analysis of data. Specific actions include processing massive amounts of multi-dimensional, heterogeneous data for the purpose of extracting value and knowledge for easy and interactive delivery to desired application scenarios.
With the expansive growth of the mobile Internet and big data marketplace, the service objectives for credit reporting have changed greatly. Service scenarios are transforming from offline face-to-face interactions to online, real-time activities. At the same time, data sources are proliferating and data structures are becoming more complex to adequately cover users’ online behavior, mobile tracking, transaction records, and consumption habits. The credit reporting market is on a fast track to enter the big data era. Over the next two years, the big data-powered credit reporting market will grow rapidly, with the potential to generate over USD 14.3 billion (CNY 100 billion) in value.
On March 3, 2016, China Mobile Communications Corporation and China Merchants Group established Shijinshi Credit Information Services Co. Ltd. (Shijinshi) as a joint venture in response to a government initiative promoting the use of big data in the credit reporting market. Shijinshi has entered China’s credit reporting market with impressive credentials and stature: China Mobile is the country’s largest mobile communications carrier with access to real-name materials for more than 800 million subscribers, and China Merchants Group is a longstanding leader in the financial services industry that is deeply familiar with the application of credit reporting services.
In the past two years, Shijinshi has been dedicated to developing a real-time big data platform for commercial use. This platform uses the micro-service architecture that reduces the difficulty of deploying customer applications. Customers can concentrate on the application logic without worrying about how to handle the massive store of data that has been accessed, how data with different table structures is stored, or how to implement offline analytics conveniently. With more than two years of effort, Shijinshi has launched a mature platform for both internal use and for access by customers from other industries.
Take the application for the financial industry as an example. By using access to the abundant big data resources of telecommunications carriers, Shijinshi can quickly introduce that data — including user location, social circles, and account registration information — to financial services institutions for purposes such as providing risk management solutions, such as identity authentication, anti-fraud investigations, and comprehensive credit scoring, for financial services applications through real-time analysis and calculation. In addition, this solution can also serve insurance, telecom, government, and other industries and play an important role in insurance underwriting, transportation services, and government affairs.
Currently, Shijinshi’s platform is successfully serving small- and medium-sized municipal and commercial banks. In the future, the use of big data in credit reporting will be extended for wider applications, such as insurance policies. Big data-capable credit reporting can also be widely used in other scenarios related to living, social, and vocational activities.
Big Data Real-Time Service Platform with Multiple Advantages
Since October 2015, the founders of Shijinshi began to explore how to meet the requirements of high-frequency and real-time service scenarios in the financial industry. With the help of Huawei, Shijinshi planned and developed a real-time big data framework using the Hadoop platform as the underlying software. This platform has achieved good results after more than one year of commercial application, with the following specific advantages:
• Second-level Real-time Transaction Services
The Shijinshi platform has put the HBase columnar database directly into the production transaction system, and uses the HBase column structure to directly store transaction data. This approach has brought three benefits: First, an intermediate data conversion process is no longer needed before data analysis, and data losses are avoided because the columnar database was explicitly designed for data reliability. Second, HBase is massively scalable. Whether accessed from Tencent WeChat, QQ, or Sina Weibo, the table structure of HBase need not be modified or reformatted, which greatly reduces the burden of data processing. Third, unlike relational databases, HBase has no upper limit for scale.
The processing logic for computing complex batches is to decouple the source data into multiple real-time requests. The result is that time-consuming calculation and storage operations are separated from the main request thread. Then separate operations are performed using a real-time-stream-processing framework that stores the results in the columnar database. A micro-service framework has also been adopted to support the fulfillment requirements for real-time services.
• Minute-level High-speed Analytics
In addition to real-time processing, credit-reporting services also require high-speed analytics. Columnar transaction data is stored in HBase, which is located at the underlying layer of the Shijinshi platform architecture. When analysis is required, data is directly synchronized from the primary data cluster to the backup cluster where the high-speed analysis is performed. There are three reasons why this platform architecture supports minute-level analysis. First, it enables fast data acquisition without incurring the time or cost of a long Extract, Transform, and Load (ETL) process. Second, preprocessing ensures that the machine is properly configured before the data is transferred. Third, a batch analysis is performed by the backup cluster rather than consuming the resources of the primary cluster — a technique that has been designed to complete the data modeling calculations within minutes for delivery to multiple application scenarios.
• High Availability for Real-time Services
The HA architecture is the most important and complex part of the credit-reporting platform, which took over nine months of designing solutions for a long list of issues. Real-time services require continuity, and the primary and backup clusters are set up in the production system to guarantee resource availability in the event of a primary systems failure. A key design goal is the equal treatment of data consistency and service continuity. Specifically, data consistency is ensured in two ways: First, when the primary cluster is working properly, data can be wholly synchronized with the backup cluster; and second, that there is no data loss when the backup cluster takes over until the primary cluster recovers.
• Columnar Information Chain with Full Time Series
Kx Systems of Palo Alto, California introduced time-series-based, column-oriented databases to the financial services industry in 1998. A columnar database design is different from row-oriented design in the presentation of user credit histories and other credit information. This structure is called “columnar information chain with full time series,” which overcomes many natural defects of relational databases. For example, querying information about the repayment, occupation, and traffic violations of a user over the previous five years is a complicated task in a row-oriented relational database. However, with a columnar database, this is easy.
The columnar data structure allows external data to be quickly written in the database after simple classification. During data reading, the data structure supports convenient data recording according to the required classification, which reduces the data payload per interaction. In terms of scalability, when the classification of some information changes, a new classification table can be created or integrated with existing classification tables and migrated or consolidated into columns that match the original organization. By classifying and layering entity information, the platform is able to efficiently store a full series of information for each person or enterprise. With the continuous enrichment of the external original information layers and more in-depth analysis methods, the platform improves the information available for the final cognitive analysis layer through iteration for the purpose of supporting services of upper-layer applications and scenarios.
• Big Data Integrated Monitoring
Real-time monitoring with data visualization is an important indicator of technical maturity for clearly understanding the running status of an integrated platform. Key monitoring activities include resource scheduling and service tracking that are reported out at different time slices: real-time services are updated by the second, the latest changes for high-speed analytics are shown minute-by-minute, and offline services are refreshed hourly.
Using only nine nodes, this credit-reporting platform has processed up to 36 million requests per month and 1,000 concurrent Tabular Data Streams (TDSs), with the number of transactions increasing monthly. Many of China’s largest financial and Internet companies, including Bank of Communications, China Minsheng Bank, China CITIC Bank, SPD Bank, China Merchants Bank, and Suning are now connected to the system. The complete system continues to meet all functional expectations in both stress test and production environments.
Huawei Digital Platform, Services, and Ecosystem
For the credit reporting industry, the choice of digital platform is always a critical decision. For Shijinshi, Huawei offers the Hadoop big data digital platform technology — which includes powerful data storage capabilities and the ability to ingest large multi-dimensional data streams and output high-speed analytics. We expect this platform to continually improve Shijinshi’s ability to support real-time services with the potential to eventually replace relational databases. To support this effort, Huawei is committed to substantial and ongoing contributions to the open-source Apache Hadoop project. In particular, Huawei is invested in R&D and committed to making continuous component-level improvements and bug fixes in the areas of security, reliability, ease of use, and performance optimization.
The cooperation agreement between Huawei and Shijinshi brings benefits to both parties. Huawei manufactures and delivers a complete set of big data solutions, from x86-based hardware platforms and the enterprise-level FusionInsight big data platforms to a full range of vertical market applications and technical support services. Shijinshi’s contributions have extended the Huawei ICT platform with the addition of its own specialized business applications.
Shijinshi designed a universal big data architecture with the goal that it be applied to many industries. Based on HA, real-time performance, and high reliability, the result has been packaged into a platform-as-a-product that can be applied to fields other than the credit reporting industry that require real-time big data processing.
Huawei played an important role in the development of the Shijinshi platform and continues to provide generous levels of support when implementation issues are encountered. One such example occurred when Huawei sent a big data team to help with the Operations and Maintenance (O&M) of the big data cluster — where problems were detected, resolved, verified, and put into production in a timely manner.
Shijinshi hopes to lower the barrier for entry to big data markets in order that more industries, organizations, and enterprises are able to reap the business value that these powerful systems are built to deliver. Through the collaboration of Huawei and its channel partner network, Shijinshi is committed to developing a solid partner ecosystem throughout China.
Huawei was selected based on the company’s capabilities in digital platforms, services and ecosystems. As a commercial system, the credit reporting service platform must have a digital platform vendor such as Huawei that is able to provide comprehensive capabilities such as reliable and trustworthy ICT solutions and services, as well as powerful channels and a comprehensive ecosystem.
— Shijinshi Credit Information Services Co. Ltd.