FusionInsight Unlocks Big Data
By Fan Zhaofu, Senior Engineer, Elastic Computing Cloud Marketing Dept., Huawei IT Product Line
Big Data Technologies
As the amount of available data grows, the overabundance of information becomes increasingly difficult to manage. The “information explosion” phenomenon describes the rapid increase in the amount of published information or data and its consequences.
Transposing Moore’s Law for the Information and Communications Technology (ICT) industry, the total amount of data generated by mankind is doubling every 18 months. The mobile Internet and the Internet of Things (IoT) are large contributors to levels of explosive growth that render conventional data processing technologies inadequate. To keep pace with this data explosion, the industry has been developing Big Data technologies that allow organizations to manage massive data and to extract value from it.
Big Data technologies collectively refer to technologies and infrastructures that enable enterprises, institutions, governments, and other organizations to effectively manage Big Data and explore its potential economic benefits. Typical Big Data technologies include distributed-storage technologies, parallel-computing technologies that facilitate mass data queries and analytics, mass data mining algorithms, industry-specific mass data modeling methods, applications intended to monetize Big Data, and hardware infrastructure for carrying huge volumes of data. Big Data technologies breed a complete ecological chain that link industries, and are already being applied in many areas, including large Internet companies, telecommunication carriers, governments, and the financial services industry.
In 5 to 10 years, Big Data technologies will likely have nearly ubiquitous applications in every corner of society, and are reshaping future IT infrastructures.
In 2009, Huawei started a Big Data R&D project, and in 2011 launched its first Big Data solution, code-named Galax HD, which was renamed FusionInsight Hadoop in 2013. Today, FusionInsight has won more than 100 contracts globally from customers in telecommunications, finance, scientific research, public security, and government. More than 40 projects have been delivered, and 10-plus have been commercialized.
FusionInsight is an enterprise-grade unified platform for mass data storage, query, and analytics for enterprises to quickly build their Big Data technical capabilities. By performing real-time and non-real-time analytics and mining various kinds of mass data, FusionInsight helps enterprises identify opportunities and risks in a timely manner to make decisions that improve profits.
With a completely open architecture, FusionInsight can run on any standard x86 servers without additional hardware components or storage media. Huawei FusionInsight features a highly reliable, secure, and easy-to-use Operation and Maintenance (O&M) system, application development services, and full data modeling middleware. These capabilities allow data-intensive industries to monetize the value that is embedded in their data.
Open source is the trend for Big Data applications to achieve optimal performance. Huawei closely follows, and makes continuous contributions to the development of the Apache Hadoop open source community. Statistics show that, by the end of 2013, Huawei was ranked as the number four contributor to Apache Hadoop globally, ahead of all other IT equipment vendors. To maintain the complete openness of FusionInsight, Huawei does not use private architecture or components. In addition, FusionInsight releases are updated with the latest technologies from the open source community and our engineering teams rapidly and routinely incorporate the latest components.
The FusionInsight team provides localization engineering capabilities, resolves kernel-level problems, develops industry-specific data services and open platforms for customers. These platforms ensure maximum Big Data benefits to enterprise clients.
FusionInsight provides enterprise-grade enhancements in four areas: reliability, security, performance, and scalability.
The components of all FusionInsight management nodes are built with High Availability (HA). HBase clustering technology enables long distance disaster-proof capabilities covering distances over 1,000 kilometers. Other major reliability enhancements include table-level cluster backup, log retrieval, and data integrity check and restoration.
FusionInsight supports Role-based Access Control (RBAC). The Web-based user interface supports Single Sign-on (SSO) authentication. HBase rights control is implemented at the HBase database, table, column family, and column levels. Hive-based access control supports data isolation for legitimate users and allows cross-referencing of user data.
FusionInsight can encrypt file systems, and the storage of user information within a cluster in plaintext is prohibited. Encryption algorithms are available as plug-ins and can be expanded or customized based on service needs.
Security hardening measures are also implemented to enhance the security of the operating system. Unnecessary components are removed, and tools are available for automated vulnerability scans. Key components — such as service nodes, management nodes, and the user management portal — strictly comply with industry standards to securely protect the operating system and infrastructure.
FusionInsight uses an innovative CTBase scheme for the data-intensive service needs of database and data warehouse association tables. Based on HBase, the CTBase application encapsulates original HBase APIs to form an HBase-based ClusterTable/ClusterIndex framework. Service tables with similar functions are associated or combined into one big HBase table (ClusterTable). CTBase provides ClusterTable-level interfaces for third-party development. On the visualized management page, users can create tables and indexes, define data columns and RowKey Schema.
The FusionInsight O&M center supports dual-host backup, distributed parallel processing, and can complete cluster installations in as few as ten minutes. The fully automated online maintenance, customized dashboard, and automated application development assistant allow enterprises to easily manage their Big Data systems. Wizards guide users through upgrade and rollback operations, allowing complete Hadoop cluster upgrades in less than six minutes. FusionInsight also provides northbound interfaces that allow integration with existing enterprise network management systems. Currently, the solution supports the syslog interface; future northbound interfaces will be made available.
For developers to use Big Data technologies, FusionInsight provides a series of data collection and analytic functions, including data center O&M log analysis, historical data queries, real-time event processing, and customer profile analytics. More customized functions can be provided to accommodate the growing service needs of enterprises.
Unlocking Business Potential
Unlocking the business potential of Big Data has become a key area of focus and investment for enterprises. Huawei FusionInsight provides businesses with an intelligent Big Data platform to transform mountains of data into critical business insights, rapidly and accurately. With Huawei FusionInsight, enterprises can move from relying on static, pre-processed information to Big Data technologies providing real-time, better-informed, and revenue-generating services.