Huawei Succeeds in Upgrading HPC System for EPFL
High-Performance Computing (HPC) improves a variety of fields, including aerodynamics and space technology development, long-term climate predictions, high-precision weather forecasting, ocean current calculations, air and water pollution simulation analysis, flood and earthquake predictions, engine and mold designs, biological medicine designs, wind tunnel simulation testing, petroleum exploration, and new materials research.
Currently, HPC continues to rapidly develop and widely expand for two reasons:
Demand: In this data era, as data volumes increase and people pursue higher data-analysis efficiencies, stronger computing capacities are required.
Technology development: Information technologies have developed rapidly in recent years, and now people can enjoy HPC’s strong computing capacities at low costs rather than paying for expensive manpower and materials. These two reasons interact with and promote each other so that more industries can implement HPC and benefit from the resulting improvements.
The higher education industry is a typical example. Statistics show that, among the world’s top 500 HPC clusters released in June 2017, 41 are from universities, with this proportion exceeding 8 percent. Why does the higher education industry benefit from HPC? Universities use HPC in relevant curriculums such as physics, chemistry, and biology. This is why HPC is developing so rapidly in the higher education industry.
École Polytechnique Fédérale de Lausanne (EPFL) is a top world-class university, ranking twelfth in the QS World University Rankings. EPFL has a well-known reputation in engineering technology and natural science fields with students, professors, and staff from over 120 countries and regions. To maintain its industry-leading scientific research level, EPFL established the first HPC system to serve all students and teachers in 2008, and continuously improves its HPC system.
To enhance future competitiveness, EPFL has planned to upgrade and expand its HPC system since 2016 because resources have become insufficient. EPFL had listed all requirements to be met, including HPC benchmark, HPL test, HPCG test, and various applications in science, engineering, biology, and medical care. All of these applications must continue running properly.
In addition, there are many mandatory requirements, such as theoretical computing capability ≥ 475 TFLOPS, shared storage ≥ 340 TB, read/write bandwidth up to 40 Gbit/s, cabinets ≤ 8, and power consumption per cabinet ≤ 25 kW. The system must be open, easy to manage, and scalable. Partners should be forward-looking in technologies and able to offer sufficient support to EPFL to build a 5 PFLOPS HPC cluster in the next five years.
Huawei repeatedly conducts detailed analysis with Transtec and devises a solution. The topology diagram is as follows:
This solution uses the InfiniBand network, which adopts the layer-2 fat-tree networking technology. The storage system is composed of six OceanStor 5800 systems and a General Parallel File System (GPFS), with a capacity of 350 TB.
Huawei selects a number of its advanced products and technologies for EPFL’s solution, which achieve remarkable results. For example, FusionServer X6800 high-density servers are used, increasing the single-cabinet computing capacity by 70 percent and decreasing the number of cabinets by 40 percent. If FusionServer X6800 servers are used, a 4U chassis is needed to accommodate 8 compute nodes and 16 CPUs. If ordinary 1U two-socket servers are used, an 8U chassis is needed. As a result, only 6 rather than 10 cabinets are required to accommodate 408 compute nodes. FusionServer X6800 adopts a heat dissipation design and Dynamic Energy Management Technology (DEMT), so its power consumption is 10 percent to 20 percent lower than that of a traditional rack server. Other features are not listed herein exhaustively.
This solution is widely recognized and deployed, relying on high efficiency, performance, scalability, and easy management. In actual applications, the solution completely meets users’ requirements, and the measured computing power is 401 TFLOPS with the computing efficiency up to 84.4 percent.
EPFL and Huawei announced that the Fidis HPC cluster, developed by EPFL SCITAS, was successfully rolled out in June 2017.
Miss Vittoria Rezzonico, Executive Officer of EPFL SCITAS, spoke highly of Huawei in an interview: “Transtec has closely cooperated with Huawei to provide EPFL with a top-quality system, which meets our demands in the high-performance computing field. We are impressed with the excellent hardware solutions introduced by Huawei engineers and professional planning, installation, and configuration services from Transtec.”