What Can You Do with 400 Teraflops?
High-Performance Computing (HPC) is involved in a variety of fields, including aerodynamics and space technology development, long-term climate prediction, high-precision weather forecasting, ocean current calculation, air and water pollution simulation analysis, flood and earthquake prediction, engine and mold design, biological medicine design, wind tunnel simulation testing, petroleum exploration, and new materials research.
Currently, HPC is developing rapidly and being applied widely for two reasons:
• One is demand. In this data era, as data volume increases and people pursue high data-analysis efficiency, a strong computing capacity is required.
• The other is technology development. Information technologies have developed rapidly in recent years, and now people can enjoy HPC’s strong computing capacity at low cost rather than paying for a huge quantity of manpower and materials.
These two reasons interact with and promote each other so that more and more industries can start using HPC and benefit from the resulting reforms.
The higher education industry is a typical example. Statistics show that among the world’s top 500 HPC clusters released in June 2017, 41 are from universities, with the proportion exceeding 8 percent. Why does the higher education industry require HPC so strongly? The reasons are similar to those of HPC popularization, but the industry has more outstanding characteristics.
Using the automobile manufacturing industry as an example, automobile manufacturers design vehicles using HPC, and universities may also use HPC because they set relevant curriculums. Both of these two industries have HPC demands. The difference lies in that the manufacturers only use HPC to design vehicles while universities use HPC in physics, chemistry, and biology. In other words, compared with enterprises, HPC is more widely applied in universities. This is why HPC is developing so rapidly in the higher education industry.
Here, I would like to introduce to you a case to further explore HPC usage in the higher education industry, that is, École Polytechnique Fédérale de Lausanne (EPFL). EPFL is a top world-class university, ranking twelfth in the QS World University Rankings. It enjoys a great reputation in engineering technology and natural science fields and has students, professors and staff from over 120 countries and regions. To maintain its industry-leading scientific research level, EPFL keeps strengthening HPC system construction and established the first HPC system to serve all students and teachers in 2008.
To enhance future competitiveness, EPFL has planned to upgrade and expand its HPC system since last year because resources are insufficient. In the demand list from EPFL, the application demand column stands out and lists all items to be met, including HPC benchmark, HPL test, HPCG test, and various applications in science, engineering, biology, and medical care. All of these applications must keep running properly.
In addition, there are many mandatory requirements, such as theoretical computing capability ≥ 475 TFLOPS, shared storage ≥ 340 TB, read/write bandwidth up to 40 Gbit/s, cabinets ≤ 8, and power consumption per cabinet ≤ 25 kW. The system must be open and easy to manage and scale. Partners should be forward-looking in technologies and able to offer sufficient support to EPFL to build a 5 PFlops HPC cluster in the coming 5 years.
It is easy to meet a certain requirement but difficult when it comes to all requirements. No pressure, no drive. Huawei repeatedly conducts detailed analysis with Transtec and figures out a solution.
This solution has 408 FusionServer XH620 servers deployed as compute nodes, and each node has two Intel Xeon E5-2690 v4 CPUs, with a theoretical computing peak of 475.2 TFLOPS. The InfiniBand network, which adopts the layer-2 fat-free networking technology, is used. The storage system is composed of six OceanStor 5800 systems and a General Parallel File System (GPFS), with a capacity of 350 TB.
Huawei adopts a number of advanced products and technologies in this solution, which achieves remarkable effects. For example, FusionServer X6800 high-density servers are used, increasing the single-cabinet computing capacity by 70 percent and decreasing the number of cabinets by 40 percent. If FusionServer X6800 servers are used, a 4U chassis is needed to accommodate 8 compute nodes and 16 CPUs. If ordinary 1U two-socket servers are used, an 8U chassis is needed. As a result, 6 rather than 10 cabinets are required to accommodate 408 compute nodes. FusionServer X6800 adopts a heat dissipation design and Dynamic Energy Management Technology (DEMT), so its power consumption is 10 percent to 20 percent lower than that of a traditional rack server. Other features are not listed herein exhaustively.
This solution is widely recognized and deployed relying on high efficiency, performance, scalability, and easy management. In actual applications, it completely meets users’ requirements and the measured computing power is 402 TFLOPS, with the computing efficiency up to 89.3 percent.
EPFL and Huawei announced that the Fidis HPC cluster, developed by EPFL SCITAS, was successfully rolled out in June 2017.
Vittoria Rezzonico, Executive Officer of EPFL SCITAS, spoke highly of Huawei in an interview. “Transtec has closely cooperated with Huawei to provide EPFL with a top-quality system, which meets our demands in the high-performance computing field. We are impressed with the excellent hardware solutions introduced by Huawei engineers and professional planning, installation, and configuration services from Transtec,” she said.
“Transtec has closely cooperated with Huawei to provide EPFL with a top-quality system, which meets our demands in the high-performance computing field. We are impressed with the excellent hardware solutions introduced by Huawei engineers and professional planning, installation, and configuration services from Transtec.”
— Vittoria Rezzonico, Executive Officer of EPFL SCITAS