计算机学院系列讲座菁英论坛第3期丨Empowering Azure Storage with RDMA-白巍

Date and Time:

2023/05/05 14:00

Location:

理科一号楼1126(Science Building#1126)

Speaker: 白巍(Wei Bai),Microsoft Research Redmond资深研究员

Host: 黄群(Qun Huang

Title: Empowering Azure Storage with RDMA

Given the wide adoption of disaggregated storage in public clouds, networking is the key to enabling high performance and high reliability in a cloud storage service. In Azure, we choose Remote Direct Memory Access (RDMA) as our transport and aim to enable it for both storage frontend traffic (between compute virtual machines and storage clusters) and backend traffic (within a storage cluster) to fully realize its benefits. As compute and storage clusters may be located in different datacenters within an Azure region, we need to support RDMA at regional scale.

In this talk, I will present our experience in deploying intra-region RDMA to support storage workloads in Azure. The high complexity and heterogeneity of our infrastructure bring a series of new challenges, such as the problem of interoperability between different types of RDMA network interface cards. We have made several changes to our network infrastructure to address these challenges. Today, around 70% of traffic in Azure is RDMA and intra-region RDMA is supported in all Azure public regions. RDMA helps us achieve significant disk I/O performance improvements and CPU core savings.


Speaker Bio

Wei Bai is a senior   researcher in the Networking Research Group at Microsoft Research Redmond. He   received his PhD degree in computer science from Hong Kong University of   Science and Technology. Wei is broadly interested in computer networking with   a special focus on data center networking. His research work has been   published in many top conferences and journals, such as SIGCOMM, NSDI, and   IEEE/ACM Transactions on Networking. Wei also has rich experience in   developing and operating production cloud networks. Currently, he is mainly   focusing on high performance networking for storage, general compute and AI   supercomputer.



CLOSE