With more than 75 percent of our internet traffic set to use QUIC and HTTP/3 together, QUIC is slowly moving to become the de facto protocol used for internet communication at Meta. For Meta’s data center network, TCP remains the primary network transport protocol that supports thousands of services on top of it. As our network continues to expand, our engineers are continually looking for ways to make our data centers even more efficient and reliable. Engineers at Meta have been working to bring better network performance than ever to people using our family of apps. Solutions we’ve deployed in production via QUIC and TCP innovations have helped improve performance, congestion management, and platform extensibility across the entire breadth of our network (CDN, edge, backbone, WAN, and data center layers) at Meta.
At the recently held Networking @Scale 2022 virtual conference, themed around transport innovation, engineers from Meta discussed the challenges faced in our network around efficiency, reliability, and deployment at scale.
Here is some of the latest work being done at Meta to enhance network performance at scale:
Quick cache DSR
Matt Joras, Software Engineer, Meta
Yair Gottdenker, Production Engineer, Meta
Matt Joras and Yair Gottdenker present a unique solution utilizing QUIC’s properties at the CDN layer to implement a form of direct server return (DSR) from the caching layer directly to the client. This solution helps bypass most intracluster communication in a typical CDN architecture when serving cached content and avoids streaming content through multiple hops, resulting in significant CPU cycles savings and intracluster network bandwidth improvement. Their talk covers the implementation details, performance improvements, and future applications.
Improving transfer times in the backbone network using QUIC Jump Start
Joseph Beshay, Research Scientist, Meta
Transfers in high-BDP links incur a startup delay for congestion control to probe the bandwidth of the underlying link. The impact of this delay is inversely proportional to the size of the transfer since small transfers may repeatedly spend all their transfer time probing for the available bandwidth and never reach it or utilize it. Joseph Beshay presents an application of QUIC in Meta’s backbone network. In this talk, Joseph presents how the congestion control state can be cached in QUIC and how this state can be used to “jump-start” new connections to significantly reduce startup delays in high-BDP links.
Tackling data center congestion and bursts
Abhishek Dhamija, Production Engineer, Meta
Balasubramanian Madhavan, Software Engineer, Meta
With Meta’s increasing user base, its data center (DC) network is growing fast. It is critical to ensure that the network delivers the highest levels of reliability and performance. Abhishek Dhamija and Balasubramanian Madhavan discuss two specific DC transport tuning initiatives that allow (a) handling sustained congestion in the network using DCTCP, which uses ECN-based congestion signals, and (b) tackling bursts in the network using receiver window turning The talk covers the motivation, implementation overview, handling the coexistence of multiple congestion control mechanisms in the DC using BPF-based enablement knobs, wins, and lessons learned for these initiatives.
NetEdit: Fine-grained network tuning at scale
Prashanth Kannan, Software Engineer, Meta
Prankur Gupta, Software Engineer, Meta
Large-scale network changes must be executed without compromising production traffic, making it essential for every change to be thoroughly developed, validated, and tested before deployment. Prashanth Kannan and Prankur Gupta share the design, implementation, and production experience of a highly extensible, stateless, and modular BPF-based network feature platform called NetEdit that was developed with monitoring and observability at its core, to effectively tune the network transport across millions of servers at Meta.
Network entitlement: From hose-based approval to host-based admission
Guanqing Yan, Software Engineer, Meta
Manikandan Somasundaram, Software Engineer, Meta
The wide area network (WAN) connects many data center (DC) regions and hundreds of points-of-presence (POPs) of Meta. The WAN resource is shared by several high network demand services at Meta. The network must be built for peak demand and account for failure scenarios to reduce the impact on Meta products. However, building a resilient, overprovisioned network for all service peak demands at our current growth rates is practically infeasible due to fiber sourcing, deployment constraints, and the costs involved.
This talk by Guanqing Yan and Manikandan Somasundaram presents Meta’s production traffic classification and WAN entitlement solution currently used by Meta’s services to share the network safely and efficiently. The network entitlement framework aims to provide a simple, stable, operations-friendly network abstraction for sharing the backbone. The framework includes two key parts: (1) a hose-based entitlement granting system that establishes an agile contract while achieving network efficiency and meeting long-term SLO guarantees, and (2) a flexible large-scale distributed host-based traffic admission system that enforces the contract on the production traffic.