Show simple item record

dc.contributor.authorTajbakhsh, Hesam
dc.date.accessioned2024-01-02T11:39:26Z
dc.date.available2024-01-02T11:39:26Z
dc.date.issued2023-12-26
dc.identifier.urihttp://hdl.handle.net/10222/83340
dc.description.abstractThe slowdown in CPU progress prompted system designers to incorporate diverse programmable accelerators (e.g., graphics processing unit (GPU), smart network interface card (SmartNIC) to address the insufficient computational capacity needed for various components within computer systems. While these programmable accelerators enhance computational capabilities, they possess distinct architectures and capacities compared to standard CPUs. Thus, it is essential to judiciously distribute the computing tasks among servers and their accelerators to avoid performance degradation. Software-defined networking is a paradigm that enables network programmability for agile and efficient network management and operations. Programmable hardware (e.g., switch) recently became a promising alternative for task distribution decisions. A programmable switch can process packets in real-time at line rates (Tbps) significantly faster than legacy server-based load balancers (LBs). Furthermore, such in-network load balancers can reduce the delay in decision-making by cutting off the latency for sending packets from the switch to load-balancing servers. There are several load balancers deployed in programmable switches, but none incorporate the capabilities of accelerators in their designs. In this thesis, we propose the first in-network accelerator-aware load balancers for performance improvement of machine learning applications in data centers. The first load balancer is called P4Mite, which deploys agents in application processing servers and accelerators to measure their capacity and shares these statuses with the switch. It uses this information and load balancing policies (e.g., weighted round robin) to dispatch loads among servers and their accelerators. However, P4Mite supports a limited number of policies. Thus, we introduce P4Hauler, which provides a load balancing framework to support a wide range of policies. Within this framework, we propose configurable building blocks that operators can dynamically select to implement various policies on-the-fly without rebooting the switch and interrupting its services. In addition to knowing the policies and statuses of accelerators, an LB must be aware of traffic condition, which makes the LB operation tedious. Thus, we propose P4Wise, a learning-based LB, to select the most suitable distribution policy automatically. We implement a prototype of the proposed load balancer and deploy it on a testbed consisting of a programmable switch (Intel Tofino), SmartNICs (Mellanox BlueField), and legacy servers to demonstrate deployment feasibility and efficiency over existing solutions. Then, we develop a realistic simulator to show the performance at scale. Specifically, P4Hauler can handle 27% more load compared to traditional LBs using only a single accelerator. In the case of hundreds of servers with multiple accelerators, the performance improvement is proportional to the number of available accelerators. Finally, P4Wise consistently selects appropriate weights with an accuracy of at least 90%. Furthermore, it responds to changes in the environment by adapting the load balancing approach accordingly.en_US
dc.language.isoen_USen_US
dc.subjectAcceleratorsen_US
dc.subjectLoad Balanceren_US
dc.subjectP4en_US
dc.subjectSoftware Defined Networkingen_US
dc.subjectProgrammable Data Planeen_US
dc.titleProgrammable and Intelligent Accelerator-aware Load Balancers in Data Centersen_US
dc.typeThesisen_US
dc.date.defence2023-12-08
dc.contributor.departmentFaculty of Computer Scienceen_US
dc.contributor.degreeDoctor of Philosophyen_US
dc.contributor.external-examinerProf. Kui Wuen_US
dc.contributor.thesis-readerDr. Saurabh Deyen_US
dc.contributor.thesis-readerDr. Yujie Tangen_US
dc.contributor.thesis-supervisorDr. Israat Haqueen_US
dc.contributor.ethics-approvalReceiveden_US
dc.contributor.manuscriptsNot Applicableen_US
dc.contributor.copyright-releaseNot Applicableen_US
 Find Full text

Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record