Mell: Memory-Efficient Large Language Model Serving via Multi-GPU KV Cache Management

Publication
IEEE International Conference on Computer Communications (INFOCOM) (CCF-A)
Qianli Liu
Qianli Liu
PhD, 2024-Now
Zicong Hong
Zicong Hong
PhD, 2020-2024, HKPFS
Song Guo
Song Guo
Chair Professor