ZHU LEI (祝磊)

HK_2018.jpg

I’m a researcher at Huawei Noah’s Ark Lab (Hong Kong). I got my PhD degree from CityUHK under the supervision of Prof. Rynson W.H. Lau in July, 2024. Before that, I received my bachelor degree in Computer Science and Engineering from Dalian University of Technology (rank 1/102) and spent one year as PhD student at CUHK under the supervision of Eric Lo.

My research focus on long sequence modelling, efficient deep learning, and agentic AI.

News

Sep 19, 2025 One paper on MoE speculative decoding has been accepted by NeurIPS 2025 as spotlight.
Sep 12, 2025 We have open sourced our multi-agent deep research project DeepDiverV2.
Aug 21, 2025 One paper on latency-aware test time scaling is accepted by EMNLP Findings.
Nov 14, 2024 Invited talk by UCSD Hao AI Lab about “Efficient Attention Mechanisms”.
Oct 30, 2024 I’m pleased to receive NeurIPS travel award and be selected as Top Reviewers.

Selected Publications

  1. ACL
    acl2024_relayattention.png
    RelayAttention for Efficient Large Language Model Serving with Long System Prompts
    Lei Zhu, Xinjiang Wang, Wayne Zhang, and 1 more author
    In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , 2024
  2. CVPR
    biformer_cvpr23.png
    BiFormer: Vision Transformer with Bi-Level Routing Attention
    Lei Zhu, Xinjiang Wang, Zhanghan Ke, and 2 more authors
    In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , 2023