Gen Li

Research Fellow @ MAE, NTU

prof_pic.jpg

On Westminster Bridge, London (2025)

I am a Postdoctoral Research Fellow in the School of Mechanical and Aerospace Engineering at Nanyang Technological University (NTU), working with Jianfei Yang at the MARS Lab. I completed my PhD in Robotics and Autonomous Systems at the University of Edinburgh, where I was supervised by Laura Sevilla and co-supervised by Timothy Hospedales. I was fortunate to be partially supported through Google DeepMind and Stability AI, where I collaborated with Deqing Sun and Varun Jampani.

🎯 My research aims to build intelligent physical agents that can perceive, reason, and act in real-world environments with human-like capability and high efficiency. This spans topics including:

  • Embodied AI: Robot Learning, VLA, RL
  • Multimodal AI: VLMs, MLLMs
  • Generative AI: Image / Video Generation, World Models
  • Efficient AI: Transfer Learning, Learning under Limited Data & Supervision
  • Human / Robot Interaction: Human-Robot collaboration, Human-to-Robot Learning

📢 If you are interested in these topics and would like to explore working together, please feel free to reach out via email.

news

Mar 24, 2026 📖 Invited to serve as an Area Chair for NeurIPS 2026!
Feb 21, 2026 🎉 Evo-1 and PALM are accepted to CVPR 2026!
Feb 11, 2026 📖 Invited to serve as an Area Chair for BMVC 2026!
Nov 08, 2025 🎉 Mask2IV has been accepted to AAAI 2026.
Sep 15, 2025 💂 Started my new position as a Postdoctoral Research Fellow at NTU!
Jun 26, 2025 🎉 Two papers accepted to ICCV 2025. See you in Hawaii!
May 28, 2025 📖 Invited to serve as a Reviewer for Nature Machine Intelligence!

selected publications

  1. CVPR’26
    evo1.gif
    Evo-1: Lightweight Vision-Language-Action Model with Preserved Semantic Alignment
    Tao Lin, Yilei Zhong, Yuxin Du, Jingjing Zhang, Jiting Liu, and 9 more authors
    In IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2026
  2. AAAI’26
    mask2iv-gif.gif
    Mask2IV: Interaction-Centric Video Generation via Mask Trajectories
    Gen Li, Bo Zhao, Jianfei Yang, and Laura Sevilla-Lara
    In AAAI Conference on Artificial Intelligence, 2026
  3. ICCV’25
    affgrasp-gif.gif
    Learning Precise Affordances from Egocentric Videos for Robotic Manipulation
    Gen Li, Nikolaos Tsagkas, Jifei Song, Ruaridh Mon-Williams, Sethu Vijayakumar, and 2 more authors
    In IEEE/CVF International Conference on Computer Vision, 2025
  4. NMI
    ellmer.gif
    Embodied Large Language Models Enable Robots to Complete Complex Tasks in Unpredictable Environments
    Ruaridh Mon-Williams†, Gen Li†, Ran Long, Wenqian Du, and Chris Lucas
    Nature Machine Intelligence, 2025
  5. CVPR’24
    ooal.png
    One-Shot Open Affordance Learning with Foundation Models
    Gen Li, Deqing Sun, Laura Sevilla-Lara, and Varun Jampani
    In IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024
  6. CVPR’23
    LOCATE.png
    LOCATE: Localize and Transfer Object Parts for Weakly Supervised Affordance Grounding
    Gen Li, Varun Jampani, Deqing Sun, and Laura Sevilla-Lara
    In IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023
  7. CVPR’21
    ASGNet.png
    Adaptive Prototype Learning and Allocation for Few-Shot Segmentation
    Gen Li, Varun Jampani, Laura Sevilla-Lara, Deqing Sun, Jonghyun Kim, and 1 more author
    In IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021