CVPR 2026
Learning Latent Proxies for Controllable Single-Image Relighting
Haoze Zheng, Zihao Wang, Xianfeng Wu, Yajing Bai, Yexin Liu, Yun Li, Xiaogang Xu,
Harry Yang
CVPR 2026
Group Editing: Edit Multiple Images in One Go
Yue Ma, Xinyu Wang, Qianli Ma, Qinghe Wang, Mingzhe Zheng, Xiangpeng Yang, Hao
Li, Chongbo Zhao, Jixuan Ying, Harry Yang, Hongyu Liu, Qifeng Chen
CVPR 2026 (Findings)
DenDiff: Density-Guided Diffusion for Quantity-Aware Image Synthesis
Bo Gao, Haoyu Liang, Harry Yang, Ser-Nam Lim
CVPR 2026 (Findings)
OpenSubject: Leveraging Video-Derived Identity and Diversity Priors for
Subject-driven Image Generation and Manipulation
Yexin Liu, Manyuan Zhang, Yueze Wang, Hongyu Li, Dian Zheng, Weiming Zhang,
Changsheng Lu, Xunliang Cai, Yan Feng, Peng Pei, Harry Yang
CVPR 2026 (Findings)
TalkVid: A Large-Scale Diversified Dataset for Audio-Driven Talking Head Synthesis
Shunian Chen, Hejin Huang, Yexin Liu, Zihan Ye, Pengcheng Chen, Chenghao Zhu,
Michael Guan, Rongsheng Wang, Junying Chen, Jianye Hou, Bo Li, Guanbin Li, Ser-Nam Lim, Harry Yang,
Benyou Wang
ICLR 2026
AC-Foley: Reference-Audio-Guided Video-to-Audio Synthesis with Acoustic Transfer
Pengjun Fang, Yingqing He, Yazhou Xing, Qifeng Chen, Ser-Nam Lim, Harry Yang
ICLR 2026
EditAnyShape: Shape-Aware Image Editing via Trajectory-Guided Region Control
Zeqian Long, Mingzhe Zheng, Kunyu Feng, Xinhua Zhang, Hongyu Liu, Harry Yang,
Linfeng Zhang, Qifeng Chen, Yue Ma
ICLR 2026
Deforming Videos to Masks: Flow Matching for Referring Video Segmentation
Zanyi Wang, Dengyang Jiang, Liuzhuozheng Li, Sizhe Dang, Chengzu Li, Harry Yang,
Guang Dai, Mengmeng Wang, Jingdong Wang
arXiv 2026
RainFusion2.0: Temporal-Spatial Awareness and Hardware-Efficient Block-wise Sparse
Attention
Aiyue Chen, Yaofu Liu, Junjian Huang, Guang Lian, Yiwu Yao, Wangli Lan, Jing Lin,
Zhixin Ma, Tingting Zhou, Harry Yang
Technical Report
INT4 Quantization for FlashAttention
Yaofu Liu, Harry Yang
arXiv 2026
Thinking in Loops: Scaling Visual ARC with Looped Transformers
Wen-Jie Shu, Xuerui Qiu, Rui-Jie Zhu, Harold Haodong Chen, Yexin Liu, Harry Yang
Journal of Technology in Behavioral Science (Conditional Accept)
Reducing depressive symptoms through AI-guided narrative self-films: Results from a
randomized controlled trial
Elvin Yao, Harry Yang
arXiv 2025
AlignVid: Training-Free Attention Scaling for Semantic Fidelity in Text-Guided
Image-to-Video Generation
Yexin Liu, Wen-Jie Shu, Zile Huang, Haoze Zheng, Yueze Wang, Manyuan Zhang,
Ser-Nam Lim, Harry Yang
arXiv 2025
Distribution Matching Distillation Meets Reinforcement Learning
Dengyang Jiang, Dongyang Liu, Zanyi Wang, Qilong Wu, Liuzhuozheng Li, Hengzhuang
Li, Xin Jin, David Liu, Zhen Li, Bo Zhang, Mengmeng Wang, Steven Hoi, Peng Gao, Harry Yang
AAAI 2026 (Poster)
Next Patch Prediction for AutoRegressive Visual Generation
Yatian Pang, Peng Jin, Shuo Yang, Bin Lin, Bin Zhu, Zhenyu Tang, Liuhan Chen,
Francis E. H. Tay, Ser-Nam Lim, Harry Yang, Li Yuan
COLM 2025
Meta-Learning for Speeding Up Large Model Inference in Decentralized Environments
Yuzhe Yang, Yipeng Du, Ahmad Farhan, Claudio Angione, Yue Zhao, Harry Yang,
Fielding Johnston, James Buban, Patrick Colangelo
NeurIPS 2025
Hierarchical Fine-Grained Preference Optimization for Physically Plausible Video
Generation
Harold Haodong Chen, Haojian Huang, Qifeng Chen, Harry Yang, Ser-Nam Lim
NeurIPS 2025
When Semantics Mislead Vision: Mitigating Hallucinations in MLLMs
Yan Shu, Hangui Lin, Yexin Liu, Yan Zhang, Gangyan Zeng, Yan Li, Yu Zhou, Ser-Nam
Lim, Harry Yang, Nicu Sebe
NeurIPS 2025 NextVid Workshop (Oral)
VideoGen-of-Thought: Step-by-Step Generation of Multi-Shot Videos
Mingzhe Zheng, Yongqi Xu, Haojian Huang, Xuran Ma, Yexin Liu, Wenjie Shu, Yatian
Pang, Feilong Tang, Qifeng Chen, Harry Yang, Ser-Nam Lim
ICCV 2025
DreamDance: Animating Human Images by Enriching 3D Geometry Cues
Yatian Pang, Bin Zhu, Bin Lin, Mingzhe Zheng, Francis E. H. Tay, Ser-Nam Lim,
Harry Yang, Li Yuan
ICCV 2025
Model Reveals What to Cache: Profiling-Based Feature Reuse
Xuran Ma, Yexin Liu, Yaofu Liu, Xianfeng Wu, Mingzhe Zheng, Zihao Wang, Ser-Nam
Lim, Harry Yang
CVPR 2025
Unveiling the Ignorance of MLLMs: Seeing Clearly, Answering Incorrectly
Yexin Liu, Zhengyang Liang, Yueze Wang, Xianfeng Wu, Feilong Tang, Muyang He,
Jian Li, Zheng Liu, Harry Yang, Ser-Nam Lim, Bo Zhao
ICLR 2025
Intervening Anchor Token: Decoding Strategy in Alleviating Hallucinations
Feilong Tang, Zile Huang, Chengzhi Liu, Qiang Sun, Harry Yang, Ser-Nam Lim
ICLR 2023
Make-A-Video: Text-to-Video Generation without Text-Video Data
Uriel Singer, Adam Polyak, Thomas Hayes, Xi Yin, Jie An, Songyang Zhang, Qiyuan
Hu, Harry Yang, Oron Ashual, Oran Gafni, Devi Parikh, Sonal Gupta, Yaniv Taigman
ECCV 2022
Long Video Generation with Time-Agnostic VQGAN and Time-Sensitive Transformer
Songwei Ge, Thomas Hayes, Harry Yang, Xi Yin, Guan Pang, David Jacobs, Jia-Bin
Huang, Devi Parikh
CVPR 2017
High-Resolution Image Inpainting using Multi-Scale Neural Patch Synthesis
Chao Yang, Xin Lu, Zhe Lin, Eli Shechtman, Oliver Wang, Hao Li