I'm a Ph.D. candidate in School of Computing at National University of Singapore. My advisor is Professor Leong Tze Yun.

My research interests include Reinforcement Learning, Reward Shaping, and Search Systems.

Publications

NeurIPS 2025

Centralized Reward Agent for Knowledge Sharing and Transfer in Multi-Task Reinforcement Learning
Haozhe Ma, Zhengding Luo, Thanh Vinh Vo, Kuankuan Sima, Tze-Yun Leong.
39th Annual Conference on Neural Information Processing Systems (NeurIPS), 2025

ICML 2025

Catching Two Birds with One Stone: Reward Shaping with Dual Random Networks for Balancing Exploration and Exploitation
Haozhe Ma, Fangling Li, Jing Yu Lim, Zhengding Luo, Thanh Vinh Vo, Tze-Yun Leong.
42nd International Conference on Machine Learning (ICML), 2025

ICLR 2025

Highly Efficient Self-Adaptive Reward Shaping for Reinforcement Learning
Haozhe Ma, Zhengding Luo, Thanh Vinh Vo, Kuankuan Sima, Tze-Yun Leong.
13th International Conference on Learning Representations (ICLR), 2025

ICML 2024

Reward Shaping for Reinforcement Learning with An Assistant Reward Agent.
Haozhe Ma, Kuankuan Sima, Thanh Vinh Vo, Di Fu, Tze-Yun Leong.
41st International Conference on Machine Learning (ICML), 2024

Preprint

Exploration by Random Reward Perturbation
Haozhe Ma, Guoji Fu, Zhengding Luo, Jiele Wu, Tze-Yun Leong.

AAMAS 2024
Oral

Mixed-Initiative Bayesian Sub-Goal Optimization in Hierarchical Reinforcement Learning.
Haozhe Ma, Thanh Vinh Vo, Tze-Yun Leong.
23rd International Conference on Autonomous Agents and Multiagent Systems (AAMAS), 2024

NN 2024

GFANC-RL: Reinforcement Learning-based Generative Fixed-filter Active Noise Control.
Zhengding Luo*, Haozhe Ma*, Dongyuan Shi, Woon-Seng Gan.
*Equal contribution.
Neural Networks Journal, 2024. (Impact Factor: 6.0)

AAMAS 2023

Hierarchical Reinforcement Learning with Human-AI Collaborative Sub-Goals Optimization.
Haozhe Ma, Thanh Vinh Vo, Tze-Yun Leong.
22rd International Conference on Autonomous Agents and Multiagent Systems (AAMAS), 2023

AAAI Sym. 2023

Human-AI Collaborative Sub-Goal Optimization in Hierarchical Reinforcement Learning.
Haozhe Ma, Thanh Vinh Vo, Tze-Yun Leong.
Inaugural Summer Symposium Series 2023 of AAAI, 2023

MSSP 2025

Deep learning-based Generative Fixed-Filter Active Noise Control: Transferability and implementation.
Zhengding Luo, Junwei Ji, Boxiang Wang, Dongyuan Shi, Haozhe Ma, Woon-Seng Gan.
Mechanical Systems and Signal Processing Journal, 2025. (Impact Factor: 8.9)

CoLLAs 2024

Decoupled Prompt-Adapter Tuning for Continual Activity Recognition
Di Fu, Thanh Vinh Vo, Haozhe Ma, Tze-Yun Leong.
Conference on Lifelong Learning Agents (CoLLAs), 2024

Preprint

Causal Policy Learning in Reinforcement Learning: Backdoor-Adjusted Soft Actor-Critic
Thanh Vinh Vo, Young Lee, Haozhe Ma, Chien Lu, Tze-Yun Leong.

Preprint

JEDI: Latent End-to-end Diffusion Mitigates Agent-Human Performance Asymmetry in Model-Based Reinforcement Learning
Jing Yu Lim, Zarif Ikram, Samson Yu, Haozhe Ma, Tze-Yun Leong, Dianbo Liu.

Education

National University of Singapore (NUS)

School of Computing

National University of Singapore (NUS)

School of Computing

Xi'an Jiaotong University (XJTU)

Computer Science and Engineering

Awards

PhD Research Achievement Award of National University of Singapore (AY2024-2025).    

PhD Research Achievement Award of National University of Singapore (AY2023-2024).    

Research Incentive Award of School of Computing of National University of Singapore (AY2022-2023)

Research Scholarship from the Ministry of Education in Singapore (2022-2026).

Scholarship of International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2024).

Academic Duty

Student Area Search Committee for faculty recruitment for School of Computing, National University of Singapore.

Reviewers of conferences and journals: ICLR, ICML, NeurIPS, AAMAS, TSP etc.

Teaching Assistant of the undergraduate course Foundations of Artificial Intelligence.

Teaching Assistant of the graduate course AI Planning and Decision Making.

Open-Sourse Projects

[ICML 2025] Dual Random Network Distillation (DuRND)

[ICLR 2025] Highly Efficient Self-Adaptive Reward Shaping for Reinforcement Learning

[ICML 2024] Reward Shaping for Reinforcement Learning with an Assistant Reward Agent

Efficient Reinforcement Learning Algorithms and Environments by PyTorch

Flat Reinforcement Learning Algorithms on StartCraft II Mini-Games

Tutorial and Document: Using StarCraft II as Learning Environment

Auto Text Recognition and Translation by Pasting Screenshots