I am an incoming PhD student in Computer Science with research interests in multimodal large language models, world models, and embodied intelligence. My work focuses on building AI systems that can understand visual environments, reason over spatial and temporal information, and support interaction in complex real-world settings.
A full publication list is available on my research page and Google Scholar.