썬문_도쿄타워_구글프로필용.jpg

I am an Assistant Professor at The University of Tokyo, Japan. My research spans multimodal AI and data-centric AI. My recent work focuses on leveraging generative AI for education (e.g., LLM-code-detector), evaluation (LMM-as-a-judge), and applications in Japanese cultural contexts (JMMMU, MangaLMM). I previously worked on OCR, including multilingual text recognition and synthetic visual text generation (TRBA, COO, CLL-STR).

CV | email | Google Scholar | LinkedIn | Github

Work experience

Education

Publications

(*: Equal contribution)

overview.jpg

MangaVQA and MangaLMM: A Benchmark and Specialized Model for Multimodal Manga Understanding

jmmmu_pro_teaser.png

JMMMU-Pro: Image-based Japanese Multi-discipline Multimodal Understanding Benchmark via Vibe Benchmark Construction

teaser.jpg

MaskingAgent: Preventing LLM Tutor from Providing Full Solutions in Python Programming Courses

SIGCSE2026.jpg

LLM-Based Explainable Detection of LLM-Generated Code in Python Programming Courses

quali.jpg

Exploring LMM-as-a-Judge for Image Harmonization Evaluation

teaser.jpg

Enhancing Safety Judgment on LLM Responses via Text-to-Image Generation

FedLLM-RAI.png

Toward Responsible Federated Large Language Models: Leveraging a Safety Filter and Constitutional AI

pdf.jpg

Harnessing PDF Data for Improving Japanese Large Multimodal Models

jmmmu.png

JMMMU: A Japanese Massive Multi-discipline Multimodal Understanding Benchmark for Culture-aware Evaluation

example_codes.png

Powered by Fruition