
I am an Assistant Professor at the University of Tokyo, Japan. I am interested in computer vision, multimedia processing, and data-centric AI. I have conducted OCR tasks such as multilingual text recognition and synthetic visual text generation. Currently, I am focusing on large language models (LLMs) and large multimodal models (LMMs).
CV | email | Google Scholar | LinkedIn | Github
Work experience
Education
Publications
(*: Equal contribution)

MangaVQA and MangaLMM: A Benchmark and Specialized Model for Multimodal Manga Understanding
- Jeonghun Baek*, Kazuki Egashira*, Shota Onohara*, Atsuyuki Miyai*, Yuki Imajuku, Hikaru Ikuta, Kiyoharu Aizawa
- arXiv preprint 2025 [Paper] [Code]
- Extended abstract version@International Conference on Computer Vision (ICCV) COMIQ Workshop (oral), 2025

Exploring LMM-as-a-Judge for Image Harmonization Evaluation
- Jeonghun Baek*, Eunchung Noh*
- International Conference on Computer Vision (ICCV) UniLight Workshop, 2025

Enhancing Safety Judgment on LLM Responses via Text-to-Image Generation
- Eunchung Noh*, Jeonghun Baek*
- International Conference on Computer Vision (ICCV) WiCV Workshop, 2025

Toward Responsible Federated Large Language Models: Leveraging a Safety Filter and Constitutional AI
- Eunchung Noh*, Jeonghun Baek*
- arXiv preprint 2025
[Paper]

Harnessing PDF Data for Improving Japanese Large Multimodal Models
- Jeonghun Baek, Akiko Aizawa, Kiyoharu Aizawa
- Association for Computational Linguistics (ACL), Findings, 2025
[Paper] [Code (placeholder)]

JMMMU: A Japanese Massive Multi-discipline Multimodal Understanding Benchmark for Culture-aware Evaluation
- Shota Onohara*, Atsuyuki Miyai*, Yuki Imajuku*, Kazuki Egashira*, Jeonghun Baek*, Xiang Yue, Graham Neubig, Kiyoharu Aizawa
- Nations of the Americas Chapter of the Association for Computational Linguistics (NAACL), 2025,
and Neural Information Processing Systems (NeurIPS) EvalEval Workshop (oral), 2024
[Project page]

Leveraging LLM for Detecting and Explaining LLM-generated Code in Python Programming Courses
- Jeonghun Baek, Tetsuro Yamazaki, Akimasa Morihata, Junichiro Mori, Yoko Yamakata, Kenjiro Taura, Shigeru Chiba
- ACM Special Interest Group on Computer Science Education (SIGCSE) Technical Symposium, poster, 2025
[Paper]
- BibTeX

Cross-Lingual Learning in Multilingual Scene Text Recognition
- Jeonghun Baek, Yusuke Matsui, Kiyoharu Aizawa
- International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2024
[Paper] [Code]
- BibTeX

Character Image Combination for Multilingual Scene Text Recognition: Can We Make High-Performance Synthetic Data Without Fonts?
- Jeonghun Baek, Eunchung Noh, Yusuke Matsui, Kiyoharu Aizawa
- International Conference on Computer Vision
(ICCV) Workshop Towards the Next Generation of Computer Vision Datasets (TNGCV) and Doctoral Consortium (ICCVDC), 2023
