썬문_도쿄타워_구글프로필용.jpg

I am an Assistant Professor at the University of Tokyo, Japan. I am interested in computer vision, multimedia processing, and data-centric AI. I have conducted OCR tasks such as multilingual text recognition and synthetic visual text generation. Currently, I am learning/focusing on large language models (LLMs) and large multimodal models (LMMs).

CV | email | Google Scholar | LinkedIn | Github

Work experience

Assistant Professor

The University of Tokyo, Japan, Apr. 2024 - Present

I am working at the Mathematics and Informatics Center, Graduate School of Information Science and Technology. I also belong to Yamasaki Lab.

Part-time Researcher

Mantra Inc., Japan, Jun. 2023 - Mar. 2024

I worked on recognizing onomatopoeia texts in Japanese comics for comic translation using LMM, with Ryota Hinami (about 8 hours per week).

Postdoc

The University of Tokyo, Japan, Apr. 2023 - Mar. 2024

I worked on several projects related to text recognition. I obtained funds from the Japanese government.

Student Researcher

Google Research, Oct. 2022 - Jan. 2023

As a student researcher, I worked in the Google OCR team (16 hours per week, from the Google Japan office or home). I surveyed the TextVQA task (Visual Question Answering with text recognition) and implemented part of the baselines with Yasuhisa Fujii.

Research Engineer (alternative military service)

*Clova AI Research, NAVER Corp., South Korea,* Jan. 2018 - Mar. 2020

Developed scene text recognition (STR) models, which recognize text in natural scenes.

Research Engineer (alternative military service)

Language Analytics, NCSOFT Corp., South Korea, Apr. 2016 - Dec. 2017

Developed sentence embedding models for question/document clustering and text style transfer models for colloquial text generation.