I am an Assistant Professor at the University of Tokyo, Japan. I am interested in computer vision, multimedia processing, and data-centric AI. I have conducted OCR tasks such as multilingual text recognition and synthetic visual text generation. Currently, I am learning/focusing on large language models (LLMs) and large multimodal models (LMMs).
CV | email | Google Scholar | LinkedIn | Github
The University of Tokyo, Japan, Apr. 2024 - Present
I am working at the Mathematics and Informatics Center, Graduate School of Information Science and Technology. Note: In Japan, 助教 is generally referred to as Assistant Professor. However, the official website of the University of Tokyo uses the title Research Associate. On the employment certificate, it is stated as Assistant Professor.
Mantra Inc., Japan, Jun. 2023 - Mar. 2024
I worked on recognizing onomatopoeia texts in Japanese comics for comic translation using LMM, with Ryota Hinami (about 8 hours per week).
The University of Tokyo, Japan, Apr. 2023 - Mar. 2024
I worked on several projects related to text recognition. I obtained funds from the Japanese government.
Google Research, Oct. 2022 - Jan. 2023
As a student researcher, I worked in the Google OCR team (16 hours per week, from the Google Japan office or home). I surveyed the TextVQA task (Visual Question Answering with text recognition) and implemented part of the baselines with Yasuhisa Fujii.
*Clova AI Research, NAVER Corp., South Korea,* Jan. 2018 - Mar. 2020
Developed scene text recognition (STR) model, which recognizes text in the natural scene.
Language Analytics, NCSOFT Corp., South Korea, Apr. 2016 - Dec. 2017
Developed sentence embedding model for question/document clustering and text style transfer model for colloquial text generation.