20220504 (2).jpg

I am working at the Aizawa-Yamakata-Matsui Lab as a Postdoc. I am interested in computer vision, multimedia processing, and data-centric AI. I have conducted OCR tasks such as multilingual text recognition and synthetic visual text generation. Currently, I am learning/focusing on large language models (LLMs) and large multimodal models (LMMs).

CV | email | Google Scholar | LinkedIn | Github

Work experience

Part-time Researcher

Mantra Inc., Japan, Jun. 2023 - Present

I am working on recognizing onomatopoeia texts in Japanese comics for comic translation using LMM, with Ryota Hinami (about 8 hours per week).

Postdoc

The University of Tokyo, Japan, Apr. 2023 - Present

I am working on several projects related to text recognition. I obtained funds from the Japanese government.

Student Researcher

Google Research, Oct. 2022 - Jan. 2023

As a student researcher, I worked in the Google OCR team (16 hours per week, from the Google Japan office or home). I surveyed the TextVQA task (Visual Question Answering with text recognition) and implemented part of the baselines with Yasuhisa Fujii.

Researcher (alternative military service)

Clova AI Research, NAVER Corp., South Korea, Jan. 2018 - Mar. 2020

Developed scene text recognition (STR) model, which recognizes text in the natural scene.

Researcher (alternative military service)

Language Analytics, NCSOFT Corp., South Korea, Apr. 2016 - Dec. 2017

Developed sentence embedding model for question/document clustering and text style transfer model for colloquial text generation.

Education

Ph.D. in Information Science and Technology

Powered by Fruition