I am a principal research engineer at KT AI2XL, South Korea.
My research lies at the intersection of natural language processing (NLP), computer vision (CV), and machine learning (ML), with a focus on representation learning and generative modeling.
I earned my Ph.D. degree in Computer Science and Engineering from Seoul National University under the supervision of Prof. Byoung-Tak Zhang. Reseach interest: Multimodal learning, Knowledge-enhanced reasoning Email : yj.heo@kt.com | yjheo@snu.ac.kr | yjheo.ai@gmail.com
BI-MDRG: Bridging Image History in Multimodal Dialogue Response Generation
Hee Suk Yoon*, Eunseop Yoon*, Joshua Tian Jin Tee*, Kang Zhang, Yu-Jung Heo, Du-Seong Chang, and Chang D. Yoo ECCV 2024
Translation Deserves Better: Analyzing Translation Artifacts in Cross-lingual Visual Question Answering
ChaeHun Park*, Koanho Lee*, Hyesu Lim, Jaeseok Kim, Junmo Park, Yu-Jung Heo, Du-Seong Chang, and Jaegul Choo ACL 2024 Findings [pdf]
CogME: A Cognition-Inspired Multi-Dimensional Evaluation Metric for Story Understanding
Minjung Shin, Seongho Choi, Yu-Jung Heo, Minsu Lee, Byoung-Tak Zhang, and Jeh-Kwang Ryu COGSCI 2024 [pdf]
Structure-aware Multimodal Sequential Learning for Visual Dialog
Young-Jin Kim*, Min-Jun Kim*, Kyunghwan An, Jinwoo Ahn, Jaeseok Kim, Yu-Jung Heo, Du-Seong Chang, and Eun-Sol Kim AAAI 2024 (acceptance ratio: 2342/9862~23.75%) [pdf]
Video Turing Test: A First Step Towards Human-Level AI
Minsu Lee*, Yu-Jung Heo*, Seongho Choi, Woo Suk Choi and Byoung-Tak Zhang AI Magazine, Volume 44, Issue 4 (Winter 2023) [pdf]
Hypergraph Transformer: Weakly-supervised Multi-hop Reasoning for Knowledge-based Visual Question Answering Yu-Jung Heo, Eun-Sol Kim, Woo Suk Choi and Byoung-Tak Zhang ACL 2022 (acceptance ratio: 701/3378~20.75%) [pdf] [code]
Toward a Human-Level Video Understanding Intelligence Yu-Jung Heo*, Minsu Lee*, Seongho Choi, Woo Suk Choi, Minjung Shin, Minjoon Jung, Jeh-Kwang Ryu and Byoung-Tak Zhang AAAI 2021 Fall Symposium Series on Artificial Intelligence for Human-Robot Interaction [pdf]
DramaQA: Character-Centered Video Story Understanding with Hierarchical QA
Seongho Choi, Kyoung-Woon On, Yu-Jung Heo, Ahjeong Seo, Youwon Jang, Minsu Lee and Byoung-Tak Zhang AAAI 2021 (acceptance ratio: 1692/7911~21.39%) [pdf] [dataset] [code]
Hypergraph Attention Networks for Multimodal Learning
Eun-Sol Kim*, Woo-Young Kang*, Kyoung-Woon On, Yu-Jung Heo and Byoung-Tak Zhang CVPR 2020 (acceptance ratio: 1470/6656~22.09%) [pdf]
†We ranked 1/51 at the 1st GQA Challenge in Visual Question Answering and Dialog Workshop in CVPR 2019.
Cut-Based Graph Learning Networks to Discover Compositional Structure of Sequential Video Data
Kyoung-Woon On, Eun-Sol Kim, Yu-Jung Heo and Byoung-Tak Zhang AAAI 2020 Oral (acceptance ratio: 454/7737~5.87%) [pdf]
†Preliminary version of this paper is presented at ICML 2019 Workshop on Learning and Reasoning with Graph-Structured Representations [pdf] and AAAI 2019 Workshop on Network Interpretability for Deep Learning [pdf].
Constructing Hierarchical Q&A Datasets for Video Story Understanding Yu-Jung Heo, Kyoung-Woon On, Seongho Choi, Jaeseo Lim, Jinah Kim, Jeh-Kwang Ryu, Byung-Chull Bae and Byoung-Tak Zhang AAAI 2019 Spring Symposium Series on Story-Enabled Intelligence [pdf]
Answerer in Questioner's Mind: Information Theoretic Approach to Goal-Oriented Visual Dialog
Sang-Woo Lee, Yu-Jung Heo and Byoung-Tak Zhang NeurIPS 2018 Spotlight (acceptance ratio: 198/4856~4.07%) [pdf]
†Preliminary version of this paper is presented at NeurIPS 2017 Workshop on Visually-Grounded Interaction and Language.
Instruction-tuned Self-Questioning Framework for Multimodal Reasoning
You-Won Jang, Yu-Jung Heo, Jaeseok Kim, Minsu Lee, Du-Seong Chang, and Byoung-Tak Zhang ICCV 2023 Workshop on Closing the Loop between Vision and Language (CLVL)
Scene Graph Parsing via Abstract Meaning Representation in Pre-trained Language Models
Woo Suk Choi, Yu-Jung Heo, Dharani Punitan and Byoung-Tak Zhang NAACL 2022 Workshop on Deep Learning on Graphs for Natural Language Processing (DLG4NLP) [pdf]
Toward General Scene Graph: Integration of Visual Semantic Knowledge with Entity Synset Alignment
Woo Suk Choi, Kyoung-Woon On, Yu-Jung Heo and Byoung-Tak Zhang ACL 2020 Workshop on Advances in Language and Vision Research (ALVR) [pdf] [code]
Temporal Attention Mechanism with Conditional Inference for Large-scale Multi-Label Video Classification
Eun-Sol Kim, Kyoung-Woon On, Jongseok Kim, Yu-Jung Heo, Seoungho Choi, Hyun-Dong Lee and Byoung-Tak Zhang ECCV 2018 Workshop on the 2nd YouTube-8M Large-Scale Video Understanding [pdf][slide]
†We ranked at 5/312~1.6% (In-the-money) in the 2nd YouTube-8M Video Understanding Challenge in ECCV 2018.
Attention Memory for Locating an Object through Visual Dialogue
Cheolho Han*, Yu-Jung Heo*, Woo-Young Kang, Jae-Hyun Jun and Byoung-Tak Zhang CVPR 2017 Workshop on VQA Challenge [pdf]
Criteria for Human-Compatible AI in Two-Player Vision-Language Tasks
Cheolho Han*, Sang-Woo Lee*, Yu-Jung Heo, Woo-Young Kang, Jae-Hyun Jun and Byoung-Tak Zhang IJCAI 2017 Workshop on Linguistic and Cognitive Approaches to Dialog Agents (LaCATODA) [pdf]
Domestic Journal
Scene Graph Generation Framework using Image Region Description
Woo Suk Choi, Yu-Jung Heo, and Byoung-Tak Zhang KIISE Transactions on Computer Practices, Vol. 29, No. 12, Dec, 2023
Efficient Compositional Translation Embedding for Visual Relationship Detection Yu-Jung Heo, Eun-Sol Kim, Woo Suk Choi, Kyoung-Woon On and Byoung-Tak Zhang Journal of KIISE, Vol. 49, No. 7, Jul, 2022[pdf]
DramaQA: Character-Centered Video Story Understanding with Hierarchical QA
Seongho Choi, Kyoung-Woon On, Yu-Jung Heo, Youwon Jang, Ahjeong Seo, Seungchan Lee, Minsu Lee and Byoung-Tak Zhang KIISE Transactions on Computer Practices, Vol. 27, No. 1, Jan, 2021[pdf]
Analyzing and Solving GuessWhat?!
Sang-Woo Lee, Cheolho Han, Yu-Jung Heo, Woo-Young Kang, Jae-Hyun Jun and Byoung-Tak Zhang Journal of KIISE, Vol. 45, No. 1, Jan, 2018[pdf]
Robust Scheduling based on Daily Activity Learning by using Markov Decision Process and Inverse Reinforcement Learning
Sang-Woo Lee, Dong-Hyun Kwak, Kyoung-Woon On, Yu-Jung Heo, Woo-Young Kang, Ceyda Cinarel and Byoung-Tak Zhang KIISE Transactions on Computer Practices, Vol. 23, No. 10, Oct, 2017[pdf]
Regional Projection Histogram Matching and Linear Regression based Video Stabilization for a Moving Vehicle Yu-Jung Heo, Min-Kook Choi, Hyun-Gyu Lee and Sang-Chul Lee Journal of Broadcast Engineering Vol. 19, No. 6, Nov, 2014[pdf]
Domestic Conference
Scene Graph Generation Model utilizing Image Region Descriptions
Woo Suk Choi, Yu-Jung Heo and Byoung-Tak Zhang Proc. Korea Computer Congress 2023 (KCC 2023)
✨ Best presentation award
Event Detection based on Predictive Uncertainty of User World Models Yu-Jung Heo, Kibeom Kim, HoJoon Song, Hyejung Yoon and Byoung-Tak Zhang Proc. Korea Computer Congress 2022 (KCC 2022)
✨ Award for 7 top-performing teams (announced at the ETRI human understanding AI challenge: Learning and Reasoning lifelog)
Video Story Understanding with Multi-level Character Attention Model
Seongho Choi, Kyoung-Woon On, Yu-Jung Heo, Ahjeong Seo, Youwon Jang, Minsu Lee and Byoung-Tak Zhang Proc. Korea Computer Congress 2021 (KCC 2021)[pdf]
✨ Best paper award
Future State Generation for Action Prediction in Cross Domain
Hyunseo Kim, Yu-Jung Heo, Kibeom Kim and Byoung-Tak Zhang Proc. Korea Computer Congress 2021 (KCC 2021)[pdf]
Knowledge-aware Visual Question Answering with Structural Attention Model Yu-Jung Heo, Eun-Sol Kim, Woo Suk Choi and Byoung-Tak Zhang Proc. Korea Computer Congress 2020 (KCC 2020)[pdf]
A study on Scene Graph Unification of Visual Semantic Knowledge using synonym
Woo Suk Choi, Kyoung-Woon on, Yu-Jung Heo and Byoung-Tak Zhang Proc. Korea Computer Congress 2020 (KCC 2020)[pdf]
A study on analysis of human and machine visual attention map for Visual Question Answering
Hyuk-Gi Lee, Yu-Jung Heo and Byoung-Tak Zhang Proc. Korea Computer Congress 2020 (KCC 2020)[pdf]
DramaQA: Human Level Video Story Understanding through Multilevel Question-Answering
Seongho Choi, Kyoung-Woon On, Yu-Jung Heo, Gi-Cheon Kang and Byoung-Tak Zhang Proc. Korea Software Congress 2019 (KSC 2019)[pdf]
✨ Best presentation award
Compositional Structure Learning for Sequential Video Data
Kyoung-Woon On, Eun-Sol Kim, Yu-Jung Heo, and Byoung-Tak Zhang Proc. Korea Computer Congress 2019 (KCC 2019)[pdf]
✨ Best paper award
A Study on Object Detection Technology for an Improved Visual Relationship Detection
Hyunji Choi, Yu-Jung Heo and Byoung-Tak Zhang Proc. Korea Computer Congress 2019 (KCC 2019)[pdf]
✨ Best paper award
Analysis of Learning Strategy in AQM Framework for Goal-Oriented Visual Dialogue Yu-Jung Heo, Sang-Woo Lee and Byoung-Tak Zhang Proc. Korea Computer Congress 2018 (KCC 2018)[pdf]
Comparison of Generative Classification Model and Discriminative Classification Model for AQM Framework Yu-Jung Heo, Sang-Woo Lee and Byoung-Tak Zhang Proc. Korea Software Congress 2017 (KSC 2017)[pdf]
Structural Knowledge Representation Learning for Content-based Question Answering Yu-Jung Heo, Kyoung-Woon On, Eun-Sol Kim and Byoung-Tak Zhang Proc. Korea Computer Congress 2017 (KCC 2017)[pdf]
✨ Best presentation award
Analyzing and Solving GuessWhat?!
Sang-Woo Lee, Cheolho Han, Yu-Jung Heo, Woo-Young Kang, Jae-Hyun Jun and Byoung-Tak Zhang Proc. Korea Computer Congress 2017 (KCC 2017)[pdf]
✨ Best paper award
Goal-oriented Question Generator model using Attention for GuessWhat?!
Jae-Hyun Jun, Woo-Young Kang, Cheolho Han, Yu-Jung Heo and Byoung-Tak Zhang Proc. Korea Computer Congress 2017 (KCC 2017)[pdf]
✨ Best presentation award
Adaptive Question Answering System for Personalized Language Education Yu-Jung Heo, Eun-Sol Kim, Kyoung-Woon On and Byoung-Tak Zhang Proc. Korean Institute of Intelligence Systems Spring Conference 2017 (KIIS 2017)[pdf]
Multimodal Story Learning with Dynamic Memory Construction Yu-Jung Heo, Eun-Sol Kim, Kyoung-Woon On and Byoung-Tak Zhang Proc. Korea Software Congress 2016 (KSC 2016)[pdf]
Robust Scheduling based on Daily Activity Learning by using Markov Decision Process and Inverse Reinforcement Learning
Sang-Woo Lee, Dong-Hyun Kwak, Kyoung-Woon On, Yu-Jung Heo, Woo-Young Kang, Ceyda Cinarel and Byoung-Tak Zhang Proc. Korea Software Congress 2016 (KSC 2016)[pdf]
✨ Best presentation award
Dual Deep Memories for Video Question Answering
Kyung-Min Kim, Changjun Nan, Jung-Woo Ha, Yu-Jung Heo and Byoung-Tak Zhang Proc. Korea Software Congress 2016 (KSC 2016)[pdf]
✨ Best presentation award
Pororobot: A Deep Learning Robot that Plays Video Q&A Games Yu-Jung Heo, Kyung-Min Kim, and Byoung-Tak Zhang Proc. Korea Software Congress 2015 (KSC 2015)[pdf]
✨ Best paper award
Automated Visualization Methodology for Surface of Driving Road by Extracting Motion Parameters of Road Images Yu-Jung Heo, Bo-Gyu Park, Hyun-Gyu Lee, Min-Kook Choi and Sang-Chul Lee Workshop on Image Processing and Image Understanding 2015 (IPIU 2015)[pdf]
Classification of Driving Events using Multi-sensor and Visualization of Driving Information
Bo-Gyu Park, Yu-Jung Heo, Hyun-Gyu Lee, Min-Kook Choi and Sang-Chul Lee Workshop on Image Processing and Image Understanding 2015 (IPIU 2015)[pdf]
Regional Projection Histogram Matching and Linear Regression based Video Stabilization for a Moving Vehicle Yu-Jung Heo, Min-Kook Choi, Hyun-Gyu Lee, and Sang-Chul Lee Proc. Korean Institute of Broadcast and Media Engineers summer conference 2014 (KIBME 2014)[pdf]
✨ Best paper award