Yuekun Yao

About me

Hi! I am Yuekun Yao, a Ph.D. student in Department of Language Science and Technology at Saarland University working with Prof. Alexander Koller. I am a part of Computational Linguistic Group. In the past, I got my MSc degree in artificial intelligence at the University of Edinburgh. Before that, I did my BS in computer science at East China Normal University.

The main research question I am interested in is How does NLP models generalize to unfamiliar data and how can we improve it? I investigate out-of-distribution generalization with a focus on compositional generalization to bridge the gap between training and test distributions in realistic applications. I am also interested in trustworthiness of NLP models to detect their generalization errors when deployed in real-world settings.

My work aims to both understand model behaviours and develop more effective and reliable NLP models through the following research questions.

  • Can NLP models perform human-like generalization, and why? [1] Does this also apply to large language models? [2]
  • How to improve models’ compositional generalization ability with general-purpose models (seq2seq)? [3]
  • How to build trustworthy models that generalize reliably? Can we train one model (discriminator) to judge the outputs of another model (parser)? [4]

Publications [Google Scholar][Semantic Scholar]

2024

Predicting generalization performance with correctness discriminators [paper]

Yuekun Yao, Alexander Koller

EMNLP 2024 Findings


Simple and effective data augmentation for compositional generalization [paper][code]

Yuekun Yao, Alexander Koller

NAACL 2024


2023

SLOG: A Structural Generalization Benchmark for Semantic Parsing [paper] [code]

Bingzhi Li, Lucia Donatelli, Alexander Koller, Tal Linzen, Yuekun Yao, Najoung Kim

EMNLP 2023


2022

Structural generalization is hard for sequence-to-sequence models [paper][code][data]

Yuekun Yao, Alexander Koller

EMNLP 2022


2020

Dynamic masking for improved stability in online spoken language translation [paper]

Yuekun Yao, Barry Haddow

AMTA 2020


ELITR non-native speech translation at IWSLT 2020 [paper]

Dominik Macháček, Jonáš Kratochvíl, Sangeet Sagar, Matúš Žilinec, Ondřej Bojar, Thai-Son Nguyen, Felix Schneider, Philip Williams, Yuekun Yao

IWSLT 2020

Contact me

ykyao [dot] cs [at] gmail [dot] com