About me
Hi! I am Yuekun Yao, a Ph.D. student in Department of Language Science and Technology at Saarland University working with Prof. Alexander Koller. I am a part of Computational Linguistic Group. In the past, I got my MSc degree in artificial intelligence at the University of Edinburgh. Before that, I did my BS in computer science at East China Normal University.
The main research question I am interested in is How does NLP models generalize to unfamiliar data and how can we improve it? I investigate out-of-distribution generalization with a focus on compositional generalization to bridge the gap between training and test distributions in realistic applications. I am also interested in trustworthiness of NLP models to detect their generalization errors when deployed in real-world settings.
My work aims to both understand model behaviours and develop more effective and reliable NLP models through the following research questions.
- Can NLP models perform human-like generalization, and why? [1] Does this also apply to large language models? [2]
- How to improve models’ compositional generalization ability with general-purpose models (seq2seq)? [3]
- How to build trustworthy models that generalize reliably? Can we train one model (discriminator) to judge the outputs of another model (parser)? [4]
Publications [Google Scholar][Semantic Scholar]
2024
Predicting generalization performance with correctness discriminators [paper]
Yuekun Yao, Alexander Koller
EMNLP 2024 Findings
Simple and effective data augmentation for compositional generalization [paper][code]
Yuekun Yao, Alexander Koller
NAACL 2024
2023
SLOG: A Structural Generalization Benchmark for Semantic Parsing [paper] [code]
Bingzhi Li, Lucia Donatelli, Alexander Koller, Tal Linzen, Yuekun Yao, Najoung Kim
EMNLP 2023
2022
Structural generalization is hard for sequence-to-sequence models [paper][code][data]
Yuekun Yao, Alexander Koller
EMNLP 2022
2020
Dynamic masking for improved stability in online spoken language translation [paper]
Yuekun Yao, Barry Haddow
AMTA 2020
ELITR non-native speech translation at IWSLT 2020 [paper]
Dominik Macháček, Jonáš Kratochvíl, Sangeet Sagar, Matúš Žilinec, Ondřej Bojar, Thai-Son Nguyen, Felix Schneider, Philip Williams, Yuekun Yao
IWSLT 2020
Contact me
ykyao [dot] cs [at] gmail [dot] com