NLP
2024
Chunpeng Ma and Takuya Makino
Detecting out-of-scope (OOS) utterances is crucial in task-oriented dialogue systems, but obtaining enough annotated OOS dialogues to train a binary classifier directly is difficult in practice. Existing data augmentation methods generate OOS dialogues automatically, but their performance usually depends on an external corpus. Herein we propose SILVER, a self data augmentation method that does not use external data. It improves the accuracy of OOS detection (false positive rate: 90.5% → 47.4%). Furthermore, SILVER successfully generates high-quality in-domain (IND) OOS dialogues in terms of naturalness (percentage: 8% → 68%) and OOS correctness (percentage: 74% → 88%), as evaluated by human workers.