Often both an utterance and its context must be read to understand its intent in a dialog. Herein we propose a task, SelfContained Utterance Description (SCUD), to describe the intent of an utterance in a dialog with multiple simple natural sentences without the context. If a task can be performed concurrently with high accuracy as the conversation continues such as in an accommodation search dialog, the operator can easily suggest candidates to the customer by inputting SCUDs of the customer’s utterances to the accommodation search system. SCUDs can also describe the transition of customer requests from the dialog log. We construct a Japanese corpus to train and evaluate automatic SCUD generation. The corpus consists of 210 dialogs containing 10,814 sentences. We conduct an experiment to verify that SCUDs can be automatically generated.
Additionally, we investigate the influence of the amount of training data on the automatic generation performance using 8,200 additional examples.
https://github.com/megagonlabs/asdc