Zero-shot triplet extraction is the task of extracting relation triplets of the form <head entity, relation label, tail entity> from each sentence wherein training dataset does not include examples of test relation labels. Existing approaches typically handle data scarcity by learning to generate synthetic data for unseen relations. These approaches typically suffer from poor quality of synthetic data and also require additional fine-tuning of the model to unseen relation labels.
In this project, we hypothesize that relation triplet extraction can be reformulated such that it aligns with the pre-training objective of large pre-trained language models. This can enable the models to leverage knowledge acquired during pre-training and render improved generalization capabilities to unseen relations. We formulate relation triplet extraction as a template infilling task where we concatenate the input text with relation templates and train the model to predict the masked head and tail entities in the template.
Example:
Input Text: His brother Byron LaBeach, also a sprinter, competed in the 1952 Summer Olympics representing Jamaica + Relation: <X> is a participant in <Y>
↓
T5
↓
Output: <X> Byron LaBeach <Y> 1952 Summer Olympics <Z>
The model is trained to maximize the probability of generating the output sequence. This approach has several other advantages. First, the relation templates can provide implicit information about compatible entity types for the relation that can benefit the entity extraction. Second, this approach does not require any pipeline, data augmentation or additional fine-tuning. Our experiments reveal that our approach outperforms state-of-the-art approaches which require synthetic data and/or fine-tuning on unseen relations.