Deep Dive with WiTQA: When Does Retrieval Augmentation Help (or Hurt) Language Models?

The article presents the WiTQA dataset, designed to assess the impact of retrieval on the performance of language models in question-answering systems. It details the findings on when retrieval augmentation enhances QA accuracy and when it may introduce errors, providing valuable guidance for optimizing RALMs.