Nikita Bhutani, Aaron Taylor, Chen Chen, Xiaolan Wang, Behzad Golshan, Wang-Chiew Tan
Knowledge bases (KBs) have long been the backbone of many real-world applications and services. There are many KB construction (KBC) methods that can extract factual information, where relationships between entities are explicitly stated in text. However, they cannot model implications between opinions which are abundant in user-generated text such as reviews and often have to be mined. Our goal is to develop a technique to build KBs that can capture both opinions and their implications. Since it can be expensive to obtain training data to learn to extract implications for each new domain of reviews, we propose an unsupervised KBC system, SAMPO, that is based on matrix factorization techniques. Specifically, SAMPO is tailored to build KBs for domains where many reviews on the same domain are available. We generate KBs for 20 different domains using SAMPO and manually evaluate KBs for 6 domains. Our experiments show that KBs generated using SAMPO capture information otherwise missed by other KBC methods. Specifically, we show that our KBs can provide additional training data to fine-tune language models that are used for downstream tasks such as review comprehension.