ExtremeReader: An Interactive Explorer for Customizable and Explainable Review Summarization

The rise of e-commerce has spurred extensive growth in the volume of user reviews. Since reading reviews is a tedious and time-consuming process, automatic summarization systems have received significant attention in the data mining, machine learning, and natural language processing (NLP) communities.

Despite extensive research in this area, current state-of-the-art automatic text summarization systems still suffer from two significant limitations:

  1. They only provide static summaries that cannot be tailored to specific user needs.

  2. They do not explain or let users explore crucial aspects of the generated summary.


We developed ExtremeReader, an interactive review summary explorer, to address these obstacles:

  1. ExtremeReader generates both a structured and abstractive summarization that are easier to interpret. To accomplish the latter, ExtremeReader uses the breadth-first search algorithm method for opinion serialization and a seq2seq model to generate the output from this input sequence of opinions.

  2. It also allows users to explore and see explanations of these summaries by drilling down or up to the desired level of granularity. Users can even see the sentence from which the opinion features were extracted.


ExtremeReader is the first system with these capabilities. We believe that this system holds vast potential to transform how we interact with and understand user reviews at scale.

The Megagon Labs team had the honor of presenting ExtremeReader’s advantages through a demonstration on a public YELP restaurant review corpus and a private hotel review dataset at World Wide Web (WWW) 2020. WWW is an international conference that highlights how computer science, deep learning, and other technological fields impact the future of the internet.


OpinionDigest: A Simple Framework for Opinion Summarization (PDF)
Yoshihiko Suhara*, Xiaolan Wang*, Stefanos Angelidis, Wang-Chiew Tan –
ACL 2020 (short paper) (to appear)
* Equal contribution

ExtremeReader: An Interactive Explorer For Customizable And Explainable Review Summarization (PDF)
Xiaolan Wang, Yoshihiko Suhara, Natalie Nuno, Yuliang Li, Jinfeng Li, Nofar Carmeli, Stefanos Angelidis, Eser Kindogan, Wang-Chiew Tan – WWW 2020 (demo) 

Rotom: A multi-purposed data augmentation framework for training high-quality machine learning models

We propose Rotom, a multi-purposed data augmentation framework for training high-quality machine learning models while requiring only a small number (e.g., 200) of labeled examples.

Snippext: An Opinion Mining Pipeline that Uses Less Training Data

Snippext is a state-of-the-art (SOTA) opinion mining pipeline that extracts aspects, opinions, and sentiments from user-generated content such as online reviews. It allows for a reduction of 50% or more of the training data usually required.

HappyDB: a happiness database of 100,000 happy moments

We built HappyDB, a crowd-sourced collection of 100,000 happy moments that we make publicly available. Our goal is to build NLP technology that understands how people express their happiness in text while achieving insights into happiness-leading events and scenarios on a scale.

OpineDB and Voyageur: How Subjective Databases and Experiential Search Can Improve Customer Experiences

We developed OpineDB a subjective database system that addresses these challenges by interpreting subjective predicates against a database schema through a combination of natural language processing (NLP) and information retrieval (IR) techniques.