OpineDB and Voyageur: How Subjective Databases and Experiential Search Can Improve Customer Experiences

Online users are always seeking experiences that fulfill various desires. Whether it’s a hotel with clean rooms, a lively bar, or a restaurant with a romantic ambiance, these searches play an integral role in nearly every aspect of travel. Unfortunately, e-commerce search engines do not support experiential queries. Even though text reviews often contain experiential data, users must rely on objective attributes like location, price, and cuisine to find the experience they are looking for.

Sentiment analysis and opinion mining techniques can be leveraged to extract relevant descriptions from text reviews. But a database system must be able to model subjective data and process queries in the user’s natural language to support cognitive and experiential search queries. It must also be capable of specifying predicates involving objective attributes.

We developed OpineDB, a subjective database system that addresses these challenges by interpreting subjective predicates against a database schema through a combination of natural language processing (NLP) and information retrieval (IR) techniques. In a conservative evaluation, OpineDB outperformed an IR-based search engine (IR) and an attribute-based query engine (AB) by up to 15% for hotel queries and 10% for restaurant queries. It also accelerated query processing by up to 660% without compromising on result quality.

We also built Voyageur, a frontend experiential search engine for travel, on top of OpineDB. Unlike traditional search engines, this application can handle subjective queries and combine them with objective attributes to elucidate more insights and tips for the user.

Rotom: A multi-purposed data augmentation framework for training high-quality machine learning models

We propose Rotom, a multi-purposed data augmentation framework for training high-quality machine learning models while requiring only a small number (e.g., 200) of labeled examples.

Snippext: An Opinion Mining Pipeline that Uses Less Training Data

Snippext is a state-of-the-art (SOTA) opinion mining pipeline that extracts aspects, opinions, and sentiments from user-generated content such as online reviews. It allows for a reduction of 50% or more of the training data usually required.

ExtremeReader: An Interactive Explorer for Customizable and Explainable Review Summarization

ExtremeReader generates both a structured and abstractive summarization that are easier to interpret. It also allows users to explore and see explanations of these summaries by drilling down or up to the desired level of granularity. Users can even see the sentence from which the opinion features were extracted.

HappyDB: a happiness database of 100,000 happy moments

We built HappyDB, a crowd-sourced collection of 100,000 happy moments that we make publicly available. Our goal is to build NLP technology that understands how people express their happiness in text while achieving insights into happiness-leading events and scenarios on a scale.