Minun: Evaluating Counterfactual Explanations for Entity Matching
Comparative Opinion Summarization via Collaborative Decoding
Annotating Columns with Pre-trained Language Models
Characterizing Practices, Limitations, and Opportunities Related to Text Information Extraction Workflows: A Human-in-the-loop Perspective
Convex Aggregation for Opinion Summarization
Recent advances in text autoencoders have significantly improved the quality of the latent space, which enables models to generate grammatical and consistent text from aggregated latent vectors. As a successful application of this property, unsupervised opinion summarization models generate a summary by decoding the aggregated latent vectors of inputs. More specifically, they perform the aggregation […]
Machamp: A Generalized Entity Matching Benchmark
Entity Matching (EM) refers to the problem of determining whether two different data representations refer to the same real-world entity. It has been a long-standing interest of the data management community and many efforts have been paid in creating benchmark tasks as well as in developing advanced matching techniques. However, existing benchmark tasks for EM […]
Towards integrated, interactive, and extensible text data analytics with LEAM
From tweets to product reviews, text is ubiquitous on the web and often contains valuable information for both enterprises and consumers. However, the online text is generally noisy and incomplete, requiring users to process and analyze the data to extract insights. While there are systems effective for different stages of text analysis, users lack extensible […]