Machamp: A Generalized Entity Matching Benchmark

Entity Matching (EM) refers to the problem of determining whether two different data representations refer to the same real-world entity. It has been a long-standing interest of the data management community and many efforts have been paid in creating benchmark tasks as well as in developing advanced matching techniques. However, existing benchmark tasks for EM […]

Towards integrated, interactive, and extensible text data analytics with LEAM

From tweets to product reviews, text is ubiquitous on the web and often contains valuable information for both enterprises and consumers. However, the online text is generally noisy and incomplete, requiring users to process and analyze the data to extract insights. While there are systems effective for different stages of text analysis, users lack extensible […]