Weedle

Data-centric NLP requires careful examination and diagnosis of data during the iterative data science workflow. Unfortunately, existing data science tools tend to focus on specific steps in the workflow (e.g., labeling, wrangling), data types (e.g., tabular), or domains, failing to support comprehensive, continuous understanding of the data throughout the entire workflow. To address this issue for data-centric NLP, we present Weedle: Widget-Enabled Exploratory Data Analysis for NLP Experts. Weedle offers global and local exploration of text data via built-in and customizable transformation operations. Weedle comes with a dashboard widget, which can be composed programmatically or interactively. It is implemented as a Python package containing a Jupyter widget, which can be seamlessly integrated into existing data science environments. Our data model and dashboard components are designed based on our survey of existing NLP notebooks and visual text analysis literature.