Semantic tagging, which has extensive applications in text mining,
predicts whether a given piece of text conveys the meaning of a
given semantic tag. The problem of semantic tagging is largely
solved with supervised learning and today, deep learning models
are widely perceived to be better for semantic tagging. However,
there is no comprehensive study supporting the popular belief. Practitioners often have to train different types of models for each semantic tagging task to identify the best model. This process is both
expensive and inefficient.
We embark on a systematic study to investigate the following
question: Are deep models the best performing model for all semantic tagging tasks? To answer this question, we compare deep
models against “simple models” over datasets with varying characteristics. Specifically, we select three prevalent deep models (i.e.
CNN, LSTM, and BERT) and two simple models (i.e. LR and
SVM), and compare their performance on the semantic tagging task
over 21 datasets. Results show that the size, the label ratio, and the
label cleanliness of a dataset significantly impact the quality of semantic tagging. Simple models achieve similar tagging quality to
deep models on large datasets, but the runtime of simple models is
much shorter. Moreover, simple models can achieve better tagging
quality than deep models when targeting datasets show worse label
cleanliness and/or more severe imbalance. Based on these findings, our study can systematically guide practitioners in