Sudowoodo: Contrastive Self-supervised Learning for Data Integration Applications
We introduce Sudowoodo, an end-to-end framework for a variety of data integration applications to resolve the limitations of data integration. Sudowoodo addresses the label requirement by leveraging contrastive learning to learn a data representation model from a large collection of unlabeled data items. This is realized by the contrastive objective that allows the model to learn how to distinguish pairs of similar data items from dissimilar ones that are likely to be distinct.