Dataset discovery from data lakes is essential in many real-world applications that require table search over open datasets. There are many important downstream tasks for dataset discovery, such as table union search, finding joinable tables, and column clustering. Starmie is an end-to-end framework for dataset discovery, with table union search as the main use case. Given a query table and a collection of data lake tables, table union search aims to find all tables that are unionable with the query table. Starmie features a novel contrastive learning-based technique to train column encoders from pre-trained language models in a fully unsupervised manner. It also introduces a multi-column-based pre-training strategy to include the contextual information into the column representation. To accelerate the query processing for table search, we also propose a filter-and-verification framework to enable multiple design choices in indexing and pruning. The experimental results on public datasets show that Starmie significantly outperforms state-of-the-art methods in terms of precision, recall, and mean average precision (MAP). Results also suggest that the design choices of utilizing the Hierarchical Navigable Small World (HNSW) index enable Starmie to scale well on large real datasets with 50 million tables and 250 million columns.