Nima Shahbazi, Jin Wang, Zhengjie Miao, Nikita Bhutani
Entity matching is a crucial task in many real
applications. Despite the substantial body of research that focuses
on improving the effectiveness of entity matching, enhancing
its fairness has received scant attention. To fill this gap, this
paper introduces a new problem of preparing fairness-aware
datasets for entity matching. We formally outline the problem,
drawing upon the principles of group fairness and statistical
parity. We devise three highly efficient algorithms to accelerate
the process of identifying an unbiased dataset from the vast
search space. Our experiments on four real-world datasets show
that our proposed algorithms can significantly improve fairness
in the results while achieving comparable effectiveness to existing
fairness-agnostic methods. Furthermore, we conduct case studies
to demonstrate that our proposed techniques can be seamlessly
integrated into end-to-end entity matching pipelines to support
fairness requirements in real-world applications.