WIT: Workshop On Deriving Insights From User-Generated Text @KDD2021

Workshop Overview

Users from Web platforms and online services generate tremendous amounts of user-generated data in the form of search requests, reviews, questions, and answers. We believe there is a great opportunity to exploit advanced AI/NLP techniques on user-generated text data which are rich in user insights and experiences.

The WIT workshop provides a venue for researchers from the academia and industry to address challenges around harnessing text-heavy user-generated data that is available to different types of organizations, especially on topics pertaining to the pipeline of extracting data from unstructured text to a structured form to obtain insights. We will have a great line-up of Invited Speakers and Panelists.  

WIT will be held as a virtual single-day event at SIGKDD 2021


Confirmed Invited Speakers

Confirmed Speaker (alphabetically ordered list):



The Workshop will happen August 15, 2021 (all times are PST).
8:00am – 8:10am: Welcome Message
8:10am – 9:10am: Invited Talk by Luna Dong (Facebook) – session chair: Estevam Hruschka
9:10am – 9:40am: Session I – Oral  Presentations (2 papers) – session chair: Sajjadur Rahman
  • Evan Shieh, Saul Simhon, Geetha Aluri, Giorgos Papachristoudis, Doa Yakut and Dhanya Raghu. Attribute Similarity and Relevance-Based Product Schema Matching for Targeted Catalog Enrichment (Video Presentation)
  • Natasha Z. Foutz, Xuan Zhang, Zhilei Qiao, Wenqi Shen and Weiguo Fan. Diamond in the Rough? Product Defect Detection and Summarization from UGCs (Video Presentation)
9:40am – 9:50am: Break
9:50am – 10:50am: Invited Talk by Yunyao Li (IBM Research) – session chair: Sajjadur Rahman
10:50am – 12:20pm: Panel (Bias and Fairness using user generated content: Sorting out “the good”, “the bad”, and “the ugly”)
  • Behzad Golshan (Megagon Labs) – Moderator
  • Thom Lake (Indeed.com) – Panelist
  • Yunyao Li (IBM Research) – Panelist
  • Vagelis Papalexakis (UC Riverside) – Panelist
  • William Wang (UC Santa Barbara) – Panelist
12:20pm-1:30pm: Lunch
1:30pm-2:30pm: Invited Talk by Vagelis Papalexakis (UC Riverside) – session chair: Nikita Bhutani
2:30pm-3:00pm: Session II – Oral Presentations (2 papers) – session chair: Hannah Kim
3:00pm – 3:10pm: Break
3:10pm – 4:10pm: Invited Talk by William Wang (UC Santa Barbara) – session chair: Hannah Kim
4:10pm – 4:50pm: Poster Session
4:50pm – 5:00pm: Closing Remarks

Call For Papers

We  encourage  submissions  that  describe  a  well-defined  piece  of  research  or  is thought-provoking. Topics will include but are not limited to information extraction, data cleaning, entity matching, schema matching, semantic search, summarization, language generation,  (common-sense) knowledge-bases and information seeking Q&A/Dialogue.

Submitted papers can be regular papers or extended abstracts. If there is sufficient interest from the authors of accepted papers,  we may publish the post-proceedings at CEUR. The maximum length of a regular paper is 8 pages plus unlimited number of pages for references. The maximum length of an extended abstract is 4 pages plus unlimited number of pages for references. At least one author of every accepted paper is expected to attend the workshop. Regular papers will be given an oral presentation slot.  Extended abstracts will be presented in the form of poster/demo/short talks, depending on the workshop schedule.


Important Dates

  • Submission deadline (All submission types): Jun 4th, 2021 (extended) May 20th, 2021
  • Notification of acceptance: June 10, 2021
  • Camera ready papers due: TBD



Papers should be formatted following KDD2021 template (as describe in guidelines here: https://www.acm.org/publications/proceedings-template) and submitted using the submission system available in this link: https://easychair.org/conferences/?conf=wit2021




Program Committee

Everton Alvares Cherman – Birdie

Maisa Cristina Duarte – Bradesco Bank 

Sanaz Bahargam – Twitter

Ricardo Marcacini – ICMC/USP

Nelson Ebecken – COPPE/UFRJ Federal University of Rio de Janeiro

Aljaz Kosmerlj – Jozef Stefan Institute

Nikita Bhutani – Megagon Labs

Sajjadur Rahman – Megagon Labs

Grace Hui Yang – Georgetown University

Jun Ma – Amazon

Vinicius Carida – Itaú Unibanco

Joao Gama – Porto University



If you have any questions or inquiries regarding the workshop or need further information, please do not hesitate to send an email to wit@megagon.ai.