We are wrapping up another successful summer of working alongside talented interns from a variety of universities. As usual, we want to share with you the talented people who contributed to our lab projects and give them an opportunity to talk about their experiences.
Bosung Kim
University: UC San Diego, Ca, USA
Program: PhD in Computer Science and Engineering
Project: Knowledge extraction with zero-shot settings. The main goal of this initiative is to find some knowledge (in the form of triplets) from raw texts, and we’re studying ways to extract new types of knowledge which the model hasn’t learned in the training stage. (Project team members: Nikita Bhutani and Hayate Iso with support from Estevam Hruschka and Tom Mitchell, CMU (advisory board))
What did you want to achieve during your internship before you started? What was your experience working with the team at Megagon?
I was looking to explore new research topics in industry. I also wanted to be able to submit a paper to a conference. As of now, I am happy with our progress and what we have so far. I am hopeful that we can submit to a top-tier NLP conference.
I also wanted to learn to collaborate since I usually work one-on-one with my advisor. With Megagon, I’m working with at least three people. I’ve really learned a lot in collaboration. At school, I just met with my advisor and skipped a lot of details during meeting updates and discussions. But during the internship, if I skip details, I get asked for more information. I now see the importance of communicating the details of my work and improving that skill. As a big bonus, providing more detailed explanations helps me see what I might be missing, gain new ideas, and develop the project further.
What's something you learned during your internship that you will take with you?
I learned to collaborate. I also learned that discussion is the most important thing when working with a team. Usually, I focus a lot on implementation; sometimes, I forget about other aspects as a result. But here at Megagon Labs, I get reminded of that. Working with code reviews and having discussions on implementation ideas really helps.
Hantian Zhang
University: Georgia Institute of Technology, GA, USA
Program: PhD in the intersection of ML and data-centric systems, with a focus on fairness
Project: Explaining complex models used for data integration tasks by counterfactual examples. A counterfactual example is an example where you change the original data point by a little bit and get a different result. For example, an explanation such as, “if the product IDs and colors become identical, the model would classify them as the same entity” is a counterfactual explanation. I focused on generating these counterfactual examples for different data integration tasks. (Project team member: Yuliang Li with support from Jin Wang and Nikita Bhutani)
What was your experience like while working with the team at Megagon?
It has been a pretty good experience so far. The team is small but diverse, and people have close relationships with one another. I like the coffee culture here, where we can enjoy coffee together after lunch.
I’ve learned a lot. I’ve learned how important it is to communicate with team members and iterate through ideas. The experience of working in a research lab in industry is very important to me because I would potentially be looking for a similar position in the future. This first-hand experience will help me decide what to do after graduation.
If you could single-handedly solve one problem within the field of ML-NLP, what would it be?
I think that would be making ML algorithms fair and responsible. To enable ML algorithms to make decisions that impact our lives, we must make them responsible. That includes making the algorithms fair as they apply to different groups and individuals, as well as making them interpretable so that humans can understand the logic behind the models and make sure that they work as intended.
University: Northeastern University, Boston, MA
Program: PhD in Computer Science at the DATA Lab
Project: Dataset discovery in data lakes, where tables are diverse, heterogeneous, and often lack structure. I have used contrastive language to encode columns in order to effectively align columns that are semantically similar to query columns. That way, we can find the most related or appropriate table. (Project team members: Yuliang Li and Jin Wang with support from Renee Miller, NU (advisory board), Dan Zhang, and Estevam Hruschka)
Grace Fan
What did you want to achieve during your internship before you started? What was your experience working with the team at Megagon?
On the technical side, I hadn’t worked in ML models and NLP, but now I’ve dipped my toes in those subjects and would like to continue working with them and learning more as I continue with my PhD.
Professionally, my mentors have helped me better how I present myself and showed me how to tell a story about my work with my presentations. They have really gone above and beyond, giving me notes on how to give presentations and making my stories cohesive. It has really helped me in note-taking, presentations, and selling myself to people.
Personally, I’ve learned not to be afraid to talk to others, no matter what level they are at. I’ve lost the fear of talking to others who are ahead of me professionally. My mentors have pushed me to be more inquisitive and have shown me that they are open to helping other emerging professionals like me.
What is one piece of advice you would give future interns?
I would suggest that they prepare some related reading before starting the internship and continue to brush up on the literature throughout their project. Also, I would advise that they ask their mentors clarifying questions on their project and how it relates to the literature, even after they settle in. Lastly, it is really important for an intern to connect with not only their mentors but also other people in the lab. Ask those other employees questions, both about their projects and also life advice. They might give you some technical suggestions that you cannot get anywhere else, as well as non-technical insight such as life outside of academia.
Fred Choi
University: University of Illinois, Urbana-Champaign
Program: PhD in Computer Science with a focus on interactive and social computing.
Project: Building a widget suite on Jupyter notebooks for graph exploration. We are focusing on knowledge graph exploration, with a focus on integrating interactive visualizations with the Jupyter notebook-style interactive coding environment. The goal is to create a suite of configurable widgets with reasonable defaults grounded in the literature that studies graph exploration techniques. (Project team members: Sajjadur Rahman with support from Hannah Kim, Dan Zhang)
What was your experience like while working with the team at Megagon?
For me, it’s been a change in perspective, since I previously had a software internship at Wayfair where I was part of a nine-person team. Now I’m doing a research internship with this smaller team at Megagon Labs. I’ve liked working with a smaller team because you get to know what your project really is and who the project is for. I’ve also found that I have more autonomy with the project. It’s a smaller code base, and I get the freedom to design the project within the constraints that I’m given. With that freedom, I created my own library/framework for our project. As a result, I contributed to a lot more than just my project. The library will be able to be reused for future projects.
Overall, I really enjoyed the structure of having morning meetings with Sajjadur Rahman where we’d come to a question, task, or issue, and then I’d take the day to tackle it. It’s a fun way to do the internship: start with a question and experiment with it throughout the rest of the day.
What's something you learned during your internship that you'll take with you?
One big thing I learned is how to write software while also thinking about writing a paper for the software. I’d have to think of where my software is going to fit into the literature as a research-oriented software design.
Before, I was focused on writing software for more software and for programmers. Now I have to know how to tell those who aren’t using the software how it is important for the research.
What advice would you give future interns?
Spend a lot of time in the first few meetings just trying to understand the problem. Even if you think you understand it, keep asking questions. There’s no harm in getting a good picture of the constraints that you are designing to.
University: Texas A&M
Program: PhD with a focus on HCI and Interactive Systems
Project: Building an interactive Jupyter notebook widget development environment for text data exploration that’s geared toward data scientists. Supporting more interactive features by a multi-plot dashboard, and pre-defined text transformation such as sentiment analysis and topic modeling is our objective. Ultimately, it’s about supporting data scientists’ text data exploration in developing NLP models by putting all resources in one tool. (Project team members: Hanna Kim, Dan Zhang, and Sajjadur Rahman)
Nahyun Kwon
What advice would you give future interns?
Stay open to hearing about research topics outside of your research focus. Try to expand your knowledge base and learn other topics, and about other projects, in the office. Doing so is very helpful to your research; you can get fresh insight and learn about more technologies and approaches that might be applicable to your research.
Get to know other people aside from the mentors, especially during lunch and during talks. Having casual conversations with them is very good. These chats made me feel comfortable. I asked many questions about other people’s research focuses, both for my professional benefit as well as purely out of curiosity. In that way, you can find future collaboration opportunities. It’s good networking.
Having advanced researchers around, I also tried to get general advice for my PhD life. They already went through it, so they can understand what I am going through and the challenges I might face. If I have a concern, they are willing to give me good advice from different perspectives.
Coming from Texas, what did you enjoy most about the Bay Area?
The weather is amazing: good sunlight and cool air. I had to ask myself: “Is this heaven?” There are a lot of nice places and events to explore. I really enjoyed the many food options. I liked having access to more Asian food. I also found many cultural and general events.
We are grateful to have had a successful summer of projects, outings, and even a potluck. We leave you with a collection of images from our summer experiences.
Interns: We’d like to thank you for all of your hard work this summer!
If you would like to learn more about our internship program, or to apply, check out our internship page.