Megagon Labs Summer 2023 Internship Experience

We are wrapping up another successful summer of working alongside talented interns from a variety of universities. As usual, we want to share with you the talented people who contributed to our lab projects and give them an opportunity to talk about their experiences.

Haopeng Zhang

University: UC Davis

Program: PhD in Computer Science

Project: We proposed an explainable fine-grained benchmark for instruction-based text editing and exploited the performance of LLM models for the task under both zero-shot and fine-tuning settings.
(Project mentors: Hayate Iso, Sairam Gurajada)

How did your project tie in with your studies?

I mostly worked on summarization before, and it is very interesting to extend my scope to text editing during the project. This is also my first time creating a benchmark and using large-scale data annotations, which are important for my future research.

What’s been your favorite part of the internship?

I love the conversations we had during lunches the most. It is really fun to learn about different cultures and backgrounds.

Xinru Wang

University: Purdue University

Program: PhD in the intersection of ML and data-centric systems, with a focus on fairness

Project: Our project was a multi-step LLM-human collaborative annotation framework to reduce costs and increase efficiency. The framework could be broken into two primary steps:

(1) LLMs generate (and explain) data labels,

(2) humans re-annotate a subset of those.

(Project mentors: Hannah Kim, Zhengjie Miao, Sajjadur Ranham, Kushan Mitra)

How did your project tie in with your studies? If it did not, how did the project help expand your research focus?

My studies were mostly human-subject studies on how AI assists humans in making decisions, while my internship project is on how LLMs assist human annotators in making accurate annotations. It was an adjustment in perspective and focus.

For my PhD studies, I’m mostly doing some simple traditional machine learning (ML) models with human user studies, while at Megagon the project is about how large language models (LLM) can help humans make annotations. This was my first big introduction to LLMs.

What's something you liked about working with the team?

The mentors are being very hands-on and always available. Overall, everyone is nice in the office and there’s a very multicultural environment. Everyone is willing to share about their culture.

Do you have any advice for future interns?

Yes. Don’t be too panicked or nervous about the first month when you might feel overwhelmed or even lost. Ask questions, the mentors are happy to help.

University: University of Illinois Chicago

Program: PhD in Computer Science

Project: Systematic bias in entity matching could cause severe harm if not resolved in all stages of the entity matching pipeline. In my project, we strive to design fairness-aware data preparation algorithms for entity matching. (Project mentors: Jin Wang, Zhengjie Miao, Nikita Bhutani)

Nima Shahbazi

How has the Megagon team helped you broaden your research interest or deepen your understanding of your research topic?

The way I’ve had to approach research was completely different, looking at research problems in a way that I had not done before. I was familiar with the problem definition, but how we were solving the problem was new to me. We had to explore a lot to get to the right problem definition. We had to formulate the right definition for fairness, and this was challenging since fairness is context-specific and can have multiple definitions. We then had to come up with formal proofs and baseline algorithms to compare to our final solution. In short, I’ve explored a wide range of research interests and topics with guidance from my mentors.

Being a three-month internship, this project’s timeline pushed me a lot, as it was different from my PhD project’s timeline.

For example, I might work nine months on this project in the parameters of a regular PhD program. But my internship project’s compressed timeline has been good for me because we all had to learn how to make decisions quickly– when to iterate and when to pull the plug on something. With so many big project decisions to make, time is of the essence. This wouldn’t have been possible without the support of my mentors.

What is something you liked about working at Megagon during your internship?

The people at Megagon are amazing. They’re very approachable and wonderful. There is no hierarchy here; everybody’s sitting next to each other. I made a lot of friends with the interns, staff, and even with people outside the office.
I came into my PhD during the pandemic, so it was hard to collaborate and interact with people in person. My Megagon internship was a good change, and it exposed me more to physical collaboration.

Aditi Mishra

University: Arizona State University

Program: PhD in Computer Science

Project: We worked on rationalizing knowledge-intensive task outputs using LLMs. With an increase in black-boxed models in the field of ML, it is of utmost importance to ensure transparency for such models– not just for experts but also for non-expert users. Our work primarily focused on generating natural language rationales for end users. We also evaluated the generated rationales on both coarse- and fine-grained parameters by performing multiple human subjects studies. (Project mentor: Sajjadur Rahman, Hannah Kim, Kushan Mitra)

How has the Megagon team helped you broaden your research interest or deepen your understanding of your research topic?

I have primarily worked on designing visualization tools to assist non-expert users. Before Megagon, I never had much experience working on core NLP where I run experiments. This summer I was knee-deep in using LLMs and running multiple iterations of experiments on the topic. This experience has not only given me an idea of how LLMs work, but also an intuitive sense of how I can get better outputs from them. Even smaller discussions with people outside the HITL team like Pouya and Estevam really helped me throughout the internship. My previous PhD work was related to LLMs and after having done the internship, I feel more confident and comfortable running experiments for my own work.

Apart from this, we also did some large-scale human subjects study which was very new to me, as I had primarily done only lab-based studies. The iterations we went through to improve the interface for Amazon Mechanical Turk workers gave me a perspective on the interfaces I design for my own research work and how I can make my own interfaces more user-friendly.

What is something you liked about working at Megagon during your internship?

The people are really nice. Everyone’s really fun to talk to. No one just talks about work; when it’s lunchtime, everyone chit-chats. I appreciated the snacks and kitchen amenities at the office. I can go talk to Sairam, and also go talk to someone else about my own project who might simply be interested. It’s a very collaborative environment, which was new for me because of the smaller lab I am coming from. There are only three of us at my current lab, so our collaborations are smaller. I like that at Megagon, as an intern, we have more people pitching in. That was really nice.

University: UC Santa Barbara

Program: PhD in Computer Science

Project: Link prediction in knowledge graphs is an important task with many downstream applications such as knowledge-base completion and question answering. Neural-based methods are a de facto choice for building effective link prediction models. However, training the models is very expensive because of the sheer size of the knowledge graphs. This project explores the idea of applying data distillation techniques to reduce the training costs of link prediction models. (Project mentors: Sairam Gurajara, Seiji Maekawa, Tanmay Laud, and Nedelina Teneva)

Alfonso Amayuelas

What advice would you give future interns?

Make sure you really align with the projects before you arrive. Do some review of the subject you will be working on beforehand, since you only have three months and time goes fast. I’d also suggest you reach out to your mentors before you start to ensure you have a good understanding of the project goals. The project is yours, so make sure that it’s feasible in three months and be prepared because it’s a very short time.

What did you enjoy most about being in the Bay Area?

As far as the office goes, I liked learning about Japanese culture. This was my first experience with the culture and even Japanese people. When it comes to the Bay Area overall, I had different expectations but I enjoyed it. I expected more of a big city feel from Mountain View or Silicon Valley, but it was quiet and not as lively as I would’ve liked. I did venture out and enjoy going to a lot of cities close by like San Francisco and Santa Cruz. I went surfing in Santa Cruz a couple of times.

Yunshu Wu

University: UC Riverside

Program: PhD in Computer Science

Project: LLMs offer strong potential as summary evaluators but face issues of high computational costs and the “Lost-in-the-middle” problem in long document summaries. To tackle this, we developed a straightforward method: extracting key sentences from lengthy source documents to create concise, information-packed summaries for LLM-based evaluations. (Project mentor: Hayate Iso and Pouya Pezeshkpour)

How did your project tie in with your studies?

Previously I was also interested in neural networks and graph studies. I was open to having a project that was different from what I was already focusing on. I wanted to take the opportunity to venture out of my studies. I was very happy to go into NLP.

How has it been working with your mentors?

My mentors are very supportive and very helpful. Hayate has been very supportive in helping me learn a lot about NLP. He helped me review and dive into the knowledge I needed to be successful. Pouya helps challenge me in the best ways. We have good conversations where he shares his opinion based on his experience and I share mine. Because my experience is different (I like theory, for example), we get to dig into these topics and differences in perspective.

We are grateful to have had a successful summer of projects, outings, and even a potluck. We leave you with a collection of images from our summer experiences.

Interns: We’d like to thank you for all of your hard work and dedication. We had a great time!

If you would like to learn more about our internship program, or to apply, check out our internship page.

Biking, Lunch, and Potluck with the Team

Previous slide

Next slide

Written by: Megagon Labs

Follow us on LinkedIn and Twitter to stay up to date with new research and projects.

Megagon Labs Summer 2023 Internship Experience

Haopeng Zhang

How did your project tie in with your studies?

What’s been your favorite part of the internship?

Xinru Wang

How did your project tie in with your studies? If it did not, how did the project help expand your research focus?

What's something you liked about working with the team?

Do you have any advice for future interns?

Nima Shahbazi

How has the Megagon team helped you broaden your research interest or deepen your understanding of your research topic?

What is something you liked about working at Megagon during your internship?

Aditi Mishra

How has the Megagon team helped you broaden your research interest or deepen your understanding of your research topic?

What is something you liked about working at Megagon during your internship?

Alfonso Amayuelas

What advice would you give future interns?

What did you enjoy most about being in the Bay Area?

Yunshu Wu

How did your project tie in with your studies?

How has it been working with your mentors?

Biking, Lunch, and Potluck with the Team

Share:

More Blog Posts:

XATU: A Fine-grained Instruction-based Benchmark for Explainable Text Updates

Human-LLM Collaborative Annotation Through Effective Verification of LLM Labels

LLMs as Data Annotators (Part 2) – MEGAnno+: A Human-LLM Collaborative Annotation System

LLMs as Data Annotators (Part 1) – Challenges and Opportunities

Less Is More for Long Document Summary Evaluation by LLMs

Megagon Team Feature: Aiden Zhao