We bring you another, long awaited, team profile blog post! In this article, we interview Dan Zhang, Senior Research Engineer and Research Manager at Megagon Labs. Learn about the trajectory of a research scientist, from PhD to management, and the experience and opportunities for growth Megagon Labs has to offer.
What is your focus at Megagon?
I’m a Senior Research Engineer and the Research Manager for our human-in-the-loop (HITL) research group.
What was your journey into the field?
I’ve been pretty interested in machines and programs since I was young. When I was young, my dad would help me break down toys. When I was in elementary school, I had a computer class where the teacher showed us the concept of basic programming with a high-level language called LOGO. It consisted of giving commands and geographic calculations to be able to move a turtle on the screen and have it perform different actions. For some kids, it was boring or they didn’t like to have to do the calculations. But for me and a few others, it was exciting.
Later I got into a high school programming club, and we attended some local competitions for NOI (National Olympiad in Informatics). I liked it so much that I later chose computer science as a college major and before I knew it, I was so into it that I was doing my PhD in privacy-preserving data management at UMass and interning at Megagon Labs.
How did your research project during your internship at Megagon Labs affect your career trajectory?
During my Ph.D. research, I was in a database group working on more data privacy-related topics. When I interned, it was a minor transition into data mining and data discovery, where we would try to predict the column type from the column contents in tables. It was not directly related to my PhD research. It was an intersection between data management and Natural Language Processing techniques where we used neural networks and language embeddings to solve the problem. It was a small jump from data management to database applied machine learning project. I had to learn training and testing models using PyTorch, something I hadn’t done in my Ph.D. before coming to intern. I knew it would be a bit of a switch, but it made me more interested in the internship. Plus, my mentor/supervisor was very supportive.
My internship assignment pushed me to learn something a bit different from what I had been studying for my PhD. The learning process was fun and exciting and got me interested in what I worked on at Megagon Labs. I enjoyed the experience and learned a lot while interning. After graduation, I applied to and joined Megagon Labs. It was a smooth transition.
What are you currently working on?
I’m currently working on the human-in-the-loop effort at Megagon Labs. In the machine learning pipeline a lot of things can be automated, but we need human input. The question to solve is how to effectively take advantage of the human input. Our goal is to design tools and algorithms to better utilize human power and improve the whole pipeline.
More specifically we have been working on this MegAnno project, which is focused on being able to support the annotation needs for researchers and data scientists.
I have also been working with our previous winter intern and new colleague, Seiji Maekawa, on solving challenges in active label selection in scenarios where annotation resources are limited.
Who are some of your mentors or role models in the field?
I have to acknowledge them all. The list isn’t short; it has been a long journey that requires a lot of support. I have my undergraduate Professor Hongzhi Wang, with whom I had my first research experiences working in his group. Then there’s my graduate advisor Professor Gerome Miklau and the people in the DB (DREAM) lab. During my internship at Megagon Labs, I received help from Çağatay Demiralp, Yoshi Suhara, and later from Eser Kandogan after joining full-time.
I have an appreciation for my Megagon Labs mentors because aside from learning from their research and engineering techniques and experience, they help me keep envisioning where I want to be in 5 years. Eser made me think not only about my current task, but further out, what I want to grow into and how I want to be recognized. They helped me make a road map of developing a career. That’s knowledge I didn’t have just starting off new in my career.
How do you keep yourself up to date with what is being developed, research, tools and more?
I subscribe to some info services. I subscribe and search for related topics on Archive and Medium, plus I review conference papers and highlights. At Megagon we have an internal paper-sharing Slack channel which is also a good source of exciting recent progress in our research area.
It’s important to make time for studying and reading about new developments in our field. I look forward to reading the interviews from my colleagues to see how they keep up with the information because it is something I am still developing.