Welcome to our team profile blog post! In this article, we highlight Kaoru Tanaka, a research engineer working on image recognition technology at the Tokyo office. Through this interview he shares a bit about his background at Megagon Labs, his interests and current projects, as well as the secret to his continued success as a research engineer.
Please tell us about your background and why you joined Megagon Labs.
I worked for a Japanese manufacturer developing and planning new services after earning my PhD from Japan Advanced Institute of Science and Technology. Since I had a strong interest in implementing technology to impact and change society, I wanted to work closer to business planning, so I joined Recruit Holdings and worked on various projects as an engineering lead in the new business development department. I joined Megagon Labs(*) because I thought I could take on the challenge of contributing to the growth of the business by working on the development of cutting-edge technologies.
Please tell us about your current work and research.
Recently, I have been working on extracting information from document images. There are still a lot of documents circulating in the world, in paper and document image form, but when we try to utilize these documents with the help of computers, we are faced with the problem of, how do we convert them into data?
If you just want to get the text information, Optical Character Recognition (OCR), which has improved much more than before, makes transcription easier. However, with OCR alone, semantic information, defined by the layout of the document, will be lost. For example, the format and layout of many business documents are meaningful in themselves. From the format, humans can easily understand the degree of importance and where important information is in the document. On the other hand, it is difficult for a computer to reproduce this task. It requires a human to specify or confirm which information is really necessary.
To solve this problem, a machine needs to be able to know where and how something is described in a document. For this purpose, we are working on methods that can classify, understand, and extract text, images, and other regions in images.
What led us to start this project was a consultation from an engineer. As I mentioned earlier, the performance of OCR has recently improved significantly. However, when we look at the actual service development site, there are various issues that cannot be addressed by OCR alone. I am currently working on this theme with the hope of helping to solve these issues.
This may sound a bit extreme, but I personally think that technology itself can be the second priority. For me, the most important thing is to think about what social and business issues exist and how to solve them. That being so, I will continue to work on this project so that our current efforts will be useful for actual services.
What is the appeal of working at Megagon Labs and what are your future goals?
One of the attractive points of working at Megagon Labs(*) is the amount of discretion engineers have. In our team, the goal is the top priority, and we can choose our own methods to achieve that goal. For example, in other organizations, the methods themselves are specified, which can be quite restricting for the engineers, but in our team, there are no restrictions that say, “do it this way”. From that point of view, working here provides ease and freedom.
As for my future goals, the idea of a world where AI can do things on its own sounds nice, but I also prefer a world where people and machines interact to tackle things more creatively, which is what I would like to work on.
In the future, as information extraction technology advances and machines are able to guarantee the conversion of documents into data, it will become possible to reconstruct and re-edit documents based on the vast and diverse documents that people have created so far. As a result, new forms of expression that have never been seen before may be created.
Music DJs may be a good analogy to describe what I mean. They create new music by interpreting and arranging the past sound sources of records in various ways. Maybe I want to realize this kind of remix culture in the field of technology as well.
Do you have any advice for working as a research engineer?
Personally, I think it’s important not to define a role too much as “this is research” or “this is engineering” and dive into one or the other. If there is a specialist on the team, of course I will ask for their help, but I think it is necessary to have an attitude of making progress on your own in areas that are lacking or where it is better to take the initiative.
I am a research engineer, but in my mind I do not draw a line between the two. If I have to define my role first, it makes it difficult for me to achieve my goals and to move according to the growth of the project or business. My career background was in research before becoming an engineer, and I think it is very important to be able to move back and forth between both practices in order to improve and implement technology in society.
To be able to stay proficient and engaged in both, I attend conferences and other events to gather information on the research side and obtain information with solid sources. As for engineering, I use social networking services to keep abreast of the latest trends, although it can be a mixed bag.
Also, from a slightly different angle, it is essential to look at things other than technology and programming for the implementation of technology that can make a change in society. There are many different ways of thinking in the world, and this is just my own way of thinking, but I try to be flexible and remember that technology is a tool, not a purpose. If you get fixated on a specific technology too much, it makes it hard to apply it in the real world.
It is important to look around the world, have various experiences, see and hear strange things, and look for fields and issues where you can utilize your technology through these experiences. And if you have found a certain challenge, think about what barriers are there to solving that challenge, and what status quo biases are at work. Those constraints are often very strong, so it is very important to break them down into small, solvable pieces.
Lastly, I recommend you get a doctoral degree. Even just a few years back, doctoral graduates were not widely seen as a resource in the corporate world. Now the frame of mind has changed and corporations now see the great resource doctoral graduates are to the corporate world. I also believe that it is quite valuable to study hard and engage in research.
Please tell us about your favorite spots in Japan.
Since joining the company, I have been working almost exclusively in Ginza, which I recommend because you can have a happy lunch time at relatively reasonable prices during the day, even if the restaurants are moderately priced at night. It’s worthwhile and fun to explore. However, it is difficult to eat out these days due to the pandemic, which I hope will end soon.
We hope you’ve enjoyed this special interview with Kaoru! Check out our blog to see other team member profiles and learn more about Megagon’s recent work. Follow us on LinkedIn and Twitter for to stay up to date with us.