Machine Intelligence Lab @SNUEE

Machine Intelligence Lab (MILAB) at Seoul National University currently focuses on natural language processing, deep learning and its applications, and data analysis and web service. We are working on developing deep learning algorithms for natural language processing including question and answering (QA), context-aware dialogue, machine translation, and speech recognition. Furthermore, we are exploring deep learning applications for logical inference, graph-structured data, multi-modal tasks, and clinical events. We also conduct data analysis and web service research such as crowdsourcing, recommendation systems, and identifying misinformation.

Collaborators

Our group encourages collaboration with researchers from other institutions and research groups to widen the horizon of our research capability. We are doing joints work with Samsung Advanced Institute of Technology, MIT, KAIST, Microsoft Research Cambridge, Adobe Research, Hyundai AIRLab, and Naver.

Research

Natural Language Processing

Question and Answer system with Deep Learning

Question answering system has long been considered a primary objective of artificial intelligence. The advancement of the QA system has attracted huge interests from the academic and industry community these days. We have been working on developing ranking algorithms that can select the best answer among the candidates. We also investigate the human-like QA system that generates answers in the natural language form in aligning with the given context. Furthermore, our research aims to fully incorporate an external knowledge to the QA system for integrating deep neural network-based model and human-generated expertise knowledge-base. This is a joint work with Adobe Research.

Context Aware Dialogue System with Deep Neural Network

There has been growing interest in human-like dialogue system such as Apple Siri, Google Assistant and other intelligence assistant services. However, current chat-bots tend to be incapable of recognizing context of utterances, often causing irritation and inconvenience. To overcome these difficulties, we are developing a Deep Learning-based dialogue model that can understand longitudinal context from dialogues. Furthermore, we aim to develop an intelligent personal assistant through combining the dialog system with speech recognition and image processing. This is a joint work with Social Computing Lab at KAIST.

Ethical Biases in AI: fairness, privacy, hostility and legality

With the increasing use of AI in various industries such as medicine, law, and advertising, it is important to address the issue of fairness to eliminate algorithmic biases in automated decision-making processes. This means that AI’s results should not be influenced by any social prejudices present in the training data. Furthermore, AI models must protect personal information and avoid producing illegal results. We are working towards defining fairness in deep learning models and finding ways to resolve ethical biases in these models. For instance, in the field of law, the aim is to develop AI that do not discriminate protected social groups, in medicine, the focus is on protecting patients' privacy, and in advertising, the goal is to create an algorithm that prevents unethical practices and promotes fair advertising. This is a joint work with the Ministry of Science and ICT (IITP).

Deep Neural Machine Translation

As one of major tasks in deep learning, Neural Machine Translation (NMT) trains deep neural networks to translate a given sentence of the source language to the target language while maintaining implicit meaning of the sentence. Our research on NMT tries to utilize not only the given input sentence but also its implicit/explicit contexts such as surrounding paragraphs, extra audio/video signals, and linguistic features. We have developed an appropriate hierarchical Transformer structure to exploit preceding several sentences and shown that utilizing extra context information improves translation performance especially in spoken language style texts. Our hierarchical Transformer recorded the best translation quality compared to other contemporary deep learning based translation models in English-Korean spoken language corpus. We also study more efficient architectures in order to enhance Transformer and Multi-head attention.

Language Representation Learning for natural language understanding

There has been growing interest in the self-supervised learning for general language understanding since Bidirectional Encoder Representations from Transformers (BERT) has been proposed. Following the current trend, we are developing a new bidirectional language model based on the Transformer encoder that has language auto-encoding property unlike BERT. We have confirmed that our model has wider applications including unsupervised learning tasks such as evaluating the naturalness of text. We are enhancing and studying contextual word representations of the proposed approach.

Speech Recognition system with Deep Neural Network

As Deep Neural Networks (DNNs) have led to significant improvements in Automatic Speech Recognition (ASR), we currently study on developing the ASR system further using additional modalities such as text or image. To increase the recognition accuracy, we do research on combining the ASR system with a Language Model that can evaluate the naturalness of a sentence. It can help the model choose a more plausible sentence between the sentences having similar pronounce. Furthermore, we study on effectively extracting and combining the information from additional modalities in the ASR systems. In order for the ASR systems to be used in many applications such as virtual assistants or speaker diarisation, techniques such as sentimental analysis, dialogue system, or object detection is essential for the system to work properly.

Deep Learning and Applications

Deep Multi-modal Learning

The data that can be observed in real world exist in various modalities such as images and sounds. We study multimodal deep learning algorithms in which the model can fuse multimodal inputs. For example, we study visual question answering that requires understanding texts and images, or multimodal speech emotion recognition that requires understanding of texts and sounds.

Deep Learning for Logical Inference

In contrast to current deep learning applications' ground-breaking success in recognition domains, a growing body of literature pinpoints that deep models fail to extend rules in logical inference tasks. Without logical inference abilities, the following deep models' applications are not possible: designing mathematical solvers that can handle variables within an unprecedented range; developing context-aware QA systems that can track relevant sentences within the evolving dialogue; implementing high-level code processors that understand user-defined algorithms written in programming codes. Our research aims to make neural networks do systematic, i.e., rule-based, generalization via combining learned rules as humans do. This is a joint work with Samsung Research Funding & Incubation Center for Future Technology.

Graph Structured Data Learning with Graph Neural Network

Graph neural networks are novel deep learning architectures for processing graph-structured data by learning from its structure explicitly, and has a wide range of applications from molecular property prediction to modeling of physical systems. We aim to find the shortcomings of the current graph neural network architectures and seek to improve upon it by integrating ideas from graph theory, information theory, and deep learning. Also, we extend the use of graph neural networks to domains of Natural Language and Vision to explicitly learn the inherent relationships within the data.

Image Processing with Deep Neural Network

Many tasks which contributed to the development of deep learning are originated from Image Processing, and studies are still being actively conducted. Image Processing Tasks require in-depth understandings and study of continuous distribution. We are proceeding with Image Processing for both Generative Model and Discriminative Model. More precisely, we are conducting research on Image Generation using GAN with semantic segmentation information. Also, we conduct classical object detection too.

Predicting Clinical Events with Deep Learning

As the large amount of clinical data have been collected over enormous patients over multiple years, related ML problems have been studied but most works do not achieve required accuracy and scalability. Thanks to the recent advances in RNN and scalable computations, we are currently developing a RNN model to forecast the future medication prescription by assessing the entire history of patients.

Data Analysis and Web Service

Crowdsourcing

Crowdsourcing has become one of the cornerstones of research in the development of human computation based intelligent systems. New Internet-based services like ‘Mechanical Turk’ allow workers from around the world to be easily hired and managed for solving various problems. Crowdsourcing systems are now in widespread use for large-scale data-processing tasks such as image classification, video annotation, form data entry, optical character recognition, translation, recommendation, and proofreading. We are currently developing a general model of such crowdsourcing tasks and devising an efficient algorithm which determines the most likely answers by combining responses of workers. Even though these low-paid workers can be unreliable, our algorithm can achieve a nearly optimal results.

Recommendation Systems

With the emerging e-commerce market and a flood of products, the importance of the recommendation system is growing. Using the recommendation system not only increases the user's satisfaction but also increases the sales volume of the product. However, it is a difficult problem to recommend the desired product to users through utilizing big data. Our research goal is to build an effective deep learning-based recommendation system that can capture the user preference from various user information.

Identifying Misinformation on Social Media [link to the project page]

Over the past few years, social media has been awash with false and misleading information – sometimes called “fake news”. Much of the information shared online lacks verification, and thereby many organizations engage in debunking and fact-checking to combat the massive spread of disinformation. As part of this effort, we have been working to tackle the Headline Incongruence problem, where a headline of news article holds unrelated or distinct claims with the stories across its body text. Equipped with technological competence backed by abundant NLP research experience, we have developed various models based on an attentive hierarchical encoder and graph neural networks. This is joint work with Social Computing Lab at KAIST.

Selected Projects

점차 강화되고 있는 윤리 정책에 발맞춰 유연하게 진화하는 인공지능 기술 개발 연구 (IITP, 2022-2026)
지식에 근거한 일관적이고 확장성 있는 대화모델 개발 (LG AI Research, 2022-2023)
IITP 글로벌 핵심인재 양성지원 사업 (Microsoft Research Asia, 2021-2022)
대용량 언어모델의 신뢰성 향상 및 지식 기반 대화모델에의 응용 (네이버, 2021-2024)
논리적 추론이 가능한 딥러닝 기반 질의 응답 시스템 개발 (중견연구자지원사업, 한국연구재단, 2021-2024)
지역 공교육 혁신을 위한 학생, 청소년 다면 역량 분석 인공지능 학습용 데이터 구축 (NIA, 2021)
의약품 부작용 보고자료의 자연어 처리 기술 연구 (식약청, 2021-2022)
상황 인지를 위한 빠르고 효율적인 딥러닝 기반의 Language Representation 개발 (삼성DS, 2020-2025)
지식기반 대화형 질의응답 시스템 개발 (현대자동차 AIRLab, 2020-2022)
멀티모달 상황이해 기반 대화모델 연구 (삼성리서치, 2020-2021)
논리적 추론을 위한 딥 러닝 아키텍쳐 개발 (삼성미래재단, 2019-2021)
딥 러닝 기반 상황 인지 대화모델 알고리즘 개발 (삼성종합기술원, 2019-2020)
인공지능 자동온도조절 시스템 개발 (다산지앤지, 2018-2019)
네이버 블로그 루머 탐지 연구 (네이버, 2017-2018)
대화 상황과 감정 인지형 인공지능 대화 시스템 개발 (산업핵심기술사업, 산업자원부, 2016-2021)
PF급 이종 초고성능컴퓨터 개발 (미래창조과학부, 2016-2021)
온라인 상품 추천을 위한 딥 러닝 엔진 개발 (중소기업청, 2016-2017)
신경망 기반 일-영 번역 성능향상 연구 (NHN, 2016-2017)
의료 빅데이터 기반 약물의 미발견 위해정보 검출 기계학습 기법 개발 (중견연구자지원사업, 한국연구재단, 2016-2019)
Learning 기반 한국어 자연어 처리 시스템 및 영-한 기계번역 시스템 개발 (한우물로 홈런치기, 서울대학교 공학연구원, 2016-2026)
스마트 클래스룸향 딥 러닝 기반 기계 번역 (삼성종합기술원, 2015-2020)
스마트폰과 웨어러블 디바이스를 결합한 보청기앱 (소리노리닷컴, 2015-2016)
PCB 검사 시스템용 딥 러닝 기반 문자인식 시스템 개발 (고영테크, 2015-2016)
개인화 서비스를 위한 Privacy-Preserving Learning 기술 개발 (삼성전자, 2015)
Excellent New Faculty Funding on “Analysis of Information Diffusion and Network Dynamics in Social Networks Based on Big Data Analytics.” (한국연구재단, 2012-2015)
스마트폰 앱 추천 시스템 개발 (LG전자, 2013-2014)
삼성전자 종합기술원 Data 분석 기법 자문 교수 (Data Analytics Group, 2010-2011)
미투데이 대용량 데이터 처리와 친구 추천 알고리즘 연구 (NHN, 2011-2012)
Influence maximization in social networks 연구 (Microsoft Research, 2011-2012)
정보 확산 이론을 통한 소셜 네트워크 포렌식 분석 연구 (ETRI, 2011-2012)
최적화 소프트웨어 무결점 검증 특화기술 개발 (교육과학기술부, ERC, 2010-2016)
소셜 네트워크 및 웹 데이터 분석 및 추천 알고리즘 개발 (KOLON-KAIST, 2010-2013)