Machine Intelligence Lab (MILAB) at Seoul National University focuses on analyzing massive data systems, and designing scalable machine learning algorithms based on structural and contextual properties. Currently our research topics include deep learning for text analysis and recommendation, machine learning and statistical inference, and social network analysis. With the power of GPUs, we endeavor to build systems for machine translation, question and answering(QA) and recommendation.

Our group encourages collaboration with researchers from other institutions and other research groups to broaden the horizon of our research capability. Currently we are collaborating with researchers from Samsung Advanced Institute of Technology, Microsoft Research Cambridge, MIT, IBM Research, Bell Labs Murray Hill, HKUST, LG Electronics, NHN, KAIST and ETRI.

Collaborators

Research

Deep Learning and Applications

  • Sentence prediction with RNN Language Model

We are currently developing a recommendation/predicting system for SMS in Android client. Our system recommend/predict the next message on current input messages for user via LSTM-RNN language model. We expect that this study will give us meaningful insight of language model and short-text analysis. This is a joint work with Samsung Electronics Software Center.

  • Predicting Clinical Events with Deep Learning

As the large amount of clinical data have been collected over enormous patients over multiple years, related ML problems have been studied but most works do not achieve required accuracy and scalability. Thanks to the recent advances in RNN and scalable computations, we are currently developing a RNN model to forecast the future medication prescription by assessing the entire history of patients.

  • Machine Translation with Deep Neural Networks

We are currently developing a mulit-lingual word embedding for Korean, English and Chinese, that can assist sentece sequence based LSTM-RNN deep learning machine translation stucture to capture the meaningful elements efficiently. The goal of this study is to introduce the word representation that captures similar semantic/grammatical features among different languages. This is a joint work with Samsung Advanced Institute of Technology.

Machine Learning and Recommendation System

  • Statistical Inference and Optimization

In many cases of machine learning problems, using prior knowledge as constraints in the statistical inference problem can be formulated by constrained discrete optimization, which is NP-hard in general. We study an efficient algorithm for such inference problems. We apply the algorithm to image segmentation, and show that imposing constraints greatly improve quality of segmentations. This is a joint work with Microsoft Research Cambridge, Machine Learning and Perception group.

  • Crowdsourcing

Crowdsourcing has become one of the cornerstones of research in the development of human computation based intelligent systems. New Internet-based services like ‘Mechanical Turk’ allow workers from around the world to be easily hired and managed for solving various problems. Crowdsourcing systems are now in widespread use for large-scale data-processing tasks such as image classification, video annotation, form data entry, optical character recognition, translation, recommendation, and proofreading. We are currently developing a general model of such crowdsoucing tasks and devising an efficient algorithm which determines the most likely answers by combining responses of workers. Even though these low-paid workers can be unreliable, our algorithm can achieve a nearly optimal results.

  • Decentralization Graph Inference Algorithm

We present a novel meta algorithm, Partition-Merge (PM), which takes existing centralized algorithms for graph computation and makes them distributed and faster. In a nutshell, PM divides the graph into small subgraphs using our novel randomized partitioning scheme, runs the centralized algorithm on each partition separately, and then stitches the resulting solutions to produce a global solution. We demonstrate the efficiency of the PM algorithm on two popular problems: computation of Maximum A Posteriori (MAP) assignment in an arbitrary pairwise Markov Random Field (MRF), and modularity optimization for community detection. This is a joint work with MIT.

Social Network Analysis

  • Decentralized Ranking Learning in Information Networks

The network consensus problem is to compute the mode of a distribution over items on nodes in an information network by decentralized message exchanges. We generalized the problem into decentralized ranking learning in the network, and have devised a scalable protocol for this task based on the voter model, and proved its correctness and efficiency. This is a joint work with Microsoft Research Cambridge.

  • Tipping Point Analysis of Information Diffusion in Social Network

Information diffusion plays an essential role in numerous human interactions, including diffusion of innovations and propagation of rumors. Understanding how information flow on networks is a central problem for industry and academia. A tipping point is a moment at which information spread rapidly and dramatically. Our research goal is to identify the tipping points and to analyze spreading behaviors. It is closely related to the initiation of a trend in marketing and the emergence of phase transitions in many complex systems.

  • Trend and Rumor Prediction in Social Media

The problem of identifying tend is of practical importance especially in social media, since information can diffuse more rapidly and widely than the offline counterpart. In this work, we classify trends by examining the three aspects of diffusion: temporal, structural, and linguistic. For the temporal characteristics, we propose a new periodic time series model that considers both the daily cycle and the external shock cycle. This work makes the first attempt to utilize periodic temporal features in identifying trends and rumors and test rigorously on a large annotated dataset based on a complete social media stream. This is a joint work with Social Computing Lab at KAIST and Microsoft Research Asia.

Projects

  • 대화 상황과 감정 인지형 인공지능 대화 시스템 개발(산업핵심기술사업, 산업자원부, 2016-2021)
  • PF급 이종 초고성능컴퓨터 개발(미래창조과학부, 2016-2021)
  • 신경망 기반 일-영 번역 성능향상 연구(NHN, 2016-2017)
  • 의료 빅데이터 기반 약물의 미발견 위해정보 검출 기계학습 기법 개발 (중견연구자지원사업, 한국연구재단, 2016-2019)
  • Learning 기반 한국어 자연어 처리 시스템 및 영-한 기계번역 시스템 개발(한우물로 홈런치기, 서울대학교 공학연구원, 2016-2026)
  • 스마트 클래스룸향 딥 러닝 기반 기계 번역(삼성종합기술원, 2015-2017)
  • 스마트폰과 웨어러블 디바이스를 결합한 보청기앱(소리노리닷컴, 2015-2016)
  • PCB 검사 시스템용 딥 러닝 기반 문자인식 시스템 개발(고영테크, 2015-2016)
  • 개인화 서비스를 위한 Privacy-Preserving Learning 기술 개발(삼성전자, 2015)
  • Excellent New Faculty Funding on “Analysis of Information Diffusion and Network Dynamics in Social Networks Based on Big Data Analytics.” (한국연구재단, 2012-2015)
  • 스마트폰 앱 추천 시스템 개발 (LG전자, 2013-2014)
  • 삼성전자 종합기술원 Data 분석 기법 자문 교수 (Data Analytics Group, 2010-2011)
  • 미투데이 대용량 데이터 처리와 친구 추천 알고리즘 연구 (NHN, 2011-2012)
  • Influence maximization in social networks 연구 (Microsoft Research, 2011-2012)
  • 정보 확산 이론을 통한 소셜 네트워크 포렌식 분석 연구 (ETRI, 2011-2012)
  • 최적화 소프트웨어 무결점 검증 특화기술 개발 (교육과학기술부, ERC, 2010-2016)
  • 소셜 네트워크 및 웹 데이터 분석 및 추천 알고리즘 개발 (KOLON-KAIST, 2010-2013)