Projects

Improving Robustness of Machine Translation with Synthetic Noise

Published in NAACL, 2019   [pdf]

Modern Machine Translation (MT) systems perform consistently well on clean, in-domain text. However most human generated text, particularly in the realm of social media, is full of typos, slang, dialect, idiolect and other noise which can have a disastrous impact on the accuracy of output translation. In this paper we leverage the Machine Translation of Noisy Text (MTNT) dataset to enhance the robustness of MT systems by emulating naturally occurring noise in otherwise clean data. Synthesizing noise in this manner we are ultimately able to make a vanilla MT system resilient to naturally occurring noise and partially mitigate loss in accuracy resulting therefrom.

Evaluating Influence Functions for Memory Replay in Continual Learning

May, 2019

Replaying examples from previous tasks is a popular way to overcome catastrophic forgetting in machine learning systems aimed at continual learning (CL). Effectively selecting these examples to honor memory and time constraints however, is a challenging problem. We exeprimented several prin-cipled approaches for fixed-memory replay sampling derived from ‘Influence’ functions (Cook & Weisberg, 1982) to select useful examples that could help overcome catastrophic forgetting on previously encountered tasks. We performed an in-depth study of the effectiveness of influence-based sampling on the Split-MNIST benchmark dataset in three different continual learning settings and compare with other competitive subset sampling techniques.
We empirically evaluated Herding, K-means and Influence function-based sampling as effective replay sampling strategies across different CL settings.

Language Informed Modeling of Code-Switched Text

Published in ACL, 2018   [pdf]

Code-switching (CS), the practice of alternating between two or more languages in conversations, is pervasive in most multilingual communities. CS texts have a complex interplay between languages and occur in informal contexts that make them harder to collect and construct NLP tools for. We approach this problem through Language Modeling (LM) on a new Hindi-English mixed corpus containing 59,189 unique sentences collected from blogging websites. We implement and discuss different Language Models derived from a multi-layered LSTM architecture. We hypothesize that encoding language information strengthens a language model by helping to learn code-switching points. We show that our highest performing model achieves a test perplexity of 19.52 on the CS corpus that we collected and processed. On this data we demonstrate that our performance is an improvement over AWD-LSTM LM (a recent state of the art on monolingual English).