PhD Defense: Temporal Context Modeling for Text Streams

Jinfeng Rao
06.07.2018 13:00 to 15:00
AVW 4172

There is increasing recognition that time plays an essential role in many information seeking tasks. This dissertation explores temporal models on evolving streams of text and the role that such models can improve information access. I consider two major cases: a stream of social media posts by many users for tweet search, a stream of queries by an individual user for voice search. My work proposes to explore the relationship between temporal models and context models --- for tweet search, the evolution of an event serves as the context of clustering relevant tweets; for voice search, user's history queries are the contexts for helping understand her true information need. First, I work on the tweet search problem by modeling the temporal contexts of the underlying collection. The intuition is that an information need in Twitter usually correlates with a breaking news event, thus tweets posted during the time of that event are more likely to be relevant. I explore techniques to model two different types of temporal signals: pseudo trend and query trend. The pseudo trend is estimated through distribution of timestamps from an initial list of retrieved documents given a query, which I model through continuous hidden Markov approach as well as neural network based methods for relevance ranking and sequence modeling. The query trend, however, is directly estimated from the temporal statistics of query terms, obviating the needs for an initial retrieval. I propose two different approaches to exploiting query trends: a linear feature-based ranking model and a regression-based model that aims to recover the distribution of relevant documents directly from the query trends. Extensive experiments on standard twitter collections demonstrate the superior effectivenesses of my proposed techniques.Second, I introduce the novel problem of voice search on an entertainment platform, where users interact with a voice-enabled remote controller through voice queries to search TV programs. Such queries range from specific program navigation (i.e., watch a movie) to requests with vague intents and even queries that have nothing to do with watching TV. I present successively richer neural network architectures to tackle this challenge based on two key insights: The first is that session context can be exploited to disambiguate queries and recover from ASR errors, which I operationalize with hierarchical recurrent neural networks. The second insight is that query understanding requires evidence integration across multiple related tasks, which I identify as program prediction, intent classification, and query tagging. I present a novel multi-task neural architecture that jointly learns to accomplish all three tasks. The first model, already deployed in production, serves millions of queries daily with an improved customer experience. The multi-task learning model is evaluated on carefully-controlled laboratory experiments, which demonstrates further gains in effectiveness and increased system capabilities. This work also won an Emmy award in 2017 for the technical contribution in advancing television technologies.To conclude, this dissertation presents families of techniques for modeling temporal information as contexts to assist applications with streaming inputs, such as tweet search and voice search. My models not only establish the state-of-the-art effectivenesses on many related tasks, also reveal insights of how various temporal patterns could impact on real information-seeking processes.

Examining Committee:

Chair: Dr. Jimmy Lin Dean's rep: Dr. Alan Sussman Members: Dr. Marine Carpuat Dr. Jordan Boyd-Graber Dr. John Dickerson