Artificial Intelligence

Streamlining Reader Q&A with Zero-Shot Learning

This video series from R&D features team members describing their roles, processes and the specific technical challenges they encounter while building and shipping projects. Along with each episode, we’ll share relevant background, resources, references and advice for anyone interested in creating something similar or learning more. If you have any questions, you can email us at [email protected].

In this episode, R&D engineer Jack Cook describes the latest improvements to Switchboard, an internal tool that uses natural language processing to cluster reader questions and quickly surface answers from reporters and experts in the newsroom. When Switchboard was first deployed in March 2020 to help answer questions about Covid-19, it had to be trained on thousands of topic-specific question pairs, a process which took months. To cut down on setup time and make Switchboard easier to deploy for other topics and new areas of coverage, we’re experimenting with a new approach that uses zero-shot learning. With zero-shot learning, we can explore a much broader set of use cases for Switchboard and someday potentially enable readers to get answers to any question across the entire news report.

By Jack Cook, A.J. Chavar, Adam Blufarb, Rony Karadi, Minkyoung Kim, Lana Z Porter, Olivia Feld

Jack's Notes

  • We’ve experimented with several approaches to achieve high-accuracy answer retrieval for readers. Switchboard’s clustering model was initially based on DistilBERT, a smaller version of Google's general-purpose language model, BERT. The model was fine-tuned on question pairs to look for similarities. However, the lack of a high-quality training dataset negatively impacted the accuracy of our downstream predictions. We considered building our own dataset, but realized a zero-shot approach, which would eliminate the need for training data, would be more efficient.
  • Zero-shot models, which have recently become more mainstream in N.L.P., can be prompted to solve various tasks without any training data. This feature makes zero-shot models ideal for the type of flexibility we need as a news organization. We don’t know what the news will be about next week or next month, so it’s difficult to source training data and pre-train topic-specific models. If we eventually want to use Switchboard within our daily report, a different approach is required.
  • We experimented with two zero-shot learning approaches, one based on BERT and trained on a natural language inference task, and a second using T0, a newer zero-shot model from BigScience. Instead of building an entire dataset, we describe the task we want the model to perform with a prompt, which gives the new model enough information to answer each question. We tested both versions in an internal experiment during the Tony Awards, where members of R&D watched the broadcast and submitted questions to Switchboard. The test confirmed that the zero-shot model could adapt to topics we had never covered before. For example, when asked, “Has anyone named Tony ever won a Tony?” the BERT-based model responded with the same answer as “Who is Tony?” Whereas the T0-based model correctly identified these were two different questions. 
  • So far, Switchboard has helped answer upwards of 35,000 reader questions. We continue to iterate and work on model improvements, UX and user testing. In the future, we hope to use this zero-shot learning approach to power live Q&As between readers and reporters.

Related Projects