In August 2011, Michele Bachmann held a small rally in the parking lot of a sports bar in Indianola, Iowa with a few dozen people. Over the course of the event, Bachmann, like many politicians, repeatedly misled her audience. The Post’s National Political Editor, Steven Ginsberg, was at the event and detected a problem: no one attending seemed to realize they were being misled. From that moment, the Post set out to try to fix that problem and to give the public the information it needs at the moment it needs it.
Our solution is Truth Teller, which aims to fact check speeches in as close to real time as possible. Truth Teller is a prototype of a news application built by the Post with funding from Knight Foundation's Prototype Fund. The prototype, built in three months, is a big step toward real-time fact checking.
The Post is dedicated to this project because we believe strongly that informing and educating the public is one of the most critical missions we can perform, particularly when it comes to our elected officials - regardless of their political affiliation. Amid the cacophony of an instant-news culture, identifying the truth is both harder and more important than ever. Facts themselves are increasingly under attack and falsehoods can easily and instantly find their way to a mass audience. In fact, many are designed to.
For the prototype, we focused on the coming debate over tax reform, both because of its timing and importance. The tax debate will play out over several months and naturally lends itself to deceit and deception - even moreso than many policy discussions. We hope that our application will help direct the conversation toward the truth as it is happening so that Americans get a fair shot at deciding this critical issue.
The Truth Teller prototype was built and runs with a combination of several technologies - some new, some very familiar. We've combined video and audio extraction with a speech-to-text technology to search a database of facts and fact checks. The Post also worked with Dan Schultz, creator of Truth Goggles, as he helped consult and shared his knowledge of real-time fact checking. We are effectively taking in video, converting the audio to text, matching that text to our database, and then displaying, in real time, what’s true and what’s false. The key to the project’s success is building an authoritative database - our goal is to identify falsehoods, not create more of them.
We are transcribing videos using Microsoft Audio Video indexing service (MAVIS) technology. MAVIS is a Windows Azure application which uses Deep Neural Net (DNN) based speech recognition technology to convert audio signals into words. Using this service, we are extracting audio from videos and saving the information in our Lucene search index as a transcript. We are then looking for the facts in the transcription. Finding distinct phrases to match is difficult. Instead, we are focusing on patterns.
We are using approximate string matching, or a fuzzy string searching algorithm. We are implemented a modified version Rabin-Karp using Levenshtein distance algorithm. This will be modified to recognize paraphrasing and negative connotations in the future.
What you see in the prototype is actual live fact checking - each time the video is played the fact checking starts anew. It needs more technical work and we need more facts, but it works and we’ll keep working on it. Can this be applied to streaming video in the future? Yes. Can this work if someone is holding up a phone to record a politician in the middle of a parking lot in Iowa? Yes, we believe it can.
The Washington Post Truth Teller team
Cory Haik, Executive Producer for Digital News, Steven Ginsberg, National Political Editor Joey Marburger, Mobile Product Director Yuri Victor, UX Director Siva Ghatti, Director, Application Development Ravi Bhaskar, Principal Software Engineer Gaurang Sathaye, Principal software engineer Julia Beizer, Mobile Projects Editor Sara Carothers, Producer
Related: "Realtime Political Fact-Checking Becomes A Reality With WaPo's 'Truth Teller'" in Tech Crunch