How DocumentCloud bolsters investigative journalism and the exchange of public information

journalism / Article

July 27, 2017 by Aron Pilhofer

Share:

DocumentCloud's open source platform provides unfiltered access to millions of news and information documents sourced and uploaded by journalists the world over. Photo: Dave.Miles under License

Aron Pilhofer is a co-founder of DocumentCloud. Today Knight Foundation is announcing $250,000 in new support to the open source journalism tool, to encourage investigative journalism and the exchange of public information.

It was just about nine years ago when Eric Umansky, Scott Klein and I hit upon an idea we hoped might help journalists be a little more transparent in their reporting. Then as now, more and more questionable content masquerading as news was being published online every day, with the job of separating fact from fiction falling increasingly to the reader.

We believed then (and still do today) that if journalists were more open about their sourcing, it would help people differentiate between real reporting and nonsense. It would not only increase the trust people had in journalism, it would encourage people to expect this same level of transparency from institutions and elected officials.

But there was a problem: There was no good way for news organizations to share, annotate or publish documents online that didn’t involve working with terrible proprietary platforms or formats - so we set out to build one. We believed such a platform would encourage journalists to show readers how they know what they know, not just tell them. Or in the words of your high school algebra teacher, they would show their work.

Thus, DocumentCloud.org was born.

In terms of adoption, DocumentCloud has been a runaway success beyond our wildest dreams. As of this writing, our repository hosts 3.6 million source documents, and has been used by more than 8,400 journalists in 1,619 organizations worldwide. Documents in our collection have been viewed more than 824 million times by the public. DocumentCloud has been used by some of the largest news organizations in the world for high-profile stories such as WikiLeaks, Panama Papers, and the Snowden documents.

<p>Numerous media outlets used DocumentCloud to support their coverage of high-profile stories like WikiLeaks and the release of classified documents by Edward Snowden, pictured here on the cover of Wired. Photo: <a data-rapid_p="35" data-track="attributionNameClick" href="https://www.flickr.com/photos/jeepersmedia/" title="Go to Mike Mozart's photostream">Mike Mozart</a> under <a href="https://creativecommons.org/licenses/by/2.0/" rel="noopener noreferrer" target="_blank">License</a></p>

Numerous media outlets used DocumentCloud to support their coverage of high-profile stories like WikiLeaks and the release of classified documents by Edward Snowden, pictured here on the cover of Wired. Photo: Mike Mozart under License

But that success has come at a cost, quite literally. We’ve built sophisticated features into DocumentCloud, including named entity extraction, multilanguage support, optical character recognition, a mobile friendly viewer, faceted search and a powerful application programming interface. We’ve had to learn to scale quickly and massively when our users upload hundreds, thousands and even tens of thousands of documents at once, as frequently happens when news breaks. And weve had to handle this demand at all hours of the day or night as DocumentCloud became an increasingly global platform.

Obviously, none of this is free. That is why our goal over the next few months is to start putting in place a plan to sustain DocumentCloud, to ensure this valuable tool remains available to journalists for as long as there is a need. And at a time when trust in journalism is at an all-time low, I think it’s safe to say DocumentCloud and projects like it are needed now more than ever.

As a first step, we will begin asking our users to help support the platform directly. The details of who is going to be asked to pay and how much are still being worked out. We promise we’ll announce details well in advance and give our users plenty of time for feedback.

Let me assure our users of one thing: As a nonprofit created by journalists for journalists, the board, staff and I are keenly aware of the increasingly limited resources available to most newsrooms. We will scale our ask accordingly. Our goal here is sustainability, not profit.

Let me also reiterate a promise we made when we launched DocumentCloud: There will always be a free version available to journalists. Always. But we can no longer afford to offer unlimited free access to all of our users, much as we would like to.

The bottom line is this: DocumentCloud as it is operating now isn’t sustainable. Thanks to the Knight Foundation grant announced today, we know we’ll make it to our 10th year. But if we can’t find a way to make DocumentCloud self-sufficient, our 10th year may very well be our last.

<p>New Knight Foundation funding announced today will help DocumentCloud institute operational practices to make the organization sustainable for the long term. Photo: <a data-rapid_p="37" data-track="attributionNameClick" href="https://www.flickr.com/photos/sillygwailo/" title="Go to Richard Eriksson's photostream">Richard Eriksson</a> under <a href="https://creativecommons.org/licenses/by/2.0/" rel="noopener noreferrer" target="_blank">License</a></p>

New Knight Foundation funding announced today will help DocumentCloud institute operational practices to make the organization sustainable for the long term. Photo: Richard Eriksson under License

related content

    Sign up for our newsletter

    Submit your email. Receive updates and the @knightfdn newsletter.

    Subscription Options

    Five lessons for libraries looking to innovate in the 21st Century

    technology / Article