DocumentCloud, the 2009 Knight News Challenge project recently seen launching open-source parallel processing software, made two big announcements today.
Imagine being able to search across the New York Times' cache of records on Guant'namo Bay detainees, the ACLU's unrivaled set of documents on detention policy, Jane Mayer's source material for her coverage of the CIA in The New Yorker, and The Washington Post's valuable contributions to all of the above. That's the promise of DocumentCloud, which I've explained at length in previous posts.
Second, the project announced a partnership with Reuters' OpenCalais:
OpenCalais uses natural language processing to extract information from documents, instantly identifying and tagging the relevant people, places, companies, facts and events. This will make it easy for readers and journalists to explore connections between documents and across the full collection of source materials.
In other words, not only will the service be filled with a never-before-seen assemblage of hard-won source documents from some of our biggest journalism heavyweights, those documents will also be deeply searchable, linked in immensely powerful ways. Once again, NiemanLab's got the goods.
As ReadWriteWeb's Marshall Kirkpatrick says, "DocumentCloud is building up a whole lot of steam!"