<?xml version="1.0" encoding="UTF-8" standalone="yes"?><oembed><version><![CDATA[1.0]]></version><provider_name><![CDATA[Open Parenthesis]]></provider_name><provider_url><![CDATA[http://www.openparenthesis.org]]></provider_url><author_name><![CDATA[John]]></author_name><author_url><![CDATA[http://www.openparenthesis.org/author/admin/]]></author_url><title><![CDATA[Media Cloud(s) On the Horizon]]></title><type><![CDATA[link]]></type><html><![CDATA[The <a href="http://cyber.law.harvard.edu/">Berkman Center for Internet & Society</a> launched <a href="http://www.mediacloud.org/">Media Cloud</a> in early March, though it had been quietly available for a few months before that. It's an exciting concept, limited in its current implementation but sure to grow in utility as more features get added. 
[caption id="attachment_1162" align="aligncenter" width="458" caption="MediaCloud"]<a href="http://www.openparenthesis.org/wp-content/uploads/2009/04/mediacloud.png"><img src="http://www.openparenthesis.org/wp-content/uploads/2009/04/mediacloud.png" alt="MediaCloud" title="mediacloud" width="458" height="46" class="size-full wp-image-1162" /></a>[/caption]

<!--more-->

In essence, Media Cloud monitors a set of sources, and then semantically processes the news items from those stories, creating a rich structured dataset which enables various queries and visualizations. 

[caption id="attachment_1155" align="aligncenter" width="300" caption="Media Cloud Summary (Image from MediaCloud.org)"]<a href="http://www.mediacloud.org/about-2/"><img src="http://www.openparenthesis.org/wp-content/uploads/2009/04/mc-flow-2b.png" alt="Media Cloud Summary (Image from MediaCloud.org)" title="mc-flow-2b" width="300" height="210" class="size-full wp-image-1155" /></a>[/caption]

The project also relies on a partnership with <a href="http://www.opencalais.com/">Calais</a> to provide the term extraction and entity identification capability.
 
Currently, the <a href="http://www.mediacloud.org/visualizations/">visualizations</a> are rather limited. You can create a comparative graphic across any three media sources in the system, of one the following types:
<ul>
	<li>Top 10 most mentioned terms</li>
	<li>Top 10 Term Pivot</li>
	<li>World Map</li>
</ul>

Unfortunately there's no easy way to identify what sources are in the database, other than starting to type and seeing if the autocomplete finds what you're hoping to use. There's also no way to tell what "terms" are considered significant, though the error message notes:

<blockquote>The available terms that you can currently serach for are focused on prominent people, places, and events. This will broaden considerably in the future.</blockquote>

It's the long term plans, not the current visualizations, that make Media Cloud worth <a href="http://www.mediacloud.org/2009/01/15/keep-up-to-date-with-media-cloud/">watching</a>. Ultimately the Media Cloud project <a href="http://www.mediacloud.org/about-2/">describes itself becoming</a>:

<blockquote>A platform for open, collaborative research by scholars around the world . . . [which] does the heavy lifting in the "cloud" and provides the results as a web service</blockquote>

It isn't clear at this point what specifically is meant by "in the 'cloud'" - except in the limited sense that all remote web services could be said to be in the cloud. (See my colleague Andrew Webb's <a href="http://openenterprise.wordpress.com/2009/03/11/open-source-and-cloud-computing/">The Open Cloud</a> for a good overview of the various things "cloud" might mean in today's environment).  Similarly, I believe the only current access to the "web service" is via the front end site at mediacloud.org - no programmatic APIs are exposed yet. 

Assuming, however, that the project can reach its goal of an infinitely scalable, cloud-hosted web service which would semantically index a great portion of the relevant media stream, and could be accessed by researchers at low or no cost - that would be a very powerful tool for understanding how media operates online. 
 
Media Cloud is also a free and open source software project, licensed under the <a href="http://www.fsf.org/licensing/licenses/agpl-3.0.html">GNU Affero General Public License</a> and built in Perl using the <a href="http://www.catalystframework.org/">Catalyst web framework</a> and a <a href="http://www.postgresql.org/">PostgreSQL</a> database. (<a href="http://www.mediacloud.org/code/">Get code here</a>). 

Related: 
<a href="http://drupal.org/node/303763">Calais for Drupal</a> 
]]></html></oembed>