Innovative Collaboration

Tuesday, May 27, 2008

Last night I drifted off to sleep thinking about scientists and online collaboration. I've been thinking about these things as I drift off to sleep much more frequently than usual since I read the chapter on "social information foraging" in Peter Pirolli's Information Foraging Theory. In said chapter, Pirolli describes a number of studies of trends in large groups of specialists working towards a common set of goals, and the degree to which such communities of specialists collectively aid their individual members in making contributions to meeting said goals. The subjects explored within a few of these studies that really caught my eye were the use of co-citation analysis to visualize a field of study or network of specialists, and, as Pirolli puts it, the "brokerage of structural holes" in these networks. The former of these I was familiar with, as the technique's been pretty well explored from a variety of angles, but I had never seen the latter presented the way in which Pirolli does.

By 'structural holes', Pirolli means the areas within a field not being explored nearly as heavily by the specialists in the network as other areas. In any area of study, there will be a number of hot topics around which the majority of research within the field crystallizes. As a result, there will be perfectly viable topics for research that go unexplored, due to the community's collective ignorance of their viability, and a visualization of the network will show clusters of closely related papers, separated by gaps, or 'structural holes'. Members of the community rarely explore these gaps, because it is entirely unknown how fruitful such endeavors would be. So, disinclined to risk wasting their time looking for a chance breakthrough, they work to make incremental contributions to areas of ongoing study. However, when discoveries are made within these gaps, the benefit to the community is enormous, as it opens up a brand new area of study around which further research may develop. So, by way of a little cost/benefit analysis, it's possible to work out the optimal number of people a research community should have exploring these underdeveloped areas at any given time.

This is all very intriguing, but what really makes me raise an eyebrow is the application of these ideas to a few online collaboration support projects that have caught my eye in the last couple months. Berkeley's had a couple interesting projects along these lines, called BOINC and Bossa, which provide frameworks for distributed computing and what they call distributed thinking, respectively. BOINC provides a means for projects requiring a lot of processing power to recruit volunteers and form a computing cloud. Bossa similarly enables the cooperation of large numbers of individuals, in working toward some common research goal, like the cataloging of stars in a particular region of the sky. Similarly, MIT's Center for Collective Intelligence is dedicated to fleshing out the ways in which a bunch of people and a bunch of computers can work together to "act more intelligently than any individuals, groups, or computers have ever done before." To this end they've launched iCKN.org, which aims to use community visualizations to support the creation of "innovative collaborative knowledge networks."

All this brings us back to why I'm so enamored with the idea of online collaboration between professional and amateur scientists alike. The whole idea behind Helioid is to leverage the benefits gained by visualizing web searches and online communities, in order to facilitate a more effective form of exploration of the information available on the web. Aside from the impression we'll make on the general web search market, there's a huge impact to be made in supporting these sorts of collaborative endeavors on a massive scale. I can't help but dream of such massive collaborations, wherein groups of millions of professionals and amateurs are directed to unexplored research areas, in a collective effort to push every scientific field forward, and these efforts are supported by a core set of tools for visualizing the field of research and determining where the frontier lies.