It's a Glog!

Wednesday, March 30, 2011

The Helioid Team has launched a new tool to share and comment upon Google searches that fall short of satisfactory, and to take a look at other Googlers' struggles. We affectionately call it the Glog. Back in January, we somewhat merrily reported on a spate of bloggers' criticisms of Google's performance on certain searches, some of which paralleled arguments we've been making about the need for more varied approaches to search since 2008. The reason for our merriness was that search startups in general, including Helioid, encounter a pervasive attitude that there's no room for new players in search, largely because Google does such a good job with the majority of people's queries. But, as we and a number of others have argued for some time, there are many searches with which Google still struggles, such as searches involving ambiguous queries or spam-saturated searches for song lyrics or home appliances.

Enter the Glog. Submit your query and your comments on what you wanted and didn't quite get, and your Glog post will immediately be added to the stack. The idea is to go beyond griping about the frustrations we've had with some Google searches, to directly capturing some examples of the sources of those frustrations. By keeping a collectively compiled log of the areas in which the web search status quo doesn't quite get the job done, we can further dispel the myth that there's no room for more players in search, while providing brain storm fuel for any and all of our fellow innovators in the field.

On Google's Spam Woes and the Need for More Players in Search

Wednesday, January 26, 2011

Fresh on the heels of search newcomer Blekko's launch in November, and the ensuing debate over whether it's particularly wise for anyone to challenge Google, the last few weeks have seen a surprising, though gratifying, spate of criticisms of the declining quality of Google's results for some searches. And, surprise surprise, by and large the critiques run parallel to issues raised by Blekko Founder, Rich Skrenta, in the months leading up to Blekko's launch, as well as to arguments we've been making since Helioid's first launch in 2008. Finally, it seems as though people are catching on to the fact that Google isn't infallible, that it doesn't handle every search perfectly, that the kinds of searches it struggles with are actually increasing in number, and therefore that there is indeed room in the field for fresh players.

On his personal blog, Stack Overflow Co-Founder, Jeff Atwood, laments the increase in content-copying search spam sites regularly beating out Stack Overflow pages in Google searches, simply by scraping Stack Overflow content and displaying it with more ads. On TechCrunch, Vivek Wadhwa regales us with tales of his struggles against the mass manufacturers of quasi-content, like Demand Media, which are increasingly dominating Google's search results, without actually offering users any worthwhile information. The response from Google has been to brush these complaints off as being in response to a "recent uptick" in spammers infiltrating the top results, but Atwood points to complaints dating back to 2009, from Richard McManus and Paul Kedrosky touching on the very same issues. Kedrosky in particular shares his personal horror after struggling to find trustworthy comparisons of dishwasher brands online and being bombarded with page after page of search spam, and concludes that the "appliance search" genre is too spam-laden for even Google to handle. Clearly this problem isn't due to an uptick in spammer activities confined to the last few months. And all of these complaints resonate strongly with an issue we discussed in our very first blog entry in 2008, which had first been raised shortly prior by Nova Spivack: that the exponential explosion of content on the web would sooner or later start to strain the ability of keyword search engines to consistently place the most relevant pages amongst the top results. It seems as though the success of content mass-manufacturers like Demand Media and Associated Content is helping Spivack's prediction come about sooner rather than later.

[caption id="" align="aligncenter" width="368" caption="Web Spam Example"][/caption]

The good news is, the creeping dissatisfaction with at least some Google searches may help abate some of the knee-jerk skepticism of new search engines like Helioid or Blekko, especially when they propose innovative ways of filtering out the noise plaguing certain Google searches. Blekko CEO Rich Skrenta actually touches on a similar issue to the ones raised by Kedrosky and Mcmanus in the CrunchTV interview we discussed in our last blog post, concerning song lyrics. Skrenta rightly points out that if you search for lyrics online, because it's so easy for search spammers to copy and paste lyrics supplied by trusted sites, you end up getting bombarded with gratuitous ads and may even pick up some malware. Blekko has dealt with this problem by only drawing results for lyrics searches from a limited list of trusted sites added by users under the "/lyrics" slashtag. Blekko's ability to avoid Demand Media filler content by drawing upon user input has been one of Skrenta's favorite reasons to give for switching to Blekko. Even a slight decline in the quality of Google's results may push more users into experimenting a bit, and making that switch, and a little extra adventurousness on the part of the users will make for a more fertile environment for innovative search start-ups in general, including future releases of Helioid's web exploration tools.

Google's SearchWiki as a step towards more user control

Wednesday, December 03, 2008

Google recently released their new SearchWiki feature which allows users, who are logged into a Google account, to rearrange search results (by clicking on arrows that move them up or down one slot), remove results from the returned list, and comment on results (all comments are made public). More information is in this Google "blog article":http://googleblog.blogspot.com/2008/11/searchwiki-make-search-your-own.html.

It's encouraging to see Google taking user responses into account. It has always been our opinion that this is something sadly missing from the mainstream search world. Google also states that the results' movements, removals, and comments will not be used as input to their search algorithms. Well, at least not yet.

Google knows that customization specific to the searchers individual information is important to delivering relevant results. In this "blog article":http://googleblog.blogspot.com/2008/07/more-transparency-in-customized-search.html about customized search transparency they detail the introduction of messages that inform users when a search has been customized based on location, recent searches, or web history (for those with Google accounts). Using the links from these messages you can also re-search removing the specified customization that was used as input to the results they provided.

Whether meant this way or not, the option to remove customization is an important step towards users controlling the way Google's algorithms function. For a relatively unlikely example, suppose you're in New York using "Tor":http://tor.eff.org/ to proxy your connection and the exit node is in San Francisco, so Google thinks you're in San Francisco. Your search for "new york bagel" might bring up the "New York Bagel" café in Mill Valley, which is entirely irrelevant. You can then tell Google to strip out their local search and re-searching will bring you the relevant results you want.

The sort of customization offered by SearchWiki is in a different vein. You're not meant to interact with or influence the search algorithm. If you perform a search and then start removing results you can continue until there are none left. This also means that if you perform a search for "new brunswick":http://www.google.com/search?q=new+brunswick, remove all the results mentioning Canada, and then go on to the second page of results, you'll still get a whole bunch of results mentioning Canada. Given your actions, it is unlikely you're looking for information about Canada's New Brunswick province, but after having removed all those irrelevant results Google will still present more of them.

One way Google could improve this problem is to bias their returned results after you begin removing and reordering your results. If they end up getting things wrong the user should be able to tell them so explicitly through customization options. The user should also have an option to turn this functionality off – or compare results with it on and with it off – if they wish. Helioid's algorithms can tell when they are getting things wrong implicitly by looking through the results the user removes and the way the user navigates through the returned results.

It's great to see Google innovating. There's a lot more that can be done.