This morning’s Defrag keynote opened with a panel discussion on where the business of search and discovery has come from, and where it needs to go.
Marti Hearst (UC Berkeley) made a strong opening analogy by explaining that search is currently an, “experience in orienteering;” the journey to your information begins with a few words, followed by a long period of sniffing out clues. This process repeats until hopefully you find what you were looking for.
Obviously this creates a demand for a major upgrade to the human experience of search. One of the natural solutions is natural language processing, where Microsoft is doing a lot of great work (TellMe, Windows Speech Recognition, and Ford Sync).
Another of the interesting factors in the evolution of search is a pitch for building more topic-specific indexes. Steve Larsen (Krugle) argued a compelling case for vertical search indexes, “In a code-writing search index, Python is never a reptile and always a language.”
This feels like a workaround for the lack of NLP implementation/effectiveness, but as opposed to most instances, that sort of design sounds extremely valuable. I think I might rather have a half-dozen different search engines that I use across a day, assuming they are each much more targeted toward the information I’m looking for. I think this type of strategy could allow for a lot of smaller-scale growth in the search market.
Bradley Horowitz (Yahoo) gave an explanation of Flickr’s Interestingness system which brought up the powerful difference between explicit and implicit discovery systems. Explicit systems (voting, rating) are ripe for gaming and rigging, which obviously prevents an honest view of the landscape from emerging. The Interestingness recipe watches views, comments, and favourites across all the system’s photos. In addition though, it weights the value of those actions based on the relationship the author has with the viewer (your brother marking your photo as a favourite is different than a stranger doing the same). Intricacies like that help develop a more honest result set.
The beauty I see in these implicit designs is their invisibility to a user’s actions. Rather than putting the request on the user to think consciously about the value they place on something, the software simply listens and reacts. As Bradley put it so eloquently, “The system changes in the user’s wake.”
When the discussion got to addressing the uncrawlable information trapped behind paywalls and corporate firewalls. Jeremie Miller (Search Wikia) made the comment that as people define knowledge by what is discovered through search, knowledge that doesn’t make it into the index may cease to exist (so far as many people are concerned).
Honestly I’m terrified by the social implications of such a reality; the rift between those with access to the indexes and those without can have dire effects, not to mention the concerns around who controls those indexes.