Open Questions on Wikia Search, Advertising, and User Privacy

I think we’ve all heard now that Jimbo Wales has set his sights on creating a search engine to rival Google, Yahoo and the rest of the gang. It is currently under development as Wikia Search (but I’m guessing the name will change at some point to something more slick, like Mahalo or Sproose). The mission of Wikia Search is to “generate a new kind of search engine, which relies on human intelligence to do what algorithms cannot.” At its core are 4 principles:

  • Transparency – Openness in how the systems and algorithms operate, both in the form of open source licenses and open content + APIs.
  • Collaboration – Everyone is able to contribute in some way (as individuals or entire organizations), strong social and community focus.
  • Quality – Significantly improve the relevancy and accuracy of search results and the searching experience.
  • Privacy – “Pursuing the Holy Grail of Privacy Protection”

Principle #4, of course, is dear to my heart, but it is not clear what is actually meant by “the holy grail of privacy protection.”

When Wales made the major announcement of this project at NYU earlier this year, Helen Nissenbaum (then my adviser) probed him about the privacy implications of current search engine design and whether he envisions Wikia Search doing something different that might help mitigate these privacy concerns. (The exchange begins at about 26:12 in this video, but the audio of the question isn’t very clear.)

Nissenbaum indirectly referenced my example of how when seeking information at the library, no one necessarily knows what books your look for in the card catalog or which you page through in the stacks; the library simply has created a very robust and navigable organizational structure to help people find information relatively anonymously. But given the drive for the “perfect search engine,” search providers increasingly want to know exactly who a particular searcher is so they can provide the most “personalized” and “relevant” results (and advertising — I’ll get back to that in a minute). Thus, when using a search engine to seek the same information, all of your searches and clicks are automatically recorded by the search engine provider in order to help them build a dossier of your personal and intellectual interests, resulting in significant privacy concerns. Her question boiled down to whether Wikia Search would follow a similar path and try to know everything possible about the user in order to deliver the quality results it seeks.

Wales almost seemed to be thrown off-guard by the question, and didn’t appear to have a canned answer at the ready (which is surprising, given how privacy is one of Wikia Search’s core principles). He did state, with the caveat that it was an uninformed opinion, that he doubts whether knowing everything about a person really can help with search results. He did, however, recognize there might be a benefit of reviewing search histories for “short term patters,” so if I was just looking at hotels in London and Berlin, and then search for “Paris Hilton” it is a safe guess that I’m looking for a hotel, and the celebrity (similar tracking of immediately-previous search strings by way of the referer field is already happening). Wales goes on to say that the goals/benefits of monitoring a user’s search activity has less to do with improving the quality of the search results and more to do with the advertising, and he admits that he doesn’t have too much of a problem with personalization of ads since he’d rather see an ad on a topic he cares about rather than some random ad. All that said, Wales acknowledged an increasing anxiety about how companies like Google are “watching everything I’m doing.”

So, in a nutshell, Wales helped identify the Faustian bargain inherent in the quest for the perfect search engine: we can (perhaps) get better results and more relevant ads, but they end up monitoring and capturing all kinds of personal and intellectual information about us. Unfortunately, given the current structure of the search engine marketplace which is overwhelmingly dependent on advertising dollars for their revenue, the drive for more personalized advertising (more bucks if they can offer more targeted ads) will mean the continued collection of user search histories and related personal information.

Nissenbaum’s question implied hope that Wikia Search might be different, that it might hold the pursuit of principle #4 above the need to provide personalized results or advertising. That exchange with Wales was 6 months ago, and nothing has really progressed since then to shed new light on how Wikia Search might strive to achieve the “holy grail” of privacy protection, or whether it will be just as threatening to user privacy as the other players in the search industry.

Recently, however, Wikia announced a new partnership with LookSmart, an online advertising and technology company. Wikia will use LookSmart’s ad platform to manage display and text-based advertising across its wide-ranging collection of projects and sites (there are quite a few). What this meant in terms of Wikia Search remained unclear until this press release a few days ago announcing that Wikia purchased the Web crawler Grub — a very interesting distributed crawling technology — to help jump-start the Wikia Search project. Alas, a connection between the advertising deal and the search engine project becomes clear. As reported here:

Specifics of the deal were not revealed, though it is part of a larger advertising deal between Wikia and LookSmart which was announced last week. Under the deal, LookSmart will provide text and display ads in Wikia’s freely hosted wiki communities, and eventually on the Search Wikia site, Wales said.” (emphasis added)

It appears, now, that Wikia Search will be an advertising supported service, powered by LookSmart (Wales confirmed that for me here). And since Wikia is a for-profit company (unlike the related Wikipedia project), my concern is that the allure to earn more dollars and cents for each of those “contextual ads” will be too powerful, and that Wikia Search will indeed resort to creating vast server logs of users search histories in order to provide “better quality” results — and more profitable ads. (Seth Finkelstein has already commented on the troubles of having a for-profit company benefit from the free labor of others.)

So, some open questions for Wikia Search remain:

  1. What kind of server logs will Wikia Search maintain of user activity?
  2. If so, what will the data retention policy be?
  3. Will Wikia Search offer (and encourage) the creation of user accounts (enabling easier tracking of user activity)?
  4. Will attempts be made to personalize advertising on the Wikia Search based on a user’s past search activity?
  5. If so, can users opt-out of such personalization to prevent their search histories from being tracked?

Answers to these questions will not only help us determine what steps Wikia Search intends to take to protect the “holy grail” of privacy, but also the extent to which they will be transparent (another of their core principles) as to the collection and use of user data.

Leave a comment