Problem solve Get help with specific problems with your technologies, process and projects.

Is it possible to delete search data from a search engine's servers?

Search engine history can be very sensitive, and can be used against the searcher if it falls into the wrong hands. Security threats expert Ed Skoudis addresses the possibility of deleting search history from a search engine's servers.

I've heard that some search engines enable users to automatically delete their search data from the search engine's servers. Is this really possible, and do you think archived search data represents a corporate data security or intellectual property threat?
Search engine history is a very controversial subject. People often search for very sensitive personal items, including information about their medical conditions and love life. Many people also search for their own names, performing vanity searches to see what information is available about them on the Internet. Search engines store all of this information, usually associating all searches with the source IP address of the person making the search, and a cookie put on the user's browser by the search engine. Thus, if a user searches for their name, plus certain medical conditions, someone viewing this history can reasonably infer that they have the given condition.

Back in 2006, AOL released search history of about 20 million searches from 650,000 users, freely available on the Web, available at about half a dozen mirrors now. Their goal was to make the search information available to researchers, but they didn't properly consider the invasion of privacy that such a release entails. AOL tried to anonymize the information, passing it through software that remapped each particular user's identity into another value before releasing it publicly. Thus, for a hypothetical example, you can't tell that user Fred Smith did a search for "Fred Smith" and later searched for "halitosis." However, even with this remapping, a person can tell that a given user's anonymous number still performed both searches, implying pretty heavily that good old Fred suffers from bad breath.

But, that's personal information. To get to the point of your question, how does this impact corporate data security and intellectual property? Enterprise employees, especially those associated with some of the most important intellectual property assets of a company, frequently research new applications of their products, new markets they are considering entering, the competitors' products, potential mergers and acquisition targets, and so on. Imagine looking at the search engine history for all IP addresses associated with some large company and sorting them out by users differentiated by the cookie left on their browsers by the search engine. Surely, some very sensitive information about the organization's plans would be revealed.

Because of this concern about the sensitivity of search results, Google announced in March 2007 that they would anonymize search results after 18 to 24 months. That's better than keeping all search queries around forever, but it's a pretty long time. Also, even after that timeframe, Google doesn't delete user searches; it merely anonymizes them. Google has said that this anonymization process involves dropping some of the bits of a user's IP address as well as changing the cookie value, but details are murky.

To address this issue, other search engine companies have jumped on board the privacy bandwagon, offering users an option to avoid storing search history on their servers entirely. In July 2007, Ask.com announced their AskEraser feature, which allows users to configure the Ask.com search engine to not log any search history on their servers. By default, Ask.com logs search queries for 18 months. To change this, when accessing Ask.com, simply click on the "AskEraser" link near the top of their page. A message pops up asking if you want to turn on AskEraser. The service is pretty easy to use, and it's a helpful option for those people who desire more anonymity. While Ask.com hasn't revealed the detailed technical underpinnings of how they omit or destroy search history on their servers, such functionality is certainly possible.

Please note that the discussion above is associated with the search history stored on the search engine company's own servers. Even with AskEraser and Google's 18 to 24 month anonymizing process, browsers still maintain a browsing history that includes all recent searches -- completely independent of what the search engine itself does with that information.

More information:

This was last published in March 2008

Dig Deeper on Data security breaches

Start the conversation

Send me notifications when other members comment.

Please create a username to comment.