Google-Hacking Made Easy

With a name like "Cult of the Dead Cow" you know these guys are probably up to no good, and they are living up to expectations with the release of Goolag Scan, a tool to automate the use of search engines to scan for vulnerable applications, back doors and sensitive information on Web sites.

This is a technique called "Google hacking," named for the Web's predominate search engine, and it isn't new. What's new is the improved tool that makes it easier to do the searches.

"I don't think they have anything new in terms of new capabilities," said Amichai Shulman, chief technology officer of Imperva Inc. of Foster City, Calif., and head of the company's Application Defense Center. "They do have a tool that makes Google hacking more accessible to script kiddies."

Goolag Scan runs with Windows and has a good graphical interface along with a library of about 1,500 carefully crafted searches that can reveal sensitive information about or from queried Web sites. The tool is neutral; it can be used for penetration-testing by administrators, by application owners to identify weaknesses or by hackers to find vulnerabilities to exploit.

"Tools like this scanner are a wake-up call for application owners," Shulman said. "And that is a good thing. The issue of data leakage into search engines is a big issue."

The Cult of the Dead Cow has said much of its research in this area has been against government servers where it has been able to turn up sensitive information that has been unwittingly exposed.

"With a lot of script kiddies having this tool, I think the government can expect a rough period of headlines," Shulman said.

The practice of using search engines to find sensitive information has been around for years. Johnny Long, a security researcher and penetration tester for Computer Sciences Corp. in El Segundo, Calif., wrote the book on the subject, Google Hacking for Penetration Testers, in 2005. The government became acutely aware of the practice in the wake of the terrorist attacks in 2001, Long said. It is one of the reasons agencies began scrubbing Web sites of sensitive data following the attacks.

The primary difference between Google hacking and doing a Google search is the frame of mind because the search engine is being used as intended. It's all a matter of what queries are used and how the resulting hits are used.

For example, a Google hacker might ignore the content of the links returned in a search and focus instead on the names of the servers that responded. Or, through a properly constructed query, access a list of Social Security numbers along with the names and addresses of their holders.

Long compiled a catalog of more than 1,300 such queries that are used by legitimate developers of penetration-testing tools. Queries can return hits containing:

  • Login pages for a variety of services and servers.
  • Security logs from firewalls, honeypots and intrusion detection and prevention systems that can reveal a wealth of details on vulnerabilities.
  • Lists of networked devices such as printers and cameras.
  • Servers operating with default configurations, which could include default passwords.

Long kept his catalog of queries confidential, but they are not secrets and the new Goolag Scan tool has its own catalog to use.

Google has taken steps to block the technique -- or at least make it less easy -- by blocking blatantly automated searches. However, this also can stop or slow down legitimate penetration-testing, giving an advantage to the hacker who can search slowly for a limited number of vulnerabilities. The administrator doing penetration testing has to scan for all 1,500 or so vulnerabilities to be secure.

Still, the best defense against this type of problem is to be proactive, Shulman said. "Find the leakage before others find it."

About the Author

William Jackson is the senior writer for Government Computer News (