Google-Hacking Made Easy
- By William Jackson
- March 6, 2008
With a name like "Cult of the Dead Cow" you know these guys are
probably up to no good, and they are living up to expectations with the release
of Goolag Scan, a tool to automate the use of search engines to scan for vulnerable
applications, back doors and sensitive information on Web sites.
This is a technique called "Google hacking," named for the Web's
predominate search engine, and it isn't new. What's new is the improved
tool that makes it easier to do the searches.
"I don't think they have anything new in terms of new capabilities,"
said Amichai Shulman, chief technology officer of Imperva Inc. of Foster City,
Calif., and head of the company's Application Defense Center. "They
do have a tool that makes Google hacking more accessible to script kiddies."
Goolag Scan runs with Windows and has a good graphical interface along with a
library of about 1,500 carefully crafted searches that can reveal sensitive
information about or from queried Web sites. The tool is neutral; it can be
used for penetration-testing by administrators, by application owners to identify
weaknesses or by hackers to find vulnerabilities to exploit.
"Tools like this scanner are a wake-up call for application owners,"
Shulman said. "And that is a good thing. The issue of data leakage into
search engines is a big issue."
The Cult of the Dead Cow has said much of its research in this area has been
against government servers where it has been able to turn up sensitive information
that has been unwittingly exposed.
"With a lot of script kiddies having this tool, I think the government
can expect a rough period of headlines," Shulman said.
The practice of using search engines to find sensitive information has been
around for years. Johnny Long, a security researcher and penetration tester
for Computer Sciences Corp. in El Segundo, Calif., wrote the book on the subject,
Google Hacking for Penetration Testers, in 2005. The government became
acutely aware of the practice in the wake of the terrorist attacks in 2001,
Long said. It is one of the reasons agencies began scrubbing Web sites of sensitive
data following the attacks.
The primary difference between Google hacking and doing a Google search is
the frame of mind because the search engine is being used as intended. It's
all a matter of what queries are used and how the resulting hits are used.
For example, a Google hacker might ignore the content of the links returned
in a search and focus instead on the names of the servers that responded. Or,
through a properly constructed query, access a list of Social Security numbers
along with the names and addresses of their holders.
Long compiled a catalog of more than 1,300 such queries that are used by legitimate
developers of penetration-testing tools. Queries can return hits containing:
- Login pages for a variety of services and servers.
- Security logs from firewalls, honeypots and intrusion detection and prevention
systems that can reveal a wealth of details on vulnerabilities.
- Lists of networked devices such as printers and cameras.
- Servers operating with default configurations, which could include default
Long kept his catalog of queries confidential, but they are not secrets and
the new Goolag Scan tool has its own catalog to use.
Google has taken steps to block the technique -- or at least make it less easy
-- by blocking blatantly automated searches. However, this also can stop or
slow down legitimate penetration-testing, giving an advantage to the hacker
who can search slowly for a limited number of vulnerabilities. The administrator
doing penetration testing has to scan for all 1,500 or so vulnerabilities to
Still, the best defense against this type of problem is to be proactive, Shulman
said. "Find the leakage before others find it."
William Jackson is the senior writer for Government Computer News (GCN.com).