To defend against Google hackers, keep any sensitive files off your Web server. Just because a file can't be accesses through your Web pages doesn't mean that a hacker can't find that file anyway. Even if a sensitive file is only on your Web site temporarily, you are not safe.
Then try Google hacking your own Web server and see what you find. You may be surprised at how much information Google may already know about your server and how vulnerable you computer might really be.
Search engines like Google constantly troll different Web sites and store the files they find in a storage area called the cache. Once your Web site's files have been stored in Google's (or some other search engine's) cache, anyone can view them by using the cache operator. For example, if you want to view pages that were previously displayed by a Web site, you can use the cache operator followed by the Web site address, as shown below:
This Google query will show you the Web pages currently stored on Google for the CNN.com Web site. These pages will remain in Google's cache until the next time Google refreshes it cache by visiting the CNN.com Web site, even if CNN.com has removed or altered the pages in the meantime.
Google, like most search engines that regularly "crawl" the Internet to find Web sites to index, follow certain rules when visiting Web sites. One of those rules is that Web site administrators can create a special robots.txt file that specifies which parts of the Web site the search engine should not explore and store in its cache. So if there are sensitive files on your computer that you don't want others to see, you can create a robots.txt file to tell Google not to index them. (Of course, it's much safer not to put sensitive files on your Web server computer in the first place.) To learn more about how the robots.txt file works, visit www.robotstxt.org. Just be aware that hackers can also peek at your robots.txt file to see what information you want to protect, and then they'll know exactly what type of information to look for in your computer.
Another alternative is to request that search engines (for example, Google) ignore your Web site altogether. However, while this can prevent hackers from scanning your site using the search engine, it can also keep legitimate users from finding it that way too. To request that Google remove your site from its index, follow the steps listed at www.google.com/remove.html.
Finally, visit the Google Hacking Database (GHDB) --http://johnny.ihackstuff.com -- to see how Google has exposed other Web sites to attack. You can (hopefully) thus learn how not to fall victim to the same tricks.
Every tool on the Internet can be used for good or for bad, and Google is no exception. If you run a Web site, you must learn about Google hacking in order to lock down your system's defenses. If you're just a curious and non-malicious individual, have fun experimenting with Google. You may find more than you ever imagined.
Download the full chapter to learn what other hacking tools and techniques are used to solicit an attack.
Dig deeper on Data Loss Prevention