How to Get Out of Google

by Chess

"Just when I thought that I was out they pull me back in!"  Learn to stay out of Google.

Most people are dying to get their sites listed in Google.

But what if you want your site out of Google's listings?

;Maybe you want to keep your site private, or you don't want a bunch of creeps surfing to your page trying to find animal porn.  Maybe you just hate Google, are paranoid, or have some copyrighted material on your page that you need out of Google's cache today.

Whatever the case, it's actually pretty easy to get out of Google and start to bask in relative anonymity.  Because once you're out, then your page is off the Internet for all intents and purposes.

Having your page delisted in Google is almost like having your page password protected where the password is your URL!  (In this article, I alternate between keeping Google's bots out of your page and keeping all search engine bots (there are other search engines now?) out.  I'm assuming that if you want out of Google you want out of them all.  If you really only want out of Google then use "Googlebot" instead of "Robots" in the following examples.)

The first thing you want to do is add some meta tags to your index.html.  If you want Google - and every other engine - to ignore your entire site during its spidering of the web add this meta tag to your header:

<meta name="robots" content="noindex, nofollow">

Alternatively, you can allow every search engine except for Google to index your page.  Just add this tag:

<meta name="googlebot" content="noindex, nofollow">

This next tag will remove the "snippets" from the Google results it returns.  Snippets are the descriptive text underneath the URL when you pull up a list of Google results.  It has your search terms bolded within the snippet to show you what context your terms are being used in.

<meta name="googlebot" content="nosnippet">

If you want your page to be listed in Google but don't want them to store an archive of your page then add only this next tag to your header:

<meta name="robots" content="noarchive">

This is handy if you have a page that changes frequently, is time critical, or if you don't want searchers to be able to see your old pages.  For example, if you're a professor posting test solutions or something similar you'd definitely want to remove Google's cache if you plan on reusing the test.

After you add all the meta tags you want, you may be finished.  But if you're trying to keep bots out of your entire site permanently, the next thing to do is create a robots.txt file in your website's root directory.

Pull up Vim/Notepad and type in the following two lines:

User-agent: *
Disallow: /

Save this file as robots.txt and FTP it to your site's root directory.

This will tell the Googlebot and actually all other search engines not to bother looking at your page and to spider somewhere else.  Obviously, if you create this file then you don't need the meta tags but if you're extra paranoid, like I am, then you should use both methods like I did.

After you've done all that, go and sign up for a Google account at: services.google.com/urlconsole/controller

This page is for people who urgently want their pages removed from the index.  Even then it will take up to 24 hours.

But if you'd rather wait six to eight weeks, be my guest.  After you create an account, Google will email you a link where you enter the URL of your robots.txt file you just uploaded and then Google sends their bot over to your site right away to read it.

With any luck, you're out of the index in a day or two.  I was out in less than 12 hours.

If you want to get back in, just remove all the meta tags and the robots.txt file.  As long as someone is linking to you somewhere, you'll be listed again after Google's next web crawl.

Special thanks to Google's Listing Removal Resource which is at: www.google.com/remove.html

The above page can also help you if you want to remove images from Google's image search engine.  Especially handy if you don't want people to be able to link your name to your face or find your wedding photos.

You can learn more about robots.txt files and what they can do here: www.robotstxt.org

Of course, it may simply be easier to password protect your page if you don't want people seeing what's inside.  But sometimes that's not feasible because of the inconvenience it may pose to your audience.

Besides, Google can index password-protected pages according to Google's corporate information page.  Not only that, but anything that is simply sharing space on your server is fair game to the Googlebot, like Excel or Word files.

Even SSL pages can be indexed.  The above methods will serve to hide your page by practically disconnecting it from the web.  Once I was out, I tried to Google for my name and page and sure enough it was gone.  It was like the page didn't exist and it gave me such a nice warm fuzzy feeling inside.

One disclaimer though: if you were using Google as your in-house search engine solution to help your users find information on your page it will no longer work once you've been delisted.

Have fun!

Shoutouts to the Boneware Crew.

Return to $2600 Index