Web Copy Blog

Web Copy Blog header image 2

Blocking Search Engine Spiders from Your Best Content

655 Comments · SEO

If you’re like your peers, you have spent an enormous amount of time, budget, and energy creating a gargantuan content library.

It probably contains white papers, past canned webinars, PowerPoint slides from show presentations, etc.

You fondly consider this ‘resource center’ a lead generation and qualifying device.  Any humans who drop by have to register or log in to get at the content.  That’s fine.

Just one thing.  Search engine spiders aren’t human.  They are robots following a trial of hotlinks to publicly available content.

That content must (at this time anyway) be either flat HTML text, or text in another standard format such as PDF, Word, PowerPoint, Excel or meta tags accompanying the non-readable content.  The spiders can’t read audio files, Flash files, images, or video directly.

They also can’t read anything that’s behind your registration barrier.
In the old days (i.e., 42 months ago) you could create a program that would lift that barrier automatically for selected visitors, such as spiders.

Nevertheless, Google et al now say that’s a bad idea.  They don’t want their spiders seeing anything that a normal non-registered human being can’t see.

How do you get around the problem?  Talk to your Web department about creating a new, more extensive, pre-barrier HTML page for every piece of content in your library.  It should use highly relevant keywords to describe what’s behind that barrier.

Both humans and search engines alike may find this content more useful than your, perhaps too brief, resource page descriptions.

Plus, start adding content to your site that’s not registration-dependent.  That might be more technical FAQs, a glossary, a blog, past newsletter issues, etc.

Be sure to ask your Web team if the site is dynamic or static.  If it’s dynamic, you’ll need more static HTML workaround pages for this content too.


655 Comments so far ↓

Leave a Comment