WordPress blogs have duplicate content issues, and one of them is allowing reading the same content on both the posts, index page, archives and categories webpages. To avoid search engine penalty it is important to optimize your WordPress installation so that duplication will be avoided.

There are a few ways to do that :


Optimizing using WordPress excerpt

Instead of duplicating the whole content, you may use just an excerpt. As explained in "Customizing the Read More", you may replace the_content() PHP function in your theme with the_excerpt() which will display the first 55 characters of the post instead of the whole post. Since the WordPress 2.1 release WordPress supports the <!–more–> tag that indicates excerpt end on every post. Only problem with the more tag is that it also cuts the posts on your RSS feed, which you fix by using the "Full Text Feed" WordPress plugin.

If you want some excerpt customization, then I suggest using one of the various plugins available for that, like the good old the_excerpt Reloaded or Post Teaser.

BTW – since your homepage, archive and categories pages shrink when you use excerpt, it could be nicer that you provide more posts on each page. To play around with those settings, I use the wonderful Custom Query String plugin (which seems like it stopped being supported, but is fully functional and stable) where one may configure how many posts will show on various pages in your blog.


Optimizing using the Noindex meta-tag

Another way to go about this, is not index the archives, categories and index pages at all. To do this, you can add the "noindex" meta-tag on those pages by adding the following code to your theme’s headers :

if ( is_single()  || is_page() ) {
   echo ‘<meta name="googlebot" content="index,follow" />’;
} else {
   echo ‘<meta name="googlebot" content="noindex,follow" />’;

Naturally, you can tweak this with your own :

  • Change conditions  to match your own taste – WordPress "Conditional Tags" page.
  • Right now, as you can see, it blocks googlebot. You can change that to block all engines (="robots", use with care!) or any other engines you wish (like msnbot, etc.)

While you’re at it, you may also add other tags that will prevent archiving. Like :

<meta name="robots" content="noarchive">

If you’d rather use a plugin for this, then I suggest you check out "Duplicate Content Cure Plugin for WordPress".


Use Robots.txt

Robots.txt will do the just about the same thing, but might be a little more tricky to set up correctly. Check out "WordPress SEO : using robots.txt to avoid content duplication" for more.

Tagged with:

9 Responses to WordPress SEO : Using excerpt, robots.txt and noindex meta-tag for duplicate content in index, archives and categories

  1. […]   WordPress SEO : Using excerpt, robots.txt and noindex meta-tag for duplicate content in index, archi… […]

  2. Can you help me to understand here…

    Does it mean your home page won’t get indexed if you have noindex meta tag in your home page? How will that help your home page to get page ranking? Isn’t the home page the most important page for people to search? If it’s not indexed, how can people find you and how can you get better PR?

    How much the penalty is it? Doesn’t cost your PR?


  3. fiLi says:

    You’re right, ofcourse, which is why the excerpt method is better for the index page. Thanks for pointing that out, since it’s not mentioned in the article.

    PR isn’t directly influenced by duplicate content, as it’s a matter of backlinks and their authority. The only thing is that PR might be split between things like www and nowww and having one link will combine the backlinks.

    So, penalty isn’t for your PR, it’s for your SERP rankings. How much? no body knows, but it’s enough that a page is marked as “supplemental” to make it drop significantly.

  4. Thank you for the quick response. Yeah.. I do see some of my pages get down to supplemental result.

    I guess it’s not too late for me, since my PR are still low.

  5. […] seen recently a lot of plugins for WordPress aimed at taking care of the duplicate content issue in search engines. Don’t […]

  6. […] WordPress SEO : Using excerpt, robots.txt and noindex meta-tag for duplicate content in index, archi… […]

  7. […] WordPress SEO: Using excerpt, robots.txt and no index meta-tag for duplicate content in index, archi… A great article which explains about WordPress SEO concepts which are not covered by others. […]

  8. Arjen says:

    I’m using the All-in-one-SEO plugin to do the job. Should be fine too.

  9. פורקס says:

    Can’t Google recognize what is the major post?

Leave a Reply

Your email address will not be published. Required fields are marked *

Set your Twitter account name in your settings to use the TwitterBar Section.