And why should I care?
This week I’ve had two potential clients talk to me about their websites (we’re going to revamp what they already have) and both need to understand what Canonical means when it comes to the content on their websites. Maybe you do too? It’s all about Duplicate Content.
I’ll get to the meaning in just a minute but let me first explain a little about these two clients.
Client #1: The bulk of the content on this site is put there automatically. The client subscribes to a service that writes blog posts for websites and they automagically show up on the site. It’s pretty easy to do – you can just pull from an RSS source and import most anything as a blog post.
Client #2: They have a template site. It’s not built on WordPress (I believe it’s a drupal site if you really want to know). In their menu bar they have a “BLOG” item. When you click on the blog you are brought to a new site that almost resembles, but not really, the original site. Chuck the idea of continuity out the window, eh? The ‘blog’ portion of the site is built on WordPress.com and even has that in the URL. The client has been copying and pasting the posts on the ‘blog’ to their main site on a weekly basis in an effort to get their content better seen.
A few years ago, the internet got together and decide to try to do something about an problem they saw happening in the form of ‘duplicate content’. Think of duplicate content as being the same stuff, but under different a URL.
There was the Bad Guys version of duplicate content, where they would create zillions of sites all with the exact same content. Often times you would see this like www.bestautorepairSanFrancisco.com and of course they’ve created a domain for nearly every city imaginable. They were spamming the internet.
Initially the search engines fought back against this with what was called a “Duplicate Content Penalty“. If they found a bunch of duplicate content they would simply block ALL of it from showing up in their results. A knee jerk reaction. It was effective but it also hurt the good guys too.
The Good Guys were people like you and me. We’re creating duplicate content every time we post something. It’s a problem because nearly every has duplicate content these days. For example: this post has it’s own unique URL, but it’s also going to be listed under the archive section, and under the category of WordPress, and on and on. This isn’t just on WordPress sites – you do the same thing when you use a hashtag in a post on Facebook and Twitter. Can you see what I mean?
The Good News:
Google and the others got together and started recognizing what is called the Canonical Link Element.
The good news is that everything out there now uses “Canonical’ in what they do.
The other good thing is that it’s automatic. You don’t have to even concern yourself about it. It’s going to happen whether you like it or not.
It looks like this: <link rel=”canonical” href=”http://example.com/page.html”/> but you don’t see it. It’s hidden in the code of your site.
Here’s what this post looks like
In the simplest terms what it says to the search engines is that this is the FIRST instance of this article and to ignore any and all other instances they might find.
Let me say that again. “this is the FIRST instance of this article and to ignore any and all other instances”
So when you hear someone say something about a “Duplicate Content Penalty” you can now point to them and laugh. There isn’t a duplicate content penalty anymore and there hasn’t been one in years!
It doesn’t matter if it’s within your own site, or across the internet, Google simply recognizes the FIRST instance of any content and disregards the rest. Yay for Canonical!
The Bad News
Of course there’s bad news, right?
Remember Client #1 who’s posts all came from a service he subscribed to? All those great posts are also sent to countless other sites who also subscribe to the same service. Can you spot the problem yet?
Google is going to ONLY index the FIRST instance they find of those articles and you can bet it isn’t going to be my clients site!
As for Client #2… The content was theirs, written genuinely by them. For every piece of content they created the WordPress.com site (their ‘blog’) got full credit in the search engines and never their Home Base.
The fix for Client #2 is already in the works. I’m exporting all the original posts. We’re rebuilding the site as a single site (no separate blog) and so each and every page will also have the same consistent look and feel. The other win for this is client is we’ll also have consistent calls to action site wide – that’s something they didn’t have on WordPress.com
As for Client #1… They don’t want to write their own content. Too much work for them they say, so I’ve recommended a ghost writer that will produce properly written content, centered around the local content they need. Hiring a ghost writer isn’t cheap. You want the proper keywords, the proper images tags and so much more. Sure, you can go over to Fiver.com, but you’ll get what you pay for. Since you are wondering, the going rate for a properly written guest post is anywhere from $25 to 50.
I found the original post announcing Canonical from Matt Cutts – it’s dated Feb. 15th 2009!