Getting Rid of Duplicate Content Issues Once and For All: PubCon Las Vegas 2008, Day 3
November 13th, 2008 | 3,584 Views RSS Feed
Duplicate content is becoming a major issue not only for all the search engines, but also for Webmasters from all around the world. In the session, the representatives of the big three, namely Google, Yahoo! Search and MSN Live, explain some of their strategies with regard to duplicate content.
Moderator:
- Rand Fishkin
Speakers:
- Ben D'Angelo, Software Engineer of Google
- Derrick Wheeler, Senior Search Engine Optmization Architect of Microsoft
- Priyank Garg, Director Product Management of Yahoo! Search
The session was initiated by Ben D'Angelo, who started of by pointing out the crucial duplicate content issue of multiple URLs pointing to the same page or quite similar pages. Duplicate content is also found across other websites as syndicated or scraped content. The perfect situation is when one URL would be simply leading to one piece of content.
There are a number of examples of duplicates, such as www & no www, session IDs, URL parameters, print version pages, CNAMEs, etc. Then there are also similar content on different URLs as well as sites in different countries with same content.
Ben want on to explain how Google handles duplicate content. They basically cluster pages together and choose the page that best represents the search. Google employs different kinds of filters for the different kinds of duplicate content. But this is simply a filter and not any kind of penalty
So how to prevent this from happening with you. You can take some of the following measures:
- To prevent exact duplicates, one could use a 301 redirect.
- To prevent near duplicates, one could use robots.txt.
- A different language us not a duplicate. One could use unique content specific to the countries.
- Don't put extraneous parameters in the URLs.
But there is a chance that other sites would cause duplicate content. In case you are syndicating your content out then ensure that there is a link back to your original article or content. One could also give a short summary about the same. In case you are syndicating another's content then you could do the reverse.
It would be an extremely rare case that scrapers would be impacting you or your content. However, one can't rule out the possiblity and in case the same happens, then one should file a DMCA and/or Spam Report.
Ben was followed by Priyank Garg, who explained how Yahoo! Search deals with the same. Yahoo! uses duplicate filters through all the steps in the pipeline. He went on to showcase some examples and stated that most duplicate content is accidental. A large number of duplicates come from soft 404, not the real 404s. Many of them are also abusive forms just as scrapers.
The final speaker of the session was Derrick Wheeler of Microsoft. Derrick made no bones in making it clear that duplicate content was Microsoft's worse nightmare. Microsoft follow the methodology of CIRTS, which goes as:
- C= Crawl
- I= Index
- R= Rank
- T= Traffic
- A= Action
He offered the following tips on how to handle the problem of duplicate content:
- Try to detect when an engine comes to your site.
- Sometimes, such as in the case of session Ids, it can also be helpful.
- Be fully aware of your parameters.
- Make sure that you link to your parameters in a consistent order.
- Exclude any form of duplicates using robots.txt, noindex, nofollow, etc.
- Never assume that search engines can't find JavaScript.
- Try to get hold of a tool that can crawl your site. This will enable you to see how an engine will be looking at your site.
- Always focus on the strong URLs of your website first
Click here to subscribe to our RSS feed to get a daily digest of news around search engine industry. PageTraffic SEO Blog is updated four times a day and is ranked as one of the best search engine resources blog by Pandia!
- del.icio.us
- Digg
- Furl
- Rojo
- StumbleUpon
- Technorati
- Yahoo!
Did you like this article?
Related Posts
Comments
4 Responses to “Getting Rid of Duplicate Content Issues Once and For All: PubCon Las Vegas 2008, Day 3”
Leave a Reply
SEO Tools
FEATURED CATEGORIES
- adCenter (82)
- AdSense (113)
- AdWords (296)
- Analytics (53)
- AOL (5)
- Ask (101)
- Bing (32)
- Blogging (19)
- Copywriting (1)
- Directory (6)
- Google (1871)
- Industry News (805)
- Keyword Research & Targeting (21)
- Link Building (1)
- Link Popularity (60)
- Live (78)
- Local SEO (7)
- Microsoft (131)
- Mobile Search (13)
- MSN (170)
- PageTraffic Happenings (6)
- Panama (21)
- Pay Per Click (33)
- Reputation Management (1)
- Search Engine Conferences (153)
- Search Engines (95)
- SEO (221)
- SEO Tools (40)
- Social Media (19)
- Tips & Tricks (12)
- Web Marketing (4)
- Yahoo! (564)
- Yahoo! Search Marketing (66)










November 15th, 2008 at 07:41
I wonder how this is going to turn out. Hope we get to see something..
July 13th, 2009 at 07:24
I used to publish my articles, but now I wander should I stop doing this, because the risk of duplicate content penalty. Should I stop publish my articles on article directories?
August 2nd, 2009 at 20:54
Hi,
Have you heard about DupeMagic? It's a Wordpress plugin to help you creating high quality unique content(not garbage). It will save a lot of your time.
You can read the review here :
http://internetmarketingtool.co.cc/seo-tools/dupemagic-review/
thanks,
Samuel
October 6th, 2009 at 19:26
[...] unaware of the fact whether their site is being penalized or is under filter. There might be some duplicate content in their site or some optimization issue. The problems can be numerous. Here are 4 flags that might [...]