Wednesday, December 16, 2009

Duplicate or Not to Duplicate. What is duplicate content?

What is duplicate content?

The patent contains a definition of duplicate content:

"Duplicate documents are documents that have substantially identical content, and in some embodiments wholly identical content, but different document addresses."

The patent describes three scenarios in which duplicate documents are encountered by a web crawler:

1. Two pages, comprising any combination of regular web page(s) and temporary redirect page(s), are duplicate documents if they share the same page content, but have different URLs.

2. Two temporary redirect pages are duplicate documents if they share the same target URL, but have different source URLs.

3. A regular web page and a temporary redirect page are duplicate documents if the URL of the regular web page is the target URL of the temporary redirect page or the content of the regular web page is the same as that of the temporary redirect page.

A permanent redirect page is not directly involved in duplicate document detection because the crawlers are configured not to download the content of the redirecting page.

How does Google detect duplicate content?

According to the patent description, Google's web crawler consults the duplicate content server to check if a found page is a copy of another document. The algorithm then determines which version is the most important version.

Google can use different methods to detect duplicate content. For example, Google might take "content fingerprints" and compare them when a new web page is found.

Interestingly, it's not always the page with the highest PageRank that is chosen as the most important URL for the content:

"In some embodiments, a canonical page of an equivalence class is not necessarily the document that has the highest score (e.g., the highest page rank or other query-independent metric)."

How does this affect your website?

If you want to get high rankings, it is easier to do so with unique content. Try to use as much original content as possible on your web pages.

If your website must use the same content as another website, make sure that your website has better inbound links than the other websites that carry the same content. It's likely that your website will be chosen as the most important URL for the content then.

If your web site has unique content, you don't have to worry about potential duplicate content penalties. Optimize that content for search engines and make sure that your web site has good inbound links.

Wednesday, December 9, 2009

Google Cafeine whats new ?

Google Caffeine is the name given to Google's next algorithm update that is going live after the holidays. It seems that Google Caffeine will be more than Google's regular updates. It will probably be a major overhaul of the calculations that Google uses to rank web pages.

Caffeine What is going to change?

Of course, Google hasn't revealed the details of Google Caffeine yet. However, the new index has been live on some test servers and some Google employees also talked about the next index. The following factors might play a larger role in Google's next index:

1. Website speed: if you have a slow loading website, it might not get high rankings on Google.

2. Broken links: if your website contains many broken links, this might have a negative impact of the position of your web pages in Google search results.

3. Bad neighborhoods: Linking to known spammers and getting a lot of links from known spammers isn't good for your rankings in Google's current algorithm. The negative impact of a bad neighborhood will probably be even worse with Google Caffeine.

4. The over-all quality of your website: Google's new algorithm probably will take a closer look at the over-all quality of your website. It's not enough to have one or two ranking factors in place.

5. You'll probably need good optimized content, a good website design with a clear navigation, good inbound links, a low bounce rate, etc. The number of social bookmarks might also play an increased role.

6. Factors like the age of a website, its past history, authority etc. will still play a role in Google's new index. However, the effect of the different factors on your rankings will shift.

How can you adjust your web pages to Google's new Caffeine index?

Although Google's Caffeine update hasn't been release yet, there are some things that you can do to increase the chances that your website will get good rankings in Google's new index:

1. Remove all spam elements from your web pages. Anything that might be considered spam can and will have a negative effect on the position of your web pages sooner or later. This includes text that has nearly the same color as the background, cloaking and fully automated linking systems.

2. Check your website design and the navigation of your website. Your website should have a professional look and feel. The navigation should be easy to understand and your web pages should easily be parseable by search engine spiders. You can test this with the search engine spider simulator in
3. Get links from social bookmark websites. Social bookmark links already play a role in Google's current algorithm and that role might increase.

4. Check your links. You shouldn't link to websites that look like spammers. It's better to focus on selected quality links instead of as many links as possible.

Google Caffeine is going to be released after the holidays. If you follow the tips above, your website will be in a good position when Google's new index will be online.

Tuesday, December 1, 2009

Once upon a time.....

Once upon a time there was one very hot-tempered and short-tempered young man. Then one day his father gave him a bag of nails and told every time he did not keep his anger to drive one nail into the fence post.

On the first day in the column were several dozen nails. The next week, he learned to restrain his anger, and with each passing day the number of nails hammered into pole began to decrease. The young man realized that it is easier to control his temper than to hammer nails.

Finally the day came when he never lost his composure. He told this to his father and he said that this time every day, when his son will be able to control himself, he can pull out from the column on the bottom nail.

Time passed, and the day came when he could tell his father that in a column not a single nail. Then the father took his son by the hand and led him to the fence:

- Do a good job, but you see how many holes in a column? He will never be like before. When you tell people something evil, he remains as a scar, as these holes. And no matter how many times after that you're sorry - the scar remains.